Machine learning prediction of stone-free success in patients with urinary stone after treatment of shock wave lithotripsy
BMC Urology volume 20, Article number: 88 (2020)
The aims of this study were to determine the predictive value of decision support analysis for the shock wave lithotripsy (SWL) success rate and to analyze the data obtained from patients who underwent SWL to assess the factors influencing the outcome by using machine learning methods.
We retrospectively reviewed the medical records of 358 patients who underwent SWL for urinary stone (kidney and upper-ureter stone) between 2015 and 2018 and evaluated the possible prognostic features, including patient population characteristics, urinary stone characteristics on a non-contrast, computed tomographic image. We performed 80% training set and 20% test set for the predictions of success and mainly used decision tree-based machine learning algorithms, such as random forest (RF), extreme gradient boosting trees (XGBoost), and light gradient boosting method (LightGBM).
In machine learning analysis, the prediction accuracies for stone-free were 86.0, 87.5, and 87.9%, and those for one-session success were 78.0, 77.4, and 77.0% using RF, XGBoost, and LightGBM, respectively. In predictions for stone-free, LightGBM yielded the best accuracy and RF yielded the best one in those for one-session success among those methods. The sensitivity and specificity values for machine learning analytics are (0.74 to 0.78 and 0.92 to 0.93) for stone-free and (0.79 to 0.81 and 0.74 to 0.75) for one-session success, respectively. The area under curve (AUC) values for machine learning analytics are (0.84 to 0.85) for stone-free and (0.77 to 0.78) for one-session success and their 95% confidence intervals (CIs) are (0.730 to 0.933) and (0.673 to 0.866) in average of methods, respectively.
We applied a selected machine learning analysis to predict the result after treatment of SWL for urinary stone. About 88% accurate machine learning based predictive model was evaluated. The importance of machine learning algorithm can give matched insights to domain knowledge on effective and influential factors for SWL success outcomes.
Shock wave lithotripsy (SWL), which was first introduced by Chaussy in 1980 , has been recognized as convenient, noninvasive management for urinary stones, and now it is widely used as primary treatment for urinary stones smaller than 2 cm sized due to high stone-free rate [2,3,4]. However, the stone-free rate of SWL in the treatment of urinary stone is affected by the size, location, composition, and radiological density of the stone. Especially in cases of large stones, the success rate is relatively low, and the rate of retreatment is high, thus requiring more time and resulting in low cost-effectiveness .
Ineffective procedures can be avoided, and unnecessary resource waste can be prevented by choosing better treatment methods for stone management by evaluating whether patients with urinary stones can respond well to SWL or not. The popular use of non-contrast computed tomography (NCCT) in the diagnosis of urinary stone has allowed accurate measurement of stone characteristics such as size, shape, location, and consistency, using Hounsfield units (HU). Therefore, considering the factors that may affect the stone-free rate, it is possible to reduce such retreatment and economic costs by selectively applying SWL. Many researchers have tried to determine these factors by using statistical methods, and various studies have been reported to predict the stone-free rate after SWL [6,7,8].
Recently, the importance of machine learning and artificial intelligence technology has increased, and get more attention in medical areas with the advent of big data. In the medical field, researchers applied machine learning methodology to various disease diagnoses and predictions [9, 10]. The purposes of this study are to investigate retrospective information on patients with a diagnosis of urinary stone who underwent SWL and to establish a machine learning model in binary classification for predicting the stone-free, or not and one-session success, or not after SWL. Furthermore, the resulting machine learning prediction models can be implemented as an actual diagnostic support system for urinary stone treatment and provide more capability or new functionality for it.
We retrospectively identified patients with kidney and upper-ureter stones who underwent the first start session of SWL at our institution between January 2015 and December 2018. All data analysis was carried out in accordance with applicable laws and regulations described in the Declaration of Helsinki and approved by Chungnam national university hospital institutional review board approval (CNUH 2018–07-047). Three hundred fifty-eight patients with previously untreated stones and a solitary stone diameter of 5 to 20 mm were included. Patients were excluded if they were younger than 18 years old; had a congenital genitourinary tract anomaly, history of previous open urinary-tract surgery, or multiple stones, or who had not undergone imaging for 4 weeks after SWL. We retrospectively reviewed the medical records and picture archiving and communication system (PACS) data of these patients and evaluated the possible prognostic features, including age and sex; presence of diabetes mellitus (DM) or hypertension (HTN); stone characteristics such as stone laterality, location, maximal length, stone volume, skin to stone distance (SSD), mean stone density (MSD), and stone heterogeneity index (SHI); double-J stenting and percutaneous nephrostomy (PCN) procedure before SWL; simple psoas muscle cross-sectional area measurement for sarcopenia; complete blood cell count; liver function test; renal function test; electrolyte test; and urinalysis.
Stone characteristics on NCCT
The stone characteristics were interpreted by NCCT, and each maximal length was measured on axial, coronal, and sagittal NCCT scan. The volume of the stone was computed by the ellipsoid method (X-axis length x Y-axis length x Z-axis length x π/6). The SSD was calculated using radiographic calipers from the point of the largest stone diameter at 90o from the horizontal plane because of vertical shockwave delivery through the patient’s back. HU was carefully calculated on the magnified, axial NCCT image from a circle with a diameter of about 2–3 mm in the center of the middle cross-section without including the adjacent tissue. MSD was identified as the mean value of HU, and SHI was identified as the standard deviation of HU.
SWL was performed on an outpatient basis, without anesthesia. The same lithotripter was used to treat all patients, with fluoroscopic guidance. The lithotripter was an electromagnetic lithotripter made by the DirexGroup (Integra SL, Initia Ltd., Israel). The intensity of the shock wave started from 10.0 kV and gradually increased less than 18.0 kV to improve stone fragmentation and reduce the risk of adjacent tissue trauma. The number of shock waves per SWL session varied from 2300 to 2500 at a rate of 60 shock waves per minute.
Additional SWL was performed at intervals of 1 week if evidence of stones remained. The stone-free was defined as the absence of observed stone with X-ray studies or asymptomatic condition and clinically insignificant residual fragments ≤3 mm in maximal length 4 weeks after the first SWL treatment as measured by simple abdominal radiography or NCCT. One-session success was defined as patients who were stone-free after a single SWL treatment. Enough water intake and appropriate exercise were recommended for all patients.
The SWL data has 42 features including the two target variables, stone-free and one- session success, and a total of 358 cases. The SWL data were analyzed using well-known machine learning methods such as random forest (RF), which is a statistical machine learning method  and extreme gradient boosting trees (XGBoost), which is a decision tree–based gradient boosting regression method  and light gradient boosting method (LightGBM) . The machine learning models were trained in binary classifications for predicting the targets, stone-free and one-session success. Experiments were performed with 80% of the data for training the prediction model (training set) and 20% for testing the trained model (test set). For the experiment, we randomly sampled 10 times, and then take average of its results, which is similar to n-fold validation method, and performed the prediction for stone-free and one-session success. The sampling strategy shows a certain capability of predictive models obtained from the given SWL data set. For calculating sensitivity, specificity, PPV, we computed the confusion matrix and do computed their values in average over sample data sets for prediction tests, and AUCs, we adopted sklearn.metric module in Python. For 95% CIs, we used bootstrapping methods with 1000 bootstraps for each sampled data sets, and then took the average over samples.
The number of cases with stone-free and one-session success were 253 (70.7%) and 154 (43.0%). Table 1 shows the patient and stone characteristics of the patient data, which we used for predictions. We present the prediction accuracies in Table 2. The prediction accuracies for the stone-free were 86.0, 87.5%, and 87.9 and those for one-session success were 78.0, 77.4, and 77.0% using RF, XGBoost, and LightGBM, respectively. In predictions for the stone-free, LightGBM offers better accuracy than XGBoost and RF; and for one-session success, RF algorithms showed better accuracy than XGBoost and LightGBM.
The sensitivity, specificity, positive predictive value, confidence interval and area under the roc curve (AUC) for machine learning analytics are presented in Table 3. As the result in Table 3, the specificity of stone-free was better than the sensitivity. The result implied that the predictive models were more accurate in prediction stone-free. On the contrary, the models were good in the prediction of one-session fail. In AUC for the stone-free, RF and LightGBM offered higher value than XGBoost; and for one-session success, RF offered higher value than XGBoost and LightGBM. Here, we also have shown the feature importance, which had certain interpretability of prediction results related to domain insights. In Fig. 1, we present the feature importance of LightGBM for stone-free predictions. It shows that MSD was the main factor that had decided the stone-free and stone volume (mm3) and SSD 90o (mm) played an important role in supporting the stone-free decision. It might match a typical intuition of stone-free; however, with the machine learning method, LightGBM caught the intuition for that. In addition, in the prediction of one-session success, stone volume (mm3) was the main factor (Fig. 2).
To be able to predict the result of treatment and the patient’s condition with easy measurement would be beneficial for all concerned. The prediction could provide the ability to choose an effective treatment for urinary tract stones and reduce unnecessary resource waste. One promising approach to obtain an appropriate prediction is to adopt machine learning based artificial intelligence methods.
The machine learning methods can show the potential to make decisions that are best suited to the situation without the involvement of emotions, based solely on thorough statistics and calculations. There are some advantages to analyzing data by using machine learning methods. First, machine learning can provide interpretability for analysis and prediction results. Second, insufficient number of data set to apply deep learning algorithms can be handled by certain types of machine learning algorithms, such as tree-based ones; statistical machine learning especially is more effective in predicting with small data, even though more data usually yield more accuracy. Third, machine learning treats heterogeneous data, which is statistically and structurally quite different, simultaneously; for instance, data about kidney stone and ureter stone properties differ distinctly [14, 15]. Machine learning could find new values from data and derives important factors for predicted target variables from the analytical/predicting perspective of machine learning. These factors can be used to validate known domain insights and sometimes reveal factors that were not previously recognized. The main factors of them derived from machine learning are expected to play a major role in developing an auxiliary system for diagnosing diseases and a supporting system.
In this study, data on SWL is accumulated in the treatment of urinary stone, and valid data is available, and we applied three well-known tree-based machine learning methods and compared their results. The decision tree-based methods, such as RF, XGBoost, and LightGBM increased the interpretability of the predicted results by providing the importance of the properties used in the prediction, suggesting new functionality of the machine learning methodology. The reason, why we applied the above decision tree-based machine learning algorithms rather than well-known deep learning ones, is that the tree-based predictive models give better performance in prediction accuracy in case of the relatively small number of data for deep learning ones, in general. Moreover, the algorithms give certain interpretation of their results in feature importance. Another strategy to overcome the relatively small number of data, we collected SWL treatment data for both the upper ureter and kidney stones without their positional information. Then the trained machine learning model with the collected data can show the capability of prediction for stone-free and one-session success. Even though the positional information for urinary stones is neglected, the predictive model can catch the effective accuracy in predictions. After filtering the missing data, carefully, three hundred and fifty-eight cases were obtained for applying machine learning algorithms. The experiment results obtained by taking an average of ten samplings of the given data set. In these prediction experiments, the parameter tuning was not performed for comparison purpose, because parameter tuning shows different results for different sampling of given data.
The major contribution of this study was to enable urologists to choose patients who would realize the most optimal results from SWL. After prediction analysis, patients who have a high risk of stone-free failure can select another method, such as percutaneous nephrolithotomy or retrograde intrarenal surgery, using flexible ureteroscopy to manage urinary stone. The first objective of this study was achieved because each outcome in the predictive analysis exceeded 85% for stone-free and 77% for one-session success, especially, LightGBM and XGBoost showed good prediction outcomes of more than 87% in stone-free prediction.
In most cases of the SWL, the stone analysis could not be done without analyzing the stone fragments that had been discharged from the body directly. Therefore, patients and stone characteristics should have an important role in the pretreatment prediction of treatment outcomes. Even though stone volume, MSD, and SSD are known as important factors that can affect the stone-free rate after SWL, controversy about SSD still exists [16, 17]. In our feature importance analysis, as with the results from various other studies that have been analyzed using general statistical methods, MSD and stone volume were the most influential factors, and SSD was less affected than MSD and stone volume.
MSD is the mean value of the HU of each pixel in a specific stone area and is known as a potential predictor of successful treatment of urinary stone with SWL [18,19,20]. Eisner et al. found that by measuring the mean HU of defined regions just smaller than the stone in magnified images on each slice of the transverse planes with a standard bone window was the most accurate method of determining MSD . In addition, PACS may provide pixel statistics such as minimum, maximum, and standard deviation of HU values. Lee et al.  defined SHI as the standard deviation of stone density on NCCT and assessment that SHI was independently associated with SWL success in patients with ureter stone. In our study, we easily determined MSD by measuring the mean HU from NCCT by using a PACS in the same way. It was significant that MSD was a more important feature than SHI in the prediction of stone-free. All things are taken together, although the predictive level of SHI seems to be lower than MSD, SHI can play a supplementary role in the prediction of SWL treatment outcomes.
The question of whether body mass index (BMI) affects the success rate of SWL treatment has been a controversial issue. Most of the studies have shown that BMI was an independent predictor of stone-free status after SWL [23, 24]. However, several studies took a different view [25, 26], so we tried to think about muscle mass, which is a factor that can indicate the whole-body health condition. No previous study has considered the relationship between muscle mass and the success rate of SWL treatment. Sarcopenia brings about mobility limitation and an inability to perform simple activities of daily life . The psoas muscle cross-sectional area has been used in many studies to provide estimates of overall muscle mass and has been shown to be a simple and easily performed measure of a reliable marker of sarcopenia [28,29,30]. In this study, we found that the psoas muscle cross-sectional area was ranked as an important feature in the prediction of one-session success rather than in those of stone-free. That is, the higher the muscle mass, the higher the activity in daily life. Muscle mass can be regarded as an important factor in determining the extent of stone removal for a short period.
The results of blood and urine tests showed a generally low feature ranking. Among them, hemoglobin, glomerular filtration rate, and platelet count were judged to give a little meaning.
In the current study, there were some limitations. Its retrospective design may have introduced sampling bias. To compensate for its retrospective sampling bias and small sample size, we applied three machine learning methods, which can reduce bias from commonly used general statistical accesses. However, further studies with prospective data are needed to prove our monitoring on the relationship of feature importance.
MSD was the most significant in feature importance, and stone volume, SSD and stone length were the next most closely associated with stone-free prediction of SWL treatment outcomes in patients with urinary stone. In addition, stone volume was the most significant in feature importance, and MSD, stone length, SSD and psoas muscle cross-sectional area were the next most closely associated with the one-session success prediction in this study. Thus, these would be clinically useful parameters in order.
We analyzed the effect of SWL treatment by using three machine learning methods and confirmed that prediction accuracy can rise up to as much as 87.9% by using various patients and stone characteristics. We propose that the new machine learning based artificial intelligence and medical encounter are important. When further large studies, validated in a prospective group of urinary stone patients, become available, our machine learning methods might be useful for guiding SWL treatment selection and prediction of patients with urinary stone.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Shock wave lithotripsy
Non-contrast computed tomography
Picture archiving and communication system
Skin to stone distance
Mean stone density
Stone heterogeneity index
Extreme gradient boosting trees
Light gradient boosting method
Area under the roc curve
Positive predictive value
Body mass index
Chaussy C, Brendel W, Schmiedt E. Extracorporeally induced destruction of kidney stones by shock waves. Lancet. 1980;2(8207):1265–8.
Ben Khalifa B, Naouar S, Gazzah W, Salem B, El Kamel R. Predictive factors of extracorporeal shock wave lithotripsy success for urinary stones. Tunis Med. 2016;94(5):397–400.
Bres-Niewada E, Dybowski B, Radziszewski P. Predicting stone composition before treatment - can it really drive clinical decisions? Cent European J Urol. 2014;67(4):392–6.
Zumstein V, Betschart P, Abt D, Schmid HP, Panje CM, Putora PM. Surgical management of urolithiasis - a systematic analysis of available guidelines. BMC Urol. 2018;18(1):25.
Cone EB, Eisner BH, Ursiny M, Pareek G. Cost-effectiveness comparison of renal calculi treated with ureteroscopic laser lithotripsy versus shockwave lithotripsy. J Endourol. 2014;28(6):639–43.
Pareek G, Armenakas NA, Fracchia JA. Hounsfield units on computerized tomography predict stone-free rates after extracorporeal shock wave lithotripsy. J Urol. 2003;169(5):1679–81.
Patel T, Kozakowski K, Hruby G, Gupta M. Skin to stone distance is an independent predictor of stone-free status following shockwave lithotripsy. J Endourol. 2009;23(9):1383–5.
Gupta NP, Ansari MS, Kesarvani P, Kapoor A, Mukhopadhyay S. Role of computed tomography with no contrast medium enhancement in predicting the outcome of extracorporeal shock wave lithotripsy for urinary calculi. BJU Int. 2005;95(9):1285–8.
Obermeyer Z, Emanuel EJ. Predicting the future - big data, machine learning, and clinical medicine. N Engl J Med. 2016;375(13):1216–9.
De Silva D, Ranasinghe W, Bandaragoda T, Adikari A, Mills N, Iddamalgoda L, et al. Machine learning to support social media empowered patients in cancer care and cancer treatment decisions. PLoS One. 2018;13(10):e0205855.
Kam HT, editor. Random decision forest. Proc of the 3rd Int'l Conf on Document Analysis and Recognition, Montreal, Canada, August; 1995.
Chen T, Guestrin C, editors. Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; 2016: ACM.
Ke G, Wang T, Chen W, Ma W, Ye Q, Liu TY, et al. LightGBM: A highly efficient gradient boosting decision tree. Adv neural inf proces syst Advances in Neural Information Processing Systems. 2017;2017-December:3147–55.
Kevin PM. Machine learning: a probabilistic perspective. MIT Press, Cambridge, UK; 2012.
Hastie T, Tibshirani R, Friedman JH. The elements of statistical learning : data mining, inference, and prediction2017.
Wiesenthal JD, Ghiculete D, DAH RJ, Pace KT. Evaluating the importance of mean stone density and skin-to-stone distance in predicting successful shock wave lithotripsy of renal and ureteric calculi. Urol Res. 2010;38(4):307–13.
Cho KS, Jung HD, Ham WS, Chung DY, Kang YJ, Jang WS, et al. Optimal skin-to-stone distance is a positive predictor for successful outcomes in upper ureter calculi following extracorporeal shock wave lithotripsy: a Bayesian model averaging approach. PLoS One. 2015;10(12):e0144912.
El-Nahas AR, El-Assmy AM, Mansour O, Sheir KZ. A prospective multivariate analysis of factors predicting stone disintegration by extracorporeal shock wave lithotripsy: the value of high-resolution noncontrast computed tomography. Eur Urol 2007;51(6):1688–1693; discussion 93-4.
Weld KJ, Montiglio C, Morris MS, Bush AC, Cespedes RD. Shock wave lithotripsy success for renal stones based on patient and stone computed tomography characteristics. Urology. 2007;70(6):1043–1046; discussion 6-7.
Kacker R, Zhao L, Macejko A, Thaxton CS, Stern J, Liu JJ, et al. Radiographic parameters on noncontrast computerized tomography predictive of shock wave lithotripsy success. J Urol. 2008;179(5):1866–71.
Eisner BH, Kambadakone A, Monga M, Anderson JK, Thoreson AA, Lee H, et al. Computerized tomography magnified bone windows are superior to standard soft tissue windows for accurate measurement of stone size: an in vitro and clinical study. J Urol. 2009;181(4):1710–5.
Lee JY, Kim JH, Kang DH, Chung DY, Lee DH, Do Jung H, et al. Stone heterogeneity index as the standard deviation of Hounsfield units: a novel predictor for shock-wave lithotripsy outcomes in ureter calculi. Sci Rep. 2016;6:23988.
Ahmed MH, Ahmed HT, Khalil AA. Renal stone disease and obesity: what is important for urologists and nephrologists? Ren Fail. 2012;34(10):1348–54.
Hwang I, Jung SI, Kim KH, Hwang EC, Yu HS, Kim SO, et al. Factors influencing the failure of extracorporeal shock wave lithotripsy with Piezolith 3000 in the management of solitary ureteral stone. Urolithiasis. 2014;42(3):263–7.
Choi JW, Song PH, Kim HT. Predictive factors of the outcome of extracorporeal shockwave lithotripsy for ureteral stones. Korean J Urol. 2012;53(6):424–30.
Hatiboglu G, Popeneciu V, Kurosch M, Huber J, Pahernik S, Pfitzenmaier J, et al. Prognostic variables for shockwave lithotripsy (SWL) treatment success: no impact of body mass index (BMI) using a third generation lithotripter. BJU Int. 2011;108(7):1192–7.
Janssen I, Heymsfield SB, Ross R. Low relative skeletal muscle mass (sarcopenia) in older persons is associated with functional impairment and physical disability. J Am Geriatr Soc. 2002;50(5):889–96.
Shen W, Punyanitya M, Wang Z, Gallagher D, St-Onge MP, Albu J, et al. Total body skeletal muscle and adipose tissue volumes: estimation from a single abdominal cross-sectional image. J Appl Physiol (1985). 2004;97(6):2333–8.
Cruz-Jentoft AJ, Baeyens JP, Bauer JM, Boirie Y, Cederholm T, Landi F, et al. Sarcopenia: European consensus on definition and diagnosis: report of the European working group on sarcopenia in older people. Age Ageing. 2010;39(4):412–23.
Jones KI, Doleman B, Scott S, Lund JN, Williams JP. Simple psoas cross-sectional area measurement is a quick and easy method to assess sarcopenia and predicts major surgical complications. Color Dis. 2015;17(1):O20–6.
The authors acknowledges the members of the department of Urology, Chungnam National University and National Institute for Mathematical Sciences.
This research was supported by the Chungnam National University Hospital Research Fund, 2018–2019. This research was also supported by National Institute for Mathematical Sciences (NIMS) grant funded by the Korea government, 2020 (No. NIMS-B20900000). Chungnam National University Hospital Research Fund support the design of the study. NIMS-B20900000 was mainly used for analysis of machine learning.
Ethics approval and consent to participate
All data analysis was carried out in accordance with applicable laws and regulations described in the Declaration of Helsinki and approved by Chungnam national university hospital institutional review board approval, reference number CNUH 2018–07-047. For this type of retrospective study formal consent is not required.
Consent for publication
The authors have declared that no competing interests exist.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Yang, S.W., Hyon, Y.K., Na, H.S. et al. Machine learning prediction of stone-free success in patients with urinary stone after treatment of shock wave lithotripsy. BMC Urol 20, 88 (2020). https://doi.org/10.1186/s12894-020-00662-x