Skip to main content

A novel post-percutaneous nephrolithotomy sepsis prediction model using machine learning



To establish a predictive model for sepsis after percutaneous nephrolithotomy (PCNL) using machine learning to identify high-risk patients and enable early diagnosis and intervention by urologists.


A retrospective study including 694 patients who underwent PCNL was performed. A predictive model for sepsis using machine learning was constructed based on 22 preoperative and intraoperative parameters.


Sepsis occurred in 45 of 694 patients, including 16 males (35.6%) and 29 females (64.4%). Data were randomly segregated into an 80% training set and a 20% validation set via 100-fold Monte Carlo cross-validation. The variables included in this study were highly independent. The model achieved good predictive power for postoperative sepsis (AUC = 0.89, 87.8% sensitivity, 86.9% specificity, and 87.4% accuracy). The top 10 variables that contributed to the model prediction were preoperative midstream urine bacterial culture, sex, days of preoperative antibiotic use, urinary nitrite, preoperative blood white blood cell (WBC), renal pyogenesis, staghorn stones, history of ipsilateral urologic surgery, cumulative stone diameters, and renal anatomic malformation.


Our predictive model is suitable for sepsis estimation after PCNL and could effectively reduce the incidence of sepsis through early intervention.

Peer Review reports


Urolithiasis is the most common urinary system disease with a high incidence worldwide [1]. According to surveys, the incidences in North America, Europe, and Asia range from 7 to 13%, 5–9% and 1–5%, respectively [2]. In recent decades, the incidence has been on the rise, causing not only suffering for patients, but also a significant burden on health systems [3].

For complex calculi, such as staghorn calculi, PCNL is the most suitable treatment because of its advantages of high stone removal rate, less surgical trauma, and faster postoperative recovery [4]. However, PCNL is associated with many complications including sepsis, which can affect patient prognosis. Septic shock, which is a serious manifestation of sepsis, significantly increases patient mortality [5].

On the other hand, medical research has entered a new era with the advent of artificial intelligence (AI) [6]. Machine learning is an important branch of AI which is widely used in image recognition and prognosis prediction. For urinary calculi, machine learning is mainly used to assist clinicians in selecting appropriate surgical methods, predicting the success rate of surgery, and determining the composition of the calculi [6,7,8,9]. However, no relevant studies have been conducted on the application of machine learning to predict sepsis after PCNL. Therefore, this study aimed to establish a predictive model for sepsis after PCNL using machine learning. This can provide a reference for urologists to identify sepsis and start earlier intervention for high-risk patients.


The perioperative data of 694 patients who underwent PCNL treatment at Changhai Hospital between January 2015 and February 2019 were collected, including 404 (58.2%) males and 290 (41.8%) females. All patients provided written informed consent. Urine bacterial cultures were performed on all patients before PCNL. To ensure negative preoperative urine culture results, all positive patients were administered appropriate antibiotic treatment based on culture results. The F22 standard access was used for all PCNL surgery in Changhai Hospital. To avoid data bias caused by different operators, only the surgeries of Professor Gao who is very experienced in PCNL were selected in this study. In case of pyonephrosis, we usually stop the operation immediately after placing the nephrostomy tube. However, for the patients with sufficient antibiotic course and small stone load, or without removing the stone in the main pelvis, simply placing the nephrostomy tube cannot guarantee the drainage effect, we will use ultrasonic negative pressure aspiration, strictly control the operation time, and finish the operation as soon as possible after the removal of stones in the pelvis, and the patient was sent to ICU for intensive care immediately after surgery.

The endpoint of this study was the occurrence of sepsis within 24 h after the operation. Patients were considered to have sepsis when Sequential Organ Failure Assessment (SOFA) score ≥ 2. Due to the early occurrence of sepsis in some patients after surgery, laboratory results could not be obtained in time. Therefore, 22 preoperative and intraoperative variables were used to construct a sepsis predictive model in this study, so that clinicians could judge whether patients would develop sepsis immediately after surgery. Ten continuous variables were used: age, body mass index (BMI), preoperative blood WBC count, creatinine, procalcitonin, bilirubin levels, urinary WBC count, days of preoperative antibiotic use, cumulative stone diameters, and operation time. Twelve classification variables were used: sex, renal anatomical malformation, urinary nitrite, hypertension, diabetes mellitus, isolated kidney, history of ipsilateral urologic surgery, preoperative drainage, preoperative midstream urine bacterial culture, staghorn stones, surgical access, and renal pyogenesis.

To prevent distortion of results with the use of conventional algorithms, we applied the synthetic minority oversampling technique (SMOTE) algorithm to adjust for imbalanced classifications. This algorithm simulated the samples of patients with sepsis and added artificially simulated new samples to the dataset, thus eliminating imbalance in the original data.

Covariance matrix analysis was used to analyse 22 variables, with a redundancy threshold of 0.85. The final model used in this study is a three-layer machine learning framework with mixed super learners. In Layer 1, various machine learning algorithms including Bayesian Classifier, Random Forest, Multi-Gaussian Weighted Classifier and Support Vector Machine were established to minimize the effect of algorithm bias. In Layer 2, meta training was applied by using the prediction results of each trained model in Layer 1 as input features and we obtained mixed super learners to increase predictive performance. The final decision of Layer 3 was the combination of the prediction results in Layer 2 by weighted majority voting. The Monte Carlo cross-validation scheme was applied with 80% training and 20% validation ratios across 100 folds. Each fold had unique training-validation configurations. The Monte Carlo split resulted in 556 samples per fold for the training set. The validation set for each fold contained 69 sepsis samples (139 samples overall). The validation samples were subsampled equally to ensure that none of the label outcomes were over or underrepresented during cross-validation. The predictive performance of the model was evaluated using the Monte Carlo cross-validation scheme with confusion matrix analysis. True-positive, true-negative, false-positive, and false-negative results were calculated by evaluating the validation samples using the established model pipeline in each fold. The sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and area under the curve (AUC) were calculated across each Monte Carlo fold validation results. The selected features and their ranks were calculated across the Monte Carlo folds by Smart Redundancy Reduction, as well as their respective value distributions. The ranks represented the relative importance of the selected features in building the model. The data processing, data analyses, machine learning works and model evaluation were conducted via Python packages including scikit-learn v1.3.2 and imbalanced-learn v0.11.0.


Baseline characteristics of the all the patients was shown in Table 1. In our study, postoperative sepsis occurred in 45 of 694 patients, including 16 males (35.6%) and 29 females (64.4%). The proportion of patients with and without sepsis was unbalanced (6.5% vs. 93.5%), and after data pre-processing using the SMOTE algorithm, the total number of patients was 695, of which 278 were positive and 417 were negative (40.0% vs. 60.0%). This reduced the interference caused by the low proportion of positive cases in data processing. A comparison of the patient distributions before and after applying the SMOTE algorithm is shown in Fig. 1.

Table 1 Baseline characteristics of the all the patients with and without sepsis
Fig. 1
figure 1

Patients distribution before and after SMOTE algorithm

The sepsis predictive model yielded 87.8% sensitivity, 86.9% specificity, 87.4% accuracy, and 0.89 AUC. Table 2 summarise the Monte Carlo cross-validation performance of all the ensemble prediction models. Figure 2 shows the receiver operating characteristic curve of the predictive model.

Table 2 Monte Carlo cross-validation performance of the established model scheme throughout the top-layer prediction model
Fig. 2
figure 2

Mean cross-validation ROC curve of the built model. The thick blue line corresponds to the mean ROC curve, while the light blue shaded area represents the spread of all 100 ROC curves generated across the validation folds. FPR – False Positive Rate, TPR – True Positive Rate. Dashed diagonal line represents a reference random-guess AUC for comparison

In our sepsis prediction model, the top ten variables were preoperative midstream urine bacterial culture, sex, days of preoperative antibiotic use, urinary nitrite level, preoperative WBC, renal pyogenesis, staghorn stones, history of ipsilateral urologic surgery, cumulative stone diameter, and renal anatomic malformation. Nine of the 10 most relevant features for sepsis prediction originated from the preoperative data. See Fig. 3 for the specific ranking.

Fig. 3
figure 3

Feature Importance Ranking. Rank values are in percentages


In this study, we collected important preoperative and intraoperative clinical data from patients, combined them with machine learning methods, and developed a model that could predict the occurrence of sepsis early after PCNL surgery. The results showed that this model had good predictive efficiency for postoperative sepsis (AUC = 0.89). This can effectively improve the diagnostic ability of urologists for postoperative sepsis in PCNL and reduce the incidence of postoperative adverse events.

Among the common complications of PCNL, infection-based sepsis not only makes treatment more challenging, but also reduces the overall treatment effect [5]. In addition, patients with sepsis have long-term physical, psychological, and cognitive disorders that have a significant negative impact on their long-term prognosis [10]. Furthermore, septic shock, a subset of sepsis, can significantly increase postoperative mortality by affecting the cardiovascular system and cell metabolism [11]. Prevention and early treatment are key to positive outcomes in sepsis; therefore, many studies have focused on exploring the risk factors for sepsis, in the hope of early identification of high-risk patients. According to previous studies, the risk factors for sepsis include age, diabetes, urinary tract infection, stone burden, and positive bacterial cultures of renal pelvic urine and stone [12,13,14,15]. However, differences in the study populations, treatment processes, surgical technology, and many other factors in each study led to large discrepancies in the results, making early identification and treatment challenging for clinicians.

Currently, multivariate analysis is the main research method used to assess sepsis risk factors. Logistic regression, a commonly used analysis method, requires normally and linear distributed data with fewer missing points. However, in renal calculi studies, clinical data are easily lost. More importantly, the correlation between several factors limits the application of logistic regression. Compared with these methods, the machine learning does not require linear data and can automatically identify the relationship between variables, allowing analysis even with missing data, which is closer to clinical research. In addition, the algorithm was repeatedly optimised to improve the final predictive ability of the model, rather than simply performing mechanical repetition when processing a large amount of data. There have been some practical applications of artificial intelligence in predicting sepsis after stone surgery. Hong et al. constructed a preliminary screening model for urosepsis based on ultrasound and urinalysis using artificial neural network [16]. This model can provide risk assessment for urosepsis in patients with upper urinary tract calculi, carry out targeted examination or intervention measures, and effectively improve the efficiency of diagnosis and treatment. Considering the low percentage of patients with sepsis, we used the SMOTE algorithm to optimise the data and solve the sample imbalance problem. Furthermore, the parameters included in this study did not exhibit variable repetition owing to a high degree of correlation.

Another advantage of our predictive model is its ability to rank the importance of the variables after data processing. Among the 22 variables included in this study, the top 10 variables contributing to model prediction were preoperative midstream urine bacterial culture, sex, days of preoperative antibiotic use, urinary nitrite, preoperative blood WBC, renal pyogenesis, staghorn stones, history of ipsilateral urologic surgery, cumulative stone diameters, and renal anatomic malformation. Most of the variables with higher importance were consistent with the results of previous studies on risk factors for sepsis. In a prospective single-centre study of 802 patients, Chen et al. found that positive urine culture and the simultaneous positive appearance of urine leukocytes and nitrite were independent risk factors for sepsis [12]. Patel et al. also reported that positive multidrug-resistant urine culture could significantly increase the risk of postoperative infectious complications despite appropriate preoperative antibiotics [17]. Sex is also an important cause of postoperative infections. Previous research showed the incidence of sepsis after PCNL was 4 times higher in female patients than in male patients [12]. In terms of stone burden, Rivera et al. demonstrated that staghorn stones were independently associated with an increased risk of sepsis and that staghorn stones could increase the risk of postoperative infection by more than three times compared to multiple stones [18]. Patel et al. showed that 25% of the patients with postoperative infection events (including sepsis) had renal anatomical abnormalities [17]. In contrast, some studies have shown no correlation between renal anatomical malformations and postoperative infection [19]. Moreover, previous studies also proved that patients with history of ipsilateral surgery are more likely to develop infection events after PCNL [15, 20]. This consistency indicates that machine learning is a process of continuous optimisation and improvement when adjusting parameters. Furthermore, since nine of the 10 most relevant features for predicting sepsis derived from the preoperative data, urologists need to pay more attention to the preoperative clinical data and evaluate patients more comprehensively while adjusting the surgical strategy or intervene as soon as possible after surgery.

This study had some limitations. First, this was a single-centre retrospective study, and the total number of patients with sepsis was relatively small. Even if the SMOTE algorithm was used, the prediction ability of the model would be affected to some extent. Second, different centers may use different references, the variables included in this model were also partly subjective, which may affect the predicting efficiency of the model to some extent. Finally, in the order of importance of variables, the importance of some variables differed from the previous understanding of sepsis risk factors. For example, preoperative blood WBC and creatinine levels were higher than BMI and diabetes. In the following study, we plan to collect cases after 2019 and conduct multi-center studies to increase the number of cases. We will also continue to optimize the inclusion of the variables and strive to further improve the predictive power of the model.


In conclusion, we established a predictive model for sepsis after PCNL using a machine learning method that provides a reference for urologists in identifying sepsis and could intervene in high-risk patients to effectively reduce the incidence of sepsis.

Availability of data and materials

No datasets were generated or analysed during the current study.



Percutaneous nephrolithotomy


White blood cell


Artificial intelligence


Body mass index


Synthetic minority oversampling technique


Area under the receiver operating characteristic curve


Receiver operator characteristic


  1. Chewcharat A, Curhan G. Trends in the prevalence of kidney stones in the United States from 2007 to 2016. Urolithiasis. 2021;49(1):27–39.

    Article  CAS  PubMed  Google Scholar 

  2. Sorokin I, Mamoulakis C, Miyazawa K, Rodgers A, Talati J, Lotan Y. Epidemiology of stone disease across the world. World J Urol. 2017;35(9):1301–20.

    Article  PubMed  Google Scholar 

  3. Zeng G, Mai Z, Xia S, et al. Prevalence of kidney stones in China: an ultrasonography based cross-sectional study. BJU Int. 2017;120:109–16.

    Article  PubMed  Google Scholar 

  4. Chen Y, Deng T, Duan X, Zhu W, Zeng G. Percutaneous nephrolithotomy versus retrograde intrarenal surgery for pediatric patients with upper urinary stones: a systematic review and meta-analysis. Urolithiasis. 2019;47:189–99.

    Article  PubMed  Google Scholar 

  5. Wollin DA, Preminger GM. Percutaneous nephrolithotomy: complications and how to deal with them. Urolithiasis. 2018;46:87–97.

    Article  PubMed  Google Scholar 

  6. Yang B, Veneziano D, Somani BK. Artificial intelligence in the diagnosis, treatment and prevention of urinary stones. Curr Opin Urol. 2020;30:782–7.

    Article  PubMed  Google Scholar 

  7. Shabaniyan T, Parsaei H, Aminsharifi A, et al. An artificial intelligence-based clinical decision support system for large kidney stone treatment. Australas Phys Eng Sci Med. 2019;42:771–9.

    Article  PubMed  Google Scholar 

  8. Choo MS, Uhmn S, Kim JK, et al. A prediction model using machine learning algorithm for assessing stone-free status after single session shock wave lithotripsy to treat ureteral stones. J Urol. 2018;200:1371–7.

    Article  PubMed  Google Scholar 

  9. Black KM, Law H, Aldoukhi A, Deng J, Ghani KR. Deep learning computer vision algorithm for detecting kidney stone composition. BJU Int. 2020;125:920–4.

    Article  CAS  PubMed  Google Scholar 

  10. Iwashyna TJ, Ely EW, Smith DM, Langa KM. Long-term cognitive impairment and functional disability among survivors of severe sepsis. JAMA. 2010;304:1787–94.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Singer M, Deutschman CS, Seymour CW, et al. The third international consensus definitions for Sepsis and septic shock (Sepsis-3). JAMA. 2016;315:801–10.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Chen D, Jiang C, Liang X, et al. Early and rapid prediction of postoperative infections following percutaneous nephrolithotomy in patients with complex kidney stones. BJU Int. 2019;123:1041–7.

    Article  CAS  PubMed  Google Scholar 

  13. Liu C, Zhang X, Liu Y, Wang P. Prevention and treatment of septic shock following mini-percutaneous nephrolithotomy: a single-center retrospective study of 834 cases. World J Urol. 2013;31:1593–7.

    Article  CAS  PubMed  Google Scholar 

  14. Koras O, Bozkurt IH, Yonguc T, et al. Risk factors for postoperative infectious complications following percutaneous nephrolithotomy: a prospective clinical study. Urolithiasis. 2015;43:55–60.

    Article  PubMed  Google Scholar 

  15. Sen V, Bozkurt IH, Aydogdu O, et al. Significance of preoperative neutrophil-lymphocyte count ratio on predicting postoperative sepsis after percutaneous nephrolithotomy. Kaohsiung J Med Sci. 2016;32:507–13.

    Article  PubMed  Google Scholar 

  16. Hong X, Liu G, Chi Z, Yang T, Zhang Y. Predictive model for urosepsis in patients with upper urinary tract calculi based on ultrasonography and urinalysis using artificial intelligence learning. Int Braz J Urol. 2023;49:221–32.

    Article  PubMed  Google Scholar 

  17. Patel N, Shi W, Liss M, et al. Multidrug resistant bacteriuria before percutaneous nephrolithotomy predicts for postoperative infectious complications. J Endourol. 2015;29:531–6.

    Article  PubMed  Google Scholar 

  18. Rivera M, Viers B, Cockerill P, Agarwal D, Mehta R, Krambeck A. Pre- and postoperative predictors of infection-related complications in patients undergoing percutaneous Nephrolithotomy. J Endourol. 2016;30:982–6.

    Article  PubMed  Google Scholar 

  19. Liatsikos EN, Kallidonis P, Stolzenburg JU, et al. Percutaneous management of staghorn calculi in horseshoe kidneys: a multi-institutional experience. J Endourol. 2010;24:531–6.

    Article  PubMed  Google Scholar 

  20. Draga RO, Kok ET, Sorel MR, Bosch RJ, Lock TM. Percutaneous nephrolithotomy: factors associated with fever after the first postoperative day and systemic inflammatory response syndrome. J Endourol. 2009;23:921–7.

    Article  PubMed  Google Scholar 

Download references


The authors would like to thank the statistical support from Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences. Furthermore, the authors are grateful to all the patients for their generous participation.


This study was supported by Shanghai Science and Technology Support Project in Biomedicine (17441900800, Yonghan Peng).

Author information

Authors and Affiliations



XG and YP conceived and designed the project; RS and SZ collected data; WQ performed data analysis; RS and SM prepared the manuscript; All authors reviewed and revised the manuscript.

Corresponding authors

Correspondence to Yonghan Peng or Xiaofeng Gao.

Ethics declarations

Ethics approval and consent to participate

This retrospective study was in accordance with the ethical standards of Helsinki Declaration and its later amendments and was approved by the Ethics Committee of Changhai Hospital. Informed written consent was also obtained from all the patients in this study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary Fig. S1. 

Correlation matrix view of the data. Correlation value 1 and -1 mean a 100% linear and inverse linear relationship between two features respectively. Feature pairs with near 0 correlation value are considered non-redundant. Supplementary Table S1. Preprocessing step algorithms as well as their parameter values performed in all Monte Carlo folds before machine learning. Supplementary Table S2. Machine learning (ML) algorithms in the first ML layer with their parameters and value ranges across Monte Carlo (MC) folds. Supplementary Table S3. Machine learning (ML) algorithms in the second ML layer with their parameters and value ranges across Monte Carlo (MC) folds. Occurrence of each ML type is represented in percentages across MC folds. Supplementary Table S4. Average Monte Carlo (MC) cross-validation performance (%) of ML Layer 1 (ML-1) predictive models as determined by confusion matrix analytics across all MC folds. Supplementary Table S5. Average Monte Carlo (MC) cross-validation performance (%) of ML Layer 2 (ML-2) predictive models as determined by confusion matrix analytics across all MC folds. Supplementary Fig. S2. Box-plot Monte Carlo (MC) cross-validation performance of the established model scheme throughout the performance of the top-layer prediction model.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shen, R., Ming, S., Qian, W. et al. A novel post-percutaneous nephrolithotomy sepsis prediction model using machine learning. BMC Urol 24, 27 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: