External validation of the Simplified PADUA REnal (SPARE) nephrometry system in predicting surgical outcomes after partial nephrectomy

Background Pentafecta is a major goal in the era of partial nephrectomy (PN). Simplified PADUA REnal (SPARE) nephrometry system was developed to evaluate the complexity of tumor. However, the predictive ability in pentafecta of SPARE system is yet to be determined. The aim of this study was to externally validate the applicability of SPARE nephrometry system in predicting pentafecta achievement after partial nephrectomy, and to examine inter-observer concordance. Methods We retrospectively reviewed data of 207 consecutive patients who underwent PN between January 2012 and August 2018 at a tertiary referral center. We obtained SPARE, R.E.N.A.L., and PADUA scores and evaluated correlations among the nephrometries and surgical outcomes including pentafecta by Spearman test. Logistic regression analysis was used to identify independent predictors of pentafecta outcomes. We compared the nephrometries to determine the predictive ability of achieving pentafecta using receiver operating characteristic curve analysis. Fleiss’ generalized kappa was used to assessed interobserver variation in the SPARE system. Results Based on the SPARE system, 120, 74, and 13 patients were stratified into low-risk, intermediate-risk, and high-risk groups, respectively. Regarding the individual components of pentafecta, there were significant differences in the complication rate (p = 0.03), ischemia time (p < 0.001), and percent change of eGFR (p < 0.001) among the three risk groups. In addition, higher tumor complexity was significantly associated with a lower achievement rate of pentafecta (p = 0.01). In Spearman correlation tests, SPARE nephrometry was correlated with ischemia time (ρ:0.37, p < 0.001), operative time (ρ:0.28, p < 0.001), complication rate (ρ:0.34, p < 0.001), percent change of eGFR (ρ:0.34, p < 0.001), and progression of chronic kidney disease stage (ρ:0.17, p = 0.02). Multivariate analysis revealed that SPARE significantly affected pentafecta (OR: 0.67, p < 0.001). In ROC curve analysis, SPARE showed fair predictive ability in the achievement pentafecta (AUC: 0.71). The predictive ability of pentafecta was similar between nephrometries (SPARE vs. R.E.N.A.L., p = 0.78; SPARE vs. PADUA, p = 0.66). The interobserver concordance of SPARE was excellent (Kappa: 0.82, p = 0.03). Conclusions SPARE system was a predictive factor of surgical outcomes after PN. This refined nephrometry had similar predictive abilities for pentafecta achievement compared with R.E.N.A.L. and PADUA.


Background
Partial nephrectomy (PN) is the standard of care despite the increased use of surgical approaches for T1 renal tumors and even selected T2 renal tumors [1]. Compared to radical nephrectomy, PN can achieve better renal function preservation without compromising the oncological and overall survival outcomes [2,3]. Both trifecta and pentafecta remain the major goals in the era of PN [4,5]. Trifecta is an evaluation of short-term outcomes and is defined as ischemia time ≤ 25 min, negative surgical margin, and no major complications (defined as a Clavien score of ≧3). Pentafecta is an evaluation of longterm outcomes, that includes all of the criteria of trifecta in addition to including > 90% preservation of estimated glomerular filtration rate (eGFR) and no increase in the stage of chronic kidney disease (CKD) at 1 year after PN. These surgical outcomes are impacted by factors including patient characteristics and tumor complexity [6]. Therefore, standard, reproducible, and precise evaluations of tumor complexity are important in surgical planning and patient counseling.
Several nephrometries have been developed and evaluated, of which the R.E.N.A.L. and PADUA systems are the most widely used and studied [7,8]. Both R.E.N.A.L. and PADUA have been significantly correlated with prolonged ischemia time and post-operative complications, which are the component of trifecta [9]. However, controversy exists with regards to the application of these first generation nephrometries in the prediction of postoperative renal function, which are the component of pentafecta [10,11]. Only the radius of the tumor and endophytic features are associated with split renal function after PN. Many factors in first generation nephrometries may decrease their predictive ability of functional outcomes [12]. The evolution of surgical techniques and the increasing use of PN may limit the use of first generation nephrometries. Ficarra et al. proposed a revised version of PADUA, the Simplified PADUA REnal (SPARE) nephrometry system [13]. The SPARE system is composed of fewer variables, including: 1) rim location; 2) renal sinus involvement; 3) exophytic rate, and 4) tumor size (Fig. 1). Even though fewer variables are used in the SPARE system, this has not negatively affected the ability to evaluate surgical complexity, and the accuracy to predict overall complications between the original PADUA and SPARE has been shown to be similar [13]. Since the SPARE system is a novel tool, its application and inter-observer concordance have yet to be validated externally. Moreover, few studies have evaluated the predictive ability of pentafecta between the SPARE system and first generation nephrometries. Therefore, the aim of this study was to apply three nephrometries (SPARE, R.E.N.A.L., PADUA) in a contemporary series of PNs in order to externally validate the SPARE system and a perform head-to-head comparisons of the predictive performance.

Patients and data collection
After Institutional Review Board (IRB) of China Medical University & Hospital approval (CMUH108-REC3-063), 207 consecutive patients who underwent PN via open, laparoscopic or robotic-assisted approaches for localized renal tumors between January 2012 and August 2018 at a tertiary referral center were included in this study. All methods were performed in accordance with the relevant guidelines and regulations, and a waiver of informed consent was granted by the IRB. Patients with multiple renal tumors within one kidney, solitary kidney, or recurrent renal cell carcinoma were excluded. The decision of surgical approach and technique of renorrhaphy were determined by the surgeons' expertise and patients' preference. All PNs were conducted by the standard renal artery and renal vein on-clamp technique, and conventional resection.
Image study with either abdominal computed tomography (CT) or magnetic resonance imaging (MRI) were obtained from all patients pre-operatively. Warm ischemia was used in LPN and RPN, and cold ischemia was used in OPN. We collected the patients' demographic and clinical data and imaging studies electronically and analyzed them retrospectively. SPARE, R.E.N.A.L., and PADUA scores were obtained according to the original studies [7,8,13]. Based on risk stratification of the SPARE nephrometry, the patients were divided into three groups: low-risk group (score 0-3), intermediaterisk group (score 4-7), and high-risk group (score [8][9][10]. Interobserver concordance of the SPARE nephrometry was assessed by two urologists and one radiologist (C.G. Heng, P.J. Hsiao, Y.P. Wang), each of whom was blinded to the clinical outcomes.

Outcome measures
We collected and analyzed preoperative demographics (gender, age, American Society of Anesthesiologists score, Charlson Comorbidity Index), and perioperative outcomes (operative time, ischemia time, estimated blood loss, complications, length of hospitalization). Complications were defined as surgical-related adverse events within 3 months after surgery, and were assessed using the Clavien-Dindo classification system. A major complication was defined as a Clavien score of ≥3. Renal function was assessed by serum Cre and eGFR based on the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) equation. Timing of renal function evaluation were pre-operatively, 3rd day, 30th day, and 1 year after surgery. Functional change in renal function was displayed in the absolute change of eGFR (ACE) and percent change of eGFR (PCE). CKD upstaging was defined as upstaging of CKD status to stage III, IV, or V. The following pathology features were recorded: malignancy, the subtype of RCC, and surgical margin. Pentafecta was assessed as previously reported [5].

Statistical analyses
Categorical variables including sex, positive surgical margin, and achievement of pentafecta are displayed as a percentage. And continuous variables including SPARE, R.E.N.A.L. and PADUA scores are displayed as median (IQR). The Mann-Whitney U-test and Kruskal-Wallis H-test were used to compare two or more nonparametric continuous variables, respectively. The Pearson chi-square test was used to compare categorical variables. Spearman correlation was used to evaluate relationships among SPARE, R.E.N.A.L., and PADUA scores and surgical outcomes. Univariate and multivariate analyses between various clinical features including nephrometries and pentafecta were evaluated using a logistic regression model. Factors associated with pentafecta such as age, sex, ASA, CCI, BMI, hypertension, diabetes, pre-operative eGFR, surgical approach, SPARE, R.E.N.A.L., PADUA were included in univariate analysis. The variables with a P-value below 0.25 in the univariable models were used in subsequent multivariable models, each with a different scoring system locked in, which used a backward stepwise multivariable model selection process with a P-value threshold of 0.05 for variables to remain in the model. Since three nephrometry scores (SPARE, R.E.N.A.L., and PAUDA) are similar proxy variables for tumor complexity and correlating with each other. This method starts with all variables in the model and removes nonsignificant variables as well as those whose loss has negligible effect on the fit of the model. Three models were run for predicting each pentafecta individually. Each model made use of a different scoring system locked into the multivariate logistic regression model. The predictive abilities of the nephrometries for pentafecta were evaluated and compared using ROC curve analysis. We assessed interobserver variation in the SPARE system according to Fleiss' generalized kappa. All analyses were performed using SPSS v.22 (SPSS, Chicago, IL, USA), and a P value < 0.05 was considered to be statistically significant.

Results
Based on the SPARE system, 120, 74, and 13 patients were stratified into low-risk, intermediate-risk, and highrisk groups, respectively. There were no significant differences among the three groups in baseline characteristics except for tumor size (p < 0.001), surgical approach (p = 0.03), and tumor complexity assessed by the three nephrometries (p < 0.001) ( Table 1). There was a trend that robotic surgery was preferred to the other two operative approaches in the high-risk group. Forty-eight, 52, and 107 patients underwent PN via open, laparoscopic, and robotic approaches respectively, of whom 50% were male. The median (IQR) age was 58 (15) years, the American Society of Anesthesiologists score was 2 (1), the Charlson Comorbidity Index score was 2 (3), and median (IQR) tumor size was 3.5 (1.9) cm.
The median (IQR) operative time was 227 (97) minutes, the ischemia time was 24 (11) minutes, the estimated blood loss was 150 (250) mL, and the length of hospital stay was 8 (2) days ( Table 2). Peri-operative outcomes were significantly different in the three risk groups. The patients with a higher tumor risk had the longest operative time (p = 0.003) and the longest hospital stay (p = 0.02) ( Table 2). Clear cell renal cell carcinoma (RCC) (45.9%) was the most common malignant tumor, followed by papillary RCC (8.7%) and chromophobic RCC (8.2%) ( Table 2). Regarding the individual components of pentafecta, there were significant differences in the complication rate (p = 0.03), ischemia time (p < 0.001), and PCE (p < 0.001) among the three risk groups. In addition, higher tumor complexity was significantly associated with a lower achievement rate of pentafecta (p = 0.01) ( Table 2).

Discussion
Achieving trifecta and pentafecta is the major goal of PN regardless of the surgical approach. Therefore, an effective and validated tool to evaluate tumor complexity and surgical difficulty is essential. However, the R.E.N.A.L. and PADUA systems are not without limitations [10,14]. The SPARE system, a refined version of PADUA, includes tumor size, exophytic rate, sinus involvement, and rim location (Fig. 1). Compared to R.E.N.A.L. and PADUA, the SPARE system had similar predictive ability in pentafecta achievement (Fig. 2). In other words, the fewer constituents of the SPARE system did not affect its efficacy while making it easier to calculate the score. Moreover, the interobserver concordance of the SPARE system was good in overall score and in most of the individual components ( Table 6). As a result, the SPARE system appears to be a favorable choice when evaluating tumor complexity and predicting post-PN outcomes during clinical practice and patient counseling.
Most peri-operative outcomes in our study were similar to the RECORd1 project, a 4-year prospective observational multicenter study. The major complication rate was 3.5%, positive surgical margin rate was 5.5%, and median ischemia time was 16 min in the RECORd1 project [15,16]. The longer median ischemia time (24 min) in our study may be due to larger tumor size and low volume center (less than 50 PN performed per year). Renal functional outcomes such as ACE at 3rd day and 30th day were similar between our study and the RECORd1 project.
Current study revealed that surgical approach were correlated to complication rates (ρ = − 0.23, p = 0.001), ischemia time (ρ = − 0.33, p < 0.001) but not with positive surgical margin (ρ = − 0.03, p = 0.76), PCE (ρ = − 0.06, p = 0.36) nor with achievement of pentafecta (ρ = 0.08, p = 0.23) (data not shown in tables). RECORd1 project mentioned that the open surgical approach was a significant predictive factor of complications. In contrast, Serni et al. showed that surgical approach was neither the predictor of trifecta outcome in patients with highly complex renal tumor underwent simple enucleation [17]. The effect of open surgical approach on trifecta/ pentafecta outcomes varied between studies may be caused by different surgical technique and different complexity of renal tumor. Further studies are required to confirm this hypothesis.
In the current study, SPARE nephrometry was correlated with peri-operative outcomes including ischemia time, operative time, and complication rate. RECORd1 project mentioned that modified PADUA is not an independent predictive factor of postoperative complications [15]. In contrast to the RECORd1 project, most patients in our cohort underwent standard PN by minimal invasive approach (76.8%). Since the utilization rate of open partial nephrectomy constantly decreased in last decades [16]. Therefore, SPARE would be a more suitable nephrometry in the era of minimally invasive surgery.
Although there was a trend toward greater functional loss in the higher risk group, Ficarra et al. found that the SPARE system was not associated with functional outcomes [13]. In contrast, the SPARE system was correlated with PCE and pentafecta in our study. This may be due to the different approaches of PN between the two studies. PN was conducted using standard resection methods in our institute, whereas 25% of the patients in their cohort underwent PN by enucleation [13]. Since resected renal volume plays an important role in functional loss [18], the predictive ability of the SPARE system in functional outcomes may be influenced by the volume of resected non-neoplastic renal parenchyma. In addition, tumor contact surface area has a greater ability to predict post-operative renal function than R.E.N.A.L. and PADUA [11,19]. SPARE includes components such as radius (R) and exophytic rate (E), which is similar to tumor contact surface area [11]. The other two components of sinus involvement and rim location are related to the vascular territory of the kidneys which affect renal function deterioration [20]. As a result, the SPARE system may be correlated to functional outcomes to some extent. However, further well-designed studies are needed to confirm these hypotheses. In our study, both R.E.N.A.L. and PADUA had a good predictive ability for pentafecta achievement. R.E.N.A.L. has been confirmed to be an independent predictive factor of pentafecta achievement with a negative association [21]. Serni et al. showed that PADUA score was significantly associated with the achievement of trifecta and with a negative margin, but not with warm ischemia time [17]. In contrast, Ubrig et al. and Harke et al. reported conflicting results about the predictive ability of PADUA for trifecta achievement [22,23]. The difference regarding the predictive ability of PADUA in pentafecta achievement between studies may be explained by the following reasons. First, there were inconsistencies between studies in controlling for confounding factors such as comorbidities, and patient factors affect postoperative complication rates and functional change [6]. Differences in the methods of multivariate analysis between studies may have resulted in conflicting results. In our study, we included possible factors including age, Charlson Comorbidity Index, BMI, and pre-operative renal function in order to reduce selection bias. Second, unimportant and non-concordant factors in PADUA and a lack of central image review may have led to the difference in results between studies [24]. Third, different surgical approaches such as open/ laparoscopy/   [25], and they reported a Fleiss' generalized kappa in their study cohort of 0.37 to 0.80 for the various components of the PADUA. Spaliviero et al. directly compared interobserver concordance among R.E.N.A.L., PADUA, and C-index, and found that agreement using the C-index method was higher than with PADUA or R.E.N.A.L. [24]. However, limitations existed when scoring the constituents including location and involvement of the collecting system [24]. Therefore, Ficarra et al. refined PADUA into the SPARE system which successfully improved interobserver agreement according to our results. In our cohort, the interobserver concordance of renal sinus involvement was lower and the exophytic rate was higher compared with previous studies. This may be because exophytic rate is a semi-quantitative parameter while renal sinus involvement is a qualitative parameter.
To the best of our knowledge, the current study is the first to externally validate the SPARE system. We further confirmed that SPARE is not only a predictive factor in overall complication rate, but also in pentafecta achievement. Besides complication rate, we also found similar predictive abilities of pentafecta achievement between the SPARE and R.E.N.A.L./PADUA systems in ROC analysis. Another strength of the current study is that we provided evidence of the reproducibility of the SPARE system between urologists and radiologist. This result suggests that the SPARE system can be applied across different specialties. However, there are also limitations to this study. First, this is a single center retrospective study design with various confounding factors. However, we tried our best to reduce selection bias by including possible confounding factors which have previously been reported. Second, we lacked unified imaging protocols for CT and MRI because we are a tertiary referral center. Most constituents of the SPARE system are quantitative or semi-quantitative, so there may not have been significant inconsistencies in the scoring. Third, only a small proportion of the patients (6.3%) were classified as being at high risk, which may have limited the findings. Fourth, the PN technique used in the current study was standard resection, so the applicability of SPARE for PN with enucleation is still unclear, and further studies are needed to confirm the efficacy of the SPARE system in high-risk renal tumors and PN with enucleation. Finally, we did not evaluate renal function using radio-isotope scans, which has been proven to be a more precise tool than serum Cre or eGFR [26], because the aim of this study was to assess pentafecta as defined by a change in renal function as assessed by eGFR [5]. This may not have limited the interpretation of the results.

Conclusions
In conclusion, the results of this study showed that the SPARE system was a predictive factor of surgical outcomes after PN. This refined nephrometry had similar predictive abilities for pentafecta achievement compared with R.E.N.A.L. and PADUA. The reproducibility, efficacy, and ease of use mean that the SPARE system may replace R.E.N.A.L. and PADUA in clinical practice.