- Research
- Open access
- Published:
Converting between the International Prostate Symptom Score (IPSS) and the Expanded Prostate Cancer Index Composite (EPIC) urinary subscales: modeling and external validation
BMC Urology volume 24, Article number: 28 (2024)
Abstract
Background
Prostate-related quality of life can be assessed with a variety of different questionnaires. The 50-item Expanded Prostate Cancer Index Composite (EPIC) and the International Prostate Symptom Score (IPSS) are two widely used options. The goal of this study was, therefore, to develop and validate a model that is able to convert between the EPIC and the IPSS to enable comparisons across different studies.
Methods
Three hundred forty-seven consecutive patients who had previously received radiotherapy and surgery for prostate cancer at two institutions in Switzerland and Germany were contacted via mail and instructed to complete both questionnaires. The Swiss cohort was used to train and internally validate different machine learning models using fourfold cross-validation. The German cohort was used for external validation.
Results
Converting between the EPIC Urinary Irritative/Obstructive subscale and the IPSS using linear regressions resulted in mean absolute errors (MAEs) of 3.88 and 6.12, which is below the respective previously published minimal important differences (MIDs) of 5.2 and 10 points. Converting between the EPIC Urinary Summary and the IPSS was less accurate with MAEs of 5.13 and 10.45, similar to the MIDs. More complex model architectures did not result in improved performance in this study. The study was limited to the German versions of the respective questionnaires.
Conclusions
Linear regressions can be used to convert between the IPSS and the EPIC Urinary subscales. While the equations obtained in this study can be used to compare results across clinical trials, they should not be used to inform clinical decision-making in individual patients.
Trial registration
This study was retrospectively registered on clinicaltrials.gov on January 14th, 2022, under the registration number NCT05192876.
Introduction
Patient-reported outcome measures (PROMs) such as quality of life (QoL) are becoming increasingly important in clinical research [1, 2]. For diseases of the prostate, such as prostate cancer (PCa) or benign prostate hyperplasia (BPH), several validated QoL questionnaires exist [3]. In prostate cancer, quality of life is of particular importance as the prognosis for localized disease tends to be favorable, and different therapies can have different toxicity profiles that can affect quality of life in different ways [4, 5].
While a certain degree of heterogeneity is desirable due to the different focus areas of the questionnaires, it also makes comparisons across studies difficult [6]. This has led to trials requiring patients to complete different QoL questionnaires, sometimes at the same point in time. While this is not only cumbersome for the patient, having to answer more questions has also been shown to reduce the likelihood of a questionnaire being completed [7].
Two common questionnaires to assess prostate-related QoL are the 50-item Expanded Prostate Cancer Index Composite (EPIC) and the International Prostate Symptom Score (IPSS) [8, 9].
The EPIC consists of 50 Likert items that are used to compute four subscales: Urinary, Bowel, Sexual, and Hormonal. Each subscale has two subdomains to assess symptom severity (Function) and its effect on QoL (Bother). The Urinary subscale has two additional subdomains called Incontinence and Irritative/Obstructive. Scores range from 0—100, with higher scores indicating better QoL.
The IPSS consists of eight Likert items. The first seven relate to lower urinary tract symptoms of BPH, while the eighth asks about the symptoms’ effect on QoL. The first seven items are added to calculate the total, which ranges from 0—35, with higher scores indicating higher symptom burden.
While the use of the IPSS in cancer patients comes with caveats [10], it has been used in a variety of studies [11,12,13]. In addition, the EPIC has also been deployed in non-cancer patients due to the breadth of its questions [14].
To address the problem of converting between questionnaires, several publications have attempted to derive conversion rules [15,16,17]. However, to the best of our knowledge, there is currently no established method to convert between the IPSS and EPIC. Vertosick and colleagues attempted conversions by taking only a subset of questions from QoL instruments and calculating conversion factors [3]. However, they were unable to compare IPSS and EPIC due to differences in the domains addressed by the questionnaires. The purpose of this study was, therefore, to collect data for training as well as internally and externally validating models to enable converting between the two.
Methods
The study was conducted in radiation oncology departments at two institutions, the Cantonal Hospital Winterthur in Switzerland and the Ruppiner Kliniken GmbH in Germany.
Three hundred and forty-seven consecutive patients who had received radiation therapy for prostate cancer in the post-operative setting at our institutions between 2010 and 2020 were identified and received the German versions of the EPIC and IPSS questionnaires in August 2020, unless a date of death had been documented in our electronic health records. Patients completed the questionnaires based on their current quality of life and symptom burden.
We received responses from 208 patients. Of these responses, 175 had no missing values for any of the quality of life questionnaires and a signed informed consent.
Training and internal validation were performed on the Swiss cohort (n = 142) using cross-validation, while the German cohort was stored for external validation (n = 33) to assess if the model generalizes well to previously unseen data from another institution without showing signs of overfitting.
We used a three-step approach: First, we visualized relevant patient characteristics to ensure that there was a correlation between the EPIC and IPSS scores, that patient characteristics were similar in both the training and external validation sets, and that both sets contained a variety of different scores from bad over mediocre to excellent.
Second, we developed four baseline models that all had one input each: A model to predict the EPIC Urinary Summary score when only the total IPSS is known, a model to predict the IPSS when only the EPIC Urinary Summary is known, a model to predict the Epic Urinary Irritative/Obstructive score when only the total IPSS is known and a model to predict the total IPSS when only the EPIC Urinary Irritative/Obstructive score is known. For all baseline models, we used a simple linear regression.
In the third step, we tried to improve upon the performance of the baseline models by using more complex machine learning algorithms and using the raw answers to the questionnaires instead of the computed scores. For the purpose of this article, we use the term advanced models as a reference to models trained in this step. For every task, we used a linear regression, a support vector regression, a k-nearest neighbors regression, and an XGBoost, respectively [18]. In turn, we trained four models each for the following tasks: Predicting the EPIC Urinary Summary score using all IPSS questions. Predicting the EPIC Urinary Irritative/Obstructive score using all IPSS questions. Predicting the total IPSS using all EPIC questions that are used for the computation of the EPIC Urinary subscale. Predicting the total IPSS using only the most relevant EPIC questions that are used for the computation of the EPIC Urinary subscale.
Questions were considered relevant if the authors considered the content of the question to be reflected in one or multiple questions of the IPSS. We selected questions 6d, e, and f, which ask about weak urine stream or incomplete emptying, waking up to urinate, and the need to urinate frequently during the day.
All models were trained and internally validated using fourfold cross-validation, and the mean absolute error (MAE) was used for scoring. Hyperparameter tuning was performed using a randomized search with 250 iterations each, and the respective ranges can be found in the code (see below) [19].
Data preprocessing, analysis, and visualization were performed with Python (version 3.9.7) using the numpy (version 1.20.3), pandas (version 1.3.4), scikit-learn (version 0.24.2), matplotlib (3.4.3), and seaborn (0.11.2) packages.
The full dataset, notebook, environment file, and trained models have been uploaded to a public repository (https://github.com/windisch-paul/EPIC-IPSS-converter).
Some of the patients in the dataset have also been analyzed in another publication on the correlation between dose-volume histogram parameters and quality of life in patients with prostate cancer treated with surgery and radiotherapy [20].
Testing for significant differences between the training and the external validation data was performed using the Mann–Whitney U test.
Institutional review board approval was obtained from the ethical review committee of the canton of Zurich (Kantonale Ethikkommission) for a project (project number: BASEC 2020–02112) to analyze the effects and side effects of radiotherapy at our institution (ClinicalTrials.gov Identifier: NCT05192876, link: https://clinicaltrials.gov/study/NCT05192876). Written informed consent for the analysis of anonymized clinical and imaging data was obtained from all patients, and all data were gathered in accordance with the World Medical Association Declaration of Helsinki: Research involving human subjects.
Results
Selected patient characteristics and their distributions are visualized in Fig. 1. The median age at the survey was 72.5 years for the training set (range: 53.5—85.7 years, standard deviation: 6.9 years) and 69.2 years for the external validation set (range: 53.2—82.7 years, standard deviation: 6.9 years). The median IPSS was 6 for the training set (range: 0—28, standard deviation: 5.6) and 7 for the external validation set (range: 2—32, standard deviation: 8.9). The median EPIC Urinary Summary score was 82 for the training set (range: 22.9—100, standard deviation: 16.2) and 73.6 for the external validation set (range: 37.5—82.7, standard deviation: 12.4). The median EPIC Urinary Irritative/Obstructive score was 89.3 for the training set (range: 35.7—100, standard deviation 12.4) and 82.1 for the external validation set (range: 39.3—92.9, standard deviation: 15.3).
There were significant differences between the training and the external validation set in terms of the EPIC Urinary Summary and Irritative Obstructive scores (both p-values > 0.001) but not in terms of age and IPSS scores (p-values: 0.26 and 0.05).
We observed a strong negative correlation between the IPSS and both the EPIC Urinary Summary and Irritative/Obstructive subscales with absolute Pearson Correlation Coefficients (PCCs) between 0.71—0.88.
The performance of the baseline models is depicted in Table 1 and Fig. 2. When using the IPSS as an input, predicting the EPIC Urinary Irritative/Obstructive subscale was more accurate than predicting the EPIC Urinary Summary with mean absolute errors on the external validation set of 6.12 and 10.45, respectively. Conversely, predicting the IPSS was more accurate when using the EPIC Urinary Irritative/Obstructive subscale as an input compared to using the EPIC Urinary Summary with mean absolute errors on the external validation set of 3.88 and 5.13, respectively.
The following equations were obtained:
The performance of the advanced models is depicted in Table 2, Figs. 3 and 4. Using all IPSS questions as separate inputs with different model architectures instead of the total score as a single input resulted in only a minor performance improvement when predicting the EPIC Urinary Summary. The mean absolute error of the best advanced model on the external validation set was 9.29 compared to 10.45 for the corresponding baseline model.
For predicting the EPIC Urinary Irritative/Obstructive subscale, using all IPSS questions did not result in improved performance. The mean absolute error of the best advanced model on the external validation set was 6.36 compared to 6.12 for the corresponding baseline model.
Using all EPIC Urinary subscale questions with different model architectures resulted in a mean absolute error of 3.79 on the external validation set, which was an improvement over using only the EPIC Urinary Summary (MAE = 5.13) but only a minor improvement over using only the EPIC Urinary Irritative/Obstructive subscale (MAE = 3.88).
Using only relevant questions did not result in improved performance.
Discussion
Our study shows that predicting the IPSS from the EPIC is feasible, especially if the raw questions or the EPIC Urinary Irritative/Obstructive subscale is used as an input. Trying to predict the IPSS using the EPIC Urinary Summary is less accurate, which was to be expected considering that other factors beyond obstructive symptoms influence the Urinary Summary. Blanker and colleagues established a minimal important difference (MID) of 5.2 (95% CI 3.9 to 6.4) for the IPSS in a Dutch cohort [21]. Both of our baseline models’ mean absolute errors on the external validation set were below that threshold, with an MAE of 5.13 for the model that used the EPIC Urinary Summary as an input and an MAE of 3.88 for the model that used the EPIC Urinary Irritative/Obstructive subscale. However, it should be noted that the maximum absolute errors for a single patient in the external validation set were 20.27 and 14.39, respectively (Table 1). Therefore, while we believe that our equations can be used to compare trial populations, they should not be used to guide clinical decision-making in individual patients. Also, it should be noted that an older publication by Barry et al. suggested a lower MID of 3.1 [22].
Conversely, predicting the EPIC Urinary Irritative/Obstructive subscale using the IPSS was more accurate than predicting the EPIC Urinary Summary with MAEs of 6.12 and 10.45, respectively. Umbehr and colleagues have suggested an MID of 10 for the urinary domain of the EPIC using its German version [23]. Here again, while the MAEs were at or well below this level, the maximum absolute error in single patients in the external validation set was higher.
The fact that more sophisticated model architectures did not result in a relevant performance improvement in this study seems reasonable, considering the already high degree of correlation between the scores that could very well be modeled using a linear regression. In addition, the training set might have been too small for more complex model architectures to identify patterns.
We have included the equations obtained by the baseline models in an online converter for other researchers to use at https://www.epic-ipss-converter.com/.
In addition to comparing results across studies with different questionnaires, the models could also be used for quality assurance in studies where patients have completed both questionnaires at the same time point: If the result of one questionnaire deviates a lot from the value that was predicted based on the response to the other questionnaire, this might warrant further investigation.
The strengths of this study include the dedicated collection of QoL data, the fact that patients completed both questionnaires at the same point in time, and rigorous preprocessing, which means that patients with missing values were dropped instead of relying on imputations. Also, an external validation set from another institution in another country than the training cohort was used, which would have highlighted issues with overfitting. Furthermore, a broad range of EPIC and IPSS values was present in both the training and the external validation set.
Limitations of this study include the fact that the German versions of the questionnaires were used and that results for other languages might differ. However, at least for the English versions, this concern is mitigated by the validation processes that the German translations underwent [9, 23]. In addition, we would have preferred to conduct additional validations in previously published studies that used both questionnaires but did not find a study with patient-level data uploaded to a public repository.
As an outlook, we believe that improvements in model performance could be achieved by using training data that spans the full range of scores on both questionnaires. While our cohort already represented a variety of scores, no person in the training set had an IPSS greater than 28 or an EPIC Urinary Summary lower than 22, which might have limited the performance of the model in people with very low prostate-related QoL. Therefore, we suggest caution when deploying the model to populations whose mean scores are close to or even beyond the aforementioned thresholds.
Future studies could also investigate the possibility of converting between other popular disease-specific quality of life instruments, such as the Functional Assessment of Cancer Therapy-Prostate (FACT-P) [24].
Lastly, the possibility of converting between the questionnaires does, of course, not replace the need to carefully consider which instrument is the most appropriate when designing new studies.
Conclusions
Linear regressions can be used to convert between the IPSS and the EPIC Urinary subscales. More complex model architectures and using the raw answers to the questions did not provide a meaningful performance benefit in this study. While the results of this study can be used to compare results across clinical trials, they should not be used to inform clinical decision-making in individual patients.
Availability of data and materials
All data and code used to obtain the results of this study have been uploaded to https://github.com/windisch-paul/EPIC-IPSS-converter.
References
Atherton PJ, Sloan JA. Rising importance of patient-reported outcomes. Lancet Oncol. 2006;7:883–4.
Rivera SC, Kyte DG, Aiyegbusi OL, Slade AL, McMullan C, Calvert MJ. The impact of patient-reported outcome (PRO) data from clinical trials: a systematic review and critical analysis. Health Qual Life Outcomes. 2019;17:156.
Vertosick EA, Vickers AJ, Cowan JE, Broering JM, Carroll PR, Cooperberg MR. Interpreting patient reported urinary and sexual function outcomes across multiple validated instruments. J Urol. 2017;198:671–7.
Slevin F, Zattoni F, Checcucci E, Cumberbatch MGK, Nacchia A, Cornford P, et al. A systematic review of the efficacy and toxicity of brachytherapy boost combined with external beam radiotherapy for Nonmetastatic prostate cancer. Eur Urol Oncol. 2023. https://doi.org/10.1016/j.euo.2023.11.018.
Spratt DE, Shore N, Sartor O, Rathkopf D, Olivier K. Treating the patient and not just the cancer: therapeutic burden in prostate cancer. Prostate Cancer Prostatic Dis. 2021;24:647–61.
Martin NE. Now You’re Speaking My Language: Getting Patient-reported Outcomes to Talk to One Another. Eur Urol. 2019;75:731–2.
Edwards PJ, Roberts I, Clarke MJ, Diguiseppi C, Wentz R, Kwan I, et al. Methods to increase response to postal and electronic questionnaires. Cochrane Database Syst Rev. 2009;2009:MR000008.
Wei JT, Dunn RL, Litwin MS, Sandler HM, Sanda MG. Development and validation of the expanded prostate cancer index composite (EPIC) for comprehensive assessment of health-related quality of life in men with prostate cancer. Urology. 2000;56:899–905.
Badía X, García-Losa M, Dal-Ré R. Ten-language translation and harmonization of the International Prostate Symptom Score: developing a methodology for multinational clinical trials. Eur Urol. 1997;31:129–40.
Gewanter RM, Sandhu JS, Tin AL, Gross JP, Mazzarella K, Urban J, et al. Assessment of patients with prostate cancer and their understanding of the international prostate symptom score questionnaire. Adv Radiat Oncol. 2023;8:101200.
Malik R, Jani AB, Liauw SL. External beam radiotherapy for prostate cancer: urinary outcomes for men with high International Prostate Symptom Scores (IPSS). Int J Radiat Oncol Biol Phys. 2011;80:1080–6.
Tree AC, Ostler P, van der Voet H, Chu W, Loblaw A, Ford D, et al. Intensity-modulated radiotherapy versus stereotactic body radiotherapy for prostate cancer (PACE-B): 2-year toxicity results from an open-label, randomised, phase 3, non-inferiority trial. Lancet Oncol. 2022;23:1308–20.
Kishan AU, Ma TM, Lamb JM, Casado M, Wilhalme H, Low DA, et al. Magnetic Resonance Imaging-Guided vs Computed Tomography-Guided Stereotactic Body Radiotherapy for Prostate Cancer: The MIRAGE Randomized Clinical Trial. JAMA Oncol. 2023;9:365–73.
Viitala A, Anttinen M, Wright C, Virtanen I, Mäkelä P, Hovinen T, et al. Magnetic resonance imaging-guided transurethral ultrasound ablation for benign prostatic hyperplasia: 12-month clinical outcomes of a phase I study. BJU Int. 2022;129:208–16.
Namiki S, Takegami M, Kakehi Y, Suzukamo Y, Fukuhara S, Arai Y. Analysis Linking UCLA PCI With Expanded Prostate Cancer Index Composite: An Evaluation of Health Related Quality of Life in Japanese Men With Localized Prostate Cancer. J Urol. 2007;178:473–7.
Hedgepeth RC, Labo J, Zhang L, Wood DP Jr. Expanded Prostate Cancer Index Composite versus Incontinence Symptom Index and Sexual Health Inventory for Men to measure functional outcomes after prostatectomy. J Urol. 2009;182:221–7 discussion 227–8.
Singh K, Tin AL, Dunn RL, Kim T, Vickers AJ. Development and validation of crosswalks for patient-reported sexual and urinary outcomes between commonly used instruments. Eur Urol. 2019;75:723–30.
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. InProceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016. pp. 785–94.
Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res. 2012;13(2).
Hanke L, Tang H, Schröder C, Windisch P, Kudura K, Shelan M, et al. Dose-volume histogram parameters and quality of life in patients with prostate cancer treated with surgery and high-dose volumetric-intensity-modulated arc therapy to the prostate bed. Cancers. 2023;15:3454.
Blanker MH, Alma HJ, Devji TS, Roelofs M, Steffens MG, van der Worp H. Determining the minimal important differences in the International Prostate Symptom Score and Overactive Bladder Questionnaire: results from an observational cohort study in Dutch primary care. BMJ Open. 2019;9:e032795.
Barry MJ, Fowler FJ Jr, O’Leary MP, Bruskewitz RC, Holtgrewe HL, Mebust WK, et al. The American Urological Association symptom index for benign prostatic hyperplasia. The measurement committee of the American urological association. J Urol. 1992;148:1549–57 discussion 1564.
Umbehr MH, Bachmann LM, Poyet C, Hammerer P, Steurer J, Puhan MA, et al. The German version of the Expanded Prostate Cancer Index Composite (EPIC): translation, validation and minimal important difference estimation. Health Qual Life Outcomes. 2018;16:36.
Ternov KK, Nolsøe AB, Bratt O, Fode M, Lindberg H, Kistorp C, et al. Quality of life in men with metastatic castration-resistant prostate cancer treated with enzalutamide or abiraterone: a systematic review and meta-analysis. Prostate Cancer Prostatic Dis. 2021;24:948–61.
Acknowledgements
Not applicable
Funding
No funding was received for this project.
Author information
Authors and Affiliations
Contributions
Conceptualization, P.W., R.F.; methodology, P.W.; formal analysis, P.W., I.B., H.T., C.S.; data curation, P.W., I.B.; writing—original draft preparation, P.W., I.B.; writing—review and editing, H.T., C.S., A.B., D.M.A., D.R.Z., R.F., M.S.; supervision, R.F., M.S.; project administration, D.R.Z.; All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Institutional review board approval was obtained from the ethical review committee of the canton of Zurich (Kantonale Ethikkommission) for a project (project number: BASEC 2020–02112) to analyze the effects and side effects of radiotherapy at our institution (ClinicalTrials.gov Identifier: NCT05192876, link: https://clinicaltrials.gov/study/NCT05192876). Written informed consent for the analysis of anonymized clinical and imaging data was obtained from all patients, and all data were gathered in accordance with the World Medical Association Declaration of Helsinki: Research involving human subjects.
Consent for publication
Not applicable.
Competing interests
P.W. has a patent application titled ‘Method for detection of neurological abnormalities’ outside of the submitted work. The remaining authors declare no conflict of interest.
Additional information
Publisher‘s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Windisch, P., Becker, I., Tang, H. et al. Converting between the International Prostate Symptom Score (IPSS) and the Expanded Prostate Cancer Index Composite (EPIC) urinary subscales: modeling and external validation. BMC Urol 24, 28 (2024). https://doi.org/10.1186/s12894-024-01421-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12894-024-01421-y