- Research article
- Open Access
Review by urological pathologists improves the accuracy of Gleason grading by general pathologists
BMC Urology volume 15, Article number: 70 (2015)
Urologists use biopsy Gleason scores for patient counseling, prognosis prediction, and decision making. The accuracy of Gleason grading is very important. However, the variability of Gleason grading between general pathologists cannot be overlooked. Here we evaluate the discrepancy in the Gleason grading between 2 urologic pathologists and general pathologists as well as improvement in the accuracy of Gleason grading by general pathologists as a result of review by urologic pathologists.
The subjects enrolled in the study were 755 patients who underwent prostate needle biopsy at affiliate hospitals of Nara Medical University over a period of 2 years. The biopsy samples were diagnosed by general pathologists. All biopsy samples were sent to Nara Medical University where they were diagnosed by 2 urologic pathologists. The results were then returned to the general pathologists. We compared the diagnostic accuracy of the general pathologists with that of the urologic pathologists for the parameters of no malignancy, atypical small acinar proliferation, high grade prostatic intraepithelial neoplasia and Gleason score (6, 3 + 4, 4 + 3 and 8–10). We then evaluated the concordance rate between the general and urologic pathologists for each of four consecutive 6-month periods.
The overall concordance rate of urologic pathologists and general pathologists in the first, second, third and last 6-month periods was 71.8 % (140/198), 79.8 % (168/225), 89.7 % (166/185) and 89.9 % (133/148), respectively. The concordance rate of the Gleason score between urologic pathologists and general pathologists in the first, second, third and last 6-month periods was 47.5 %(38/80), 62.6 %(57/91),76.9 %(50/65) and 78.7 %(48/61), respectively, and the kappa value was 0.55, 0.68, 0.81 and 0.84, respectively. The concordance rate improved significantly over the course of each period (P = 0.04).
The concordance rate of the Gleason grading between the general pathologists and the urologic pathologists was 47.5 %. However, improvement of the concordance rate as a result of review by the urological pathologist could be seen.
Gleason et al  introduced the Gleason grading system for prostate cancer in 1966 and it was modified in 1974  and 1977 . Gleason grading is now accepted as the international standard for pathological grading in prostate cancer. In Japan, Gleason grading was introduced for clinical and pathological studies of prostate cancer in 2001; since then, it has been the standard for the pathological classification of prostate cancer . Previously reported studies demonstrated the ability of the Gleason grading system to serve as a predictor of the final pathological stage and prognosis [5–7]. Generally, urologists use biopsy Gleason scores (GS) for patient counseling, prognosis prediction, and decision making. It goes without saying that the accuracy of Gleason grading is very important; however, several studies have described interobserver variabilities [8, 9]. The variability of Gleason grading between general pathologists cannot be overlooked [8, 10]. To improve these variabilities, the International Society of Urological Pathology (ISUP) convenced a consensus conference on the Gleason grading of Prostatic carcinoma at 2005 . The 2005 ISUP modified Gleason system is considered as the currently accepted version of Gleason grading [12, 13].
In this study, we evaluated discrepancies in Gleason grading between urological pathologists and general pathologists. We also sought to evaluate the impact of Gleason grading by general pathologists.
Between April 2006 and March 2008, we enrolled 755 patients who underwent prostate needle biopsy at 2 hospitals affiliated to the Nara Medical University. Approval for the study was obtained from the Nara Medical University Hospital Institutional Review Board. We obtained written informed consent from each enrolled patient before biopsy. Prostate-specific antigen (PSA) levels were determined using the PSA age-specific reference range according to Ito et al.  The cutoff value was 3.1 ng/mL in patients aged <65 years, 3.6 ng/mL in those aged 65–70 years, and 4.1 ng/mL in those aged ≥70 years. Biopsy was performed under transrectal ultrasonography while adjusting the number (6–12 cores) on the basis of the prostate volume and age (Table 1) [14, 15]. All general pathologists in both affiliated hospitals evaluated the biopsy samples.
All biopsy samples were subsequently sent to Nara Medical University, where a urological pathological diagnosis was made by 2 experts in prostate cancer diagnosis who were blinded to the general pathologists’ evaluations. Each slide was diagnosed by 2 urological pathologists. When discrepancy between urological pathologists, they discussed with the case and determined the final diagnosis. The results were then returned to the general pathologists who reviewed the results given by the urological pathologist. The results were only described about GS, high grade prostatic intraepithelial neoplasia (HGPIN), atypical small acinar proliferation (ASAP), prostatitis, hypertrophy, or no malignancy (NM), and portion of cancer in each core. This procedure was followed for all samples. We compared the diagnostic accuracy between general and urological pathologists for the parameters of no malignancy, ASAP, HGPIN, and GS (6, 3 + 4, 4 + 3 and 8–10) at the worst GS for each patient. We then evaluated the concordance rate between general and urological pathologists for each 6-month period.
We used the Kruskal–Wallis test or chi-square test to estimate the distribution of each parameter in each term. The concordance was measured on the basis of the percentage of concordance and Cohen’s Kappa. Kappa values of 0.00–0.20, 0.21–0.40, 0.41–0.60, 0.61–0.80, and 0.81–1.00 represented slight, fair, moderate, substantial, and almost perfect concordance, respectively. These statistical analyses were performed using SPSS®, version 19 (SPSS Inc., Chicago, IL). Improvement of concordance over the course of each period was estimated by the chi-square test for trend using Graph Pad Prism®, version 5.01 (Graph Pad Software, San Diego, CA). A p-value of <0.05 was considered to be significant.
Table 2 shows patient characteristics for each 6-month period. No significant dispersion was noted for age, PSA level, the number of biopsy cores, or GS between the 4 groups using the Kruskal–Wallis test or the chi-square test.
In the first period, the overall concordance rate of urological pathologists and general pathologists was 71.8 % (140/198 samples; Table 3). The urological pathologists diagnosed NM in 103 samples, ASAP in 4 samples, HGPIN in 11 patients, and prostate cancer in 80 samples. For 99 of 103 samples (96.1 %) diagnosed with NM, 1 of 4 samples (25.0 %) diagnosed with ASAP and 9 of 11 samples (81.8 %) with HGPIN, the general and urological pathologists’ diagnoses were in agreement. For 38 of these 80 samples (47.5 %), the general and urological pathologists’ GS diagnoses were in agreement and the kappa value was 0.55. The general pathologists undergraded 35.1 % (27/80) samples and overgraded 18.1 % (14/80) samples (Table 4). The general pathologists diagnosed 120 patients with NM and 5 patients with ASAP. Nine patients of patients diagnosed with NM and 2 of patients with ASAP by general pathologists were diagnosed with prostate cancer by urological pathologists.
In the second period, the overall concordance rate of urological and general pathologists was 79.8 % (168/225 samples; Table 5). The urological pathologists diagnosed NM in 126 samples, ASAP in 2 samples, HGPIN in 6 patients, and prostate cancer in 91 samples. For 118 of 126 samples (93.7 %) diagnosed with NM, 2 of 2 samples (100 %) diagnosed with ASAP and 1 of 6 samples (16.7 %) with HGPIN, the general and urological pathologists’ diagnoses were in agreement. For 57 of these 91 samples (62.6 %), the general and urological pathologists’ GS diagnoses were in agreement and the kappa value was 0.68. General pathologists undergraded 23.1 % (21/91) samples and overgraded 14.3 % (13/91) samples (Table 4). The general pathologists diagnosed 125 patients with NM and 13 patients with ASAP. Two patients of patients diagnosed with NM and 4 of patients with ASAP were diagnosed with prostate cancer by urological pathologists.
In the third period, the overall concordance rate of urological and general pathologists was 89.7 % (166/185; Table 6). The urological pathologists diagnosed NM in 115 samples, ASAP in 1 sample, HGPIN in 2 patients, and prostate cancer in 65 samples. For 115 of 115 samples (100 %) diagnosed with NM, 1 of 1 sample (100 %) diagnosed with ASAP and 0 of 2 samples (0 %) with HGPIN, the general and urological pathologists’ diagnoses were in agreement. For 50 out of these 65 samples (76.9 %), the general and urological pathologists’ GS diagnoses were in agreement and the kappa value was 0.81. General pathologists undergraded 11.9 % (8/65) samples and overgraded 14.1 % (9/65) samples (Table 4). The general pathologists diagnosed 117 patients with NM and 2 patients with ASAP. No patient of patients diagnosed with NM and 1 of patients with ASAP were diagnosed with prostate cancer by urological pathologists.
In the last period, the overall concordance rate of urological and general pathologists was 89.9 % (133/148; Table 7). The urological pathologists diagnosed NM in 85 samples, ASAP in 1 sample, HGPIN in 1 sample, and prostate cancer in 61 samples. For 84 of 85 samples (98.8 %) diagnosed with NM, 1 of 1 sample (100 %) diagnosed with ASAP and 0 of 1 samples (0 %) with HGPIN, the general and urological pathologists’ diagnoses were in agreement. For 48 of these 61 samples (78.7 %), the general and urological pathologists’ GS diagnoses were in agreement and the kappa value was 0.84. General pathologists undergraded 16.4 % (10/61) samples and overgraded 4.9 % (3/61) samples (Table 4). The general pathologists diagnosed 86 patients with NM and 3 patients with ASAP. One patient of patients diagnosed with NM and 1 of patients with ASAP were diagnosed with prostate cancer by urological pathologists.
The kappa value increased with time. The concordance rate significantly improved over the course of the study across periods (p = 0.04).
Fifty three patients were diagnosed with prostate cancer by urological pathologists on one positive core and 243 patients diagnosed with prostate cancer on two or more positive cores. Discrepancy between general and urological pathologists was found in 30 patients (56.6 %) of 53 and 76 patients (31.3 %) of 243 (p < 0.01), respectively.
Biopsy GS is an important predictor of the likelihood of various final pathological stages of radical retropubic prostatectomy , and it is also a significant predictor of biochemical recurrence in patients who undergo radical prostatectomy [16, 17]. Biopsy GS is also associated with biochemical failure in those who have undergone permanent brachytherapy  and external beam radiation therapy . Biopsy GS, in combination with PSA level and clinical stage, is a very important factor in decision making for initial therapy. However, several studies have described interobserver variability in Gleason grading [8, 9]. In particular, the variability in Gleason grading between general pathologists should not be overlooked [8, 10]. Burchardt et al. demonstrated that 29 German pathologists who analyzed a series of tissue microarray images showed 45.7 % concordance with biopsy GS assigned by an expert.  Coard et al. reported 67 % overall concordance between anatomical pathologists and an experienced pathologist for consensus on prostate cancer GS . In the present study, the overall concordance between general pathologists and the urological pathologists was 47.5 % and the kappa score was 0.55 in the first 6-month period. This was not an acceptable concordance and was similar to the results of previous studies. These discrepancies may have been caused by (1) a sampling effect caused by tumor heterogeneity, (2) interpretational bias, or (3) the small volume of tissue for cancer biopsy [10, 21]. In the present study, patients who diagnosed with prostate cancer on one positive core tended to be misdiagnosed compared to those who diagnosed on two or more cores and another reason for discordance may have been that the general pathologists did not refer to the 2005 ISUP consensus conference on the Gleason grading of Prostatic Carcinoma .
To improve this discrepancy, Mikami et al  used a 40-min educational lecture or a tutorial with an anatomical atlas. In a lecture group, the average concordance rates before and after the lecture were 55.7 % and 68.4 %, and the average kappa values were 0.43 and 0.67, respectively. In the atlas group, the average concordance rates before and after providing the atlas were 61.3 % and 74.5 %, and the average kappa values were 0.44 and 0.68, respectively. Allsbrook et al  reported that concordance between general pathologists and urological pathologists improved to 77.4 % (kappa value = 0.73) by web-based virtual microscopy. In Egevad’s study, the proportion of correct GS improved from 70.5 % to 86.6 % after a teaching set of 40 images illustrating GS was distributed among 85 pathologists . The present study demonstrated an improvement in the accuracy of general pathologists’ GS after review by 2 urological pathologist. The rate of agreement and the kappa value increased with the period and improved from an initial 47.5 % (kappa score = 0.55) to a final 80.3 % (kappa value = 0.84). Furthermore in the third period, the rate of concordance was high and the high rate continued in the last period. So the appropriate time of this method for improving GS may need one year by our way.
It is well known that general pathologists tend to underestimate GS [8, 10, 20, 24]. Coard et al. reported that anatomical pathologists undergraded 25.6 % of all biopsy specimens and overgraded 6.7 % , whereas Burchardt et al. reported that the rates of undergrading and overgrading were 38.9 % and 15.4 %, respectively . Similar to our reports, Barqawi et al. evaluated defference between outside pathologists and their institution pathologists and Gleason undergrading occurred in 46 % outside and 38 % their institution diagnosis with respect to radical prostatectomy specimens . The corresponding values in the first period in the present study were 35.1 % and 18.1 %, respectively, showing that general pathologists tended to undergrade as in other reports. Undergrading was particularly common for tumors with a GS of 6 and 3 + 4 in our study. Allsbrook et al. found 47 % undergrading of tumors with GS 5–6, and 43 % undergrading of tumors with GS 7 . In the present study, the undergrading of GS in 7 samples most probably resulted from mistaking Gleason pattern 4 for pattern 3, and the undergrading of GS in 6 tumors most probably resulted from mistaking pattern 3 for pattern 2. This is in accordance with the studies of Allsbrook et al. , Burchardt et al. , and Mikami et al.  Thus, there is a tendency for general pathologists to underestimate GS, especially in Gleason patterns 3 and 4.
However, the rate of undergrading decreased to 16.4 % in the last period after general pathologists had the experienced of review by the urological pathologists in the present study. Mikami et al. reported an improvement in the rate of undergrading from 36.3 % to 14.2 % after a lecture . Egevad reported improvement of undergrading by the use of reference images . It shows that the tendency for general pathologists to undergrade can improve when they study GS patterns using any of the common methods. Particular improvement in undergrading among general pathologists can be expected by preventing mistakes in identifying Gleason pattern 3 for pattern 2 and pattern 4 for pattern 3.
20 patients who diagnosed with NM and ASAP by general urologists were diagnosed with prostate cancer by urological patients. This discrepancy could be fatal. This discrepancy improved with time, 17/263 (6.5 %) in first and second period to 3/208 (1.4 %) in third and fourth period in the presents study (p = 0.01, chi-square test). Furthermore in 14 cases (70 %) the positive core was one. These result showed the discrepancy was caused by small cancer volume and interpretative error.
A limitation of this study was the inability to isolate the general pathologists from other educational sources associated with Gleason scoring over a period of 2 years. Therefore, any improvement seen may not necessarily be a direct result of the experience of the review by the urological pathologists.
The concordance rate of GS between the urological and general pathologists was initially low (47.5 %), but following the expert reviews there was a significantly improvement in concordance rate over time.
Atypical small acinar proliferation
Gleason score, HGPIN, High grade prostatic intraepithelial neoplasia
International Society of Urological Pathology
Gleason DF. Classification of prostatic carcinoma. Cancer Chemother Rep. 1966;50:125.
Gleason DF, Mellinger GT. Prediction of prognosis for prostatic adenocarcinoma by combined histological grading and clinical staging. J Urol. 1974;111:58.
Gleason DF. Histologic grading and clinical staging of prostatic carcinoma. In: Tannenbaum M, editor. Urologic Pathology: The Prostate. Philadelphia: Lea and Febiger; 1977. p. 171.
Japanese Urological Association and the Japanese Society of Pathology, editor. General rule for clinical and pathological studies on prostate cancer. 3rd ed. Tokyo: Kanahara-Shuppan; 2001.
Oesterling JE, Brendler CB, Epstein JI, Kimball AW Jr, Walsh PC. Correlation of clinical stage, serum prostatic acid phosphatase and preoperative Gleason grade with final pathological stage in 275 patients with clinically localized adenocarcinoma of the prostate. J Urol. 1987;38:92.
Epstein JI, Pizov G, Walsh PC. Correlation of pathologic findings with progression after radical retropubic prostatectomy. Cancer. 1993;71:3582.
Partin AW, Mangold LA, Lamm DM, Walsh PC, Epstein JI, Pearson JD. Contemporary update of prostate cancer staging nomograms (Partin Tables) for the new millennium. Urology. 2001;58:843.
Allsbrook WC Jr, Mangold KA, Johnson MH, Lane RB, Lane CG, Amin MB et al. Interobserver reproducibility of Gleason grading of prostatic carcinoma: urologic pathologists. Hum Pathol. 2001;32:74.
McLean M, Srigley J, Banerjee D, Warde P, Hao Y. Interobserver variation in prostate cancer Gleason scoring: are there implications for the design of clinical trials and treatment strategies? Clin Oncol (R Coll Radiol). 1997;9:222.
Coard KC, Freeman VL. Gleason grading of prostate cancer: level of concordance between pathologists at the University Hospital of the West Indies. Am J Clin Patol. 2004;122:373.
Epstein JI, Allsbrook WC Jr, Amin MB, Egevad LL.The 2005 International Society of Urological Pathology (ISUP) Consensus Conference on Gleason Grading of Prostatic Carcinoma. Am J Surg Pathol. 2005;29:1228.
Billis A, Quintal MM, Meirelles L, Freitas LL, Costa LB, Bonfitto JF, et al. The value of the 2005 International Society of Urological Pathology (ISUP) modified Gleason grading system as a predictor of biochemical recurrence after radical prostatectomy. Int Urol Nephrol. 2014;46:935.
Dong F, Wang C, Farris AB, Wu S, Lee H, Olumi AF, et al. Impact on the clinical outcome of prostate cancer by the 2005 international society of urological pathology modified Gleason grading system. Am J Surg Pathol. 2012;36:838.
Ito K, Ohi M, Yamamoto T, Miyamoto S, Kurokawa K, Fukabori Y, et al.The diagnostic accuracy of the age-adjusted and prostate volume-adjusted biopsy method in males with proatate specific antigen levels of 4.1-10.0 ng/mL. Cancer. 2002;95:2112.
Tanaka N1, Fujimoto K, Yoshikawa M, Tanaka M, Hirao Y, Kondo H, et al. Prostatic volume and volume-adjusted prostate-specific antigen as predictive parameters for T1c prostatecancer. Hinyokika kiyo. 2007;53:459.
Han M, Partin AW, Zahurak M, Piantadosi S, Epstein JI, Walsh PC. Biochemical (prostate specific antigen) recurrence probability following radical prostatectomy for clinically localized prostate cancer. J Urol. 2003;169:517.
Tanaka N, Fujimoto K, Hirayama A, Torimoto K, Okajima E, Tanaka M, et al. Risk-stratified survival rates and predictors of biochemical recurrence after radical prostatectomy in a Nara, Japan, cohort study. Int J Clin Oncol. 2011;16:553.
Potters L, Purrazzella R, Brustein S, Fearn P, Huang D, Leibel SA, et al. The prognostic significance of Gleason grade in patients treated with permanent prostate brachytherapy. Int J Radiat Oncol Boil Phys. 2003;56:749.
Sabolch A1, Feng FY, Daignault-Newton S, Halverson S, Blas K, Phelps L, et al. Gleason pattern 5 is the greatest risk factor for clinical failure and death from prostate cancer after dose-escalated radiation therapy and hormonal ablation. Int J Radiat Oncol Boil Phys. 2011;81:e351.
Burchardt M, Engers R, Müller M, Burchardt T, Willers R, Epstein JI, et al. Interobserver reproducibility of Gleason grading: evaluation using prostate cancer tissue microarrays. J cancer Res Clin Oncol. 2008;134:1071.
King CR, McNeal JE, Gill H, Presti JC Jr. Extended prostate biopsy scheme improves reliability of Gleason grading: implications for radiotherapy patients. Int J Radiat Oncol Biol Phys. 2004;59:386.
Mikami Y, Manabe T, Epstein JI, Shiraishi T, Furusato M, Tsuzuki T, et al. Accuracy of Gleason grading by practicing pathologists and the impact of education on improving agreement. Hum Pathol. 2003;34:658.
Egevad L. Reproducibility of Gleason grading of prostate cancer can be improved by the use of reference images. Urology. 2001;57:291.
Barqawi AB, Turcanu R, Gamito EJ, Lucia SM, O'Donnell CI, Crawford ED, et al. The value of second-opinion pathology diagnosis on prostate biopsies from patients reffered for management of prostate cancer. Int J Clin Exp Pathol. 2011;4:468–75.
We are very grateful to Shuji Watanabe (Saiseikai Chuwa Hospital), Yoshinori Nakagawa (Yamato Takada Municipal Hospital) and Shuya Hirao (Hirao Hospital) for their valuable cooperation in our study.
The authors declare that they have no competing interests.
YN analysed and interpreted the data, and drafted the manuscript. NT conceived of the study and revised this manuscript. KS participated in this study as a urological pathologist and helped to carry out this study. NK participated in this study as a urological pathologist and provided valuable help on the study. MM and SA provided valuable help on the experiments. KF participated in its design and gave final approval of the version to be published. All authors read and approved the final manuscript.
About this article
Cite this article
Nakai, Y., Tanaka, N., Shimada, K. et al. Review by urological pathologists improves the accuracy of Gleason grading by general pathologists. BMC Urol 15, 70 (2015). https://doi.org/10.1186/s12894-015-0066-x
- Gleason score
- Prostate biopsy
- General pathologist
- Urological pathologist