Review by urological pathologists improves the accuracy of Gleason grading by general pathologists

Backgrounds Urologists use biopsy Gleason scores for patient counseling, prognosis prediction, and decision making. The accuracy of Gleason grading is very important. However, the variability of Gleason grading between general pathologists cannot be overlooked. Here we evaluate the discrepancy in the Gleason grading between 2 urologic pathologists and general pathologists as well as improvement in the accuracy of Gleason grading by general pathologists as a result of review by urologic pathologists. Methods The subjects enrolled in the study were 755 patients who underwent prostate needle biopsy at affiliate hospitals of Nara Medical University over a period of 2 years. The biopsy samples were diagnosed by general pathologists. All biopsy samples were sent to Nara Medical University where they were diagnosed by 2 urologic pathologists. The results were then returned to the general pathologists. We compared the diagnostic accuracy of the general pathologists with that of the urologic pathologists for the parameters of no malignancy, atypical small acinar proliferation, high grade prostatic intraepithelial neoplasia and Gleason score (6, 3 + 4, 4 + 3 and 8–10). We then evaluated the concordance rate between the general and urologic pathologists for each of four consecutive 6-month periods. Results The overall concordance rate of urologic pathologists and general pathologists in the first, second, third and last 6-month periods was 71.8 % (140/198), 79.8 % (168/225), 89.7 % (166/185) and 89.9 % (133/148), respectively. The concordance rate of the Gleason score between urologic pathologists and general pathologists in the first, second, third and last 6-month periods was 47.5 %(38/80), 62.6 %(57/91),76.9 %(50/65) and 78.7 %(48/61), respectively, and the kappa value was 0.55, 0.68, 0.81 and 0.84, respectively. The concordance rate improved significantly over the course of each period (P = 0.04). Conclusion The concordance rate of the Gleason grading between the general pathologists and the urologic pathologists was 47.5 %. However, improvement of the concordance rate as a result of review by the urological pathologist could be seen.


Backgrounds
Gleason et al [1] introduced the Gleason grading system for prostate cancer in 1966 and it was modified in 1974 [2] and 1977 [3]. Gleason grading is now accepted as the international standard for pathological grading in prostate cancer. In Japan, Gleason grading was introduced for clinical and pathological studies of prostate cancer in 2001; since then, it has been the standard for the pathological classification of prostate cancer [4]. Previously reported studies demonstrated the ability of the Gleason grading system to serve as a predictor of the final pathological stage and prognosis [5][6][7]. Generally, urologists use biopsy Gleason scores (GS) for patient counseling, prognosis prediction, and decision making. It goes without saying that the accuracy of Gleason grading is very important; however, several studies have described interobserver variabilities [8,9]. The variability of Gleason grading between general pathologists cannot be overlooked [8,10]. To improve these variabilities, the International Society of Urological Pathology (ISUP) convenced a consensus conference on the Gleason grading of Prostatic carcinoma at 2005 [11]. The 2005 ISUP modified Gleason system is considered as the currently accepted version of Gleason grading [12,13].
In this study, we evaluated discrepancies in Gleason grading between urological pathologists and general pathologists. We also sought to evaluate the impact of Gleason grading by general pathologists.

Methods
Between April 2006 and March 2008, we enrolled 755 patients who underwent prostate needle biopsy at 2 hospitals affiliated to the Nara Medical University. Approval for the study was obtained from the Nara Medical University Hospital Institutional Review Board. We obtained written informed consent from each enrolled patient before biopsy. Prostate-specific antigen (PSA) levels were determined using the PSA age-specific reference range according to Ito et al. [14] The cutoff value was 3.1 ng/mL in patients aged <65 years, 3.6 ng/mL in those aged 65-70 years, and 4.1 ng/mL in those aged ≥70 years. Biopsy was performed under transrectal ultrasonography while adjusting the number (6-12 cores) on the basis of the prostate volume and age (Table 1) [14,15]. All general pathologists in both affiliated hospitals evaluated the biopsy samples.
All biopsy samples were subsequently sent to Nara Medical University, where a urological pathological diagnosis was made by 2 experts in prostate cancer diagnosis who were blinded to the general pathologists' evaluations. Each slide was diagnosed by 2 urological pathologists. When discrepancy between urological pathologists, they discussed with the case and determined the final diagnosis. The results were then returned to the general pathologists who reviewed the results given by the urological pathologist. The results were only described about GS, high grade prostatic intraepithelial neoplasia (HGPIN), atypical small acinar proliferation (ASAP), prostatitis, hypertrophy, or no malignancy (NM), and portion of cancer in each core. This procedure was followed for all samples. We compared the diagnostic accuracy between general and urological pathologists for the parameters of no malignancy, ASAP, HGPIN, and GS (6, 3 + 4, 4 + 3 and 8-10) at the worst GS for each patient. We then evaluated the concordance rate between general and urological pathologists for each 6-month period.
We used the Kruskal-Wallis test or chi-square test to estimate the distribution of each parameter in each term. The concordance was measured on the basis of the percentage of concordance and Cohen's Kappa. Kappa values of 0.00-0.20, 0.21-0.40, 0.41-0.60, 0.61-0.80, and 0.81-1.00 represented slight, fair, moderate, substantial, and almost perfect concordance, respectively. These statistical analyses were performed using SPSS®, version 19 (SPSS Inc., Chicago, IL). Improvement of concordance over the course of each period was estimated by the chisquare test for trend using Graph Pad Prism®, version 5.01 (Graph Pad Software, San Diego, CA). A p-value of <0.05 was considered to be significant. Table 2 shows patient characteristics for each 6-month period. No significant dispersion was noted for age, PSA level, the number of biopsy cores, or GS between the 4 groups using the Kruskal-Wallis test or the chi-square test.

Results
In the first period, the overall concordance rate of urological pathologists and general pathologists was 71.8 % (140/198 samples; Table 3 (14/80) samples ( Table 4). The general pathologists diagnosed 120 patients with NM and 5 patients with ASAP. Nine patients of patients diagnosed with NM and 2 of patients with ASAP by general pathologists were diagnosed with prostate cancer by urological pathologists.
In the second period, the overall concordance rate of urological and general pathologists was 79.8 % (168/225 samples; Table 5 Table 4). The general pathologists diagnosed 125 patients with NM and 13 patients with ASAP. Two patients of patients diagnosed with NM and 4 of patients with ASAP were diagnosed with prostate cancer by urological pathologists.
In the third period, the overall concordance rate of urological and general pathologists was 89.7 % (166/185; Table 6). The urological pathologists diagnosed NM in 115 samples, ASAP in 1 sample, HGPIN in 2 patients, and prostate cancer in 65 samples. For 115 of 115 samples (100 %) diagnosed with NM, 1 of 1 sample (100 %) diagnosed with ASAP and 0 of 2 samples (0 %) with HGPIN, the general and urological pathologists' diagnoses were in agreement. For 50 out of these 65 samples (76.9 %), the general and urological pathologists' GS diagnoses were in agreement and the kappa value was 0.81. General pathologists undergraded 11.9 % (8/65) samples and overgraded 14.1 % (9/65) samples ( Table 4). The general pathologists diagnosed 117 patients with NM and 2 patients with ASAP. No patient of patients diagnosed with NM and 1 of patients with ASAP were diagnosed with prostate cancer by urological pathologists.     In the last period, the overall concordance rate of urological and general pathologists was 89.9 % (133/148; Table 7 Table 4). The general pathologists diagnosed 86 patients with NM and 3 patients with ASAP. One patient of patients diagnosed with NM and 1 of patients with ASAP were diagnosed with prostate cancer by urological pathologists.
The kappa value increased with time. The concordance rate significantly improved over the course of the study across periods (p = 0.04).
Fifty three patients were diagnosed with prostate cancer by urological pathologists on one positive core and 243 patients diagnosed with prostate cancer on two or more positive cores. Discrepancy between general and urological pathologists was found in 30 patients (56.6 %) of 53 and 76 patients (31.3 %) of 243 (p < 0.01), respectively.

Discussion
Biopsy GS is an important predictor of the likelihood of various final pathological stages of radical retropubic prostatectomy [7], and it is also a significant predictor of biochemical recurrence in patients who undergo radical prostatectomy [16,17]. Biopsy GS is also associated with biochemical failure in those who have undergone permanent brachytherapy [18] and external beam radiation therapy [19]. Biopsy GS, in combination with PSA level and clinical stage, is a very important factor in decision making for initial therapy. However, several studies have described interobserver variability in Gleason grading [8,9]. In particular, the variability in Gleason grading between general pathologists should not be overlooked [8,10]. Burchardt     overall concordance between anatomical pathologists and an experienced pathologist for consensus on prostate cancer GS [10]. In the present study, the overall concordance between general pathologists and the urological pathologists was 47.5 % and the kappa score was 0.55 in the first 6-month period. This was not an acceptable concordance and was similar to the results of previous studies. These discrepancies may have been caused by (1) a sampling effect caused by tumor heterogeneity, (2) interpretational bias, or (3) the small volume of tissue for cancer biopsy [10,21]. In the present study, patients who diagnosed with prostate cancer on one positive core tended to be misdiagnosed compared to those who diagnosed on two or more cores and another reason for discordance may have been that the general pathologists did not refer to the 2005 ISUP consensus conference on the Gleason grading of Prostatic Carcinoma [11].
To improve this discrepancy, Mikami et al [22] used a 40-min educational lecture or a tutorial with an anatomical atlas. In a lecture group, the average concordance rates before and after the lecture were 55.7 % and 68.4 %, and the average kappa values were 0.43 and 0.67, respectively. In the atlas group, the average concordance rates before and after providing the atlas were 61.3 % and 74.5 %, and the average kappa values were 0.44 and 0.68, respectively. Allsbrook et al [8] reported that concordance between general pathologists and urological pathologists improved to 77.4 % (kappa value = 0.73) by       [20]. Similar to our reports, Barqawi et al. evaluated defference between outside pathologists and their institution pathologists and Gleason undergrading occurred in 46 % outside and 38 % their institution diagnosis with respect to radical prostatectomy specimens [24]. The corresponding values in the first period in the present study were 35.1 % and 18.1 %, respectively, showing that general pathologists tended to undergrade as in other reports. Undergrading was particularly common for tumors with a GS of 6 and 3 + 4 in our study. Allsbrook et al. found 47 % undergrading of tumors with GS 5-6, and 43 % undergrading of tumors with GS 7 [8]. In the present study, the undergrading of GS in 7 samples most probably resulted from mistaking Gleason pattern 4 for pattern 3, and the undergrading of GS in 6 tumors most probably resulted from mistaking pattern 3 for pattern 2. This is in accordance with the studies of Allsbrook et al. [8], Burchardt et al. [20], and Mikami et al. [22] Thus, there is a tendency for general pathologists to underestimate GS, especially in Gleason patterns 3 and 4.
However, the rate of undergrading decreased to 16.4 % in the last period after general pathologists had the experienced of review by the urological pathologists in the present study. Mikami et al. reported an improvement in the rate of undergrading from 36.3 % to 14.2 % after a lecture [22]. Egevad reported improvement of undergrading by the use of reference images [23]. It shows that the tendency for general pathologists to undergrade can improve when they study GS patterns using any of the common methods. Particular improvement in undergrading among general pathologists can be expected by preventing mistakes in identifying Gleason pattern 3 for pattern 2 and pattern 4 for pattern 3. 20 patients who diagnosed with NM and ASAP by general urologists were diagnosed with prostate cancer by urological patients. This discrepancy could be fatal. This discrepancy improved with time, 17/263 (6.5 %) in first and second period to 3/208 (1.4 %) in third and fourth period in the presents study (p = 0.01, chi-square test). Furthermore in 14 cases (70 %) the positive core was one. These result showed the discrepancy was caused by small cancer volume and interpretative error.
A limitation of this study was the inability to isolate the general pathologists from other educational sources associated with Gleason scoring over a period of 2 years. Therefore, any improvement seen may not necessarily be a direct result of the experience of the review by the urological pathologists.

Conclusion
The concordance rate of GS between the urological and general pathologists was initially low (47.5 %), but following the expert reviews there was a significantly improvement in concordance rate over time.