The conventional Fuhrman grading system, which categorizes renal cell carcinoma (RCC) with grades I, II, III, and IV, is the most widely used predictor assessment of RCC cancer-specific mortality (CSM).
The aim of this study was to test the prognostic ability of simplified Fuhrman grading schemes (FGSs) that rely on two- or three-tiered classifications.
Design, setting, and participants
The current study addressed a population of 14
Univariable and multivariable analyses as well as prognostic accuracy analyses were performed for various FGSs to test their ability to predict CSM rates. The conventional four-tiered FGS was compared to a modified two-tiered FGS in which grades I and II and grades III and IV were combined. A second simplified three-tiered FGS in which grades I and II were combined but grades III and IV were kept separate was also tested.
Results and limitations
The overall 5-yr CSM-free rate was 81.5%. All three FGSs achieved independent predictor status in multivariable analyses. Prognostic accuracy of multivariable models that relied on various FGSs was 83.6% for the modified two-tiered FGS and 83.8% for both the conventional four-tiered and the modified three-tiered FGS.
Our findings indicate that the simplified FGSs perform equally as well as the conventional four-tiered FGS. The use of simplified grading schemes may represent an advantage for pathologists as well as for clinicians caring for patients with RCC.
Keywords: Fuhrman grade, Clear cell renal cell carcinoma, Prognosis, Mortality, Classification.
Fuhrman grade (FG) represents one of the foremost prognostic variables in patients with all stages of renal cell carcinoma (RCC) . Zisman et al  and Ficarra et al  suggested two simplified revisions of the conventional four-tiered Fuhrman grading scheme (FGS). The Zisman et al  model consisted of grouping grades I and II and grades III and IV into a two-tiered scheme instead of the conventional four-tiered FGS. The Ficarra et al  model consisted of grouped grades I and II and unchanged groupings of grades III and IV into a three-tiered FGS. The validity of these simplified grading schemes was recently confirmed in a large European cohort of 5453 patients from 14 centers of excellence . In the current study, we examined whether the simplified FGSs were equally informative and accurate when they were applied to a population-based cohort of patients with RCC who were treated with partial or radical nephrectomy in the United States.
2. Materials and methods
2.1. Study population
Patients diagnosed with RCC and treated with partial or radical nephrectomy between 1988–2004 were identified within nine Surveillance, Epidemiology, and End Results (SEER) cancer registries . The registries consist of the Atlanta, Georgia; Detroit, Michigan; San Francisco-Oakland, California; and Seattle-Puget Sound, Washington, metropolitan areas and the states of Connecticut, Hawaii, Iowa, New Mexico, and Utah. Two kidney cancer diagnostic codes (International Classification of Disease for Oncology [second edition; ICD-O-2] C64.9 code and the ninth revision [ICD-O-9] 189.0 code) were used as inclusion criteria. The presence of both diagnostic codes resulted in the identification of RCC patients and excluded patients with upper tract transitional carcinoma or noncortical renal tumors (ie, melanomas, sarcomas, and lymphomas). Only patients with clear cell histological subtype were included in this analysis. Further exclusion criteria were unknown FG, unknown tumor size, and unknown SEER stage.
2.2. Statistical analyses
Patients were stratified according to the conventional and modified FGSs. The Zisman et al  modified FGS consists of two strata in which grades I and II were combined as low grade and grades III and IV were combined as high grade. The second modified scheme, proposed by Ficarra et al , consists of three strata in which grades I and II were combined and grades III and IV were considered separately.
Kaplan-Meier plots were used to graphically explore the univariable ability of the three examined FGSs to stratify cancer-specific mortality (CSM) after partial or radical nephrectomy for RCC. Subsequently, univariable and multivariable Cox regression analyses addressing CSM were used with the intent of testing the univariable and multivariable prognostic ability of the three tested FGSs. Separate models were fitted with either no FG, conventional four-tiered FG, or one of the two modified FGSs. The covariates consisted of age categories (≤49 vs 50–59 vs 60–69 vs 70–79 vs ≥80), gender (female vs male), race (Caucasian vs African American vs Hispanic vs other), year of surgery, tumor size, and SEER stage (localized vs locoregional vs distant).
Cox proportional hazards regression model coefficients were then used to first quantify the univariable prognostic accuracy of the FGSs. Then multivariable Cox regression coefficients were used to quantify the prognostic ability of the three tested FGS in combination with all other covariables. In accuracy analyses, a value of 100% indicates a perfect prediction, whereas 50% is equivalent to a toss of a coin. Predictive accuracy is usually quantified with receiver operating characteristics–derived area under the curve (AUC). In Cox regression models, the AUC is replaced with Harrell's concordance index . This method was used for AUC determination in the current analysis. Two hundred bootstrap resamples were used to reduce the overfit bias and to internally validate all accuracy estimates. The statistical significance of the differences between various accuracy estimates was compared with the Mantel-Haenszel test.
All statistical tests were performed using the S-PLUS Professional v.1 (Mathsoft, Seattle, WA, USA) or the Statistical Package for Social Science v.15.0 (SPSS, Chicago, IL, USA). Moreover, all tests were two-sided with a significance level set at 0.05.
Patient characteristics are described in Table 1. Of the entire patient population (n
|No. of patients||14 064 (100.0)|
|Age groups, yr|
|11 022 (78.4)|
|Year of surgery quartiles|
|Type of nephrectomy|
|12 617 (89.7)|
|Tumor size (cm)|
The CSM-free rates of all patients were 88.8%, 81.5%, and 74.6% at 2, 5, and 10 yr following surgery, respectively (Fig. 1A). After stratification according to the conventional four-tiered FGS (Fig. 1B), 5-yr CSM rates were 6.7%, 13.2%, 34.4%, and 58.3% for FG I, II, III, and IV, respectively. Statistically significant intergroup differences in CSM were recorded among all four groups (all log-rank p-values <0.001). According to the Zisman et al  modified two-tiered FGS (Fig. 1C) in which grades I and II and grades III and IV are combined, the 5-yr CSM rates were, respectively, 11.1% and 39.0% (log-rank p-value <0.001). Finally, according to the Ficarra et al  modified three-tiered FGS (Fig. 1D) in which grades I and II were combined and grades III and IV were unchanged, the respective 5-yr CSM rates were 11.1%, 34.4%, and 58.3% (both log-rank p-values <0.001).
Table 2 shows the univariable models that quantify the accuracy of the various FGSs. The conventional four-tiered FG contributed to 70.9% accuracy versus 67.7% for the modified two-tiered FGS proposed by Zisman et al  and versus 68.7% for the three-tiered FGS suggested by Ficarra et al .
|Univariable analysis||Model without FG||Model with four-tiered FGS: I vs II vs III vs IV||Model with three-tiered FGS : I–II vs III vs IV||Model with two-tiered FGS : I–II vs III–IV|
|HR; p-value||AUC of individual predictor variable, %||HR; p-value||HR; p-value||HR; p-value||HR; p-value|
|Fuhrman grade||–; <0.001||70.9||–||–; <0.001||–||–|
|1.9; <0.001||1.5; <0.001|
|5.6; <0.001||2.7; <0.001|
|12.1; <0.001||4.4; <0.001|
|Fuhrman grade||–; <0.001||68.7||–||–||–; <0.001||–|
|3.4; <0.001||2.0; <0.001|
|7.4; <0.001||3.1; <0.001|
|4.0; <0.001||67.7||–||–||–||2.1; <0.001|
|Tumor size||1.0; <0.001||73.1||1.0; <0.001||1.0; <0.001||1.0; <0.001||1.0; <0.001|
|Year of surgery||1.0; <0.001||53.5||1.0; <0.001||0.9; <0.001||1.0; <0.001||1.0; <0.001|
|Type of surgery|
|3.7; <0.001||53.5||1.4; 0.02||1.3; 0.04||1.8: <0.001||1.3; 0.03|
|SEER stage||–; <0.001||78.1||–; <0.001||–; <0.001||–; <0.001||–; <0.001|
|5.1; <0.001||3.7; <0.001||3.1; <0.001||3.2; <0.001||3.2; <0.001|
|21.7; <0.001||14.8; <0.001||11.2; <0.001||11.6; <0.001||11.8; <0.001|
|Race||–; 0.004||51.3||–; 0.3||–; 0.2||–; 0.2||–; 0.2|
|0.9; 0.3||0.9; 0.2||0.9; 0.2||0.9; 0.1||0.9; 0.1|
|0.8; 0.001||0.9; 0.2||0.9; 0.3||0.9; 0.2||0.9; 0.2|
|0.9; 0.1||0.9; 0.2||0.9; 0.2||0.9; 0.1||0.9; 0.1|
|0.8; <0.001||52.7||0.9; 0.007||0.9; 0.02||0.9; 0.006||0.9; 0.02|
|Age group||–; <0.001||53.2||–; <0.001||–; <0.001||–; <0.001||–; <0.001|
|1.4; <0.001||1.2; 0.002||1.3; 0.001||1.3; <0.001||1.2; 0.001|
|1.5; <0.001||1.3; <0.001||1.3; <0.001||1.3; <0.001||1.3; <0.001|
|1.5; <0.001||1.5; <0.001||1.6; <0.001||1.6; <0.001||1.6; <0.001|
|1.6; <0.001||1.5; <0.001||1.7; <0.001||1.7; <0.001||1.6; <0.001|
|AUC of multivariable models, %||–||–||82.6||83.8||83.8||83.6|
|[Mantel-Haenszel test]||–||–||–||+1.2 (p
Four multivariate models are shown in Table 2. The first model does not include FG. The second model relies on the conventional four-tiered FGS. The third model relies on the three-tiered Ficarra et al scheme  in which grades I and II are combined and grades III and IV are considered separately. The fourth model relies on the Zisman et al two-tiered scheme  in which grades I and II are paired as one group and grades III and IV are considered as another group. In all models in which FG was considered and irrespective of the type of FGS, all FGSs achieved an independent predictor status. Specifically, the first model that did not rely on any FGS demonstrated 82.6% accuracy. Both the second model that included the conventional four-tiered FGS and the third model that relied on the Ficarra et al  modified three-tiered FGS resulted in 83.8% accuracy. Finally, the fourth model, which relied on the modified two-tiered FGS proposed by Zisman et al , resulted in 83.6% accuracy.
Taken together in univariable models, all FGSs contributed to accurate predictions (67.7–70.9%) that were clearly superior to 50% (flip of a coin). In multivariable models, the consideration of FGS improved the prognostic ability (from 82.6% to 83.6–83.8%) and resulted in accuracy figures that exceeded that of a model that did not consider any type of FGS (82.6%). Both modified FGSs (83.6–83.8%) demonstrated virtually the same accuracy as the conventional four-tiered FGS (83.8%).
Finally, compared to the base model without FG, a statistically significant gain was noted when the conventional four-tiered FGS was included (1.2%, p
FG represents one of the foremost predictors of CSM. In virtually all prognostic models, FG showed independent predictor status. For example, the University of California Los Angeles Integrated Staging System (UISS) relies on FG for prognostic stratification  and . Similarly, the Mayo Clinic stage, size, grade, and necrosis score  and the Karakiewicz et al nomogram  also rely on FG for prediction of CSM.
Despite its central role in virtually all prognostic schemes and models, the interobserver agreement for FG assignment using the conventional four-tiered scheme is low to moderate. Lang et al  showed a κ of 0.22 for interobserver agreement when the four-tiered FGS was examined (n
Our results confirmed our hypothesis and demonstrated that simplified FGSs are equally as accurate as the conventional four-tiered FGS (Table 2). The alternative FGS resulted in 83.6% and 83.8% accuracy, as compared to 83.8% for the conventional four-tiered FGS. Moreover, all tested FGSs improved the prognostic ability of CSM prediction (83.6–83.8%, all p-values <0.001) relative to models in which FG was not considered (82.6%).
We corroborate the findings of the renowned pathologist Rioux-Leclercq and colleagues , who performed a similar analysis in 5453 patients from European centers of excellence. In that report, a similar increase in accuracy was noted when various FGSs were tested (84.6%) relative to the multivariable model that did not rely on FG (83.8%). The similarity of our findings implies that FG is performed with similar accuracy on both continents. Moreover, our findings demonstrate that a simplified two-tiered FGS can be safely used in North America and results in no loss of prognostic ability relative to the conventional four-tiered FGS. Our results virtually perfectly replicate the results of Rioux-Leclercq et al  from a cohort of European patients. It appears that on either continent, the use of the simplest two-tiered FGS, proposed by Zisman et al , does not undermine the ability to predict CSM after partial or radical nephrectomy.
From a practical standpoint, our findings imply that pathologists do not need to make the distinction between grades I and II, which are differentiated by a mere presence of nucleoli at ×400 magnification. Similarly, FG IV tumors are differentiated from FG III tumors by the identification of bizarre multilobed nuclei . This relatively subjective definition may undermine the reproducibility of FG IV assignment. The findings of Al-Aynati et al  substantiate our results. In their study, Al-Aynati et al  showed that the use of a simplified FG resulted in better intraobserver agreement. Similarly, the use of a simplified FG also resulted in better interobserver agreement. In consequence, simplified FGSs such as the two-tiered FGS proposed by Zisman et al  may result in resource savings and in better interobserver agreement. This may prove useful in the pathological assessment of nephrectomy specimens as well as in grade assignment in renal mass biopsy samples. Nonetheless, our findings and those of Rioux-Leclercq et al  need to be validated by a forum or an expert panel of urologic pathologists before being implemented in clinical practice.
From the perspective of accuracy, the use of a two-tiered FGS  results in virtually the same results as a three-  or four-tiered FGS. Therefore, a simplification from four categories to either three or two categories appears valid and justified. The recommendations for a better FGS consisting of two or three categories should not only be made by urologists and prognosticians but should also involve pathologists since not only predictive accuracy but also pathological considerations are necessary to decide which of the simplified FGS (two-tiered  vs three-tiered ) is ideal. It is a matter of debate whether the two-tiered versus three tiered FGS is better. For example, our findings indicate a substantial degree of heterogeneity between FG III and IV. This may imply that patients with FG III and IV behave differently and should not be grouped. The convention not to group patients with FG III and IV is also supported by the excellent accuracy of a modified FGS that relies on three tiers (I and II vs III vs IV). Ficarra et al also arrived at this conclusion . We hope that this study and other similar studies will form the basis for a revision of the current FGS according to a consensus of experts in the field of genitourinary pathology, epidemiology, and urologic oncology.
Our results indicate that two-  or three-tiered FGSs  are equally as valuable as the conventional four-tiered FGS based on accuracy criteria. This implies that the performance of prognostic schemes such as the UISS  or the Karakiewicz et al nomogram , which relied on the four-tiered FGS, is not undermined. Our findings only suggest that simplified FGSs may result in time and financial resource savings. For this to happen, the modified FGSs need to be prospectively used in clinical practice.
Our analyses only relied on clear cell RCC histological subtype due to lack of proven validity of FG in other histological subtypes. For example, Sika-Paotonu et al  showed that FG was unrelated to CSM in patients with papillary renal cell carcinoma. Similarly, Delahunt et al  used the same approach to refute the prognostic ability of FG in chromophobe RCC variant.
Despite its practical value, our study has limitations. These consist of lack of central pathological review for all specimens, which may be interpreted by some as a major limitation . However, in everyday clinical practice, the vast majority of specimens are not subjected to central review. In consequence, our study is reflective of a real-life phenomenon in which FG assignment is performed at each individual institution. Under such circumstances, the combination of low and high FGS might be highly valuable. Interobserver agreement may be reduced if multiple FG levels require assignment, as is the case for the conventional four-tiered FGS. Conversely, interobserver agreement may be improved if the number of variable levels is maximally reduced, as is the case for the modified two-tiered FGS . Consequently, the simplified FGS may prove particularly useful in community practice.
Additionally, it may be postulated that an observational database such as the SEER registry lacks detail. Indeed, some important variables were not recorded. For example, performance status, presence of tumor necrosis, and various hematologic and biochemistry values were not included , , and . These values have been shown to predict CSM in specific patient groups , , and . Consideration of these variables could decrease the independent prognostic contribution of various FGSs. Unfortunately, only detailed institutional databases that contain these variables will prove or disprove this potential limitation. Moreover, our study represents a retrospective analysis. This limitation is shared with all other studies that assessed the value and limitations of FG. In consequence, a prospective trial of the prognostic value of a simplified FGS will represent the ultimate proof of improved ability to predict prognosis. Finally, our study relied on 14 064 patients, which exceeds the sample sizes of all previous studies that addressed this topic , , , , and . Despite our sample size, we could not address some of the issues that were examined in these previous studies. For example, inter- and intraobserver variability of FG assignment could not be tested due to lack of available pathologist identifiers. Similarly, the effect of institutional volume or institution type also could not be explored.
Simplified FGSs with two tiers  or three tiers  were shown to result in better, or at least equal, interobserver agreement relative to the conventional four-tiered FGS. Moreover, a large European study showed that a modified two-tiered FGS predicts CSM with virtually equal accuracy relative to the conventional four-tiered FGS . We confirmed the prognostic ability of two modified FGSs in a large North American population. Based on concordant findings from two large cohorts from Europe and North America, we propose the implementation of a simplified FGS into routine clinical practice.
Study concept and design: Sun, Lughezzani, Jeldres, Isbarn, Karakiewicz.
Acquisition of data: Sun, Lughezzani, Jeldres, Karakiewicz.
Analysis and interpretation of data: Sun, Lughezzani, Isbarn, Shariat, Latour, Karakiewicz.
Drafting of the manuscript: Sun, Lughezzani, Jeldres, Isbarn, Karakiewicz.
Critical revision of the manuscript for important intellectual content: Shariat, Arjane, Widmer, Pharand, Latour, Perrotte, Karakiewicz.
Statistical analysis: Sun, Lughezzani, Karakiewicz.
Obtaining funding: Karakiewicz.
Administrative, technical, or material support: Shariat, Perrotte, Karakiewicz.
Supervision: Arjane, Widmer, Pharand, Shariat, Perrotte, Karakiewicz.
Other (specify): None.
Financial disclosures: I certify that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: Dr. Karakiewicz is partially supported by the University of Montreal Urology Associates, Fonds de la Recherche en Santé du Québec, the University of Montreal Department of Surgery, and the University of Montreal Foundation.
Funding/Support and role of the sponsor: None.
-  S.A. Fuhrman, L.C. Lasky, C. Limas. Prognostic significance of morphologic parameters in renal cell carcinoma. Am J Surg Pathol. 1982;6:655-663
-  A. Zisman, A.J. Pantuck, F. Dorey, et al. Improved prognostication of renal cell carcinoma using an integrated staging system. J Clin Oncol. 2001;19:1649-1657
-  V. Ficarra, G. Martignoni, N. Maffei, et al. Original and reviewed nuclear grading according to the Fuhrman system. Cancer. 2005;103:68-75 Crossref.
-  N. Rioux-Leclercq, P.I. Karakiewicz, Q.D. Trinh, et al. Prognostic ability of simplified nuclear grading of renal cell carcinoma. Cancer. 2007;109:868-874 Crossref.
-  B.F. Hankey, L.A. Ries, B.K. Edwards. The Surveillance, Epidemiology, and End Results program: a national resource. Cancer Epidemiol Biomarkers Prev. 1999;8:1117-1121
-  F.E. Harrell, K.L. Lee, D.B. Mark. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361-387 Crossref.
-  A. Zisman, A.J. Pantuck, J. Wieder, et al. Risk group assessment and clinical outcome algorithm to predict the natural history of patients with surgically resected renal cell carcinoma. J Clin Oncol. 2002;20:4559-4566 Crossref.
-  I. Frank, M.L. Blute, J.C. Cheville, C.M. Lohse, A.L. Weaver, H. Zincke. An outcome prediction model for patients with clear cell renal cell carcinoma treated with radical nephrectomy based on tumor stage, size, grade and necrosis: The SSIGN score. J Urol. 2002;168:2395-2400
-  P.I. Karakiewicz, A. Briganti, F.K. Chun, et al. Multi-institutional validation of a new renal cancer-specific survival nomogram. J Clin Oncol. 2007;25:1316-1322 Crossref.
-  H. Lang, V. Lindner, M. de Fromont, et al. Multicenter determination of optimal interobserver agreement using the Fuhrman grading system for renal cell carcinoma: assessment of 241 patients with >15-year follow-up. Cancer. 2005;103:625-629 Crossref.
-  M. Al-Aynati, V. Chen, S. Salama, H. Shuhaibar, D. Treleaven, L. Vincic. Interobserver and intraobserver variability using the Fuhrman grading system for renal cell carcinoma. Arch Pathol Lab Med. 2003;127:593-596
-  D. Sika-Paotonu, P.B. Bethwaite, M.R.E. McCredie, W.T. Jordan, B. Delahunt. Nucleolar grade but not Fuhrman grade is applicable to papillary renal cell carcinoma. Am J Surg Pathol. 2006;30:1091-1096 Crossref.
-  B. Delahunt, D. Sika-Paotonu, P.B. Bethwaite, et al. Fuhrman grading is not appropriate for chromophobe renal cell carcinoma. Am J Surg Pathol. 2007;31:957-960 Crossref.
-  R.J. Motzer, M. Mazumdar, J. Bacik, W. Berg, A. Amsterdam, J. Ferrara. Survival and prognostic stratification of 670 patients with advanced renal cell carcinoma. J Clin Oncol. 1999;17:2530-2540
a Cancer Prognosis and Health Outcomes Unit, University of Montreal Health Center, Montreal, QC, Canada
b Department of Urology, Vita-Salute San Raffaele University, Milan, Italy
c Martini-clinic, Prostate Cancer Center Hamburg-Eppendorf, Hamburg, Germany
d Department of Urology, University of Montreal, Montreal, QC, Canada
e Department of Urology, Lille University Hospital, Lille, France
Corresponding author. Cancer Prognostics and Health Outcomes Unit, University of Montreal Health Center (CHUM), 1058, rue St-Denis, Montréal, Québec, Canada, H2X 3J4. Tel. +1 514 890 8000 35336; Fax: +1 514 412 7363.
These authors have made equal contributions to this manuscript.
Please visit www.eu-acme.org/europeanurology to read and answer questions on-line. The EU-ACME credits will then be attributed automatically.
© 2009 Published by Elsevier B.V.