Abstract
Aim: We describe a new method to expand the tumor, lymph node, metastasis (TNM) staging system using a clustering algorithm. Cases of breast cancer were used for demonstration. Materials & methods: An unsupervised ensemble-learning algorithm was used to create dendrograms. Cutting the dendrograms produced prognostic systems. Results: Prognostic systems contained groups of patients with similar outcomes. The prognostic systems based on tumor size and lymph node status recapitulated the general structure of the TNM for breast cancer. The prognostic systems based on tumor size, lymph node status, histologic grade and estrogen receptor status revealed a more detailed stratification of patients when grade and estrogen receptor status were added. Conclusion: Prognostic systems from cutting the dendrogram have the potential to improve and expand the TNM.
Papers of special note have been highlighted as: • of interest; •• of considerable interest
References
- 1 . Criteria for prognostic factors and for an enhanced prognostic system. Cancer 72(10), 3131–3135 (1993).
- 2 Thin primary cutaneous malignant melanoma: a prognostic tree for 10-year metastasis is more accurate than American Joint Committee on Cancer staging. J. Clin. Oncol. 22(18), 3668–3676 (2004). • Develops a prognostic tree for thin invasive melanomas.
- 3 . Improved prediction of recurrence after curative resection of colon carcinoma using tree-based risk stratification. Cancer 100(5), 958–967 (2004).
- 4 . A postoperative prognostic nomogram for renal cell carcinoma. J. Urol. 166(1), 63–67 (2001). • Develops a nomogram for renal cell carcinoma.
- 5 Development and validation of a nomogram for predicting survival in patients with resected non-small-cell lung cancer. J. Clin. Oncol. 33(8), 861–869 (2015).
- 6 . Developing prognostic systems of cancer patients by ensemble clustering. J. Biomed. Biotechnol. 2009, 632786 (2009). •• Introduces the Ensemble Algorithm of Clustering Cancer Data.
- 7 On an ensemble algorithm for clustering cancer patient data. BMC Syst. Biol. 7(Suppl. 4), S9 (2013). • Studies the effect of parameters in the Ensemble Algorithm of Clustering Cancer Data.
- 8 . Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25), 14863–14868 (1998).
- 9 Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769), 503–511 (2000).
- 10 The Surveillance, Epidemiology, and End Results Program of the National Cancer Institute. www.seer.cancer.gov.
- 11 . Relationship among outcome, stage of disease, and histologic grade for 22,616 cases of breast cancer. Cancer 68(10), 2142–2149 (1991).
- 12 . Cancer Staging Manual (7th Edition). Springer, New York, NY, USA (2010).
- 13 The R Project for Statistical Computing. www.r-project.org.
- 14 . Survival Analysis: Techniques for Censored and Truncated Data (2nd Edition). Springer, New York, NY, USA (2005). •• Provides an excellent treatment of various topics in survival analysis.
- 15 . Finding Groups in Data: an Introduction to Cluster Analysis. Wiley, New York, NY, USA (1990). •• Describes the well-known Partitioning Around Medoids clustering procedure.
- 16 . The Elements of Statistical Learning (2nd Edition). Springer, New York, NY, USA (2013). •• Presents a well-organized exploration of key ideas in unsupervised learning.
- 17 . Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53(282), 457–481 (1958).
- 18 . Cluster Analysis (5th Edition). Wiley, NJ, USA (2011).
- 19 . Histologic grade remains a prognostic factor for breast cancer regardless of the number of positive lymph nodes and tumor size: a study of 161 708 cases of breast cancer from the SEER program. Arch. Pathol. Lab. Med. 138(8), 1048–1052 (2014).
- 20 . The clinical implications of integrating additional prognostic factors into the TNM. J. Surg. Oncol. 109(5), 391–394 (2014).
- 21 . An examination of TNM staging of melanoma by a machine learning algorithm. In: Proceedings of 2012 International Conference on Computerized Healthcare. Institute of Electrical and Electronics Engineers, NY, USA, 120–126 (2012).