Skip to main content
Top
Published in: Journal of Experimental & Clinical Cancer Research 1/2009

Open Access 01-12-2009 | Research

Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data

Authors: Desheng Huang, Yu Quan, Miao He, Baosen Zhou

Published in: Journal of Experimental & Clinical Cancer Research | Issue 1/2009

Login to get access

Abstract

Background

More studies based on gene expression data have been reported in great detail, however, one major challenge for the methodologists is the choice of classification methods. The main purpose of this research was to compare the performance of linear discriminant analysis (LDA) and its modification methods for the classification of cancer based on gene expression data.

Methods

The classification performance of linear discriminant analysis (LDA) and its modification methods was evaluated by applying these methods to six public cancer gene expression datasets. These methods included linear discriminant analysis (LDA), prediction analysis for microarrays (PAM), shrinkage centroid regularized discriminant analysis (SCRDA), shrinkage linear discriminant analysis (SLDA) and shrinkage diagonal discriminant analysis (SDDA). The procedures were performed by software R 2.80.

Results

PAM picked out fewer feature genes than other methods from most datasets except from Brain dataset. For the two methods of shrinkage discriminant analysis, SLDA selected more genes than SDDA from most datasets except from 2-class lung cancer dataset. When comparing SLDA with SCRDA, SLDA selected more genes than SCRDA from 2-class lung cancer, SRBCT and Brain dataset, the result was opposite for the rest datasets. The average test error of LDA modification methods was lower than LDA method.

Conclusions

The classification performance of LDA modification methods was superior to that of traditional LDA with respect to the average error and there was no significant difference between theses modification methods.
Appendix
Available only for authorised users
Literature
1.
go back to reference Guyon I, Weston J, Barnhill , Vapnik V: Gene Selection for Cancer Classification using Support Vector Machines. Mach Learn. 2002, 46: 389-422. 10.1023/A:1012487302797.CrossRef Guyon I, Weston J, Barnhill , Vapnik V: Gene Selection for Cancer Classification using Support Vector Machines. Mach Learn. 2002, 46: 389-422. 10.1023/A:1012487302797.CrossRef
2.
go back to reference Breiman L: Random Forests. Mach Learn. 2001, 45: 5-32. 10.1023/A:1010933404324.CrossRef Breiman L: Random Forests. Mach Learn. 2001, 45: 5-32. 10.1023/A:1010933404324.CrossRef
3.
go back to reference Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-5121. 10.1073/pnas.091062498.CrossRef Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-5121. 10.1073/pnas.091062498.CrossRef
4.
go back to reference Guo Y, Hastie T, Tibshirani R: Regularized linear discriminant analysis and its application in microarrays. Biostatistics. 2005, 8: 86-100. 10.1093/biostatistics/kxj035.CrossRef Guo Y, Hastie T, Tibshirani R: Regularized linear discriminant analysis and its application in microarrays. Biostatistics. 2005, 8: 86-100. 10.1093/biostatistics/kxj035.CrossRef
5.
go back to reference Schäfer J, Strimmer K: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005, 4: Schäfer J, Strimmer K: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005, 4:
6.
go back to reference Yeung KY, Bumgarner RE, Raftery AE: Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics. 2005, 21: 2394-2402. 10.1093/bioinformatics/bti319.CrossRef Yeung KY, Bumgarner RE, Raftery AE: Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics. 2005, 21: 2394-2402. 10.1093/bioinformatics/bti319.CrossRef
7.
go back to reference Li T, Zhang C, Ogihara M: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics. 2004, 20: 2429-2437. 10.1093/bioinformatics/bth267.CrossRef Li T, Zhang C, Ogihara M: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics. 2004, 20: 2429-2437. 10.1093/bioinformatics/bth267.CrossRef
8.
go back to reference Gordon GJ, Jensen RV, Hsiao LL, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 2002, 62: 4963-4967. Gordon GJ, Jensen RV, Hsiao LL, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 2002, 62: 4963-4967.
9.
go back to reference Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA. 1999, 96: 6745-6750. 10.1073/pnas.96.12.6745.CrossRef Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA. 1999, 96: 6745-6750. 10.1073/pnas.96.12.6745.CrossRef
10.
go back to reference Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Sellers WR: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell. 2002, 1: 203-209. 10.1016/S1535-6108(02)00030-2.CrossRef Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Sellers WR: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell. 2002, 1: 203-209. 10.1016/S1535-6108(02)00030-2.CrossRef
11.
go back to reference Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M: Classification of human lung carcinomas by mRNA expressionprofiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci USA. 2001, 98: 13790-13795. 10.1073/pnas.191502998.CrossRef Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark EJ, Lander ES, Wong W, Johnson BE, Golub TR, Sugarbaker DJ, Meyerson M: Classification of human lung carcinomas by mRNA expressionprofiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci USA. 2001, 98: 13790-13795. 10.1073/pnas.191502998.CrossRef
12.
go back to reference Parmigiani G, Garrett-Mayer ES, Anbazhagan R, Gabrielson E: A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clin Cancer Res. 2004, 10: 2922-2927. 10.1158/1078-0432.CCR-03-0490.CrossRef Parmigiani G, Garrett-Mayer ES, Anbazhagan R, Gabrielson E: A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clin Cancer Res. 2004, 10: 2922-2927. 10.1158/1078-0432.CCR-03-0490.CrossRef
13.
go back to reference Khan J, Wei JS, Ringnér M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, Meltzer PS: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001, 7: 673-679. 10.1038/89044.CrossRef Khan J, Wei JS, Ringnér M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schwab M, Antonescu CR, Peterson C, Meltzer PS: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med. 2001, 7: 673-679. 10.1038/89044.CrossRef
14.
go back to reference Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature. 2002, 415: 436-442. 10.1038/415436a.CrossRef Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature. 2002, 415: 436-442. 10.1038/415436a.CrossRef
15.
go back to reference Opgen-Rhein R, Strimmer K: Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat Appl Genet Mol Biol. 2007, 6: Article9- Opgen-Rhein R, Strimmer K: Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach. Stat Appl Genet Mol Biol. 2007, 6: Article9-
16.
go back to reference Schäfer J, Strimmer K: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005, 4: Article32- Schäfer J, Strimmer K: A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol. 2005, 4: Article32-
17.
go back to reference Fisher RA: The Use of Multiple Measurements in Taxonomic Problems. Annuals of Eugenics. 1936, 7: 179-188.CrossRef Fisher RA: The Use of Multiple Measurements in Taxonomic Problems. Annuals of Eugenics. 1936, 7: 179-188.CrossRef
18.
go back to reference Hastie T, Tibshirani R, Friedman J: The elements of statistical learning; data mining, inference and prediction. 2001, New York: Springer, 193-224. Hastie T, Tibshirani R, Friedman J: The elements of statistical learning; data mining, inference and prediction. 2001, New York: Springer, 193-224.
19.
go back to reference R Development Core Team R: A language and environment forstatistical computing. 2009, R Foundation for StatisticalComputing, Vienna, Austria, ISBN 3-900051-07-0, [http://www.R-project.org] R Development Core Team R: A language and environment forstatistical computing. 2009, R Foundation for StatisticalComputing, Vienna, Austria, ISBN 3-900051-07-0, [http://​www.​R-project.​org]
20.
go back to reference Campioni M, Ambrogi V, Pompeo E, Citro G, Castelli M, Spugnini EP, Gatti A, Cardelli P, Lorenzon L, Baldi A, Mineo TC: Identification of genes down-regulated during lung cancer progression: a cDNA array study. J Exp Clin Cancer Res. 2008, 27: 38-10.1186/1756-9966-27-38.CrossRef Campioni M, Ambrogi V, Pompeo E, Citro G, Castelli M, Spugnini EP, Gatti A, Cardelli P, Lorenzon L, Baldi A, Mineo TC: Identification of genes down-regulated during lung cancer progression: a cDNA array study. J Exp Clin Cancer Res. 2008, 27: 38-10.1186/1756-9966-27-38.CrossRef
21.
go back to reference Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-5121. 10.1073/pnas.091062498.CrossRef Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001, 98: 5116-5121. 10.1073/pnas.091062498.CrossRef
22.
go back to reference Tibshirani R: Regression shrinkage and selection via the lasso. J Royal Statist Soc B. 1996, 58: 267-288. Tibshirani R: Regression shrinkage and selection via the lasso. J Royal Statist Soc B. 1996, 58: 267-288.
23.
go back to reference Xie Y, Pan W, Jeong KS, Khodursky A: Incorporating prior information via shrinkage: a combined analysis of genome-wide location data and gene expression data. Stat Med. 2007, 26: 2258-2275. 10.1002/sim.2703.CrossRef Xie Y, Pan W, Jeong KS, Khodursky A: Incorporating prior information via shrinkage: a combined analysis of genome-wide location data and gene expression data. Stat Med. 2007, 26: 2258-2275. 10.1002/sim.2703.CrossRef
24.
go back to reference Li Y, Campbell C, Tipping M: Bayesian automatic relevance determination algorithms for classifying gene expression data. Bioinformatics. 2002, 18: 1332-1339. 10.1093/bioinformatics/18.10.1332.CrossRef Li Y, Campbell C, Tipping M: Bayesian automatic relevance determination algorithms for classifying gene expression data. Bioinformatics. 2002, 18: 1332-1339. 10.1093/bioinformatics/18.10.1332.CrossRef
25.
go back to reference Diaz-Uriarte R: Supervised methods with genomic data: a review and cautionary view. Data analysis and visualization in genomics and proteomics. Edited by: Francisco Azuaje, Joaquín Dopazo. 2005, Hoboken: John Wiley & Sons, Ltd, 193-214. full_text.CrossRef Diaz-Uriarte R: Supervised methods with genomic data: a review and cautionary view. Data analysis and visualization in genomics and proteomics. Edited by: Francisco Azuaje, Joaquín Dopazo. 2005, Hoboken: John Wiley & Sons, Ltd, 193-214. full_text.CrossRef
26.
go back to reference Tsai CA, Chen CH, Lee TC, Ho IC, Yang UC, Chen JJ: Gene selection for sample classifications in microarray experiments. DNA Cell Biol. 2004, 23: 607-614. 10.1089/dna.2004.23.607.CrossRef Tsai CA, Chen CH, Lee TC, Ho IC, Yang UC, Chen JJ: Gene selection for sample classifications in microarray experiments. DNA Cell Biol. 2004, 23: 607-614. 10.1089/dna.2004.23.607.CrossRef
27.
go back to reference Dudoit S, Fridlyand J, Speed TP: Comparison of Discrimination Methods for the Classification o Tumors Using Gene Expression Data. J Am Stat Assoc. 2002, 97: 77-87. 10.1198/016214502753479248.CrossRef Dudoit S, Fridlyand J, Speed TP: Comparison of Discrimination Methods for the Classification o Tumors Using Gene Expression Data. J Am Stat Assoc. 2002, 97: 77-87. 10.1198/016214502753479248.CrossRef
28.
go back to reference Li H, Zhang K, Jiang T: Robust and accurate cancer classification with gene expression profiling. Proc IEEE Comput Syst Bioinform Conf: 8-11 August 2005; California. 2005, 310-321. Li H, Zhang K, Jiang T: Robust and accurate cancer classification with gene expression profiling. Proc IEEE Comput Syst Bioinform Conf: 8-11 August 2005; California. 2005, 310-321.
29.
go back to reference Breiman L, Spector P: Submodel selection and evaluation in regression: the x-random case. Int Stat Rev. 1992, 60: 291-319. 10.2307/1403680.CrossRef Breiman L, Spector P: Submodel selection and evaluation in regression: the x-random case. Int Stat Rev. 1992, 60: 291-319. 10.2307/1403680.CrossRef
30.
go back to reference Efron B: Bootstrap methods: Another look at the jackknife. Ann Stat. 1979, 7: 1-26. 10.1214/aos/1176344552.CrossRef Efron B: Bootstrap methods: Another look at the jackknife. Ann Stat. 1979, 7: 1-26. 10.1214/aos/1176344552.CrossRef
Metadata
Title
Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data
Authors
Desheng Huang
Yu Quan
Miao He
Baosen Zhou
Publication date
01-12-2009
Publisher
BioMed Central
Published in
Journal of Experimental & Clinical Cancer Research / Issue 1/2009
Electronic ISSN: 1756-9966
DOI
https://doi.org/10.1186/1756-9966-28-149

Other articles of this Issue 1/2009

Journal of Experimental & Clinical Cancer Research 1/2009 Go to the issue
Webinar | 19-02-2024 | 17:30 (CET)

Keynote webinar | Spotlight on antibody–drug conjugates in cancer

Antibody–drug conjugates (ADCs) are novel agents that have shown promise across multiple tumor types. Explore the current landscape of ADCs in breast and lung cancer with our experts, and gain insights into the mechanism of action, key clinical trials data, existing challenges, and future directions.

Dr. Véronique Diéras
Prof. Fabrice Barlesi
Developed by: Springer Medicine