Skip to main content
Top
Published in: Journal of Translational Medicine 1/2009

Open Access 01-12-2009 | Research

A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data

Authors: Lung-Cheng Huang, Sen-Yen Hsu, Eugene Lin

Published in: Journal of Translational Medicine | Issue 1/2009

Login to get access

Abstract

Background

In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for the association studies of disease susceptibility. In this work, our goal was to compare computational tools with and without feature selection for predicting chronic fatigue syndrome (CFS) using genetic factors such as single nucleotide polymorphisms (SNPs).

Methods

We employed the dataset that was original to the previous study by the CDC Chronic Fatigue Syndrome Research Group. To uncover relationships between CFS and SNPs, we applied three classification algorithms including naive Bayes, the support vector machine algorithm, and the C4.5 decision tree algorithm. Furthermore, we utilized feature selection methods to identify a subset of influential SNPs. One was the hybrid feature selection approach combining the chi-squared and information-gain methods. The other was the wrapper-based feature selection method.

Results

The naive Bayes model with the wrapper-based approach performed maximally among predictive models to infer the disease susceptibility dealing with the complex relationship between CFS and SNPs.

Conclusion

We demonstrated that our approach is a promising method to assess the associations between CFS and SNPs.
Literature
1.
go back to reference Griffith JP, Zarrouf FA: A systematic review of chronic fatigue syndrome: don't assume it's depression. Prim Care Companion J Clin Psychiatry. 2008, 10: 120-128. 10.4088/PCC.v10n0206.PubMedCentralCrossRefPubMed Griffith JP, Zarrouf FA: A systematic review of chronic fatigue syndrome: don't assume it's depression. Prim Care Companion J Clin Psychiatry. 2008, 10: 120-128. 10.4088/PCC.v10n0206.PubMedCentralCrossRefPubMed
2.
go back to reference Fukuda K, Straus SE, Hickie I, Sharpe MC, Dobbins JG, Komaroff A: The chronic fatigue syndrome: a comprehensive approach to its definition and study. Ann Intern Med. 1994, 121: 953-959.CrossRefPubMed Fukuda K, Straus SE, Hickie I, Sharpe MC, Dobbins JG, Komaroff A: The chronic fatigue syndrome: a comprehensive approach to its definition and study. Ann Intern Med. 1994, 121: 953-959.CrossRefPubMed
3.
go back to reference Afari N, Buchwald D: Chronic fatigue syndrome: a review. Am J Psychiatry. 2003, 160: 221-236. 10.1176/appi.ajp.160.2.221.CrossRefPubMed Afari N, Buchwald D: Chronic fatigue syndrome: a review. Am J Psychiatry. 2003, 160: 221-236. 10.1176/appi.ajp.160.2.221.CrossRefPubMed
4.
go back to reference Reeves WC, Wagner D, Nisenbaum R, Jones JF, Gurbaxani B, Solomon L, Papanicolaou DA, Unger ER, Vernon SD, Heim C: Chronic fatigue syndrome--a clinically empirical approach to its definition and study. BMC Med. 2005, 3: 19-10.1186/1741-7015-3-19.PubMedCentralCrossRefPubMed Reeves WC, Wagner D, Nisenbaum R, Jones JF, Gurbaxani B, Solomon L, Papanicolaou DA, Unger ER, Vernon SD, Heim C: Chronic fatigue syndrome--a clinically empirical approach to its definition and study. BMC Med. 2005, 3: 19-10.1186/1741-7015-3-19.PubMedCentralCrossRefPubMed
5.
go back to reference Sanders P, Korf J: Neuroaetiology of chronic fatigue syndrome: an overview. World J Biol Psychiatry. 2008, 9: 165-171. 10.1080/15622970701310971.CrossRefPubMed Sanders P, Korf J: Neuroaetiology of chronic fatigue syndrome: an overview. World J Biol Psychiatry. 2008, 9: 165-171. 10.1080/15622970701310971.CrossRefPubMed
6.
go back to reference Lin E, Hwang Y, Wang SC, Gu ZJ, Chen EY: An artificial neural network approach to the drug efficacy of interferon treatments. Pharmacogenomics. 2006, 7: 1017-1024. 10.2217/14622416.7.7.1017.CrossRefPubMed Lin E, Hwang Y, Wang SC, Gu ZJ, Chen EY: An artificial neural network approach to the drug efficacy of interferon treatments. Pharmacogenomics. 2006, 7: 1017-1024. 10.2217/14622416.7.7.1017.CrossRefPubMed
7.
go back to reference Lin E, Hwang Y, Tzeng CM: A case study of the utility of the HapMap database for pharmacogenomic haplotype analysis in the Taiwanese population. Mol Diagn Ther. 2006, 10: 367-370.CrossRefPubMed Lin E, Hwang Y, Tzeng CM: A case study of the utility of the HapMap database for pharmacogenomic haplotype analysis in the Taiwanese population. Mol Diagn Ther. 2006, 10: 367-370.CrossRefPubMed
8.
go back to reference Smith AK, White PD, Aslakson E, Vollmer-Conna U, Rajeevan MS: Polymorphisms in genes regulating the HPA axis associated with empirically delineated classes of unexplained chronic fatigue. Pharmacogenomics. 2006, 7: 387-394. 10.2217/14622416.7.3.387.CrossRefPubMed Smith AK, White PD, Aslakson E, Vollmer-Conna U, Rajeevan MS: Polymorphisms in genes regulating the HPA axis associated with empirically delineated classes of unexplained chronic fatigue. Pharmacogenomics. 2006, 7: 387-394. 10.2217/14622416.7.3.387.CrossRefPubMed
9.
go back to reference Goertzel BN, Pennachin C, de Souza Coelho L, Gurbaxani B, Maloney EM, Jones JF: Combinations of single nucleotide polymorphisms in neuroendocrine effector and receptor genes predict chronic fatigue syndrome. Pharmacogenomics. 2006, 7: 475-483. 10.2217/14622416.7.3.475.CrossRefPubMed Goertzel BN, Pennachin C, de Souza Coelho L, Gurbaxani B, Maloney EM, Jones JF: Combinations of single nucleotide polymorphisms in neuroendocrine effector and receptor genes predict chronic fatigue syndrome. Pharmacogenomics. 2006, 7: 475-483. 10.2217/14622416.7.3.475.CrossRefPubMed
10.
go back to reference Rajeevan MS, Smith AK, Dimulescu I, Unger ER, Vernon SD, Heim C, Reeves WC: Glucocorticoid receptor polymorphisms and haplotypes associated with chronic fatigue syndrome. Genes Brain Behav. 2007, 6: 167-176. 10.1111/j.1601-183X.2006.00244.x.CrossRefPubMed Rajeevan MS, Smith AK, Dimulescu I, Unger ER, Vernon SD, Heim C, Reeves WC: Glucocorticoid receptor polymorphisms and haplotypes associated with chronic fatigue syndrome. Genes Brain Behav. 2007, 6: 167-176. 10.1111/j.1601-183X.2006.00244.x.CrossRefPubMed
11.
go back to reference Smith AK, Dimulescu I, Falkenberg VR, Narasimhan S, Heim C, Vernon SD, Rajeevan MS: Genetic evaluation of the serotonergic system in chronic fatigue syndrome. Psychoneuroendocrinology. 2008, 33: 188-197. 10.1016/j.psyneuen.2007.11.001.CrossRefPubMed Smith AK, Dimulescu I, Falkenberg VR, Narasimhan S, Heim C, Vernon SD, Rajeevan MS: Genetic evaluation of the serotonergic system in chronic fatigue syndrome. Psychoneuroendocrinology. 2008, 33: 188-197. 10.1016/j.psyneuen.2007.11.001.CrossRefPubMed
12.
go back to reference Chung Y, Lee SY, Elston RC, Park T: Odds ratio based multifactor-dimensionality reduction method for detecting gene-gene interactions. Bioinformatics. 2007, 23: 71-76. 10.1093/bioinformatics/btl557.CrossRefPubMed Chung Y, Lee SY, Elston RC, Park T: Odds ratio based multifactor-dimensionality reduction method for detecting gene-gene interactions. Bioinformatics. 2007, 23: 71-76. 10.1093/bioinformatics/btl557.CrossRefPubMed
13.
go back to reference Lin E, Hsu SY: A Bayesian approach to gene-gene and gene-environment interactions in chronic fatigue syndrome. Pharmacogenomics. 2009, 10: 35-42. 10.2217/14622416.10.1.35.CrossRefPubMed Lin E, Hsu SY: A Bayesian approach to gene-gene and gene-environment interactions in chronic fatigue syndrome. Pharmacogenomics. 2009, 10: 35-42. 10.2217/14622416.10.1.35.CrossRefPubMed
14.
go back to reference Lin E, Huang LC: Identification of Significant Genes in Genomics Using Bayesian Variable Selection Methods. Computational Biology and Chemistry: Advances and Applications. 2008, 1: 13-18. Lin E, Huang LC: Identification of Significant Genes in Genomics Using Bayesian Variable Selection Methods. Computational Biology and Chemistry: Advances and Applications. 2008, 1: 13-18.
15.
go back to reference Lee KE, Sha N, Dougherty ER, Vannucci M, Mallick BK: Gene selection: a Bayesian variable selection approach. Bioinformatics. 2003, 19: 90-97. 10.1093/bioinformatics/19.1.90.CrossRefPubMed Lee KE, Sha N, Dougherty ER, Vannucci M, Mallick BK: Gene selection: a Bayesian variable selection approach. Bioinformatics. 2003, 19: 90-97. 10.1093/bioinformatics/19.1.90.CrossRefPubMed
16.
go back to reference Lin E, Hwang Y, Liang KH, Chen EY: Pattern-recognition techniques with haplotype analysis in pharmacogenomics. Pharmacogenomics. 2007, 8: 75-83. 10.2217/14622416.8.1.75.CrossRefPubMed Lin E, Hwang Y, Liang KH, Chen EY: Pattern-recognition techniques with haplotype analysis in pharmacogenomics. Pharmacogenomics. 2007, 8: 75-83. 10.2217/14622416.8.1.75.CrossRefPubMed
17.
go back to reference Lin E, Hwang Y, Chen EY: Gene-gene and gene-environment interactions in interferon therapy for chronic hepatitis C. Pharmacogenomics. 2007, 8: 1327-1335. 10.2217/14622416.8.10.1327.CrossRefPubMed Lin E, Hwang Y, Chen EY: Gene-gene and gene-environment interactions in interferon therapy for chronic hepatitis C. Pharmacogenomics. 2007, 8: 1327-1335. 10.2217/14622416.8.10.1327.CrossRefPubMed
19.
go back to reference Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques. 2005, San Francisco, CA, USA: Morgan Kaufmann Publishers Witten IH, Frank E: Data Mining: Practical Machine Learning Tools and Techniques. 2005, San Francisco, CA, USA: Morgan Kaufmann Publishers
20.
go back to reference Domingos P, Pazzani M: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning. 1997, 29: 103-137. 10.1023/A:1007413511361.CrossRef Domingos P, Pazzani M: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning. 1997, 29: 103-137. 10.1023/A:1007413511361.CrossRef
21.
go back to reference Vapnik V: The Nature of Statistical Learning Theory. 1995, New York, NY, USA: Springer-VerlagCrossRef Vapnik V: The Nature of Statistical Learning Theory. 1995, New York, NY, USA: Springer-VerlagCrossRef
22.
go back to reference Burges CJ: A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998, 2: 127-167. 10.1023/A:1009715923555.CrossRef Burges CJ: A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998, 2: 127-167. 10.1023/A:1009715923555.CrossRef
23.
go back to reference Quinlan JR: C4.5: Programs for Machine Learning. 1993, San Francisco, CA, USA: Morgan Kaufmann Publishers Quinlan JR: C4.5: Programs for Machine Learning. 1993, San Francisco, CA, USA: Morgan Kaufmann Publishers
24.
go back to reference Breiman L, Friedman JH, Olshen RA, Stone CJ: Classification and regression trees. 1995, Boca Raton, FL, USA: CRC Press Breiman L, Friedman JH, Olshen RA, Stone CJ: Classification and regression trees. 1995, Boca Raton, FL, USA: CRC Press
25.
go back to reference Listgarten J, Damaraju S, Poulin B, Cook L, Dufour J, Driga A, Mackey J, Wishart D, Greiner R, Zanke B: Predictive models for breast cancer susceptibility from multiple single nucleotide polymorphisms. Clin Cancer Res. 2004, 10: 2725-2737. 10.1158/1078-0432.CCR-1115-03.CrossRefPubMed Listgarten J, Damaraju S, Poulin B, Cook L, Dufour J, Driga A, Mackey J, Wishart D, Greiner R, Zanke B: Predictive models for breast cancer susceptibility from multiple single nucleotide polymorphisms. Clin Cancer Res. 2004, 10: 2725-2737. 10.1158/1078-0432.CCR-1115-03.CrossRefPubMed
26.
go back to reference Chen K, Kurgan L, Ruan J: Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs. BMC Struct Biol. 2007, 7: 25-10.1186/1472-6807-7-25.PubMedCentralCrossRefPubMed Chen K, Kurgan L, Ruan J: Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs. BMC Struct Biol. 2007, 7: 25-10.1186/1472-6807-7-25.PubMedCentralCrossRefPubMed
27.
go back to reference Forman G: An extensive empirical study of feature selection metrics for text classification. J Machine Learning Research. 2003, 3: 1289-1305. 10.1162/153244303322753670. Forman G: An extensive empirical study of feature selection metrics for text classification. J Machine Learning Research. 2003, 3: 1289-1305. 10.1162/153244303322753670.
28.
go back to reference Zheng C, Kurgan L: Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments. BMC Bioinformatics. 2008, 9: 430-10.1186/1471-2105-9-430.PubMedCentralCrossRefPubMed Zheng C, Kurgan L: Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments. BMC Bioinformatics. 2008, 9: 430-10.1186/1471-2105-9-430.PubMedCentralCrossRefPubMed
29.
go back to reference Kohavi R, John GH: Wrappers for feature subset selection. Artificial Intelligence. 1997, 97: 273-324. 10.1016/S0004-3702(97)00043-X.CrossRef Kohavi R, John GH: Wrappers for feature subset selection. Artificial Intelligence. 1997, 97: 273-324. 10.1016/S0004-3702(97)00043-X.CrossRef
30.
go back to reference Lin E, Hwang Y: A support vector machine approach to assess drug efficacy of interferon-alpha and ribavirin combination therapy. Mol Diagn Ther. 2008, 12: 219-223.CrossRefPubMed Lin E, Hwang Y: A support vector machine approach to assess drug efficacy of interferon-alpha and ribavirin combination therapy. Mol Diagn Ther. 2008, 12: 219-223.CrossRefPubMed
31.
go back to reference Fawcett T: An introduction to ROC analysis. Pattern Recognit Lett. 2006, 27: 861-874. 10.1016/j.patrec.2005.10.010.CrossRef Fawcett T: An introduction to ROC analysis. Pattern Recognit Lett. 2006, 27: 861-874. 10.1016/j.patrec.2005.10.010.CrossRef
32.
33.
go back to reference Aliferis CF, Statnikov A, Tsamardinos I, Schildcrout JS, Shepherd BE, Harrell FE: Factors influencing the statistical power of complex data analysis protocols for molecular signature development from microarray data. PLoS One. 2009, 4: e4922-10.1371/journal.pone.0004922.PubMedCentralCrossRefPubMed Aliferis CF, Statnikov A, Tsamardinos I, Schildcrout JS, Shepherd BE, Harrell FE: Factors influencing the statistical power of complex data analysis protocols for molecular signature development from microarray data. PLoS One. 2009, 4: e4922-10.1371/journal.pone.0004922.PubMedCentralCrossRefPubMed
34.
go back to reference Saeys Y, Inza I, Larrañaga P: A review of feature selection techniques in bioinformatics. Bioinformatics. 2007, 23: 2507-2517. 10.1093/bioinformatics/btm344.CrossRefPubMed Saeys Y, Inza I, Larrañaga P: A review of feature selection techniques in bioinformatics. Bioinformatics. 2007, 23: 2507-2517. 10.1093/bioinformatics/btm344.CrossRefPubMed
35.
go back to reference Guyon I, Weston J, Barnhill S, Vapnik V: Gene selection for cancer classification using support vector machines. Machine Learning. 2002, 46: 389-422. 10.1023/A:1012487302797.CrossRef Guyon I, Weston J, Barnhill S, Vapnik V: Gene selection for cancer classification using support vector machines. Machine Learning. 2002, 46: 389-422. 10.1023/A:1012487302797.CrossRef
36.
go back to reference Erdmann G, Berger S, Schütz G: Genetic dissection of glucocorticoid receptor function in the mouse brain. J Neuroendocrinol. 2008, 20: 655-659. 10.1111/j.1365-2826.2008.01717.x.CrossRefPubMed Erdmann G, Berger S, Schütz G: Genetic dissection of glucocorticoid receptor function in the mouse brain. J Neuroendocrinol. 2008, 20: 655-659. 10.1111/j.1365-2826.2008.01717.x.CrossRefPubMed
37.
go back to reference Garcia A, Steiner B, Kronenberg G, Bick-Sander A, Kempermann G: Age-dependent expression of glucocorticoid- and mineralocorticoid receptors on neural precursor cell populations in the adult murine hippocampus. Aging Cell. 2004, 3: 363-371. 10.1111/j.1474-9728.2004.00130.x.CrossRefPubMed Garcia A, Steiner B, Kronenberg G, Bick-Sander A, Kempermann G: Age-dependent expression of glucocorticoid- and mineralocorticoid receptors on neural precursor cell populations in the adult murine hippocampus. Aging Cell. 2004, 3: 363-371. 10.1111/j.1474-9728.2004.00130.x.CrossRefPubMed
38.
go back to reference Whorwood CB, Donovan SJ, Flanagan D, Phillips DI, Byrne CD: Increased glucocorticoid receptor expression in human skeletal muscle cells may contribute to the pathogenesis of the metabolic syndrome. Diabetes. 2002, 51: 1066-1075. 10.2337/diabetes.51.4.1066.CrossRefPubMed Whorwood CB, Donovan SJ, Flanagan D, Phillips DI, Byrne CD: Increased glucocorticoid receptor expression in human skeletal muscle cells may contribute to the pathogenesis of the metabolic syndrome. Diabetes. 2002, 51: 1066-1075. 10.2337/diabetes.51.4.1066.CrossRefPubMed
Metadata
Title
A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data
Authors
Lung-Cheng Huang
Sen-Yen Hsu
Eugene Lin
Publication date
01-12-2009
Publisher
BioMed Central
Published in
Journal of Translational Medicine / Issue 1/2009
Electronic ISSN: 1479-5876
DOI
https://doi.org/10.1186/1479-5876-7-81

Other articles of this Issue 1/2009

Journal of Translational Medicine 1/2009 Go to the issue
Obesity Clinical Trial Summary

At a glance: The STEP trials

A round-up of the STEP phase 3 clinical trials evaluating semaglutide for weight loss in people with overweight or obesity.

Developed by: Springer Medicine

Highlights from the ACC 2024 Congress

Year in Review: Pediatric cardiology

Watch Dr. Anne Marie Valente present the last year's highlights in pediatric and congenital heart disease in the official ACC.24 Year in Review session.

Year in Review: Pulmonary vascular disease

The last year's highlights in pulmonary vascular disease are presented by Dr. Jane Leopold in this official video from ACC.24.

Year in Review: Valvular heart disease

Watch Prof. William Zoghbi present the last year's highlights in valvular heart disease from the official ACC.24 Year in Review session.

Year in Review: Heart failure and cardiomyopathies

Watch this official video from ACC.24. Dr. Biykem Bozkurt discuss last year's major advances in heart failure and cardiomyopathies.