Top

BMC Medical Informatics and Decision Making

Published in:

Open Access 01-12-2017 | Research Article

Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data

Authors: Yolanda Garcia-Chimeno, Begonya Garcia-Zapirain, Marian Gomez-Beldarrain, Begonya Fernandez-Ruanova, Juan Carlos Garcia-Monco

Published in: BMC Medical Informatics and Decision Making | Issue 1/2017

Abstract

Background

Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition – factors that influence of pain perceptions.

Methods

We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms.

Results

When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions).

Conclusions

The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.

Stewart WF, Lipton RB, Celentano DD, Reed ML. Prevalence of migraine headache in the United States: relation to age, income, race, and other sociodemographic factors. Jama. 1992; 267(1):64–9.CrossRefPubMed

Stovner LJ, Zwart JA, Hagen K, Terwindt G, Pascual J. Epidemiology of headache in Europe. Eur J Neurol. 2006; 13(4):333–45.CrossRefPubMed

Wood AJ, Goadsby PJ, Lipton RB, Ferrari MD. Migraine—current understanding and treatment. N Engl J Med. 2002; 346(4):257–70.CrossRef

Bigal ME, Lipton RB. Clinical course in migraine conceptualizing migraine transformation. Neurology. 2008; 71(11):848–55.CrossRefPubMed

Kira K, Rendell LA. A practical approach to feature selection. In: Proceedings of the ninth international workshop on Machine learning. San Mateo: Morgan Kaufmann Publishers: 1992. p. 249–56.

Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007; 23(19):2507–17.CrossRefPubMed

Forman G. An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res. 2003; 3:1289–305.

Dash M, Liu H. Consistency-based search in feature selection. Artif Intell. 2003; 151(1):155–76.CrossRef

Yu L, Liu H. Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res. 2004; 5:1205–24.

10.

Wang G, Song Q, Sun H, Zhang X, Xu B, Zhou Y. A feature subset selection algorithm automatic recommendation method. J Artif Intell Res (JAIR). 2013; 47:1–34.

11.

Dash M, Liu H. Feature selection for classification. Intell Data Anal. 1997; 1(1):131–56.CrossRef

12.

Battiti R. Using mutual information for selecting features in supervised neural net learning. Neural Netw IEEE Trans. 1994; 5(4):537–50.CrossRef

13.

Kwak N, Choi CH. Input feature selection by mutual information based on Parzen window. Pattern Anal Mach Intell IEEE Trans. 2002; 24(12):1667–71.CrossRef

14.

Estévez PA, Caballero RE. A niching genetic algorithm for selecting features for neural network classifiers. In: ICANN 98. London: Springer: 1998. p. 311–16.

15.

Lashkia GV, Anthony L. Relevant, irredundant feature selection and noisy example elimination. Syst Man Cybern B Cybern IEEE Trans. 2004; 34(2):888–97.CrossRef

16.

Pal M, Foody GM. Feature selection for classification of hyperspectral data by SVM. Geosci Remote Sens IEEE Trans. 2010; 48(5):2297–307.CrossRef

17.

Hsu C-W, Chang C-C, Lin C-J. A practical guide to support vector classification. Tech rep, Department of Computer Science, National Taiwan University. 2003.

18.

An TK, Kim MH. A new diverse AdaBoost classifier. In: Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on. Vol. 1. IEEE: 2010. p. 359–63.

19.

Murphy KP. Naive Bayes classifiers: University of British Columbia; 2006. Technical Report, [online] Available: http://www.cs.ubc.ca/murphyk/Teaching/CS340-Fall06/reading/NB.pdf.

20.

Maji S, Berg AC, Malik J. Classification using intersection kernel support vector machines is efficient. In: Computer Vision and Pattern Recognition, 2008. CVPR 2008 IEEE Conference on. IEEE: 2008. p. 1–8.

21.

Hu W, Hu W. Network-based intrusion detection using Adaboost algorithm. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence. IEEE Computer Society: 2005. p. 712–17.

22.

Panda M, Patra MR. Network intrusion detection using naive bayes. In J Comput Sci Netw Secur. 2007; 7(12):258–63.

23.

Gomez-Beldarrain M, Oroz I, Zapirain BG, Ruanova BF, Fernandez YG, Cabrera A, et al. Right fronto-insular white matter tracts link cognitive reserve and pain in migraine patients. J Headache Pain. 2016; 17(1):1.CrossRef

24.

Le Bihan D, Mangin JF, Poupon C, Clark CA, Pappata S, Molko N, et al. Diffusion tensor imaging: concepts and applications. J Magn Reson Imaging. 2001; 13(4):534–46.CrossRefPubMed

25.

Song SK, Yoshino J, Le TQ, Lin SJ, Sun SW, Cross AH, et al. Demyelination increases radial diffusivity in corpus callosum of mouse brain. Neuroimage. 2005; 26(1):132–40.CrossRefPubMed

26.

Maes M, De Ruyter M, Hobin P, Suy E. The dexamethasone suppression test, the Hamilton Depression Rating Scale and the DSM-III depression categories. J Affect Disord. 1986; 10(3):207–14.CrossRefPubMed

27.

Alonso J, Prieto L, Anto J. [The Spanish version of the SF-36 Health Survey (the SF-36 health questionnaire): an instrument for measuring clinical results]. Medicina clínica. 1995; 104(20):771–6.PubMed

28.

Gomar JJ, Ortiz-Gil J, McKenna PJ, Salvador R, Sans-Sansa B, Sarró S, et al. Validation of the Word Accentuation Test (TAP) as a means of estimating premorbid IQ in Spanish speakers. Schizophr Res. 2011; 128(1):175–6.CrossRefPubMed

29.

Patrick DL, Hurst BC, Hughes J. Further Development and Testing of the Migraine-Specific Quality of Life (MSQOL) Measure. Headache J Head Face Pain. 2000; 40(7):550–60.CrossRef

30.

Stewart WF, Lipton RB, Kolodner K. Migraine disability assessment (MIDAS) score: relation to headache frequency, pain intensity, and headache symptoms. Headache J Head Face Pain. 2003; 43(3):258–65.CrossRef

31.

Beck AT, Steer RA, Ball R, Ranieri WF. Comparison of Beck Depression Inventories-IA and-II in psychiatric outpatients. J Pers Assess. 1996; 67(3):588–97.CrossRefPubMed

32.

Radat F, Irachabal S, Lafittau M, Creac’h C, Dousset V, Henry P. Construction of a medication dependence questionnaire in headache patients (MDQ-H) validation of the French version. Headache J Head Face Pain. 2006; 46(2):233–9.CrossRef

33.

Prettenhofer P, Louppe G. Gradient Boosted Regression Trees in Scikit-Learn. In: PyData 2014. London: 2014.

34.

Stemle E, Onysko A. Automated L1 identification in English learner essays and its implications for language transfer. Transf Eff Multiling Lang Dev. 2015; 4:297.

35.

Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, et al. API design for machine learning software: experiences from the scikit-learn project. arXiv preprint arXiv:1309.0238. 2013.

36.

Abraham A, Pedregosa F, Eickenberg M, Gervais P, Mueller A, Kossaifi J, Gramfort A, Thirion B, Varoquaux G. Machine learning for neuroimaging with scikit-learn. Frontiers Neuroinformatics. 2014;8.

37.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011; 12:2825–30.

38.

Mathanker S, Weckler P, Bowser T, Wang N, Maness N. AdaBoost classifiers for pecan defect classification. Comput Electron Agric. 2011; 77(1):60–8.CrossRef

39.

Islam MJ, Wu QJ, Ahmadi M, Sid-Ahmed MA. Investigating the performance of naive-bayes classifiers and k-nearest neighbor classifiers. In: Convergence Information Technology, 2007. International Conference on. IEEE: 2007. p. 1541–46.

40.

Dietterich TG. Ensemble methods in machine learning. In: International workshop on multiple classifier systems. Berlin Heidelberg: Springer: 2000. p. 1–15.

41.

Witten IH, Frank E. Data Mining: Practical machine learning tools and techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems). San Francisco: Morgan Kaufmann Publishers Inc.; 2005.

42.

Garrett D, Peterson D, Anderson CW, Thaut MH, et al. Comparison of linear, nonlinear, and feature selection methods for EEG signal classification. Neural Syst Rehabil Eng IEEE Trans. 2003; 11(2):141–4.CrossRef

43.

Dyrba M, Ewers M, Wegrzyn M, Kilimann I, Plant C, Oswald A, et al. Robust automated detection of microstructural white matter degeneration in Alzheimer’s disease using machine learning classification of multicenter DTI data. PloS ONE. 2013; 8(5):e64925.CrossRefPubMedPubMedCentral

44.

Ingalhalikar M, Kanterakis S, Gur R, Roberts T, Verma R. DTI based diagnostic prediction of a disease via pattern classification. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2010: 2010. p. 558–65.

45.

Estévez P, Tesmer M, Perez C, Zurada JM, et al. Normalized mutual information feature selection. Neural Netw IEEE Trans. 2009; 20(2):189–201.CrossRef

46.

Jirapech-Umpai T, Aitken S. Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes. BMC Bioinforma. 2005; 6(1):148.CrossRef

47.

Yan Q, Yan H, Han F, Wei X, Zhu T. SVM-based decision support system for clinic aided tracheal intubation predication with multiple features. Expert Syst Appl. 2009; 36(3):6588–92.CrossRef

Title: Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data
Authors: Yolanda Garcia-Chimeno
Begonya Garcia-Zapirain
Marian Gomez-Beldarrain
Begonya Fernandez-Ruanova
Juan Carlos Garcia-Monco
Publication date: 01-12-2017
Publisher: BioMed Central
Published in: BMC Medical Informatics and Decision Making / Issue 1/2017
Electronic ISSN: 1472-6947
DOI: https://doi.org/10.1186/s12911-017-0434-4

At a glance: The STEP trials

Springer Medicine

Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data

Abstract

Background

Methods

Results

Conclusions

At a glance: The STEP trials

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content

Other articles of this Issue 1/2017

Ceiling effect in EMR system assimilation: a multiple case study in primary care family practices

Lost in translation? A multilingual Query Builder improves the quality of PubMed queries: a randomised controlled trial

EXPLICIT: a feasibility study of remote expert elicitation in health technology assessment

Word2Vec inversion and traditional text classifiers for phenotyping lupus

Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention

The role and benefits of accessing primary care patient records during unscheduled care: a systematic review