Skip to main content
Top
Published in: Journal of Translational Medicine 1/2019

Open Access 01-12-2019 | Cataract | Research

Prediction of postoperative complications of pediatric cataract patients using data mining

Authors: Kai Zhang, Xiyang Liu, Jiewei Jiang, Wangting Li, Shuai Wang, Lin Liu, Xiaojing Zhou, Liming Wang

Published in: Journal of Translational Medicine | Issue 1/2019

Login to get access

Abstract

Background

The common treatment for pediatric cataracts is to replace the cloudy lens with an artificial one. However, patients may suffer complications (severe lens proliferation into the visual axis and abnormal high intraocular pressure; SLPVA and AHIP) within 1 year after surgery and factors causing these complications are unknown.

Methods

Apriori algorithm is employed to find association rules related to complications. We use random forest (RF) and Naïve Bayesian (NB) to predict the complications with datasets preprocessed by SMOTE (synthetic minority oversampling technique). Genetic feature selection is exploited to find real features related to complications.

Results

Average classification accuracies in three binary classification problems are over 75%. Second, the relationship between the classification performance and the number of random forest tree is studied. Results show except for gender and age at surgery (AS); other attributes are related to complications. Except for the secondary IOL placement, operation mode, AS and area of cataracts; other attributes are related to SLPVA. Except for the gender, operation mode, and laterality; other attributes are related to the AHIP. Next, the association rules related to the complications are mined out. Then additional 50 data were used to test the performance of RF and NB, both of then obtained the accuracies of over 65% for three classification problems. Finally, we developed a webserver to assist doctors.

Conclusions

The postoperative complications of pediatric cataracts patients can be predicted. Then the factors related to the complications are found. Finally, the association rules that is about the complications can provide reference to doctors.
Appendix
Available only for authorised users
Literature
1.
go back to reference Duggirala HJ, Tonning JM, Smith E, et al. Use of data mining at the Food and Drug Administration. J Am Med Inform Assoc. 2016;23:428.CrossRef Duggirala HJ, Tonning JM, Smith E, et al. Use of data mining at the Food and Drug Administration. J Am Med Inform Assoc. 2016;23:428.CrossRef
2.
go back to reference Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402.CrossRef Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402.CrossRef
3.
go back to reference Resnikoff S, Keys TU. Future trends in global blindness. Indian J Ophthalmol. 2012;60:387–95.CrossRef Resnikoff S, Keys TU. Future trends in global blindness. Indian J Ophthalmol. 2012;60:387–95.CrossRef
4.
go back to reference Lin H, Lin D, Chen J, et al. Distribution of axial length before cataract surgery in chinese pediatric patients. Sci Rep. 2016;6:23862.CrossRef Lin H, Lin D, Chen J, et al. Distribution of axial length before cataract surgery in chinese pediatric patients. Sci Rep. 2016;6:23862.CrossRef
5.
go back to reference Daw NW. Visual development. US: Springer; 2006. Daw NW. Visual development. US: Springer; 2006.
6.
go back to reference Jackson WS, Lindquist S. Illuminating aggregate heterogeneity in neurodegenerative disease. Nat Methods. 2007;4:1000–1.CrossRef Jackson WS, Lindquist S. Illuminating aggregate heterogeneity in neurodegenerative disease. Nat Methods. 2007;4:1000–1.CrossRef
7.
go back to reference Chen Z, Fillmore CM, Hammerman PS, Kim CF, Wong KK. Non-small-cell lung cancers: a heterogeneous set of diseases. Nat Rev Cancer. 2014;14:535–46.CrossRef Chen Z, Fillmore CM, Hammerman PS, Kim CF, Wong KK. Non-small-cell lung cancers: a heterogeneous set of diseases. Nat Rev Cancer. 2014;14:535–46.CrossRef
8.
go back to reference Bedard PL, Hansen AR, Ratain MJ, Siu LL. Tumour heterogeneity in the clinic. Nature. 2013;501:355–64.CrossRef Bedard PL, Hansen AR, Ratain MJ, Siu LL. Tumour heterogeneity in the clinic. Nature. 2013;501:355–64.CrossRef
9.
go back to reference Raju D, Su X, Patrician PA, Loan LA, McCarthy MS. Exploring factors associated with pressure ulcers: a data mining approach. Int J Nurs Stud. 2015;52:102–11.CrossRef Raju D, Su X, Patrician PA, Loan LA, McCarthy MS. Exploring factors associated with pressure ulcers: a data mining approach. Int J Nurs Stud. 2015;52:102–11.CrossRef
10.
go back to reference Pereira S, Portela F, Santos MF, Machado J, Abelha A. Predicting type of delivery by identification of obstetric risk factors through data mining. Procedia Comput Sci. 2015;64:601–9.CrossRef Pereira S, Portela F, Santos MF, Machado J, Abelha A. Predicting type of delivery by identification of obstetric risk factors through data mining. Procedia Comput Sci. 2015;64:601–9.CrossRef
11.
go back to reference Aljumah AA, Ahamad MG, Siddiqui MK. Application of data mining: diabetes health care in young and old patients. J King Saud Univ Comput Inf Sci. 2013;25:127–36. Aljumah AA, Ahamad MG, Siddiqui MK. Application of data mining: diabetes health care in young and old patients. J King Saud Univ Comput Inf Sci. 2013;25:127–36.
12.
go back to reference Somanchi S, Adhikari S, Lin A, Eneva E, Ghani R. Early prediction of cardiac arrest (code blue) using electronic medical records. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM; 2015. p. 2119–2126. Somanchi S, Adhikari S, Lin A, Eneva E, Ghani R. Early prediction of cardiac arrest (code blue) using electronic medical records. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. New York: ACM; 2015. p. 2119–2126.
13.
go back to reference Lin H, Long E, Chen W, Liu Y. Documenting rare disease data in China. Science. 2015;349:1064.CrossRef Lin H, Long E, Chen W, Liu Y. Documenting rare disease data in China. Science. 2015;349:1064.CrossRef
14.
go back to reference Liu X, Jiang J, Zhang K, et al. Localization and diagnosis framework for pediatric cataracts based on slit-lamp images using deep features of a convolutional neural network. PLoS ONE. 2017;12:e0168606.CrossRef Liu X, Jiang J, Zhang K, et al. Localization and diagnosis framework for pediatric cataracts based on slit-lamp images using deep features of a convolutional neural network. PLoS ONE. 2017;12:e0168606.CrossRef
15.
go back to reference Mataftsi A, Haidich AB, Kokkali S, et al. Postoperative glaucoma following infantile cataract surgery: an individual patient data meta-analysis. Jama Ophthalmol. 2014;132:1059–67.CrossRef Mataftsi A, Haidich AB, Kokkali S, et al. Postoperative glaucoma following infantile cataract surgery: an individual patient data meta-analysis. Jama Ophthalmol. 2014;132:1059–67.CrossRef
16.
go back to reference Barua S, Islam MM, Yao X, Murase K. MWMOTE–majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng. 2014;26:405–25.CrossRef Barua S, Islam MM, Yao X, Murase K. MWMOTE–majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng. 2014;26:405–25.CrossRef
17.
go back to reference Verbiest N, Ramentol E, Cornelis C, Herrera F. Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection. Appl Soft Comput. 2014;22:511–7.CrossRef Verbiest N, Ramentol E, Cornelis C, Herrera F. Preprocessing noisy imbalanced datasets using SMOTE enhanced with fuzzy rough prototype selection. Appl Soft Comput. 2014;22:511–7.CrossRef
18.
go back to reference Burges CJC. A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998;2:121–67.CrossRef Burges CJC. A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc. 1998;2:121–67.CrossRef
19.
go back to reference Yen SJ, Lee YS. Cluster-based under-sampling approaches for imbalanced data distributions. Expert Syst Appl. 2009;36:5718–27.CrossRef Yen SJ, Lee YS. Cluster-based under-sampling approaches for imbalanced data distributions. Expert Syst Appl. 2009;36:5718–27.CrossRef
20.
go back to reference Ibáñez A, Bielza C, Larrañaga P. Cost-sensitive selective naive Bayes classifiers for predicting the increase of the h-index for scientific journals. Neurocomputing. 2014;135:42–52.CrossRef Ibáñez A, Bielza C, Larrañaga P. Cost-sensitive selective naive Bayes classifiers for predicting the increase of the h-index for scientific journals. Neurocomputing. 2014;135:42–52.CrossRef
21.
go back to reference Zidelmal Z, Amirou A, Ould-Abdeslam D, Merckle J. ECG beat classification using a cost sensitive classifier. Comput Methods Progr Biomed. 2013;111:570–7.CrossRef Zidelmal Z, Amirou A, Ould-Abdeslam D, Merckle J. ECG beat classification using a cost sensitive classifier. Comput Methods Progr Biomed. 2013;111:570–7.CrossRef
22.
go back to reference Yang Z, Tang WH, Shintemirov A, Wu QH. Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers. IEEE Trans Syst Man Cybern Part C. 2009;39:597–610.CrossRef Yang Z, Tang WH, Shintemirov A, Wu QH. Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers. IEEE Trans Syst Man Cybern Part C. 2009;39:597–610.CrossRef
23.
go back to reference Khalili A, Sami A. SysDetect: a systematic approach to critical state determination for Industrial Intrusion Detection Systems using Apriori algorithm. J Process Control. 2015;32:154–60.CrossRef Khalili A, Sami A. SysDetect: a systematic approach to critical state determination for Industrial Intrusion Detection Systems using Apriori algorithm. J Process Control. 2015;32:154–60.CrossRef
24.
go back to reference Shaheen M, Shahbaz M. An algorithm of association rule mining for microbial energy prospection. Sci Rep. 2017;7:46108.CrossRef Shaheen M, Shahbaz M. An algorithm of association rule mining for microbial energy prospection. Sci Rep. 2017;7:46108.CrossRef
25.
go back to reference Bellinger C, Mohomed Jabbar MS, Zaïane O, Osornio-Vargas A. A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health. 2017;17:907.CrossRef Bellinger C, Mohomed Jabbar MS, Zaïane O, Osornio-Vargas A. A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health. 2017;17:907.CrossRef
26.
go back to reference Jiang L, Li C, Wang S, Zhang L. Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell. 2016;52:26–39.CrossRef Jiang L, Li C, Wang S, Zhang L. Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell. 2016;52:26–39.CrossRef
27.
go back to reference Kim S-B, Han K-S, Rim H-C, Myaeng SH. Some effective techniques for naive bayes text classification. IEEE Trans Knowl Data Eng. 2006;18:1457–66.CrossRef Kim S-B, Han K-S, Rim H-C, Myaeng SH. Some effective techniques for naive bayes text classification. IEEE Trans Knowl Data Eng. 2006;18:1457–66.CrossRef
28.
go back to reference Wu J, Pan S, Zhu X, Cai Z, Zhang P, Zhang C. Self-adaptive attribute weighting for Naive Bayes classification. Expert Syst Appl. 2015;42:1487–502.CrossRef Wu J, Pan S, Zhu X, Cai Z, Zhang P, Zhang C. Self-adaptive attribute weighting for Naive Bayes classification. Expert Syst Appl. 2015;42:1487–502.CrossRef
29.
go back to reference Liu Y-F, Guo J-M, Lee J-D. Halftone image classification using LMS algorithm and naive Bayes. IEEE Trans Image Process. 2011;20:2837–47.CrossRef Liu Y-F, Guo J-M, Lee J-D. Halftone image classification using LMS algorithm and naive Bayes. IEEE Trans Image Process. 2011;20:2837–47.CrossRef
30.
go back to reference Marucci-Wellman HR, Lehto MR, Corns HL. A practical tool for public health surveillance: semi-automated coding of short injury narratives from large administrative databases using Naïve Bayes algorithms. Accid Anal Prev. 2015;84:165–76.CrossRef Marucci-Wellman HR, Lehto MR, Corns HL. A practical tool for public health surveillance: semi-automated coding of short injury narratives from large administrative databases using Naïve Bayes algorithms. Accid Anal Prev. 2015;84:165–76.CrossRef
31.
go back to reference Miranda E, Irwansyah E, Amelga AY, Maribondang MM, Salim M. Detection of cardiovascular disease risk’s level for adults using Naive Bayes classifier. Healthcare Inform Res. 2016;22:196–205.CrossRef Miranda E, Irwansyah E, Amelga AY, Maribondang MM, Salim M. Detection of cardiovascular disease risk’s level for adults using Naive Bayes classifier. Healthcare Inform Res. 2016;22:196–205.CrossRef
32.
go back to reference Zhang H, Jiang T, Shan G. Identification of hot spots in protein structures using Gaussian network model and Gaussian Naive Bayes. Biomed Res Int. 2016;2016:4354901.PubMedPubMedCentral Zhang H, Jiang T, Shan G. Identification of hot spots in protein structures using Gaussian network model and Gaussian Naive Bayes. Biomed Res Int. 2016;2016:4354901.PubMedPubMedCentral
33.
go back to reference Arvind V, Köbler J, Kuhnert S, Rattan G, Vasudev Y. On the isomorphism problem for decision trees and decision lists. Theor Comput Sci. 2015;590:38–54.CrossRef Arvind V, Köbler J, Kuhnert S, Rattan G, Vasudev Y. On the isomorphism problem for decision trees and decision lists. Theor Comput Sci. 2015;590:38–54.CrossRef
34.
go back to reference Mistikoglu G, Gerek IH, Erdis E, Usmen PEM, Cakan H, Kazan EE. Decision tree analysis of construction fall accidents involving roofers. Expert Syst Appl. 2014;42:2256–63.CrossRef Mistikoglu G, Gerek IH, Erdis E, Usmen PEM, Cakan H, Kazan EE. Decision tree analysis of construction fall accidents involving roofers. Expert Syst Appl. 2014;42:2256–63.CrossRef
35.
go back to reference Senroy N, Heydt GT, Vittal V. Decision tree assisted controlled islanding. IEEE Trans Power Syst. 2006;21:1790–7.CrossRef Senroy N, Heydt GT, Vittal V. Decision tree assisted controlled islanding. IEEE Trans Power Syst. 2006;21:1790–7.CrossRef
36.
go back to reference Nasseri AA, Tucker A, Cesare SD. Quantifying StockTwits semantic terms’ trading behavior in financial markets: an effective application of decision tree algorithms. Expert Syst Appl. 2015;42:9192–210.CrossRef Nasseri AA, Tucker A, Cesare SD. Quantifying StockTwits semantic terms’ trading behavior in financial markets: an effective application of decision tree algorithms. Expert Syst Appl. 2015;42:9192–210.CrossRef
37.
go back to reference Jiawei H, Micheline K, Jian P. Data mining: concepts and techniques. 3rd ed. China: China Machine Press; 2012. p. 217–21. Jiawei H, Micheline K, Jian P. Data mining: concepts and techniques. 3rd ed. China: China Machine Press; 2012. p. 217–21.
38.
go back to reference Pei S-C, Chen L-H. Image quality assessment using human visual DOG model fused with random forest. IEEE Trans Image Process. 2015;24:3282–92.CrossRef Pei S-C, Chen L-H. Image quality assessment using human visual DOG model fused with random forest. IEEE Trans Image Process. 2015;24:3282–92.CrossRef
39.
go back to reference Qian C, Wang L, Gao Y, et al. In vivo MRI based prostate cancer localization with random forests and auto-context model. Comput Med Imaging Gr. 2016;52:44–57.CrossRef Qian C, Wang L, Gao Y, et al. In vivo MRI based prostate cancer localization with random forests and auto-context model. Comput Med Imaging Gr. 2016;52:44–57.CrossRef
40.
go back to reference Mourad R, Ginalski K, Legube G, Cuvier O. Predicting double-strand DNA breaks using epigenome marks or DNA at kilobase resolution. Genome Biol. 2018;19:34.CrossRef Mourad R, Ginalski K, Legube G, Cuvier O. Predicting double-strand DNA breaks using epigenome marks or DNA at kilobase resolution. Genome Biol. 2018;19:34.CrossRef
41.
go back to reference Wu Q, Ye Y, Liu Y, Ng MK. SNP selection and classification of genome-wide SNP data using stratified sampling random forests. IEEE Trans Nanobiosci. 2012;11:216–27.CrossRef Wu Q, Ye Y, Liu Y, Ng MK. SNP selection and classification of genome-wide SNP data using stratified sampling random forests. IEEE Trans Nanobiosci. 2012;11:216–27.CrossRef
42.
go back to reference Yang Q, Wang M, Xiao H, et al. Feature selection using a combination of genetic algorithm and selection frequency curve analysis. Chemomet Intell Lab Syst. 2015;148:106–14.CrossRef Yang Q, Wang M, Xiao H, et al. Feature selection using a combination of genetic algorithm and selection frequency curve analysis. Chemomet Intell Lab Syst. 2015;148:106–14.CrossRef
43.
go back to reference Wang L, Zhang K, Liu X, et al. Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images. Sci Rep. 2017;7:41545.CrossRef Wang L, Zhang K, Liu X, et al. Comparative analysis of image classification methods for automatic diagnosis of ophthalmic images. Sci Rep. 2017;7:41545.CrossRef
44.
go back to reference Ghareb AS, Bakar AA, Hamdan AR. Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl. 2016;49:31–47.CrossRef Ghareb AS, Bakar AA, Hamdan AR. Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl. 2016;49:31–47.CrossRef
45.
go back to reference Nagarajan G, Minu R, Muthukumar B, Vedanarayanan V, Sundarsingh S. Hybrid genetic algorithm for medical image feature extraction and selection. Procedia Comput Sci. 2016;85:455–62.CrossRef Nagarajan G, Minu R, Muthukumar B, Vedanarayanan V, Sundarsingh S. Hybrid genetic algorithm for medical image feature extraction and selection. Procedia Comput Sci. 2016;85:455–62.CrossRef
46.
go back to reference Lu L, Yan J, de Silva CW. Feature selection for ECG signal processing using improved genetic algorithm and empirical mode decomposition. Measurement. 2016;94:372–81.CrossRef Lu L, Yan J, de Silva CW. Feature selection for ECG signal processing using improved genetic algorithm and empirical mode decomposition. Measurement. 2016;94:372–81.CrossRef
47.
go back to reference Zhang M-L, Zhang K. Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: ACM; 2010. p. 999–1008. Zhang M-L, Zhang K. Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: ACM; 2010. p. 999–1008.
48.
go back to reference Zhang K, Liu X, Liu F, He L, Zhang L, Yang Y, Li W, Wang S, Liu L, Liu Z, Wu X, Lin H. An interpretable and expandable deep learning diagnostic system for multiple ocular diseases: qualitative study. J Med Internet Res. 2018;20(11):e11144.CrossRef Zhang K, Liu X, Liu F, He L, Zhang L, Yang Y, Li W, Wang S, Liu L, Liu Z, Wu X, Lin H. An interpretable and expandable deep learning diagnostic system for multiple ocular diseases: qualitative study. J Med Internet Res. 2018;20(11):e11144.CrossRef
Metadata
Title
Prediction of postoperative complications of pediatric cataract patients using data mining
Authors
Kai Zhang
Xiyang Liu
Jiewei Jiang
Wangting Li
Shuai Wang
Lin Liu
Xiaojing Zhou
Liming Wang
Publication date
01-12-2019
Publisher
BioMed Central
Keyword
Cataract
Published in
Journal of Translational Medicine / Issue 1/2019
Electronic ISSN: 1479-5876
DOI
https://doi.org/10.1186/s12967-018-1758-2

Other articles of this Issue 1/2019

Journal of Translational Medicine 1/2019 Go to the issue