Top

BMC Medical Informatics and Decision Making

Published in:

Open Access 01-12-2019 | Biomarkers | Research article

Methods for a similarity measure for clinical attributes based on survival data analysis

Authors: Christian Karmen, Matthias Gietzelt, Petra Knaup-Gregori, Matthias Ganzinger

Published in: BMC Medical Informatics and Decision Making | Issue 1/2019

Abstract

Background

Case-based reasoning is a proven method that relies on learned cases from the past for decision support of a new case. The accuracy of such a system depends on the applied similarity measure, which quantifies the similarity between two cases. This work proposes a collection of methods for similarity measures especially for comparison of clinical cases based on survival data, as they are available for example from clinical trials.

Methods

Our approach is intended to be used in scenarios, where it is of interest to use longitudinal data, such as survival data, for a case-based reasoning approach. This might be especially important, where uncertainty about the ideal therapy decision exists. The collection of methods consists of definitions of the local similarity of nominal as well as numeric attributes, a calculation of attribute weights, a feature selection method and finally a global similarity measure. All of them use survival time (consisting of survival status and overall survival) as a reference of similarity. As a baseline, we calculate a survival function for each value of any given clinical attribute.

Results

We define the similarity between values of the same attribute by putting the estimated survival functions in relation to each other. Finally, we quantify the similarity by determining the area between corresponding curves of survival functions. The proposed global similarity measure is designed especially for cases from randomized clinical trials or other collections of clinical data with survival information. Overall survival can be considered as an eligible and alternative solution for similarity calculations. It is especially useful, when similarity measures that depend on the classic solution-describing attribute “applied therapy” are not applicable. This is often the case for data from clinical trials containing randomized arms.

Conclusions

In silico evaluation scenarios showed that the mean accuracy of biomarker detection in k = 10 most similar cases is higher (0.909–0.998) than for competing similarity measures, such as Heterogeneous Euclidian-Overlap Metric (0.657–0.831) and Discretized Value Difference Metric (0.535–0.671). The weight calculation method showed a more than six times (6.59–6.95) higher weight for biomarker attributes over non-biomarker attributes. These results suggest that the similarity measure described here is suitable for applications based on survival data.

Available only for authorised users

In total, this leads to 100.000 single results considered for biomarker classification.

Kolodner J. Reconstructive memory: a computer model. Cogn Sci. 1983;7:281–328. https://doi.org/10.1016/S0364-0213(83)80002-0.CrossRef

Aamodt A, Plaza E. Case-based reasoning: foundational issues, methodological variations, and system approaches. AI Commun. 1994;7:39–59.

Miotto R, Weng C. Case-based reasoning using electronic health records efficiently identifies eligible patients for clinical trials. J Am Med Inform Assoc. 2015;22:e141–50. https://doi.org/10.1093/jamia/ocu050.CrossRefPubMedPubMedCentral

Gierl L, Stengel-Rutkowski S. Integrating consultation and semi-automatic knowledge acquisition in a prototype-based architecture: experiences with dysmorphic syndromes. Artif Intell Med. 1994;6:29–49. https://doi.org/10.1016/0933-3657(94)90056-6.CrossRefPubMed

Brown S-A. Patient similarity: emerging concepts in systems and precision medicine. Front Physiol. 2016. https://doi.org/10.3389/fphys.2016.00561.

Chen Q, Wu J, Li S, Lyu P, Wang Y, Li M. An ontology-driven, case-based clinical decision support model for removable partial denture design. Sci Rep. 2016;6:27855. https://doi.org/10.1038/srep27855.CrossRefPubMedPubMedCentral

Seitz A, Uhrmacher AM, Damm D. Case-based prediction in experimental medical studies. Artif Intell Med. 1999;15:255–73. https://doi.org/10.1016/S0933-3657(98)00057-8.CrossRefPubMed

Ahmed MU, Begum S, Funk P, Xiong N, von SB. A multi-module case-based biofeedback system for stress treatment. Artif Intell Med. 2011;51:107–15. https://doi.org/10.1016/j.artmed.2010.09.003.CrossRefPubMed

Lu X, Huang Z, Duan H. Supporting adaptive clinical treatment processes through recommendations. Comput Methods Prog Biomed. 2012;107:413–24. https://doi.org/10.1016/j.cmpb.2010.12.005.CrossRef

10.

Bilska-Wolak AO, Floyd CE. Development and evaluation of a case-based reasoning classifier for prediction of breast biopsy outcome with BI-RADS lexicon. Med Phys. 2002;29:2090–100. https://doi.org/10.1118/1.1501140.CrossRefPubMed

11.

Azuaje F, Dubitzky W, Black N, Adamson K. Discovering relevance knowledge in data: a growing cell structures approach. IEEE Trans Syst Man Cybern B Cybern. 2000;30:448–60. https://doi.org/10.1109/3477.846233.CrossRefPubMed

12.

Schlaefer A, Dieterich S. Feasibility of case-based beam generation for robotic radiosurgery. Artif Intell Med. 2011;52:67–75. https://doi.org/10.1016/j.artmed.2011.04.008.CrossRefPubMed

13.

Ortiz-Posadas MR, Vega-Alvarado L, Toni B. A similarity function to evaluate the orthodontic condition in patients with cleft lip and palate. Med Hypotheses. 2004;63:35–41. https://doi.org/10.1016/j.mehy.2004.01.027.CrossRefPubMed

14.

Hartge F, Wetter T, Haefeli WE. A similarity measure for case based reasoning modeling with temporal abstraction based on cross-correlation. Comput Methods Prog Biomed. 2006;81:41–8. https://doi.org/10.1016/j.cmpb.2005.10.005.CrossRef

15.

Stamper R, Todd BS, Macpherson P. Case-based explanation for medical diagnostic programs, with an example from gynaecology. Methods Inf Med. 1994;33:205–13.CrossRef

16.

Jaulent MC, Bennani A, Le Bozec C, Zapletal E, Degoulet P. A customizable similarity measure between histological cases. Proc AMIA Symp. 2002:350–4.

17.

Leng B, Buchanan BG, Nicholas HB. Protein secondary structure prediction using two-level case-based reasoning. J Comput Biol. 1994;1:25–38. https://doi.org/10.1089/cmb.1994.1.25.CrossRefPubMed

18.

Rossille D, Laurent JF, Burgun A. Modelling a decision-support system for oncology using rule-based and case-based reasoning methodologies. Int J Med Inform. 2005;74:299–306. https://doi.org/10.1016/j.ijmedinf.2004.06.005.CrossRefPubMed

19.

Bach K, Sauer C, Althoff K-D, Roth-Berghofer T. Knowledge Modeling with the Open Source Tool myCBR. In: Nalepa GJ, Baumeister J, Kaczor K, editors. CEUR Workshop Proceedings (http://ceur-ws.org/); 2014.

20.

Stahl A, Roth-Berghofer TR. Rapid prototyping of CBR applications with the open source tool myCBR. In: Althoff K-D, Bergmann R, Minor M, Hanft A, editors. Advances in case-based reasoning. Berlin: Springer Berlin Heidelberg; 2008. p. 615–29. https://doi.org/10.1007/978-3-540-85502-6_42.

21.

López B, Pous C, Gay P, Pla A, Sanz J, Brunet J. eXiT*CBR: a framework for case-based medical diagnosis development and experimentation. Artif Intell Med. 2011;51:81–91. https://doi.org/10.1016/j.artmed.2010.09.002.CrossRefPubMed

22.

Goel A, Diaz-Agudo B. What’s hot in case-based reasoning. In: Proceedings of the thirty-first AAAI conference on artificial intelligence (AAAI-17); 2017.

23.

Sizov G, Öztürk P, Aamodt A. Evidence-driven retrieval in textual CBR: bridging the gap between retrieval and reuse. In: Hüllermeier E, Minor M, editors. Case-based reasoning Research and Development. Cham: Springer International Publishing; 2015. p. 351–65.CrossRef

24.

Weber RO, Ashley KD, Brüninghaus S. Textual case-based reasoning. Knowl Eng Rev. 2005;20:255–60. https://doi.org/10.1017/S0269888906000713.CrossRef

25.

Homem TPD, Perico DH, Santos PE, Bianchi RAC, RL de M. Qualitative case-based reasoning for humanoid robot soccer: A new retrieval and reuse algorithm; 2016. p. 170–85.

26.

Batchelor BG. Pattern recognition. Boston: Springer US; 1977.CrossRef

27.

Aha DW. Tolerating noisy, irrelevant and novel attributes in instance-based learning algorithms. Int J Man Mach Stud. 1992;36:267–87. https://doi.org/10.1016/0020-7373(92)90018-G.CrossRef

28.

Giraud-Carrier C, Martinez T. An efficient metric for heterogeneous inductive learning applications in the attribute-value language. In: Yfantis EA, editor. Proceedings of the Fourth Golden West International Conference on Intelligent Systems (GWIC´94). Boston: Kluwer Academic Publishers; 1995. p. 341–50.CrossRef

29.

Stanfill C, Waltz D. Toward memory-based reasoning. Commun ACM. 1986;29:1213–28. https://doi.org/10.1145/7902.7906.CrossRef

30.

Wilson DR, Martinez TR. Improved heterogeneous distance functions. J Artif Intell Res. 1997;6:1–34.CrossRef

31.

Assali AA, Lenne D, Debray B. Heterogeneity in Ontological CBR Systems. In: Montani S, Jain LC, editors. Successful case-based reasoning applications - I. Berlin: Springer Berlin Heidelberg; 2010. p. 97–116. https://doi.org/10.1007/978-3-642-14078-5_5.CrossRef

32.

Xiong N, Funk P. Combined feature selection and similarity modelling in case-based reasoning using hierarchical memetic algorithm. In: 2010 IEEE Congress on Evolutionary Computation (CEC); Barcelona. Piscataway: IEEE; 2010. p. 1–6. https://doi.org/10.1109/CEC.2010.5586421.CrossRef

33.

Gietzelt M, Karmen C, Haux C, Ganzinger M, Knaup P. vivaGen: Ein Datensatzgenerator für Überlebenszeitdaten. Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie: German Medical Science GMS Publishing House; 2017. https://doi.org/10.3205/17gmds052.

34.

Gietzelt M. vivaGen. 2018. https://sourceforge.net/projects/vivagen. Accessed 13 Jul 2018.

35.

Makama M, Drukker CA, Rutgers EJT, Slaets L, Cardoso F, Rookus MA, et al. An association study of established breast cancer reproductive and lifestyle risk factors with tumour subtype defined by the prognostic 70-gene expression signature (MammaPrint(R)). Eur J Cancer. 2017;75:5–13. https://doi.org/10.1016/j.ejca.2016.12.024.CrossRefPubMed

36.

Guinney J, Dienstmann R, Wang X, de RA, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med. 2015;21:1350–6. https://doi.org/10.1038/nm.3967.CrossRefPubMedPubMedCentral

37.

Aldrich J. R.a. Fisher and the making of maximum likelihood 1912-1922. Stat Sci. 1997;12:162–76. https://doi.org/10.1214/ss/1030037906.CrossRef

38.

Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B Methodol. 1977;39:1–38.

39.

Roever L. Endpoints in clinical trials: advantages and limitations. Evid Based Med Pract. 2015. https://doi.org/10.4172/2471-9919.1000e111.

40.

Ceze N, Charachon A, Locher C, Aparicio T, Mitry E, Barbieux J-P, et al. Safety and efficacy of palliative systemic chemotherapy combined with colorectal self-expandable metallic stents in advanced colorectal cancer: a multicenter study. Clin Res Hepatol Gastroenterol. 2016;40:230–8. https://doi.org/10.1016/j.clinre.2015.09.004.CrossRefPubMed

41.

Kuon R-J, Hudalla H, Seitz C, Hertler S, Gawlik S, Fluhr H, et al. Impaired neonatal outcome after emergency cerclage adds controversy to prolongation of pregnancy. PLoS One. 2015;10:e0129104. https://doi.org/10.1371/journal.pone.0129104.CrossRefPubMedPubMedCentral

42.

Asakura H, Hashimoto T, Harada H, Mizumoto M, Furutani K, Hasuike N, et al. Palliative radiotherapy for bleeding from advanced gastric cancer: is a schedule of 30 Gy in 10 fractions adequate? J Cancer Res Clin Oncol. 2011;137:125–30. https://doi.org/10.1007/s00432-010-0866-z.CrossRefPubMed

43.

Laurie JA, Moertel CG, Fleming TR, Wieand HS, Leigh JE, Rubin J, et al. Surgical adjuvant therapy of large-bowel carcinoma: an evaluation of levamisole and the combination of levamisole and fluorouracil. The north central Cancer treatment group and the Mayo Clinic. J Clin Oncol. 1989;7:1447–56. https://doi.org/10.1200/JCO.1989.7.10.1447.CrossRefPubMed

44.

T. M. Therneau, T. Lumley. survival : Survival Analysis. 2017. http://CRAN.R-project.org/package=survival. Accessed 26 Sep 2018.

45.

Karmen C. myCbrBuilder - A developer GUI for experimental similarity measures based on myCBR. 2018. https://gitlab.com/ckarmen/mycbrbuilder. Accessed 5 Feb 2019.

Title: Methods for a similarity measure for clinical attributes based on survival data analysis
Authors: Christian Karmen
Matthias Gietzelt
Petra Knaup-Gregori
Matthias Ganzinger
Publication date: 01-12-2019
Publisher: BioMed Central
Keyword: Biomarkers
Published in: BMC Medical Informatics and Decision Making / Issue 1/2019
Electronic ISSN: 1472-6947
DOI: https://doi.org/10.1186/s12911-019-0917-6

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Methods for a similarity measure for clinical attributes based on survival data analysis

Abstract

Background

Methods

Results

Conclusions

Keynote webinar | Spotlight on medication adherence

Springer Medicine

Abstract

Background

Methods

Results

Conclusions

Please log in to get access to this content

Other articles of this Issue 1/2019

The Generalized Data Model for clinical research

Effectiveness of a chat-bot for the adult population to quit smoking: protocol of a pragmatic clinical trial in primary care (Dejal@)

Assessing factors militating against the acceptance and successful implementation of a cloud based health center from the healthcare professionals’ perspective: a survey of hospitals in Benue state, northcentral Nigeria

Field study of a web service for stimulating the positive side of stress: entrepreneurs’ experiences and design implications

How to improve eRehabilitation programs in stroke care? A focus group study to identify requirements of end-users

Knowledge, attitude, and use of mHealth technology among students in Ghana: A university-based survey