Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2012

Open Access 01-12-2012 | Research article

Measuring diversity in medical reports based on categorized attributes and international classification systems

Authors: Petra Přečková, Jana Zvárová, Karel Zvára

Published in: BMC Medical Informatics and Decision Making | Issue 1/2012

Login to get access

Abstract

Background

Narrative medical reports do not use standardized terminology and often bring insufficient information for statistical processing and medical decision making. Objectives of the paper are to propose a method for measuring diversity in medical reports written in any language, to compare diversities in narrative and structured medical reports and to map attributes and terms to selected classification systems.

Methods

A new method based on a general concept of f-diversity is proposed for measuring diversity of medical reports in any language. The method is based on categorized attributes recorded in narrative or structured medical reports and on international classification systems. Values of categories are expressed by terms. Using SNOMED CT and ICD 10 we are mapping attributes and terms to predefined codes. We use f-diversities of Gini-Simpson and Number of Categories types to compare diversities of narrative and structured medical reports. The comparison is based on attributes selected from the Minimal Data Model for Cardiology (MDMC).

Results

We compared diversities of 110 Czech narrative medical reports and 1119 Czech structured medical reports. Selected categorized attributes of MDMC had mostly different numbers of categories and used different terms in narrative and structured reports. We found more than 60% of MDMC attributes in SNOMED CT. We showed that attributes in narrative medical reports had greater diversity than the same attributes in structured medical reports. Further, we replaced each value of category (term) used for attributes in narrative medical reports by the closest term and the category used in MDMC for structured medical reports. We found that relative Gini-Simpson diversities in structured medical reports were significantly smaller than those in narrative medical reports except the "Allergy" attribute.

Conclusions

Terminology in narrative medical reports is not standardized. Therefore it is nearly impossible to map values of attributes (terms) to codes of known classification systems. A high diversity in narrative medical reports terminology leads to more difficult computer processing than in structured medical reports and some information may be lost during this process. Setting a standardized terminology would help healthcare providers to have complete and easily accessible information about patients that would result in better healthcare.
Appendix
Available only for authorised users
Literature
2.
go back to reference International Classification of Diseases and Related Health Problems. The Tenth Revisions. Instructing Manual. ÚZIS ČR. (In Czech) International Classification of Diseases and Related Health Problems. The Tenth Revisions. Instructing Manual. ÚZIS ČR. (In Czech)
3.
go back to reference Stausberg J, Lehmann N, Kaczmarek D, Stein M: Reliability of diagnose coding with ICD-10. Int J Med Inform. 2008, 77: 50-57. 10.1016/j.ijmedinf.2006.11.005.CrossRefPubMed Stausberg J, Lehmann N, Kaczmarek D, Stein M: Reliability of diagnose coding with ICD-10. Int J Med Inform. 2008, 77: 50-57. 10.1016/j.ijmedinf.2006.11.005.CrossRefPubMed
5.
go back to reference The International Health Terminology Standards Development Organisation: SNOMED Clinical Terms® User Guide. ©2002-2009, July 2009 International Release, 1-70 The International Health Terminology Standards Development Organisation: SNOMED Clinical Terms® User Guide. ©2002-2009, July 2009 International Release, 1-70
6.
go back to reference Schulz S, Hanser S, Hahn U, Rodgers J: The semantics procedures and diseases in SNOMED® CT. Methods Inf Med. 2006, 45: 354-358.PubMed Schulz S, Hanser S, Hahn U, Rodgers J: The semantics procedures and diseases in SNOMED® CT. Methods Inf Med. 2006, 45: 354-358.PubMed
7.
go back to reference Cornet R: Definitions and qualifiers in SNOMED CT. Methods Inf Med. 2009, 48: 177-183.CrossRef Cornet R: Definitions and qualifiers in SNOMED CT. Methods Inf Med. 2009, 48: 177-183.CrossRef
8.
go back to reference Lee D, Cornet R, Lau F: Implications of SNOMED CT versioning. Int J Med Inform. 2011, 80: 442-453. 10.1016/j.ijmedinf.2011.02.006.CrossRefPubMed Lee D, Cornet R, Lau F: Implications of SNOMED CT versioning. Int J Med Inform. 2011, 80: 442-453. 10.1016/j.ijmedinf.2011.02.006.CrossRefPubMed
9.
go back to reference Ceusters W: SNOMED CT's FR2: Is the Future Bright?. User Centered Networked Health Care. Edited by: Moen A at al. 2011, IOS Press, 829-833. Ceusters W: SNOMED CT's FR2: Is the Future Bright?. User Centered Networked Health Care. Edited by: Moen A at al. 2011, IOS Press, 829-833.
10.
go back to reference Conley E, Benson T: SNOMED CT: Who Needs to Know What?. European Journal for Biomedical Informatics. 2011, 7 (2): 40-47. Conley E, Benson T: SNOMED CT: Who Needs to Know What?. European Journal for Biomedical Informatics. 2011, 7 (2): 40-47.
11.
go back to reference Park HA, Lundberg C, Coenen A, Konicek D: Evaluation of the content coverage of SNOMED CT representing ICNP seven-axis version 1 concepts. Methods Inf Med. 2011, 50: 472-478. 10.3414/ME11-01-0004.CrossRefPubMed Park HA, Lundberg C, Coenen A, Konicek D: Evaluation of the content coverage of SNOMED CT representing ICNP seven-axis version 1 concepts. Methods Inf Med. 2011, 50: 472-478. 10.3414/ME11-01-0004.CrossRefPubMed
12.
14.
go back to reference Gault LV, Schultz M: Variations in Medical Subject Headings (MeSH) mapping: from the natural language of patron terms to the controlled vocabulary of mapped lists. J Med Libr Assoc. 2002, 90 (2): 173-180.PubMedPubMedCentral Gault LV, Schultz M: Variations in Medical Subject Headings (MeSH) mapping: from the natural language of patron terms to the controlled vocabulary of mapped lists. J Med Libr Assoc. 2002, 90 (2): 173-180.PubMedPubMedCentral
16.
go back to reference Khan AN, Griffith SP, Moore C, Russell D, Rosario AC, Bertolli J: Standardizing laboratory data by mapping to LOINC. J Am Med Inform Assoc. 2006, 13 (3): 353-355. 10.1197/jamia.M1935.CrossRefPubMedPubMedCentral Khan AN, Griffith SP, Moore C, Russell D, Rosario AC, Bertolli J: Standardizing laboratory data by mapping to LOINC. J Am Med Inform Assoc. 2006, 13 (3): 353-355. 10.1197/jamia.M1935.CrossRefPubMedPubMedCentral
18.
go back to reference Han S-B, Choi J: The comparative study on concept representation between the UMLS and the clinical terms in Korean Medical Records. Int J Med Inform. 2005, 74: 67-76. 10.1016/j.ijmedinf.2004.09.004.CrossRefPubMed Han S-B, Choi J: The comparative study on concept representation between the UMLS and the clinical terms in Korean Medical Records. Int J Med Inform. 2005, 74: 67-76. 10.1016/j.ijmedinf.2004.09.004.CrossRefPubMed
19.
go back to reference Campbell JR, Olivek DE, Shortliffe: UMLS: towards a collaborative approcah for solving terminological problems. J Am Med Inform Assoc. 1998, 5: 12-16. 10.1136/jamia.1998.0050012.CrossRefPubMedPubMedCentral Campbell JR, Olivek DE, Shortliffe: UMLS: towards a collaborative approcah for solving terminological problems. J Am Med Inform Assoc. 1998, 5: 12-16. 10.1136/jamia.1998.0050012.CrossRefPubMedPubMedCentral
20.
go back to reference Massari P, Pereira S, Thirion B, Derdeville A, Darmoni SJ: Use of super-concepts to customize electronic medical records data display. Stud Health Technol Inform. 2008, 136: 845-850.PubMed Massari P, Pereira S, Thirion B, Derdeville A, Darmoni SJ: Use of super-concepts to customize electronic medical records data display. Stud Health Technol Inform. 2008, 136: 845-850.PubMed
21.
go back to reference Meystre SM, Savova K, Klipper-Schuler C, Hurdle JF: Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research. IMIA Yearbook of Medical Informatics. 2008, 128-144. Meystre SM, Savova K, Klipper-Schuler C, Hurdle JF: Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research. IMIA Yearbook of Medical Informatics. 2008, 128-144.
22.
go back to reference Liu K, Chapman WW, Savova G, Chute CG, Sioutos N, Crewley RS: Effectiveness of Lexico-syntactic Pattern Matchng for Ontology Enrichment with Clinical Documents. Methods of Information in Medicine. 2001, 40 (5): 397-407. Liu K, Chapman WW, Savova G, Chute CG, Sioutos N, Crewley RS: Effectiveness of Lexico-syntactic Pattern Matchng for Ontology Enrichment with Clinical Documents. Methods of Information in Medicine. 2001, 40 (5): 397-407.
23.
go back to reference Eryiğit G, Nivre J, Oflazer K: Dependency parsing of Turkish. Computational Linguistics. 2008, 34 (3): 357-389. 10.1162/coli.2008.07-017-R1-06-83.CrossRef Eryiğit G, Nivre J, Oflazer K: Dependency parsing of Turkish. Computational Linguistics. 2008, 34 (3): 357-389. 10.1162/coli.2008.07-017-R1-06-83.CrossRef
24.
go back to reference Zvára K, Kašpar V: Identification of units and other terms in Czech medical records. European Journal for Biomedical Informatics. 2010, 6 (1): 78-82. Zvára K, Kašpar V: Identification of units and other terms in Czech medical records. European Journal for Biomedical Informatics. 2010, 6 (1): 78-82.
25.
go back to reference Bleich HL, Slack WV: Reflections on electronic medical record: when doctor will use them and when they will not. Int J Med Inform. 2010, 79: 1-4. 10.1016/j.ijmedinf.2009.10.002.CrossRefPubMed Bleich HL, Slack WV: Reflections on electronic medical record: when doctor will use them and when they will not. Int J Med Inform. 2010, 79: 1-4. 10.1016/j.ijmedinf.2009.10.002.CrossRefPubMed
26.
go back to reference Zvárová J: Biomedical Informatics Research and Education at the EuroMISE Center. IMIA Yearbook of Medical Informatics, Schattauer GmbH. 2006, 166-173. Zvárová J: Biomedical Informatics Research and Education at the EuroMISE Center. IMIA Yearbook of Medical Informatics, Schattauer GmbH. 2006, 166-173.
27.
go back to reference Adášková J, Anger Z, Aschermann M, Bencko V, Berka P, Filipovský J, Goláň L, Grus T, Grünfeldová H, Haas T, Hanuš P, Hanzlíček P, Holcátová I, Hrach K, Jiroušek R, Kejřová E, Kocmanová D, Kolář J, Kotásek P, Králíková E, Krupařová M, Kyloušková M, Malý M, Mareš R, Matoulek M, Mazura I, Mrázek V, Novotný L, Novotný Z, Pecen L, Peleška J, Prázný M, Pudil P, Rameš J, Rauch J, Reissigová J, Rosolová H, Rousková B, Říha A, Sedlak P, Slámová A, Somol P, Svačina Š, Svátek V, Šabík D, Šimek S, Škvor J, Špidlen J, Štochl J, Tomečková M, Umnerová V, Zvára K, Zvárová J: A proposal of the Minimal Data Model for Cardiology and the ADAMEK software application (in Czech). Internal research report of the EuroMISE Centre - Cardio. 2002, Prague: Institute of Computer Science AS CR Adášková J, Anger Z, Aschermann M, Bencko V, Berka P, Filipovský J, Goláň L, Grus T, Grünfeldová H, Haas T, Hanuš P, Hanzlíček P, Holcátová I, Hrach K, Jiroušek R, Kejřová E, Kocmanová D, Kolář J, Kotásek P, Králíková E, Krupařová M, Kyloušková M, Malý M, Mareš R, Matoulek M, Mazura I, Mrázek V, Novotný L, Novotný Z, Pecen L, Peleška J, Prázný M, Pudil P, Rameš J, Rauch J, Reissigová J, Rosolová H, Rousková B, Říha A, Sedlak P, Slámová A, Somol P, Svačina Š, Svátek V, Šabík D, Šimek S, Škvor J, Špidlen J, Štochl J, Tomečková M, Umnerová V, Zvára K, Zvárová J: A proposal of the Minimal Data Model for Cardiology and the ADAMEK software application (in Czech). Internal research report of the EuroMISE Centre - Cardio. 2002, Prague: Institute of Computer Science AS CR
28.
go back to reference Mareš R, Tomečková M, Peleška J, Hanzlíček P, Zvárová J: Interface of patient database systems - an example of the application designed for data collection in the framework of minimal data model for cardiology (in Czech). Cor Vasa. 2002, 44 (4 Suppl): 76- Mareš R, Tomečková M, Peleška J, Hanzlíček P, Zvárová J: Interface of patient database systems - an example of the application designed for data collection in the framework of minimal data model for cardiology (in Czech). Cor Vasa. 2002, 44 (4 Suppl): 76-
29.
go back to reference Gini C: Variabilità e Mutabilità. Studi Economico-Giuridici della R. Univ. di Cagliari. 3, 1912; Part 2 80 Gini C: Variabilità e Mutabilità. Studi Economico-Giuridici della R. Univ. di Cagliari. 3, 1912; Part 2 80
30.
go back to reference Simpson EH: Measurement of diversity. Nature. 1949, 163: 688-10.1038/163688a0.CrossRef Simpson EH: Measurement of diversity. Nature. 1949, 163: 688-10.1038/163688a0.CrossRef
31.
go back to reference Vajda I: Theory of statistical inference and information. 1989, Kluwer: Boston Vajda I: Theory of statistical inference and information. 1989, Kluwer: Boston
32.
go back to reference Zvarova J, Studeny M: Information theoretical approach to constitution and reduction of medical data. Int J Med Inf. 1997, 45: 65-74. 10.1016/S1386-5056(97)00036-1.CrossRef Zvarova J, Studeny M: Information theoretical approach to constitution and reduction of medical data. Int J Med Inf. 1997, 45: 65-74. 10.1016/S1386-5056(97)00036-1.CrossRef
33.
go back to reference Peng H, Long F, Ding Ch: Feature selection based on mutual information: criteria of max-dependency, mas-relevance and min-redundancy. IEEE Trans Pattern Anal Mach Intell. 2005, 27 (8): 1226-1238.CrossRefPubMed Peng H, Long F, Ding Ch: Feature selection based on mutual information: criteria of max-dependency, mas-relevance and min-redundancy. IEEE Trans Pattern Anal Mach Intell. 2005, 27 (8): 1226-1238.CrossRefPubMed
34.
go back to reference Benish WA: Intuitive and axiomatic arguments for quantifying diagnostic test performance in units of information. Methods Inf Med. 2009, 48: 552-557. 10.3414/ME0627.CrossRefPubMed Benish WA: Intuitive and axiomatic arguments for quantifying diagnostic test performance in units of information. Methods Inf Med. 2009, 48: 552-557. 10.3414/ME0627.CrossRefPubMed
35.
go back to reference Blokh D, Zurgil N, Stambler I, Afrimzon E, Shafran Y, Korech E, Sandbank J, Deutsch M: An information-theoretical model for breast cancer detection. Methods Inf Med. 2008, 47: 322-557.PubMed Blokh D, Zurgil N, Stambler I, Afrimzon E, Shafran Y, Korech E, Sandbank J, Deutsch M: An information-theoretical model for breast cancer detection. Methods Inf Med. 2008, 47: 322-557.PubMed
36.
go back to reference Zvárová J: On measures of statistical dependence. Casopis pro pestovani matematiky. 1974, 99: 15-29. Zvárová J: On measures of statistical dependence. Casopis pro pestovani matematiky. 1974, 99: 15-29.
37.
go back to reference Zvárová J, Vajda I: On genetic information, diversity and distance. Methods Inf Med. 2006, 2: 173-179. Zvárová J, Vajda I: On genetic information, diversity and distance. Methods Inf Med. 2006, 2: 173-179.
38.
go back to reference Patil GP, Tailie C: Diversity as a concept and its measurement. J Am Stat Assoc. 1982, 77: 548-561. 10.2307/2287709.CrossRef Patil GP, Tailie C: Diversity as a concept and its measurement. J Am Stat Assoc. 1982, 77: 548-561. 10.2307/2287709.CrossRef
39.
go back to reference Zvárová J, Zvára K: Stochastic modelling of biodiversity: f-diversity, self f-diversity and marginal f-diversity. Proceedings of the 6th Summer School on Computational Biology, Deterministic and Stochastic Modelling in Biology and Medicine. Edited by: Hrebicek J, Holcik J. 2010, Akademické nakladatelství CERM, Brno, 108-119. Zvárová J, Zvára K: Stochastic modelling of biodiversity: f-diversity, self f-diversity and marginal f-diversity. Proceedings of the 6th Summer School on Computational Biology, Deterministic and Stochastic Modelling in Biology and Medicine. Edited by: Hrebicek J, Holcik J. 2010, Akademické nakladatelství CERM, Brno, 108-119.
40.
go back to reference Bonachela JA, Hinrichsen H, Munoz MA: Entropy estimates of small data sets. Journal of Physics A: Mathematical and Theoretical. 2009, 41: 1(11)- Bonachela JA, Hinrichsen H, Munoz MA: Entropy estimates of small data sets. Journal of Physics A: Mathematical and Theoretical. 2009, 41: 1(11)-
41.
42.
go back to reference Přečková P: Language of Czech medical reports and classification systems in medicine. European Journal for Biomedical Informatics. 2010, 6 (1): 58-65. Přečková P: Language of Czech medical reports and classification systems in medicine. European Journal for Biomedical Informatics. 2010, 6 (1): 58-65.
43.
go back to reference Ringlestetter C, Schulz KU, Mihov S: Orthographic errors in web pages: toward cleaner web corpora. Computational Linguistics. 2006, 32 (3): 295-340. 10.1162/coli.2006.32.3.295.CrossRef Ringlestetter C, Schulz KU, Mihov S: Orthographic errors in web pages: toward cleaner web corpora. Computational Linguistics. 2006, 32 (3): 295-340. 10.1162/coli.2006.32.3.295.CrossRef
45.
go back to reference Institute of Health Information and Statistics of the Czech Republic (homepage on the internet). (last accessed October 10, 2011), [http://www.uzis.cz] Institute of Health Information and Statistics of the Czech Republic (homepage on the internet). (last accessed October 10, 2011), [http://​www.​uzis.​cz]
47.
go back to reference European Committee for standardisation (CEN), Technical Committee CEN/TC251: European Standard EN 13606, "Health informatics - Electronic health record communication". European Committee for standardisation (CEN), Technical Committee CEN/TC251: European Standard EN 13606, "Health informatics - Electronic health record communication".
Metadata
Title
Measuring diversity in medical reports based on categorized attributes and international classification systems
Authors
Petra Přečková
Jana Zvárová
Karel Zvára
Publication date
01-12-2012
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2012
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/1472-6947-12-31

Other articles of this Issue 1/2012

BMC Medical Informatics and Decision Making 1/2012 Go to the issue