research-article

Medical coding classification by leveraging inter-code relationships

Authors:
Yan Yan

Northeastern University, Boston, MA, USA

Northeastern University, Boston, MA, USA
View Profile

,
Glenn Fung

Siemens Healthcare, Malvern, PA, USA

Siemens Healthcare, Malvern, PA, USA
View Profile

,
Jennifer G. Dy

Northeastern University, Boston, MA, USA

Northeastern University, Boston, MA, USA
View Profile

,
Romer Rosales

Siemens Healthcare, Malvern, PA, USA

Siemens Healthcare, Malvern, PA, USA
View Profile

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data miningJuly 2010Pages 193–202https://doi.org/10.1145/1835804.1835831

Published:25 July 2010Publication History

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 193–202

ABSTRACT

Medical coding or classification is the process of transforming information contained in patient medical records into standard predefined medical codes. There are several worldwide accepted medical coding conventions associated with diagnoses and medical procedures; however, in the United States the Ninth Revision of ICD(ICD-9) provides the standard for coding clinical records. Accurate medical coding is important since it is used by hospitals for insurance billing purposes. Since after discharge a patient can be assigned or classified to several ICD-9 codes, the coding problem can be seen as a multi-label classification problem. In this paper, we introduce a multi-label large-margin classifier that automatically learns the underlying inter-code structure and allows the controlled incorporation of prior knowledge about medical code relationships. In addition to refining and learning the code relationships, our classifier can also utilize this shared information to improve its performance. Experiments on a publicly available dataset containing clinical free text and their associated medical codes showed that our proposed multi-label classifier outperforms related multi-label models in this problem.

Supplemental Material

kdd2010_yan_mccl_01.mov

mov

89.1 MB

Download

References

C. Benesch, D. W. Jr, A. Wilder, P. Duncan, G. Samsa, and D. Matchar. Inaccuracy of the international classification of diseases ICD-9-cm in identifying the diagnosis of ischemic cerebrovascular disease. Neurology, 1997.Google Scholar
M. Boutell, J. Luo, X. Shen, and C. Brown. Learning multi-label scene classification. Pattern Recognition, 37:9:1757--71, 2004.Google ScholarCross Ref
K. Crammer and Y. Singer. A new family of online algorithms for category ranking. In ACM SIGIR, pages 151--158, 2002. Google ScholarDigital Library
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269--274, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
A. Elisseeff and J. Weston. A kernel method for multi-labelled classification. In In Advances in Neural Information Processing Systems 14, pages 681--687. MIT Press, 2001.Google Scholar
I. Guyon and A. Elisseeff. An introduction to variable and feature selection. JMLR, 3:1157--1182, 2003. Google ScholarDigital Library
http://www.icd9coding.com/.Google Scholar
R. B. Jean, J. Charles, and G. J. Nocedal. A trust region method based on interior point techniques for nonlinear programming. Mathematical Programming, 89:149--185, 1996.Google Scholar
L. Larkey and W. B. Croft. Automatic assignment of ICD9 codes to discharge summaries. IR IR-64, Center for Intelligent Information Retrieval, University of Massachusetts, Amherst, 1995.Google Scholar
L. Lita, S. Yu, S. Niculescu, and J. Bi. Large scale diagnostic code classification for medical patient records. In AIME, pages 331--339, 1995.Google Scholar
C. Lovis, P. Michel, R. Baud, and J. Scherrer. Use of a conceptual semi-automatic ICD-9 encoding system in a hospital environment. In AIME, pages 331--339, 1995. Google ScholarDigital Library
J. Nocedal and S. Wright. Numerical Optimization (2nd ed.). Springer-Verlag, Berlin, New York, 2003.Google Scholar
R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. In Machine Learning, pages 135--168, 2000. Google ScholarDigital Library
M. Schmidt, G. Fung, and R. Rosales. Fast optimization methods for l1 regularization: A comparative study and two new approaches. In ECML '07: Proceedings of the 18th European conference on Machine Learning, pages 286--297, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarDigital Library
A. Sonel, C. Good, H. Rao, A. Macioce, L. Wall, R. Niculescu, S. Sandilya, P. Giang, S. Krishnan, P. Aloni, and R. Rao. Use of REMIND artificial intelligence software for rapid assessment of adherence to disease specific management guidelines in acute coronary syndromes. AHRQ, 2006.Google Scholar
J. Weston, A. Elisseeff, B. Scholkopf, and M. Tipping. Use of the zero norm with linear models and kernel methods. JMLR, 3:1439--1461, 2003. Google ScholarDigital Library
M.-L. Zhang. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. on Knowl. and Data Eng., 18(10):1338--1351, 2006. Google ScholarDigital Library
M.-L. Zhang and Z.-H. Zhou. A k-nearest neighbor based algorithm for multi-label classification. In IEEE International Conference on Granular Computing, volume 2, pages 718--721 Vol. 2. The IEEE Computational Intelligence Society, 2005.Google Scholar

Index Terms

Medical coding classification by leveraging inter-code relationships
1. Computing methodologies
  1. Machine learning

Recommendations

Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources
WWW '19: The World Wide Web Conference

Clinical notes contain detailed information about health status of patients for each of their encounters with a health system. Developing effective models to automatically assign medical codes to clinical notes has been a long-standing active research ...
Read More
Coding Medical Information: Classification Versus Nomenclature and Implications to the Israeli Medical System

The efficient retrieval of medical information is essential for all functional aspects of a health system. Such retrieval is possible only by coding data (as it is produced or after it is produced) and entering it into a data-base. The completeness and ...
Read More
Neural transfer learning for assigning diagnosis codes to EMRs
Highlights
- Transfer learning using convolutional neural networks improves multi-label learning.
Abstract Objective
Electronic medical records (EMRs) are manually annotated by healthcare professionals and specialized medical coders with a standardized set of alphanumeric diagnosis and procedure codes, specifically from the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
July 2010
1240 pages
ISBN:9781450300551
DOI:10.1145/1835804
General Chairs:
Bharat Rao
Siemens
,
Balaji Krishnapuram
Siemens
,
Program Chairs:
Andrew Tomkins
Google Inc.
,
Qiang Yang
Hong Kong University of Science and Technology
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 July 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
L1 regularization
classification
large margin
medical coding
medical data mining
multi-label classification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 26
  Total Citations
  View Citations
- 658
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Medical coding classification by leveraging inter-code relationships

KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources

Coding Medical Information: Classification Versus Nomenclature and Implications to the Israeli Medical System

Neural transfer learning for assigning diagnosis codes to EMRs