skip to main content
10.1145/1835804.1835831acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Medical coding classification by leveraging inter-code relationships

Published:25 July 2010Publication History

ABSTRACT

Medical coding or classification is the process of transforming information contained in patient medical records into standard predefined medical codes. There are several worldwide accepted medical coding conventions associated with diagnoses and medical procedures; however, in the United States the Ninth Revision of ICD(ICD-9) provides the standard for coding clinical records. Accurate medical coding is important since it is used by hospitals for insurance billing purposes. Since after discharge a patient can be assigned or classified to several ICD-9 codes, the coding problem can be seen as a multi-label classification problem. In this paper, we introduce a multi-label large-margin classifier that automatically learns the underlying inter-code structure and allows the controlled incorporation of prior knowledge about medical code relationships. In addition to refining and learning the code relationships, our classifier can also utilize this shared information to improve its performance. Experiments on a publicly available dataset containing clinical free text and their associated medical codes showed that our proposed multi-label classifier outperforms related multi-label models in this problem.

Skip Supplemental Material Section

Supplemental Material

kdd2010_yan_mccl_01.mov

mov

89.1 MB

References

  1. C. Benesch, D. W. Jr, A. Wilder, P. Duncan, G. Samsa, and D. Matchar. Inaccuracy of the international classification of diseases ICD-9-cm in identifying the diagnosis of ischemic cerebrovascular disease. Neurology, 1997.Google ScholarGoogle Scholar
  2. M. Boutell, J. Luo, X. Shen, and C. Brown. Learning multi-label scene classification. Pattern Recognition, 37:9:1757--71, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  3. K. Crammer and Y. Singer. A new family of online algorithms for category ranking. In ACM SIGIR, pages 151--158, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In KDD '01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269--274, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Elisseeff and J. Weston. A kernel method for multi-labelled classification. In In Advances in Neural Information Processing Systems 14, pages 681--687. MIT Press, 2001.Google ScholarGoogle Scholar
  6. I. Guyon and A. Elisseeff. An introduction to variable and feature selection. JMLR, 3:1157--1182, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. http://www.icd9coding.com/.Google ScholarGoogle Scholar
  8. R. B. Jean, J. Charles, and G. J. Nocedal. A trust region method based on interior point techniques for nonlinear programming. Mathematical Programming, 89:149--185, 1996.Google ScholarGoogle Scholar
  9. L. Larkey and W. B. Croft. Automatic assignment of ICD9 codes to discharge summaries. IR IR-64, Center for Intelligent Information Retrieval, University of Massachusetts, Amherst, 1995.Google ScholarGoogle Scholar
  10. L. Lita, S. Yu, S. Niculescu, and J. Bi. Large scale diagnostic code classification for medical patient records. In AIME, pages 331--339, 1995.Google ScholarGoogle Scholar
  11. C. Lovis, P. Michel, R. Baud, and J. Scherrer. Use of a conceptual semi-automatic ICD-9 encoding system in a hospital environment. In AIME, pages 331--339, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Nocedal and S. Wright. Numerical Optimization (2nd ed.). Springer-Verlag, Berlin, New York, 2003.Google ScholarGoogle Scholar
  13. R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. In Machine Learning, pages 135--168, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. Schmidt, G. Fung, and R. Rosales. Fast optimization methods for l1 regularization: A comparative study and two new approaches. In ECML '07: Proceedings of the 18th European conference on Machine Learning, pages 286--297, Berlin, Heidelberg, 2007. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Sonel, C. Good, H. Rao, A. Macioce, L. Wall, R. Niculescu, S. Sandilya, P. Giang, S. Krishnan, P. Aloni, and R. Rao. Use of REMIND artificial intelligence software for rapid assessment of adherence to disease specific management guidelines in acute coronary syndromes. AHRQ, 2006.Google ScholarGoogle Scholar
  16. J. Weston, A. Elisseeff, B. Scholkopf, and M. Tipping. Use of the zero norm with linear models and kernel methods. JMLR, 3:1439--1461, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M.-L. Zhang. Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans. on Knowl. and Data Eng., 18(10):1338--1351, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M.-L. Zhang and Z.-H. Zhou. A k-nearest neighbor based algorithm for multi-label classification. In IEEE International Conference on Granular Computing, volume 2, pages 718--721 Vol. 2. The IEEE Computational Intelligence Society, 2005.Google ScholarGoogle Scholar

Index Terms

  1. Medical coding classification by leveraging inter-code relationships

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
      July 2010
      1240 pages
      ISBN:9781450300551
      DOI:10.1145/1835804

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader