Skip to main content
Top
Published in: Knee Surgery, Sports Traumatology, Arthroscopy 4/2023

07-12-2022 | Artificial Intelligence | Review

Natural language processing: using artificial intelligence to understand human language in orthopedics

Authors: James A. Pruneski, Ayoosh Pareek, Benedict U. Nwachukwu, R. Kyle Martin, Bryan T. Kelly, Jón Karlsson, Andrew D. Pearle, Ata M. Kiapour, Riley J. Williams III

Published in: Knee Surgery, Sports Traumatology, Arthroscopy | Issue 4/2023

Login to get access

Abstract

Natural language processing (NLP) describes the broad field of artificial intelligence by which computers are trained to understand and generate human language. Within healthcare research, NLP is commonly used for variable extraction and classification/cohort identification tasks. While these tools are becoming increasingly popular and available as both open-source and commercial products, there is a paucity of the literature within the orthopedic space describing the key tasks within these powerful pipelines. Curation and navigation of the electronic medical record are becoming increasingly onerous, and it is important for physicians and other healthcare professionals to understand potential methods of harnessing this large data resource. The purpose of this study is to provide an overview of the tasks required to develop an NLP pipeline for orthopedic research and present recent examples of successful implementations.
Literature
2.
go back to reference Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, et al. (2019) Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323 Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, et al. (2019) Publicly available clinical BERT embeddings. arXiv preprint arXiv:​1904.​03323
3.
go back to reference Balakrishnan V, Ethe L (2014) Stemming and lemmatization: a comparison of retrieval performances. Lect Notes Softw Eng 2(3):262–267CrossRef Balakrishnan V, Ethe L (2014) Stemming and lemmatization: a comparison of retrieval performances. Lect Notes Softw Eng 2(3):262–267CrossRef
4.
go back to reference Ben-Ari A, Chansky H, Rozet I (2017) Preoperative opioid use is associated with early revision after total knee arthroplasty: a study of male patients treated in the veterans affairs system. J Bone Joint Surg Am 99:1–9CrossRefPubMed Ben-Ari A, Chansky H, Rozet I (2017) Preoperative opioid use is associated with early revision after total knee arthroplasty: a study of male patients treated in the veterans affairs system. J Bone Joint Surg Am 99:1–9CrossRefPubMed
5.
go back to reference Brants T (2000) TnT-a statistical part-of-speech tagger. arXiv preprint cs/0003055 Brants T (2000) TnT-a statistical part-of-speech tagger. arXiv preprint cs/0003055
6.
go back to reference Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 2016 Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 2016
8.
go back to reference Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–2979CrossRef Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–2979CrossRef
9.
go back to reference Deanehan JK, Kimia AA, Tan Tanny SP, Milewski MD, Talusan PG, Smith BG et al (2013) Distinguishing Lyme from septic knee monoarthritis in Lyme disease-endemic areas. Pediatrics 131:e695-701CrossRefPubMed Deanehan JK, Kimia AA, Tan Tanny SP, Milewski MD, Talusan PG, Smith BG et al (2013) Distinguishing Lyme from septic knee monoarthritis in Lyme disease-endemic areas. Pediatrics 131:e695-701CrossRefPubMed
10.
go back to reference Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:​1810.​04805
11.
go back to reference Floyd JS, Heckbert SR, Weiss NS, Carrell DS, Psaty BM (2012) Use of administrative data to estimate the incidence of statin-related rhabdomyolysis. JAMA 307:1580–1582CrossRefPubMedPubMedCentral Floyd JS, Heckbert SR, Weiss NS, Carrell DS, Psaty BM (2012) Use of administrative data to estimate the incidence of statin-related rhabdomyolysis. JAMA 307:1580–1582CrossRefPubMedPubMedCentral
12.
go back to reference Friedl JE (2006) Mastering regular expressions. O’Reilly Media Inc, Sebastopol Friedl JE (2006) Mastering regular expressions. O’Reilly Media Inc, Sebastopol
14.
go back to reference Géron A (2019) Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media Inc, Sebastopol Géron A (2019) Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media Inc, Sebastopol
15.
17.
go back to reference James G, Witten D, Hastie T, Tibshirani R (2021) An Introduction to Statistical Learning: with Applications in R. Springer, New YorkCrossRef James G, Witten D, Hastie T, Tibshirani R (2021) An Introduction to Statistical Learning: with Applications in R. Springer, New YorkCrossRef
18.
go back to reference Jatnika D, Bijaksana MA, Suryani AA (2019) Word2vec model analysis for semantic similarities in english words. Procedia Comput Sci 157:160–167CrossRef Jatnika D, Bijaksana MA, Suryani AA (2019) Word2vec model analysis for semantic similarities in english words. Procedia Comput Sci 157:160–167CrossRef
19.
go back to reference Jing L-P, Huang H-K, Shi H-B (2002). Improved feature selection approach TFIDF in text mining. Paper presented at: Proceedings of 2002 International Conference on Machine Learning and Cybernetics, Beijing, 4–9 December 2002 Jing L-P, Huang H-K, Shi H-B (2002). Improved feature selection approach TFIDF in text mining. Paper presented at: Proceedings of 2002 International Conference on Machine Learning and Cybernetics, Beijing, 4–9 December 2002
20.
go back to reference Jurafsky D, Martin JH (2006) Speech and language processing: an introduction to natural language processing. Wiley, New York Jurafsky D, Martin JH (2006) Speech and language processing: an introduction to natural language processing. Wiley, New York
21.
go back to reference Karhade AV, Bongers MER, Groot OQ, Cha TD, Doorly TP, Fogel HA et al (2020) Can natural language processing provide accurate, automated reporting of wound infection requiring reoperation after lumbar discectomy? Spine J 20:1602–1609CrossRefPubMed Karhade AV, Bongers MER, Groot OQ, Cha TD, Doorly TP, Fogel HA et al (2020) Can natural language processing provide accurate, automated reporting of wound infection requiring reoperation after lumbar discectomy? Spine J 20:1602–1609CrossRefPubMed
23.
go back to reference Kimia AA, Savova G, Landschaft A, Harper MB (2015) An introduction to natural language processing: how you can get more from those electronic notes you are generating. Pediatr Emerg Care 31:536–541CrossRefPubMed Kimia AA, Savova G, Landschaft A, Harper MB (2015) An introduction to natural language processing: how you can get more from those electronic notes you are generating. Pediatr Emerg Care 31:536–541CrossRefPubMed
24.
go back to reference Le Q, Mikolov T (2014). Distributed representations of sentences and documents. Paper presented at: 2014 International conference on machine learning, Beijing, 21–26 June 2014 Le Q, Mikolov T (2014). Distributed representations of sentences and documents. Paper presented at: 2014 International conference on machine learning, Beijing, 21–26 June 2014
25.
go back to reference LeCun Y, Kavukcuoglu K, Farabet C (2010). Convolutional networks and applications in vision. Paper presented at: 2010 IEEE international symposium on circuits and systems, Paris, 30 May - 2 June 2010 LeCun Y, Kavukcuoglu K, Farabet C (2010). Convolutional networks and applications in vision. Paper presented at: 2010 IEEE international symposium on circuits and systems, Paris, 30 May - 2 June 2010
26.
go back to reference Lee J, Yoon W, Kim S, Kim D, Kim S, So CH et al (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234–1240CrossRefPubMed Lee J, Yoon W, Kim S, Kim D, Kim S, So CH et al (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36:1234–1240CrossRefPubMed
27.
go back to reference Levin E, Pieraccini R, Eckert W (1998) Using Markov decision process for learning dialogue strategies. Paper presented at: 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, 12–15 May 1998 Levin E, Pieraccini R, Eckert W (1998) Using Markov decision process for learning dialogue strategies. Paper presented at: 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, Seattle, 12–15 May 1998
28.
go back to reference Liu G, Liao Y, Wang F, Zhang B, Zhang L, Liang X et al (2021) Medical-VLBERT: medical visual language BERT for COVID-19 CT report generation with alternate learning. IEEE Trans Neural Netw Learn Syst 32:3786–3797CrossRefPubMed Liu G, Liao Y, Wang F, Zhang B, Zhang L, Liang X et al (2021) Medical-VLBERT: medical visual language BERT for COVID-19 CT report generation with alternate learning. IEEE Trans Neural Netw Learn Syst 32:3786–3797CrossRefPubMed
29.
go back to reference Liu H, Bielinski SJ, Sohn S, Murphy S, Wagholikar KB, Jonnalagadda SR et al (2013) An information extraction framework for cohort identification using electronic health records. AMIA Jt Summits Transl Sci Proc b 2013:149 Liu H, Bielinski SJ, Sohn S, Murphy S, Wagholikar KB, Jonnalagadda SR et al (2013) An information extraction framework for cohort identification using electronic health records. AMIA Jt Summits Transl Sci Proc b 2013:149
30.
go back to reference Lovins JB (1968) Development of a stemming algorithm. Mech Transl Comput Linguistics 11:22–31 Lovins JB (1968) Development of a stemming algorithm. Mech Transl Comput Linguistics 11:22–31
31.
go back to reference Martin-Sanchez F, Verspoor K (2014) Big data in medicine is driving big changes. Year Med Inform 23:14–20CrossRef Martin-Sanchez F, Verspoor K (2014) Big data in medicine is driving big changes. Year Med Inform 23:14–20CrossRef
32.
go back to reference Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:​1301.​3781
33.
go back to reference Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Paper presented at: 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, 5–10 December 2013 Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Paper presented at: 27th Annual Conference on Neural Information Processing Systems, Lake Tahoe, 5–10 December 2013
34.
go back to reference Müller AC, Guido S (2016) Introduction to machine learning with Python: a guide for data scientists. O’Reilly Media Inc, Sebastopol Müller AC, Guido S (2016) Introduction to machine learning with Python: a guide for data scientists. O’Reilly Media Inc, Sebastopol
39.
go back to reference Rai A, Borah S (2021) Study of various methods for tokenization. Applications of internet of things. Springer, Singapore, pp 193–200CrossRef Rai A, Borah S (2021) Study of various methods for tokenization. Applications of internet of things. Springer, Singapore, pp 193–200CrossRef
40.
go back to reference Rasmy L, Xiang Y, Xie Z, Tao C, Zhi D (2021) Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit Med 4:86CrossRefPubMedPubMedCentral Rasmy L, Xiang Y, Xie Z, Tao C, Zhi D (2021) Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit Med 4:86CrossRefPubMedPubMedCentral
41.
go back to reference Rothman D (2021) Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more. Packt Publishing Ltd, Birmingham Rothman D (2021) Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more. Packt Publishing Ltd, Birmingham
42.
go back to reference Sagheb E, Ramazanian T, Tafti AP, Fu S, Kremers WK, Berry DJ et al (2021) Use of natural language processing algorithms to identify common data elements in operative notes for knee arthroplasty. J Arthroplasty 36:922–926CrossRefPubMed Sagheb E, Ramazanian T, Tafti AP, Fu S, Kremers WK, Berry DJ et al (2021) Use of natural language processing algorithms to identify common data elements in operative notes for knee arthroplasty. J Arthroplasty 36:922–926CrossRefPubMed
43.
go back to reference Sanders TL, Pareek A, Desai VS, Hewett TE, Levy BA, Stuart MJ et al (2018) Low accuracy of diagnostic codes to identify anterior cruciate ligament tear in orthopedic database research. Am J Sports Med 46:2894–2898CrossRefPubMedPubMedCentral Sanders TL, Pareek A, Desai VS, Hewett TE, Levy BA, Stuart MJ et al (2018) Low accuracy of diagnostic codes to identify anterior cruciate ligament tear in orthopedic database research. Am J Sports Med 46:2894–2898CrossRefPubMedPubMedCentral
44.
go back to reference Shah RF, Bini S, Vail T (2020) Data for registry and quality review can be retrospectively collected using natural language processing from unstructured charts of arthroplasty patients. Bone Joint J 102-B:99–104CrossRefPubMed Shah RF, Bini S, Vail T (2020) Data for registry and quality review can be retrospectively collected using natural language processing from unstructured charts of arthroplasty patients. Bone Joint J 102-B:99–104CrossRefPubMed
45.
go back to reference Silva C, Ribeiro B (2003) The importance of stop word removal on recall values in text categorization. Paper presented at: International Joint Conference on Neural Networks, Istanbul, 26–29 June 2003 Silva C, Ribeiro B (2003) The importance of stop word removal on recall values in text categorization. Paper presented at: International Joint Conference on Neural Networks, Istanbul, 26–29 June 2003
47.
go back to reference Tan WK, Hassanpour S, Heagerty PJ, Rundell SD, Suri P, Huhdanpaa HT et al (2018) Comparison of natural language processing rules-based and machine-learning systems to identify lumbar spine imaging findings related to low back pain. Acad Radiol 25:1422–1432CrossRefPubMedPubMedCentral Tan WK, Hassanpour S, Heagerty PJ, Rundell SD, Suri P, Huhdanpaa HT et al (2018) Comparison of natural language processing rules-based and machine-learning systems to identify lumbar spine imaging findings related to low back pain. Acad Radiol 25:1422–1432CrossRefPubMedPubMedCentral
49.
go back to reference Thirukumaran CP, Zaman A, Rubery PT, Calabria C, Li Y, Ricciardi BF et al (2019) Natural language processing for the identification of surgical site infections in orthopedics. J Bone Joint Surg Am 101:2167–2174CrossRefPubMed Thirukumaran CP, Zaman A, Rubery PT, Calabria C, Li Y, Ricciardi BF et al (2019) Natural language processing for the identification of surgical site infections in orthopedics. J Bone Joint Surg Am 101:2167–2174CrossRefPubMed
50.
go back to reference Tibbo ME, Wyles CC, Fu S, Sohn S, Lewallen DG, Berry DJ et al (2019) Use of natural language processing tools to identify and classify periprosthetic femur fractures. J Arthroplasty 34:2216–2219CrossRefPubMedPubMedCentral Tibbo ME, Wyles CC, Fu S, Sohn S, Lewallen DG, Berry DJ et al (2019) Use of natural language processing tools to identify and classify periprosthetic femur fractures. J Arthroplasty 34:2216–2219CrossRefPubMedPubMedCentral
51.
go back to reference Turing AM (2009) Computing machinery and intelligence. Parsing the turing test Springer Science + Media LLC. Springer, New York Turing AM (2009) Computing machinery and intelligence. Parsing the turing test Springer Science + Media LLC. Springer, New York
52.
go back to reference VanderPlas J (2016) Python data science handbook: Essential tools for working with data. O’Reilly Media, Inc., Sebastopol VanderPlas J (2016) Python data science handbook: Essential tools for working with data. O’Reilly Media, Inc., Sebastopol
53.
go back to reference Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. (2017) Attention is all you need. Paper presented at: 2017 Conference on Neural Information Processing Systems, Long Beach, 4–9 December 2017 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. (2017) Attention is all you need. Paper presented at: 2017 Conference on Neural Information Processing Systems, Long Beach, 4–9 December 2017
54.
go back to reference Wen A, Fu S, Moon S, El Wazir M, Rosenbaum A, Kaggal VC et al (2019) Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation. NPJ Digit Med 2:130CrossRefPubMedPubMedCentral Wen A, Fu S, Moon S, El Wazir M, Rosenbaum A, Kaggal VC et al (2019) Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation. NPJ Digit Med 2:130CrossRefPubMedPubMedCentral
55.
go back to reference Wyles CC, Tibbo ME, Fu S, Wang Y, Sohn S, Kremers WK et al (2019) Use of natural language processing algorithms to identify common data elements in operative notes for total hip arthroplasty. J Bone Joint Surg Am 101:1931–1938CrossRefPubMed Wyles CC, Tibbo ME, Fu S, Wang Y, Sohn S, Kremers WK et al (2019) Use of natural language processing algorithms to identify common data elements in operative notes for total hip arthroplasty. J Bone Joint Surg Am 101:1931–1938CrossRefPubMed
Metadata
Title
Natural language processing: using artificial intelligence to understand human language in orthopedics
Authors
James A. Pruneski
Ayoosh Pareek
Benedict U. Nwachukwu
R. Kyle Martin
Bryan T. Kelly
Jón Karlsson
Andrew D. Pearle
Ata M. Kiapour
Riley J. Williams III
Publication date
07-12-2022
Publisher
Springer Berlin Heidelberg
Published in
Knee Surgery, Sports Traumatology, Arthroscopy / Issue 4/2023
Print ISSN: 0942-2056
Electronic ISSN: 1433-7347
DOI
https://doi.org/10.1007/s00167-022-07272-0

Other articles of this Issue 4/2023

Knee Surgery, Sports Traumatology, Arthroscopy 4/2023 Go to the issue