Skip to main content
Top
Published in:

26-12-2023

Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes

Authors: Veronica Barcelona, Danielle Scharp, Hans Moen, Anahita Davoudi, Betina R. Idnay, Kenrick Cato, Maxim Topaz

Published in: Maternal and Child Health Journal | Issue 3/2024

Login to get access

Abstract

Introduction

Stigma and bias related to race and other minoritized statuses may underlie disparities in pregnancy and birth outcomes. One emerging method to identify bias is the study of stigmatizing language in the electronic health record. The objective of our study was to develop automated natural language processing (NLP) methods to identify two types of stigmatizing language: marginalizing language and its complement, power/privilege language, accurately and automatically in labor and birth notes.

Methods

We analyzed notes for all birthing people > 20 weeks’ gestation admitted for labor and birth at two hospitals during 2017. We then employed text preprocessing techniques, specifically using TF-IDF values as inputs, and tested machine learning classification algorithms to identify stigmatizing and power/privilege language in clinical notes. The algorithms assessed included Decision Trees, Random Forest, and Support Vector Machines. Additionally, we applied a feature importance evaluation method (InfoGain) to discern words that are highly correlated with these language categories.

Results

For marginalizing language, Decision Trees yielded the best classification with an F-score of 0.73. For power/privilege language, Support Vector Machines performed optimally, achieving an F-score of 0.91. These results demonstrate the effectiveness of the selected machine learning methods in classifying language categories in clinical notes.

Conclusion

We identified well-performing machine learning methods to automatically detect stigmatizing language in clinical notes. To our knowledge, this is the first study to use NLP performance metrics to evaluate the performance of machine learning methods in discerning stigmatizing language. Future studies should delve deeper into refining and evaluating NLP methods, incorporating the latest algorithms rooted in deep learning.
Literature
go back to reference Berthold, M. R. C., Dill, N., Gabriel, F., Kotter, T. R., Meinl, T., Ohl, T., Thiel, P., & Wiswedel, K., B (2009). KNIME – the Konstanz Information Miner. AcM SIGKDD Explorations Newsletter, 11(1), 26–31.CrossRef Berthold, M. R. C., Dill, N., Gabriel, F., Kotter, T. R., Meinl, T., Ohl, T., Thiel, P., & Wiswedel, K., B (2009). KNIME – the Konstanz Information Miner. AcM SIGKDD Explorations Newsletter, 11(1), 26–31.CrossRef
go back to reference Braveman, P., Dominguez, T. P., Burke, W., Dolan, S. M., Stevenson, D. K., Jackson, F. M., & Waddell, L. (2021). Explaining the black-white disparity in Preterm Birth: A Consensus Statement from a Multi-disciplinary Scientific Work Group convened by the March of dimes [Review]. 3. https://doi.org/10.3389/frph.2021.684207. Braveman, P., Dominguez, T. P., Burke, W., Dolan, S. M., Stevenson, D. K., Jackson, F. M., & Waddell, L. (2021). Explaining the black-white disparity in Preterm Birth: A Consensus Statement from a Multi-disciplinary Scientific Work Group convened by the March of dimes [Review]. 3. https://​doi.​org/​10.​3389/​frph.​2021.​684207.
go back to reference Bridle, J. S. (1990). Probabilistic interpretation of Feedforward Classification Network Outputs, with relationships to Statistical Pattern Recognition. In F. F. Soulié, & J. Hérault (Eds.), Neurocomputing (Vol. 68). Springer. Bridle, J. S. (1990). Probabilistic interpretation of Feedforward Classification Network Outputs, with relationships to Statistical Pattern Recognition. In F. F. Soulié, & J. Hérault (Eds.), Neurocomputing (Vol. 68). Springer.
go back to reference Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
go back to reference Ho, T. K. (1995). Random decision forests. The Institute of Electronical and Electronics Engineers (IEEE), In Proceedings of 3rd international conference on document analysis and recognition. Ho, T. K. (1995). Random decision forests. The Institute of Electronical and Electronics Engineers (IEEE), In Proceedings of 3rd international conference on document analysis and recognition.
go back to reference Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. European conference on machine learning Berlin, Heidelberg. Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. European conference on machine learning Berlin, Heidelberg.
go back to reference Manning, C. D. R., & Schütze, P., H (2008). Introduction to information retrieval (Vol. 39). Cambridge University Press. Manning, C. D. R., & Schütze, P., H (2008). Introduction to information retrieval (Vol. 39). Cambridge University Press.
go back to reference Martin, J. A., & Osterman, M. J. K. (2018). Describing the increase in Preterm Births in the United States, 2014–2016. NCHS data Brief, (312)(312), 1–8. Martin, J. A., & Osterman, M. J. K. (2018). Describing the increase in Preterm Births in the United States, 2014–2016. NCHS data Brief, (312)(312), 1–8.
Metadata
Title
Using Natural Language Processing to Identify Stigmatizing Language in Labor and Birth Clinical Notes
Authors
Veronica Barcelona
Danielle Scharp
Hans Moen
Anahita Davoudi
Betina R. Idnay
Kenrick Cato
Maxim Topaz
Publication date
26-12-2023
Publisher
Springer US
Published in
Maternal and Child Health Journal / Issue 3/2024
Print ISSN: 1092-7875
Electronic ISSN: 1573-6628
DOI
https://doi.org/10.1007/s10995-023-03857-4

Other articles of this Issue 3/2024

Maternal and Child Health Journal 3/2024 Go to the issue