Skip to main content
Top
Published in: Aesthetic Plastic Surgery 5/2024

25-01-2024 | Artificial Intelligence | Original Articles

Complications Following Body Contouring: Performance Validation of Bard, a Novel AI Large Language Model, in Triaging and Managing Postoperative Patient Concerns

Authors: Jad Abi-Rafeh, Vanessa J. Mroueh, Brian Bassiri-Tehrani, Jacob Marks, Roy Kazan, Foad Nahai

Published in: Aesthetic Plastic Surgery | Issue 5/2024

Login to get access

Abstract

Introduction

Large language models (LLM) have revolutionized the way humans interact with artificial intelligence (AI) technology, with marked potential for applications in esthetic surgery. The present study evaluates the performance of Bard, a novel LLM, in identifying and managing postoperative patient concerns for complications following body contouring surgery.

Methods

The American Society of Plastic Surgeons’ website was queried to identify and simulate all potential postoperative complications following body contouring across different acuities and severity. Bard’s accuracy was assessed in providing a differential diagnosis, soliciting a history, suggesting a most-likely diagnosis, appropriate disposition, treatments/interventions to begin from home, and red-flag signs/symptoms indicating deterioration, or requiring urgent emergency department (ED) presentation.

Results

Twenty-two simulated body contouring complications were examined. Overall, Bard demonstrated a 59% accuracy in listing relevant diagnoses on its differentials, with a 52% incidence of incorrect or misleading diagnoses. Following history-taking, Bard demonstrated an overall accuracy of 44% in identifying the most-likely diagnosis, and a 55% accuracy in suggesting the indicated medical dispositions. Helpful treatments/interventions to begin from home were suggested with a 40% accuracy, whereas red-flag signs/symptoms, indicating deterioration, were shared with a 48% accuracy. A detailed analysis of performance, stratified according to latency of postoperative presentation (<48hours, 48hours–1month, or >1month postoperatively), and according to acuity and indicated medical disposition, is presented herein.

Conclusions

Despite promising potential of LLMs and AI in healthcare-related applications, Bard’s performance in the present study significantly falls short of accepted clinical standards, thus indicating a need for further research and development prior to adoption.

Level of Evidence IV

This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.​springer.​com/​00266.
Appendix
Available only for authorised users
Literature
2.
go back to reference Eggmann F, Weiger R, Zitzmann NU, Blatz MB. (2023) Implications of large language models such as ChatGPT for dental medicine. J Esthetic Res Dent 35(7):1098–1102 Eggmann F, Weiger R, Zitzmann NU, Blatz MB. (2023) Implications of large language models such as ChatGPT for dental medicine. J Esthetic Res Dent 35(7):1098–1102
6.
go back to reference Gilson A, Safranek CW, Huang T et al (2023) How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9(1):e45312CrossRefPubMedPubMedCentral Gilson A, Safranek CW, Huang T et al (2023) How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9(1):e45312CrossRefPubMedPubMedCentral
8.
go back to reference Vaishya R, Misra A, Vaish A (2023) ChatGPT: Is this version good for healthcare and research? Diabetes Metab Syndr 17(4):102744CrossRefPubMed Vaishya R, Misra A, Vaish A (2023) ChatGPT: Is this version good for healthcare and research? Diabetes Metab Syndr 17(4):102744CrossRefPubMed
13.
go back to reference Hanson CW III, Marshall BE (2001) Artificial intelligence applications in the intensive care unit. Crit Care Med 29(2):427–435CrossRefPubMed Hanson CW III, Marshall BE (2001) Artificial intelligence applications in the intensive care unit. Crit Care Med 29(2):427–435CrossRefPubMed
14.
go back to reference Cheng K, He Y, Li C et al (2023) Talk with ChatGPT about the outbreak of Mpox in 2022: reflections and suggestions from AI dimensions. Annal Biomed Eng 8:1–5 Cheng K, He Y, Li C et al (2023) Talk with ChatGPT about the outbreak of Mpox in 2022: reflections and suggestions from AI dimensions. Annal Biomed Eng 8:1–5
15.
go back to reference Coffey R, Gupta V (2023) Meralgia paresthetica. In: StatPearls. StatPearls Publishing, Treasure Island, FL Coffey R, Gupta V (2023) Meralgia paresthetica. In: StatPearls. StatPearls Publishing, Treasure Island, FL
23.
go back to reference National Guideline Centre (UK) (2020) NICE Evidence reviews collection. Evidence review for information and support needs: Perioperative care in adults: Evidence review A. National Institute for Health and Care Excellence (NICE), London National Guideline Centre (UK) (2020) NICE Evidence reviews collection. Evidence review for information and support needs: Perioperative care in adults: Evidence review A. National Institute for Health and Care Excellence (NICE), London
Metadata
Title
Complications Following Body Contouring: Performance Validation of Bard, a Novel AI Large Language Model, in Triaging and Managing Postoperative Patient Concerns
Authors
Jad Abi-Rafeh
Vanessa J. Mroueh
Brian Bassiri-Tehrani
Jacob Marks
Roy Kazan
Foad Nahai
Publication date
25-01-2024
Publisher
Springer US
Published in
Aesthetic Plastic Surgery / Issue 5/2024
Print ISSN: 0364-216X
Electronic ISSN: 1432-5241
DOI
https://doi.org/10.1007/s00266-023-03819-9

Other articles of this Issue 5/2024

Aesthetic Plastic Surgery 5/2024 Go to the issue