Skip to main content
Top

Open Access 05-03-2025 | Artificial Intelligence | Original Article

Assessing the performance of large language models (GPT-3.5 and GPT-4) and accurate clinical information for pediatric nephrology

Author: Nadide Melike Sav

Published in: Pediatric Nephrology

Login to get access

Abstract

Background

Artificial intelligence (AI) has emerged as a transformative tool in healthcare, offering significant advancements in providing accurate clinical information. However, the performance and applicability of AI models in specialized fields such as pediatric nephrology remain underexplored. This study is aimed at evaluating the ability of two AI-based language models, GPT-3.5 and GPT-4, to provide accurate and reliable clinical information in pediatric nephrology. The models were evaluated on four criteria: accuracy, scope, patient friendliness, and clinical applicability.

Methods

Forty pediatric nephrology specialists with ≥ 5 years of experience rated GPT-3.5 and GPT-4 responses to 10 clinical questions using a 1–5 scale via Google Forms. Ethical approval was obtained, and informed consent was secured from all participants.

Results

Both GPT-3.5 and GPT-4 demonstrated comparable performance across all criteria, with no statistically significant differences observed (p > 0.05). GPT-4 exhibited slightly higher mean scores in all parameters, but the differences were negligible (Cohen’s d < 0.1 for all criteria). Reliability analysis revealed low internal consistency for both models (Cronbach’s alpha ranged between 0.019 and 0.162). Correlation analysis indicated no significant relationship between participants’ years of professional experience and their evaluations of GPT-3.5 (correlation coefficients ranged from − 0.026 to 0.074).

Conclusions

While GPT-3.5 and GPT-4 provided a foundational level of clinical information support, neither model exhibited superior performance in addressing the unique challenges of pediatric nephrology. The findings highlight the need for domain-specific training and integration of updated clinical guidelines to enhance the applicability and reliability of AI models in specialized fields. This study underscores the potential of AI in pediatric nephrology while emphasizing the importance of human oversight and the need for further refinements in AI applications.

Graphical abstract

Appendix
Available only for authorised users
Literature
4.
go back to reference Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, Aldairem A, Alrashed M, Bin Saleh K, Badreldin HA, Al Yami MS, Al Harbi S, Albekairy AM (2023) Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Edu 23:689. https://doi.org/10.1186/s12909-023-04698-zCrossRef Alowais SA, Alghamdi SS, Alsuhebany N, Alqahtani T, Alshaya AI, Almohareb SN, Aldairem A, Alrashed M, Bin Saleh K, Badreldin HA, Al Yami MS, Al Harbi S, Albekairy AM (2023) Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med Edu 23:689. https://​doi.​org/​10.​1186/​s12909-023-04698-zCrossRef
12.
go back to reference Field A (2013) Discovering statistics using IBM SPSS statistics. Sage Publications, London Field A (2013) Discovering statistics using IBM SPSS statistics. Sage Publications, London
15.
go back to reference Cohen J (2013) Statistical power analysis for the behavioral sciences. Routledge, New YorkCrossRef Cohen J (2013) Statistical power analysis for the behavioral sciences. Routledge, New YorkCrossRef
Metadata
Title
Assessing the performance of large language models (GPT-3.5 and GPT-4) and accurate clinical information for pediatric nephrology
Author
Nadide Melike Sav
Publication date
05-03-2025
Publisher
Springer Berlin Heidelberg
Published in
Pediatric Nephrology
Print ISSN: 0931-041X
Electronic ISSN: 1432-198X
DOI
https://doi.org/10.1007/s00467-025-06723-3

Keynote webinar | Spotlight on adolescent vaping

Growing numbers of young people are using e-cigarettes, despite warnings of respiratory effects and addiction. How can doctors tackle the epidemic, and what health effects should you prepare to manage in your clinics?

Prof. Ann McNeill
Dr. Debbie Robson
Benji Horwell
Developed by: Springer Medicine
Watch now
Video