2nd International Congress on Sport Sciences and Interdisciplinary Research

Can AI chatbots provide reliable advice on creatine?

Paper ID : 1470-SPORTCONGRESS

Authors

Mahboobeh Borjian Fard *¹, abolfazl sabagh tajanaki²

¹Assistant Professor

²University of tehran

Abstract

Introduction: Large language models (LLMs) are increasingly utilized as accessible sources for sports nutrition information. However, concerns persist regarding their accuracy and potential biases, particularly for specialized topics like creatine supplementation where misinformation could lead to adverse health consequences. This study critically evaluated four prominent LLMs (GPT-4o, Claude 3.5 Sonnet, DeepSeek, and Gemini 1.5 Pro) for their efficacy in providing evidence-based responses to common creatine-related queries.
Methods: Five key questions on creatine were selected. For each, a gold-standard evidence-based answer was established using Q1 journal articles(1-3) and authoritative position stands (e.g., ISSN). The models' responses were evaluated by independent raters based on predefined criteria: accuracy, completeness, clarity, quality of evidence, and generalization bias. Quantitative and qualitative analyses were performed, and data were analyzed using Kruskal-Wallis and Fisher's exact tests.
Results: While no statistically significant differences were found between the models in accuracy, completeness, clarity, and quality of evidence (p > 0.05), descriptive analysis revealed practical distinctions. GPT-4o and Gemini 1.5 Pro demonstrated the highest accuracy and completeness, with mean scores of 4.60/5. Regarding bias, DeepSeek exhibited the highest rate of unwarranted generalizations, with 60% temporal bias and 40% action-guiding bias. In contrast, GPT-4o showed the most stable and balanced performance with the least bias.
Conclusion: The findings indicate that although advanced LLMs are statistically comparable in their mean scores, their functional performance patterns and inherent biases can have significant clinical and practical implications for athletes and professionals. Users must be aware of the specific capabilities and limitations of each model.

Keywords

Artificial Intelligence, creatine supplementation, Large Language Models

Status: Abstract Accepted (Poster Presentation)