A study used a structured approach, inputting typical gastroenterology questions into ChatGPT 4.0 and Google Bard. Independent reviewers evaluated the responses using a Likert scale and cross-referenced them with guidelines from authoritative gastroenterology bodies. Statistical analysis, including the Mann-Whitney U test, was conducted to assess the significance of differences in ratings.
The results showed that ChatGPT 4.0 demonstrated higher reliability and accuracy in its responses than Google Bard, as indicated by higher mean ratings and statistically significant p-values in hypothesis testing. However, the study noted limitations in the data structure, such as the inability to conduct detailed correlation analysis.
In conclusion, the study found that ChatGPT 4.0 outperforms Google Bard in providing reliable and accurate responses to gastroenterology-related queries. This underscores the potential of AI tools like ChatGPT in enhancing healthcare delivery. The study also highlights the need for a broader and more diverse assessment of AI capabilities in healthcare to fully leverage their potential in clinical practice.