A study published in Scientific Reports evaluated ChatGPT’s proficiency in answering questions about colorectal cancer (CRC). The study, conducted by a team from the University of Texas at Arlington, used a book on CRC as a reference to test ChatGPT (GPT-3.5 version) against expert answers to 131 CRC-related questions. The questions covered various aspects of CRC, including surgical management, radiation therapy, and pain control. ChatGPT’s responses were scored by clinical physicians specializing in CRC.
The study found that while ChatGPT showed high accuracy and reproducibility, it lacked comprehensiveness in some areas, particularly in surgical management, basic information, and internal medicine. The study suggests that updating AI models with more specific data could improve their depth and breadth of response. Despite its good performance in some areas, ChatGPT’s underperformance compared to expert knowledge indicates that AI models like ChatGPT are not yet ready for clinical practice deployment. The study highlights the potential and limitations of ChatGPT in the medical field, particularly for CRC questions, and suggests that future versions could be improved for more accurate diagnosis and early treatment of CRC.