AI Chatbots Surpass Oncologists in Quality, Empathy, and Readability of Responses to Patient Questions About Cancer
Artificial intelligence chatbots have demonstrated superior performance compared to oncologists in providing accurate, empathetic, and readable responses to patient questions about cancer, according to a study published in JAMA Oncology. The research, which evaluated responses from three AI chatbots and six oncologists, highlights significant differences in overall quality, medical accuracy, completeness, and focus, suggesting potential implications for integrating AI into clinical practice.
Key Points:
- Study Overview: Researchers tested three AI chatbots (ChatGPT-3.5, ChatGPT-4, and Claude AI) and compared their responses to 200 patient questions about cancer against those from six oncologists.
- Evaluation Metrics: Responses were rated on overall quality, empathy, and readability on a scale from 1 (very poor) to 5 (very good) by two teams of attending oncology specialists.
- Superior Performance: All three chatbots outperformed oncologists in overall quality, empathy, and readability (P <.05 for all).
- Medical Accuracy: Claude AI and ChatGPT-4 provided significantly more accurate responses than oncologists (P <.05 for both).
- Completeness and Focus: Responses from all three chatbots were rated higher for completeness and focus compared to those from oncologists (P <.05 for all).
- Quality Scores:
- Oncologists: 3.00 (95% CI, 2.91-3.09)
- ChatGPT-3.5: 3.25 (95% CI, 3.18-3.32)
- ChatGPT-4: 3.52 (95% CI, 3.46-3.59)
- Claude AI: 3.56 (95% CI, 3.48-3.63)
- Empathy Scores:
- Oncologists: 2.43 (95% CI, 2.32-2.53)
- ChatGPT-3.5: 3.17 (95% CI, 3.10-3.24)
- ChatGPT-4: 3.28 (95% CI, 3.20-3.36)
- Claude AI: 3.62 (95% CI, 3.53-3.70)
- Readability Scores:
- Oncologists: 3.07 (95% CI, 3.00-3.15)
- ChatGPT-3.5: 3.42 (95% CI, 3.34-3.50)
- ChatGPT-4: 3.77 (95% CI, 3.71-3.82)
- Claude AI: 3.79 (95% CI, 3.72-3.87)
- Practical Implications: The study suggests potential benefits in integrating AI chatbots into clinical practice to enhance patient communication and support oncologists.
- Future Research: Further studies are needed to evaluate the implementation process, scope, and outcomes of chatbot-facilitated interactions in real-world clinical settings.
A review of seven studies of patient-initiated second opinions found that while most second opinions confirmed the initial diagnosis, between 10 and 60% of second opinions resulted to a change in diagnosis, management or prognosis. (Support Care Cancer)
More on Physician Communication