In a groundbreaking study, researchers have shown that Large Language models (LLMs), such as OpenAI’s GPT, can significantly enhance diagnostic reasoning in healthcare.
According to the JAMA Network open, the randomized clinical trial, led by Ethan Goh, MBBS, MS, and a team of doctors including Robert Gallo, MD, and Jason Hom, revealed that AI-powered tools can help physicians make faster, more accurate diagnoses, improving patient outcomes.
“A general mantra in health AI is that humans + AI > humans alone,” said Daniel Yang, a co-author in the study.
However, the researchers were at disbelief finding out that LLMs alone outperformed both groups used to run the study in terms of diagnostic accuracy. The groups included, “Humans together with Conventional resources” and “Humans together Generative Artificial Intelligence.”
How was the study conducted?
50 physicians were brought on board to review patients related scenarios cases to provide diagnostic reasoning and knowledge. With the clinical vignettes, half of the physicians were having access to ‘convectional resources’ such as internet connection and support tools. The other half had access to GenAI like ChatGPT 4.
The researchers expected that the doctors armed with the LLMs; ChatGPT would outdo the doctors without it. However, they were wrong, the results showed that both groups performed the same in terms of the diagnostic accuracy.
The performance of the LLM alone raised eyebrows scoring higher than the other two groups.
“Are we going to be out of a job?” Sumant Ranji, quoted the question that an audience member asked upon hearing the results of the study.
“Trained clinicians with a blank GPT4 prompt box and no prior training on prompt engineering is unlikely to get significant diagnostic value from the tool,” said Mr Yang.
The study stressed the need for careful training and clear guidelines to ensure that AI compliments rather than supplants human expertise.
The researchers cautioned that the results obtained from the study should not be interpreted to indicate that LLMs should be used for diagnosis without physician’s oversight, rather to be viewed and used as a an aid to support clinicians in making informed decisions and improving patient outcomes.
In Kenya, the health AI is seen to rapidly increase creating room for more innovations and growth in the healthcare industry. With that, various applications have been developed such as Sophie Bot, M-tiba, MYDAWA, ZuriHealth, iZola and Goodlife to improve diagnostic accuracy and patients’ care.
Sophie Bot is a Kenyan AI-powered chatbot developed to provide information on sexual and reproductive health. The chatbot uses artificial intelligence to offer users personalized health advice and answer sexual health queries through text-based conversations.
M-TIBA, a mobile platform that allows patients to save and manage funds for health care services. It uses AI to analyze health data, predict potential health risks for users, provide personalized health services and remote healthcare access.
iZola focuses on providing virtual care, wellness advice, and health management. It uses AI-based algorithms to assess symptoms keyed in by users and provide initial diagnostic suggestions.
Goodlife uses AI in personal health assessments, supply chain optimization, customer service and virtual assistance. The AI chatbots help users navigate the website, answer product-related queries offering enhanced customer experience.
MYDAWA, an e-health platform for purchasing medications, consultations uses AI for personalized medication recommendations, supply chain and inventory optimization.
ZuriHealth, a telemedicine platform, leverages AI in consultations to help doctors quickly assess patient symptoms and suggest possible treatments, improving the accuracy and efficiency of remote consultations.
With the medical field evolving, it is crystal clear that the study offers a glimpse into a future where AI and human expertise will work together to improve healthcare outcomes globally.