ChatGPT Medical Support Better Than Humans, Worries About Missing Human Touch Remain

ChatGPT Medical Support Better Than Humans, Worries About Missing Human Touch Remain
Screens displaying the logos of OpenAI and ChatGPT, in Toulouse, southwestern France, on Jan. 23, 2023. (Lionel Bonaventure/AFP via Getty Images)
Naveen Athrappully
5/2/2023
Updated:
5/2/2023
0:00

AI chatbot ChatGPT’s responses to medical queries scored higher ratings compared to human responses, according to a new study. However, researchers raised concerns that the mechanization of such activities could take away the feeling of human support.

The study, published in the JAMA Network journal on April 28, involved researchers randomly selecting 195 questions from a public social media health forum to which a verified physician had responded. ChatGPT was fed with these questions and the AI generated its responses. Researchers submitted the original questions as well as randomly ordered responses from the verified physician and ChatGPT to a team of three licensed healthcare professionals. The trio compared the responses based on “the quality of information provided” and “the empathy or bedside manner provided.”

The three healthcare professionals preferred the responses of ChatGPT to those of the verified physician 78.6 percent of the time.

“ChatGPT messages responded with nuanced and accurate information that often addressed more aspects of the patient’s questions than physician responses,” said Jessica Kelley, a nurse practitioner with San Diego firm Human Longevity and study co-author, according to an April 28 press release.

ChatGPT responses were rated “good” or “very good” 3.6 times more than the physician’s responses. The chatbot’s responses also received 9.8 times more “empathetic” or “very empathetic” ratings.

In an interview with Fox, Dr. John W. Ayers, vice chief of innovation in the Division of Infectious Diseases and Global Public Health at the University of California San Diego and an author of the study, raised concerns that people’s perception of medical support responses might change if they realize the messages were generated via AI.

Ayers cited a study focused on mental health support that found that AI messages were preferred by people over human responses. “Interestingly, once the messages were disclosed as being written by AI, the support felt by the receiver of these messages disappeared,” he said.

“A worry I have is that in the future, people will not feel any support through a message, as patients may assume it will be written by AI.”

Study Limitations, Risks of Missed Diagnoses With AI

The study’s main limitation is the fact that it used an online forum question-and-answer exchange. Such messages might not reflect typical patient-physician questions, the study admitted.

“We only studied responding to questions in isolation, whereas actual physicians may form answers based on established patient-physician relationships.”

“We do not know to what extent clinician responses incorporate this level of personalization, nor have we evaluated the chatbot’s ability to provide similar details extracted from the electronic health record.”

In an April 5 article published on Medium, Dr. Josh Tamayo-Sarver revealed the issues he discovered when he used ChatGPT to diagnose his patients. “The results were fascinating, but also fairly disturbing,” he writes.

As long as the material fed to ChatGPT was precise and highly detailed, the bot did a “decent job” of highlighting common diagnoses. For almost half the patients, ChatGPT suggested six possible diagnoses. The right diagnoses of the patients were one among these six diagnoses outputted by the bot.

However, “a 50 percent success rate in the context of an emergency room is also not good,” Josh says. In one case, ChatGPT missed diagnosing that a 21-year-old female patient had an ectopic pregnancy, a condition in which the fetus develops in a woman’s fallopian tube rather than the uterus. If diagnosed late, the situation can lead to death.

“My fear is that countless people are already using ChatGPT to medically diagnose themselves rather than see a physician. If my patient in this case had done that, ChatGPT’s response could have killed her,” Josh wrote.