AI beats doctors on empathy and quality

Welcome to another instalment of ‘ChatGPT is coming for your job’.

How would you expect an AI to go answering patients’ unsolicited medical questions compared with real-life warm-blooded human doctors?

It’ll have reasonably accurate information but get some things wrong, you may think, and as for its bedside manner … come on, it’s by definition an unfeeling, heartless algorithm – there are some things you can’t program.

Boy, does this study have news for you.

The authors mined the subreddit r/AskDocs, a forum with around 474,000 members, for medical questions and their answers by registered medical professionals.

They then plugged each question into a fresh session of OpenAI’s ChatGPT (version 3.5), and submitted both sets of answers (stripped of identifying information) to a panel of doctors working across six specialities, who were blinded to their authors. Each pair was assessed by three members of the panel.

The evaluators had to say which response was “better”, and then rated each on a Likert scale for “the quality of information provided” (very poor, poor, acceptable, good, or very good) and “the empathy or bedside manner provided” (not empathetic, slightly empathetic, moderately empathetic, empathetic, and very empathetic). Means were taken to establish a consensus score for each answer.

In nearly 80% of cases the evaluators preferred the AI’s answer to the human doctor’s.

On quality, they rated the human’s answers at 21% lower than ChatGPT, equating to an overall “good” score for the AI and “acceptable” for the humans.

The prevalence of sub-acceptable responses was an embarrassing 10 times higher for the doctors than for the chatbot (a mean of 27.2% vs 2.6%).

The proportion of good/very good responses was also much higher for the AI: 78.5% vs 22.1%.

Here’s the kicker: the human responses were rated 41% less empathetic than the bot’s. Responses rated empathetic or very empathetic were, again, 10 times more prevalent for the AI.

Yikes.

The AI also gave longer responses, with a mean 180 words vs 52 words from the docs.

The authors of this paper deserve a medal for most euphemistic framing of a study result.

They don’t say “an AI not only gives better answers, it’s also nicer than you – pull your socks up”.

They say the volume of electronic messaging has gone up 160% since the pandemic, and that additional messaging predicts additional doctor burnout, so wouldn’t it be great if AI could take on some of this burden for the sake of your mental health? For example, chatbots could be used to draft responses to be edited by clinicians or support staff.

Not only would it save clinicians time, but patients might even benefit: “If more patients’ questions are answered quickly, with empathy, and to a high standard, it might reduce unnecessary clinical visits, freeing up resources … Moreover, messaging is a critical resource for fostering patient equity, where individuals who have mobility limitations, work irregular hours, or fear medical bills, are potentially more likely to turn to messaging.”

Best of all, the AI won’t incur payroll tax or leave teabags in the sink.

Send story tips to penny@medicalrepublic.com.au for an instant 21% improvement in your bedside manner.

3 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

info@rtrs.com.au

2 years ago

The authors should do a study which provides the findings of this study to the participants (who are patients seeking a consult) then presents them with 2 doors, one to a human dr and the other to an AI. I reckon I know which door most, if not all participants, would pick.

Amiaq

AI is better than any politicans
AI is better than any administrative positions
AI is better in reporting news than any reporters
AI is better than any editors
AI is better than any essay writing person
AI is better than any pharmacists

why stop at doctors?

I wonder

Cornelius Nydam

Great food for thought.
Reminds me of air travel.
More than half of air crash fatalities are due to human error.
But who wants to fly in a non piloted airplane.
Mind you, I am old enough to remember that all lifts had human lift operators.
The future’s not ours to tell. Que Sera

AI beats doctors on empathy and quality

Australia monitoring Nipah outbreak ‘very, very closely’

Coming up: The new dementia test for patients with cognitive concerns

AI health tools are here to stay

Register for a new account

Billing Address

Payment Information We Accept Visa, Mastercard, American Express, and Discover

Log in

Register for a new account

Billing Address

Payment Information We Accept Visa, Mastercard, American Express, and Discover