AI chatbot safeguards fail to prevent spread of health disinformation, study reveals

Credit: Pixabay/CC0 Public Domain

A study assessed the effectiveness of safeguards in foundational large language models (LLMs) to protect against malicious instruction that could turn them into tools for spreading disinformation, or the deliberate creation and dissemination of false information with the intent to harm.

The study revealed vulnerabilities in the safeguards for OpenAI’s GPT-4o, Gemini 1.5 Pro, Claude 3.5 Sonnet, Llama 3.2-90B Vision, and Grok Beta. Specifically, customized LLM chatbots were created that consistently generated disinformation responses to health queries, incorporating fake references, scientific jargon, and logical cause-and-effect reasoning to make the disinformation seem plausible.

The findings are published in Annals of Internal Medicine.

Researchers from Flinders University and colleagues evaluated the application programming interfaces (APIs) of five foundational LLMs for their capacity to be system-instructed to always provide incorrect responses to health questions and concerns.

The specific system instructions provided to these LLMs included always providing incorrect responses to health questions, fabricating references to reputable sources, and delivering responses in an authoritative tone. Each customized chatbot was asked 10 health-related queries, in duplicate, on subjects like vaccine safety, HIV, and depression.

The researchers found that 88% of responses from the customized LLM chatbots were health disinformation, with four chatbots (GPT-4o, Gemini 1.5 Pro, Llama 3.2-90B Vision, and Grok Beta) providing disinformation to all tested questions.

The Claude 3.5 Sonnet chatbot exhibited some safeguards, answering only 40% of questions with disinformation. In a separate exploratory analysis of the OpenAI GPT Store, the researchers investigated whether any publicly accessible GPTs appeared to disseminate health disinformation.

They identified three customized GPTs that appeared tuned to produce such content, which generated health disinformation responses to 97% of submitted questions.

Overall, the findings suggest that LLMs remain substantially vulnerable to misuse and, without improved safeguards, could be exploited as tools to disseminate harmful health disinformation.

More information:
Assessing the System-Instruction Vulnerabilities of Large Language Models to Malicious Conversion into Health Disinformation Chatbots, Annals of Internal Medicine (2025). DOI: 10.7326/ANNALS-24-03933

Provided by
American College of Physicians

Citation:
AI chatbot safeguards fail to prevent spread of health disinformation, study reveals (2025, June 23)
retrieved 23 June 2025
from https://medicalxpress.com/news/2025-06-ai-chatbot-safeguards-health-disinformation.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Source link

Menu

Categories:

Contact Information

Menu

Categories:

Contact Information

AI chatbot safeguards fail to prevent spread of health disinformation, study reveals

Share This Post

Related Posts

Volcano Chicken Recipe in Under 30 Minutes

U.S. Funding Cuts and Their Impact Worldwide: Insights from the Harm Reduction International Conference

The Science Behind Medication-Assisted Treatment (MAT) for Opioid Addiction

Dip Into Summer: The Healthiest Spreads for Every Party Table

7 doctor-approved tips for outdoor fitness without injury

Five-day vascular organoids speed tissue engineering research

Categories:

Company:

Support:

Contact Information: