OpenAI’s Latest GPT Models Show Rising Hallucination Rates, Causes Remain a Mystery

OpenAI’s newest GPT versions hallucinate more often, puzzling researchers and users alike.

OpenAI’s newest GPT models are struggling more than ever with hallucinations, making up false information at rates that have increased compared to earlier versions. OpenAI’s tests revealed this surprising trend, leaving many wondering why it’s happening.

The New York Times reports that OpenAI’s GPT-3 and GPT-4-mini models hallucinate far more than the older GPT-1. For example, in the PersonQA benchmark, which tests answering questions about public figures, o3 hallucinated 33% of the time, more than double the 15% hallucination rate of o1. The o4-mini model did even worse, hitting 48% hallucination.

When tested with SimpleQA, which covers more general questions, hallucination rates jumped to 51% for o3 and a staggering 79% for o4-mini, compared to 44% for o1. These numbers are pretty wild, especially since newer models are expected to be more innovative and accurate.

OpenAI admits it doesn’t fully understand why these newer models hallucinate more. Some experts think it might be linked to the so-called “reasoning” models, which try to break down problems step-by-step, mimicking human thought processes. These models are designed to handle complex tasks better than simple text prediction.

OpenAI’s first reasoning model, o1, was praised for matching or beating PhD students in physics, chemistry, biology, math, and coding. It uses a “chain of thought” approach, thinking through problems carefully before answering.

Despite this, OpenAI’s Gaby Raila told the Times that hallucinations aren’t inherently worse in reasoning models, though they are working on reducing the high hallucination rates seen in o3 and o4-mini.

AI models need to reduce nonsense and falsehoods if they want to be truly useful. Right now, it’s tough to trust their answers without double-checking everything. That defeats the purpose of saving time or effort, which is the main reason people turn to AI in the first place.

We’ll have to wait and see if OpenAI and other AI developers can get these hallucinations under control. Until then, it’s a wild ride with AI that sometimes just makes stuff up.

What do you think about these rising hallucination rates? Have you noticed weird or false answers from AI lately? Drop your thoughts in the comments below.

Tags: ChatGPT OpenAI

OpenAI’s Latest GPT Models Show Rising Hallucination Rates, Causes Remain a Mystery

OpenAI’s newest GPT versions hallucinate more often, puzzling researchers and users alike.

Xbox Game Pass May 2025 Wave 1 Adds DOOM: The Dark Ages, Revenge of the Savage Planet, and More

Elden Ring Tarnished Edition for Nintendo Switch 2 Adds New Content and Features

Mihaela Kicevski

RELATEDPOSTS

OpenAI’s New Open-Weight Reasoning AI Runs Locally on RTX Cards, But You’ll Need a Beast of a PC

OpenAI Updates ChatGPT to Avoid Giving Direct Answers on Personal Relationship Questions

DeepMind CEO Predicts AGI Within a Decade, Promises Change Bigger Than Industrial Revolution

OpenAI’s Sam Altman Predicts Job Losses to AI but Doubts ChatGPT in Medical Decisions

Former OpenAI Coder Edges Out AI in Grueling Coding Championship, Barely Survives

OpenAI Prepares AI-Powered Browser to Challenge Google Chrome

Leave a Reply Cancel reply

Battlestate Games shares results from the in-game Survey about the Flea Market

Escape From Tarkov 2025 Roadmap Revealed, Full Release Finally Confirmed

Escape From Tarkov reveals the 0.15 trailer before wipe

Minecraft Update 1.21.21 Patch Notes for August 14/15

Escape From Tarkov Best Graphics Settings – Updated With Patch 0.15.5

Escape From Tarkov: How to Snipe Flea Market Items Easily?

CoD: Warzone Season 2 Update Fixes Plenty of Bugs

MW2 and Warzone 2.0 Season 3 is full of Bugs and Issues, Upcoming Fixes and more

Dondozo Raid Counters Guide: Best Pokémon to Beat the New Water Boss

Everything to know about Dota 2’s The International 2025 in Hamburg

Rematch Hotfix 1.200.202 fixes Slingshot ball teleport glitch

Ex-Eaton Developer Sentenced to Four Years for Deploying Malware ‘Kill Switch’

CPGPATCH NOTES

Rematch Hotfix 1.200.202 fixes Slingshot ball teleport glitch

Last Epoch Hotfix 1.3.0.3 Notes, memory leak fix and purchase logs

Last Epoch Hotfix 1.3.0.2 Addresses Audio, AI and Item Issues

About Us

Welcome Back!

Retrieve your password