Chatgpt voluntarily misses some answers: the analysis that reveals why and in what cases

The artificial generative intelligence has made great strides, but continues to stumble on one of the most delicate aspects: the veracity of his responses. The same admitted it Openai In a new research work in which the phenomenon of the so -called “Hallucinations“, That is those plausible but false statements that the chatbots generate with a surprising security. And, according to the company behind Chatgpt, the problem is not only technical, but structural: it derives from how linguistic models are assessed and” rewarded “during training and benchmark.

These are not bugs, but incorrect logical mechanisms

Unlike what one might think, the hallucinations they are not simple random errors. They are the predictable result of how linguistic models work at the base: GPT-4, GPT-5 and their competitors do not really “know” the facts, but predict the next word in a sentence on the basis of enormous quantities of texts. In this process, there is no label that indicates what is true or false, therefore when the context is weak, as in the case of a little -known name or a marginal event, the model “fills the void” with the most statistically plausible, but not necessarily true response.

Wrong incentives, wrong answers

The search for Openai It goes beyond the simple technical explanation: it identifies the real culprit in the evaluation phase. The models are judged on the basis of the percentage of correct responses, but not penalized if they are wrongly wrong. In fact, they are always encouraged to respond, even when they do not have certain data, because a “convincing” incorrect response is often reward more than an honest admission of uncertainty.

Openii researchers make an illuminating comparison: it is like a multiple choice test where leaving in white is equivalent to taking zero, while trying to guess it could lead some point. In this logic, the models learn to “Shoot answers” even without solid bases.

The proposal: reward uncertainty, penalize arrogance

To correct this perverse effect, Openai suggests changing the rules of the game. The new approach should:

Penalize the false statements more severely gave confidence.
Reward, at least in part, the answers that indicate uncertainty (“I don’t know”).
Adopt benchmark (specially designed test to evaluate the most realistic device performance), which reflect the daily use of chatbots.

Alone changing The way we measure the performance of the AI, according to researchers, can really be reduced to the tendency to generate falsehood.

False responses increase

The picture, however, is further complicated with the data published by NewsGuardwhich for over a year analyzes the performance of the main IA models on false statements linked to current affairs. The most surprising data? In 2025, the chatbots spread false information in 35% of cases, almost double compared to 2024 (18%).

How is it possible that errors increase just as models improve? The answer is paradoxical: the more the models become available and proactive in responding, the more they risk making mistakes. Today the chatbots almost always respond, in 100% of the cases tested, while a year ago they refused to respond in almost a third of the occasions. But this “total availability” has a price: more answers, More errors.

Unreliable sources, toxic information ecosystems

Another critical element is the Quality of sources from which the models draw when they try to provide updated answers. The introduction of real -time research, designed to make the IA more useful, has had a side effect: the chatbots today access a web polluted by unreliable sites, content generated by IA and organized disinformation.

In many cases, the models cannot distinguish between a serious newspaper and a semi-anonymous site built to spread fake news. The campaigns of foreign influence, such as those attributed to Russia, take advantage of this limit to insert False content in the information flow from which the IA learns.

Continuous newsguard monitoring

To accurately measure the reliability of the AI, newsguard has launched his To the false claims monitora project that verifies every month if and how the main models generate or confuse false statements. The system analyzes different thematic categories, politics, health, international relations, immigration, and tests chatbots with three types of prompt (textual or visual/sound input that a user provides to an artificial intelligence system, to ask him to perform a task or generate a specific output):

Neutral: a simple and direct question.
Tendensito: a question that gives true a false affirmation.
Mischievous: a question designed to circumvent the AI filters.

The result? Even the most advanced models, such as chatgpt, fail in 35-40% cases when it comes to distinguishing the truth from the false on hot themes. In some tests, the percentage of incorrect responses came to 40%.

The problem is systemic, not a single model

This special edition of the monitoring revealed for the first time the scores of the individual models, after a year of aggregated evaluations. The aim? Demonstrate that problems do not depend only on the model used, but on the way everyone comes trained and tested. The improvements, where there have been, are not enough to contrast a wider trend: that of treating statistical probability as absolute truth.

A new evaluation paradigm

In light of these data, it is clear that it is not enough to improve the accuracy of the models. We need to review the metrics with which we judge theirs reliability.

Until the algorithms are rewarded for their “security” rather than for their “honesty”, they will continue to answer even when they should stop.

Generative IA has revolutionized access to information. But if we want it to become a truly reliable and safe tool, we must accept that, sometimes, an honest response is: “I do not know”.