While we can’t eliminate hallucinations entirely, there are still things we can do to reduce the chances of hallucinations showing up in our model’s responses.
In the last article, we talked about probability, token prediction and how temperature can change the types of responses we get from AI models. However, it would be irresponsible of us to talk about generative AI without also addressing the elephant in the room: hallucination.
Hallucination is when generative AI models return a response that isn’t grounded in facts or based on their training data. You’ve almost certainly experienced it for yourself if you’ve chatted with an LLM for even a little while. As of the writing of this article, there is no known way to prevent AI models from occasionally generating hallucinated content. There are several things we can do to reduce hallucinations, but nothing will eliminate them completely.
This makes hallucinations one of the most prominent issues facing us today, as developers building with AI. Generative AI is incredibly impressive from a purely technical perspective, but if the output can’t be consistently trusted then our ability to build real-world apps leveraging it is limited. Of course, this varies based on field and use case: as we discussed in the previous article, there will be situations where a higher rate of error is OK and situations where mistakes are unacceptable.
Part of our role as ethical AI developers is to assess the situations where our application will be used and determine whether the rate of hallucination minimization that we can realistically achieve is within the bounds of acceptable risk—or whether the use of AI should be avoided entirely.
One of the most interesting (and challenging) aspects of AI hallucination is that we still don’t fully understand why it occurs or which underlying mechanisms are primarily responsible. We’ve identified a few things that make hallucinations worse and/or amplify them, but nothing can be traced back directly as the root cause.
A logical assumption to make would be that errors in the training data are responsible for hallucinated content. After all, as they say: “garbage in, garbage out,” right? However, Kalai, et al., show in their 2025 paper Why Language Models Hallucinate that "… even if the training data were error-free, the objectives optimized during language model training would lead to errors being generated.” They theorize that hallucination happens because most AI model training rewards producing an answer rather than declining, which “encourages” models to generate plausible responses even when uncertain.
Technically, guessing an answer that has a slim chance of being correct (even if that chance is just 1 out of 10,000) is more likely to be right than giving no answer (which has a 0% chance of being the right answer). That means that guessing is the “better” mathematical option—assuming that “better” is being evaluated based on “highest chances of being right” (which is how most AI training benchmarks are currently structured).
Let Model B be similar to Model A except that it never indicates uncertainty and always “guesses” when unsure. Model B will outperform A under 0-1 scoring, the basis of most current benchmarks. This creates an “epidemic” of penalizing uncertainty and abstention …
Another popular theory comes from the paper Shaking the foundations: delusions in sequence models for interaction and control by Ortega, et al., in 2021. They argue that hallucinations are caused by the model integrating its own responses as reference data and further extrapolating from there. This pollutes the data and creates a kind of compounding ouroboros of “self-delusion” as the model consumes its own (incorrect) output and builds upon it. While Ortega, et al.’s theory was related to training sequential decision-making models, the underlying pattern—a model building on its own outputs to create compounding errors—shows up in generative models as well.
… the model update triggered by the collected data differs depending upon whether the data was generated by the model itself (i.e. actions) or outside of it (i.e. observations), and mixing them up leads to wrong inferences. These take the form of self-delusions where the model takes its own actions as evidence about the world … due to the presence of confounding variables.
With the disclaimer that we can’t (at this point) eliminate hallucinations entirely—there are still things we can do to reduce the chances of hallucinations showing up in our model’s responses. Whether you implement some or all of these will depend on what you’re building and what kinds of hallucinations your model is most often generating.
Hallucinations can be more likely when a model doesn’t have access to the correct information. This aligns with the “guessing” theory from earlier. If the training data was minimal or not well-aligned with the use of the model, then it’s more likely that the model will be asked questions that it’s unequipped to answer—and by extension, the model is more likely to make an unsubstantiated guess.
If you’re building and training your own models, then you have the ability to adjust the training data—but many application developers are using pretrained foundation models. That doesn’t mean we’re completely out of luck, though: we can still finetune the model we’re using, or use an approach like RAG to provide additional reference data to the model. The more current, accurate, relevant information we can provide a model, the lower our chances of hallucination will be.
Hallucinations are also common in longer exchanges with models. The longer the conversation gets, the more likely the model is to “get lost” and provide responses that may be technically correct but don’t make sense in the context of the current task.
The amount of history from the current conversation that an LLM can see at a given time is known as its context window. As a conversation gets longer, older messages may eventually fall outside the model’s context window and are no longer visible to the model.
A model also has limited attention that it can distribute among all the tokens in a given conversation, in order to help it keep track of inter-conversational references. This is what helps a model read things like “Kathryn is a software engineer. They mostly use React.” and understand that “They” and “Kathryn” refer to the same person. As a model’s attention is stretched and more things fall out of its context window, it “forgets” information and becomes more likely to hallucinate.
The likelihood of this can be reduced by prompt engineering (”Remember when we discussed X? Make a plan based on X and adapt it to Y.”) or by providing additional context reference via Skills or reference documents. It can also be helpful to intentionally limit the length of a user’s exchange with the model—if you notice that the model is more likely to hallucinate during long exchanges, see if you can identify natural and non-disruptive places to reset and allow it to “start fresh.”
Occasionally, hallucinations can be caused (at least in part) by the user’s input. Anh-Hoang, Tran and Nguyen define in Survey and analysis of hallucinations in large language models: attribution to prompting strategies or model behavior two main categories of hallucination: prompting-induced hallucinations and model-intrinsic hallucinations. Prompting-induced hallucinations occur when “prompts are vague, underspecified, or structurally misleading, pushing the model into speculative generation.” By being intentional and careful about the shaping of our own prompts—or by providing prompt guidance / instructions and shaping our end users’ prompts—we can reduce the chances of hallucination.
We might also help reduce hallucinations by including instructions in our prompt for the model to “say ‘I don’t know’ if you’re unsure of the answer,” asking it to cite sources or similar. By telling the model that we would prefer no answer over the wrong answer, we might override that internal calculation about which option would be considered “better” based on the training benchmarks—we can attempt to change the grading scale, if you will.
Kathryn Grayson Nanz is a developer advocate at Progress with a passion for React, UI and design and sharing with the community. She started her career as a graphic designer and was told by her Creative Director to never let anyone find out she could code because she’d be stuck doing it forever. She ignored his warning and has never been happier. You can find her writing, blogging, streaming and tweeting about React, design, UI and more. You can find her at @kathryngrayson on Twitter.