What Is LLM Hallucination?
LLM hallucination refers to the phenomenon in which a large language model generates text that is factually incorrect, internally inconsistent, or entirely fabricated — yet is presented with the same confident tone as accurate information. The term "hallucination" was adopted from cognitive science, where it describes perception without a corresponding external stimulus. In LLMs, hallucination occurs when the model generates text based on statistical patterns in its training data rather than grounded factual knowledge.
Hallucination is not a bug introduced by poor engineering; it is an emergent property of how autoregressive language models work. Models predict the most statistically likely next token given prior context, which produces fluent, coherent text that may nonetheless be factually wrong. The problem is compounded by the fact that most LLMs do not distinguish between high-confidence factual recall and low-confidence inference — both are delivered in the same authoritative register.
Why Hallucination Is Especially Dangerous in Journalism
In most consumer applications, occasional LLM hallucination is an inconvenience. In journalism, it is a professional and legal liability. A journalist who publishes AI-generated content without independent verification may inadvertently defame individuals with false attributions, misrepresent scientific findings, report invented statistics as real data, or create false impressions about public figures or events. Several high-profile cases — including the fabricated case citations in the Mata v. Avianca legal brief (2023) — demonstrate the real-world consequences of deploying LLMs without verification safeguards.
Common Types of Hallucination in News Contexts
- Entity hallucination: The model invents names of people, organisations, or places, or confuses similar entities (e.g., attributing a quote to the wrong politician).
- Statistical fabrication: The model generates plausible-sounding but invented statistics, percentages, or dates.
- Citation hallucination: The model invents URLs, DOI numbers, or publication names that do not exist.
- Temporal confusion: The model conflates events from different time periods or asserts outdated information as current.
- Synthesis errors: The model correctly retrieves two separate facts but combines them incorrectly, implying a causal or factual relationship that does not exist.
Detecting Hallucination: Technical Approaches
Several techniques exist for detecting LLM hallucination before content is published. Self-consistency checking involves prompting the same model multiple times with paraphrased questions and checking whether it gives consistent answers — inconsistency signals uncertainty and potential hallucination. Retrieval grounding (RAG) anchors responses in retrieved documents, making it possible to verify each claim against its cited source. Multi-model consensus runs the same query through multiple independent models — if ChatGPT, Gemini, and Perplexity all agree on a fact, confidence is higher than if only one model was queried. Perplexity scoring uses the model's own uncertainty estimates to flag low-confidence passages for human review.
Omniscient AI's multi-model architecture addresses hallucination risk by running claims through three independent models simultaneously. Where models disagree, the system flags the claim rather than returning a confident verdict, ensuring that genuine uncertainty is surfaced rather than suppressed.
Newsroom Protocols to Prevent Hallucination-Driven Errors
Beyond technical safeguards, newsrooms need editorial protocols: always treat AI-generated text as a first draft requiring human verification; never publish statistics or attributed quotes from AI output without independently confirming the source; use RAG-based tools rather than standalone chatbots for research; deploy multi-model checking for all fact-sensitive content; and train staff to recognise the linguistic patterns associated with hallucinated content (overly confident assertions about specific numbers or dates, unusually convenient and round figures, and perfect-sounding but unverifiable expert quotes).