The term "hallucination" makes AI errors sound exotic. They're not. They're a predictable consequence of how these systems work, and once you understand the mechanism, you can work around it without paranoia.
The short version: large language models predict the next token. They generate text that fits the pattern of a correct answer. They do not retrieve facts from a verified database. The output sounds confident because confident-sounding text is what trained well on human data. When the model doesn't know something, it doesn't say "I don't know." It says something plausible.
That's the core problem. Everything else follows from it.
Why It Happens
Your search engine pulls documents. It finds a web page that contains your query terms and returns it to you. That page either exists or it doesn't.
A language model does something completely different. It generates a response based on patterns in its training data. If the pattern says "academic papers about X are cited as [Author, Year]," the model will produce a citation that fits that pattern, whether or not the paper exists. It's not lying. It's completing the pattern, the way autocomplete finishes a sentence.
The training data shapes what patterns the model learned. But the model has no internal "check facts before outputting" step. It has a "generate the next most likely token" step, repeated thousands of times. The result is text that sounds like it was written by someone who knows what they're talking about, even when the specific claim is wrong.
Five Categories Where Hallucinations Are Most Common
1. Academic Citations
This is where hallucinations are most reliably dangerous. Ask AI to give you sources for a claim, and it will generate citations that look real: plausible author names, journals that exist, years, volume numbers. Many of those papers do not exist. Some of the real papers don't say what the citation implies.
Never use an AI-generated citation without verifying it directly. Check Google Scholar, PubMed, or the journal's website. The citation format is correct; the content may be invented.
2. Statistics and Numbers
"Studies show that 73% of..." is a classic hallucination pattern. The model knows that claims like this are often stated with specific percentages. It produces a specific percentage. Whether that percentage came from real research is another question entirely.
Treat any specific statistic from AI as unverified until you find the primary source. A real number has a real source: a named study, a government dataset, a company report. If you can't find the source, don't use the number.
3. Quotes from Named People
AI quotes are frequently fabricated. The model knows that Einstein said smart things about physics and life. It generates a quote that sounds like Einstein. That quote may not exist anywhere Einstein actually said or wrote it.
This applies equally to living public figures. A quote that sounds right, formatted with quotation marks and a name attached, is not verified. Find the original source before using it.
4. Recent Events Past the Training Cutoff
Every model has a training cutoff. Claude's is early 2025. ChatGPT's varies by version. For anything that happened after that date, the model is either working from its web search integration (if it has one), extrapolating from older patterns, or making things up.
If you need current information, use Perplexity. It searches the web and cites sources. Don't ask a base language model to tell you about events from the last six months without a real-time search layer.
5. Legal, Medical, and Financial Specifics
General principles, fine. Specific application to your situation, risky. The difference between "how does HIPAA work generally" and "is this specific practice HIPAA compliant" is the difference between education and advice. AI can do the first well. The second requires someone who can be held accountable, access your actual documents, and knows the specific jurisdiction.
The hallucination risk here isn't just factual inaccuracy. It's confident-sounding specific guidance that doesn't apply to your situation. "Consult a professional" is the correct answer here, not a disclaimer.
A Practical Verification Workflow
You don't need to verify everything AI produces. Most of what AI generates doesn't require verification: structural output, formatting, style editing, general explanations, brainstorming. The risk concentrates in specific claims.
When you get output with specific facts, citations, statistics, or quotes:
Paste the specific claim into Perplexity. It will search the web and cite sources. If it finds support for the claim with real sources, you have something to work with. If it doesn't, the claim is suspect.
For citations, go directly to the source. Google Scholar for academic papers, PubMed for medical research. Search the exact title. If the paper doesn't come up, it may not exist. If it comes up but doesn't say what the AI implied, don't use the citation.
For quotes, search the exact phrase plus the person's name. Quote verification sites like Wikiquote can help for famous figures. If you can't find the original source, treat the quote as apocryphal.
Prompting to Reduce Risk
You can't eliminate hallucinations through prompting, but you can surface them. These prompts help the model flag uncertainty rather than paper over it:
"Tell me what you're uncertain about in this answer."
"Flag any claims that might be incorrect or that you can't verify."
"If you don't know, say so. I'd rather have a gap than a wrong answer."
These work because they give the model explicit permission to express uncertainty, and they set an expectation that gaps are acceptable. The model will still sometimes fill in confidently. But you get more hedging, more "I believe" and "I'm not certain," which are useful signals to know where to verify.
One more: after getting a response with specific claims, ask "What would change your answer here, and what are you most uncertain about?" The follow-up often surfaces the weakest parts of the original response.
The Trust Matrix
Not everything needs equal scrutiny. Here's a practical breakdown:
| Almost Always Reliable | Verify Before Using |
|---|---|
| Document structure and outlines | Specific statistics and percentages |
| Formatting and editing | Academic citations and paper titles |
| General reasoning and logic | Quotes attributed to named people |
| Style rewriting and tone adjustment | Events after the training cutoff |
| Brainstorming and ideation | Specific legal or regulatory details |
| Explaining general concepts | Medical specifics and drug interactions |
| Summarizing content you provide | Financial figures and market data |
The pattern in the left column: AI working with your input or generating structure. The pattern in the right column: AI making specific factual claims about the world. The left column is where models are genuinely good. The right column is where the verification workflow applies.
The Right Mental Model
Think of AI as an extremely capable collaborator who hasn't always done the reading. They can help you think through a problem, structure an argument, draft a document, and improve your writing. They'll sometimes state a fact with confidence that turns out to be wrong, the way a knowledgeable person occasionally misremembers a study or misattributes a quote.
You wouldn't fire that collaborator. You'd just know to check their specific claims before publishing anything that depends on them. Same principle here.
The goal isn't to use AI less. It's to use it for what it's actually good at, and to verify the narrow category of outputs where confident-sounding errors are most likely. That's a small workflow change for a large reduction in risk.