Validates LLM-generated output via entailment checking against grounding sources. The grounding/RAG-validation layer for Gemini / Assistant LLM outputs — ensures generated text is supported by source evidence.
Patent Overview
- Inventor
- Yossi Matias, Avinatan Hassidim, others
- Assignee
- Google LLC
- Filed
- 2023
- Granted
- Published 2025-03-20
The Challenge
The Challenge
LLMs hallucinate. Generated text often contains plausible-sounding but unsupported claims. Production deployment requires entailment validation — checking that generated claims are entailed by retrieved grounding sources before exposing output to users.
- Hallucination Damages Trust — LLMs generate confident-sounding wrong answers. Untrusted output damages user trust.
- Retrieval Alone Isn't Enough — Retrieval brings sources, but LLM may still misquote or extrapolate beyond them. Validation must check what's actually said.
- Entailment Models Are The Validator — Per claim, entailment models check that grounding sources actually entail the claim. NLP infrastructure for entailment is required.
- Validation Must Scale — Per LLM output, validation runs across many generated claims. Fast entailment checking required.
- Action Required On Failure — Per failed entailment, action is required: remove claim, regenerate, add caveat. The action layer matters as much as detection.
Innovation
How The System Works
The system parses LLM output into atomic claims, retrieves grounding sources, runs entailment checking per claim, validates that sources entail claims, and triggers appropriate actions on failure (remove, regenerate, caveat).
- Generate LLM Output — LLM produces candidate output.
- Parse Into Atomic Claims — Per output, parse into atomic claims for validation.
- Retrieve Grounding Sources — Per output, retrieve grounding sources relevant to claims.
- Run Entailment Per Claim — Per claim, entailment model checks whether sources entail the claim.
- Aggregate Entailment Confidence — Per output, aggregate confidence across claims.
- Trigger Action On Failure — Per failed claim, action triggered: remove, regenerate, caveat.
- Continuous Improvement — Entailment models and action policies recalibrate against fresh data.
Entailment Is The Grounding Gate
The patent's load-bearing idea is that LLM output must pass entailment validation before user exposure. Per claim, entailment checking against grounding sources is the structural validation layer.
Claim-Level Validation, Source-Level Grounding
Per claim, entailment validation. Per claim, grounding source required. The two combine into the validation architecture.
- Atomic Claim Parsing — Per output, parsed into atomic claims for validation.
- Grounding Source Retrieval — Per claim, grounding sources retrieved.
- Per-Claim Entailment Checking — Per claim, entailment model checks whether sources entail.
Technical Foundation
Technical Foundation
The patent specifies the LLM, claim parser, source retriever, entailment model, confidence aggregator, action layer, and improvement loop.
- LLM — Generates candidate output.
- Claim Parser — Parses output into atomic claims.
- Source Retriever — Per claim, retrieves grounding sources.
- Entailment Model — Per claim, checks source-claim entailment.
- Confidence Aggregator — Aggregates per-claim confidence into per-output score.
- Action Layer — Per failed claim, triggers remove/regenerate/caveat actions.
The Process
The Process
Per LLM output, the validation pipeline runs before user exposure.
- LLM Generates — LLM produces candidate output.
- Parse Claims — Output parsed into claims.
- Retrieve Sources — Per claim, grounding sources retrieved.
- Check Entailment — Per claim, entailment validated.
- Aggregate Confidence — Per-output confidence aggregated.
- Act On Failure — Failed claims trigger appropriate action.
- Return Validated Output — Validated output returned to user.
Quality Control
Quality Control
Entailment accuracy determines output trustworthiness. The patent specifies safeguards.
- Entailment-Model Validation — Entailment model validated against labeled entailment pairs.
- Source-Quality Filtering — Per source, source quality validated before entailment use.
- Claim-Confidence Threshold — Per claim, minimum confidence threshold for retention.
- Action-Policy Calibration — Per failure mode, action policy calibrated.
- Continuous Recalibration — Models and policies recalibrate against fresh data.
Real-World Application
LLM output entailment is foundational to grounded generative search and Assistant LLM deployments. The pattern of atomic-claim parsing plus source retrieval plus entailment validation underpins how generative responses earn user trust.
- Atomic-claim Validation Unit — Per claim, entailment validated independently.
- Source-grounded Validation Basis — Per claim, grounding source required.
- Action-triggered Failure Handling — Per failed claim, remove/regenerate/caveat actions.
Why Authoritative Source Pages Become AI Citation Targets
Per claim, the system retrieves grounding sources. Pages with clear, well-cited, authoritative content become preferred grounding sources for LLM responses. Source-citation worthiness is the AI-era discovery signal.
Why Verifiable Facts Beat Unsupported Claims
Entailment validation favors content where claims map to clear evidence. Pages structured with clear claim-evidence patterns (cited statistics, sourced quotes, verifiable assertions) entail cleanly and ground LLM responses reliably.
<\/section>What This Means for SEO
What This Means for SEO
This patent validates LLM output by parsing it into atomic claims, retrieving grounding sources, and checking entailment before exposing the answer. SEO implication: authoritative, well-cited pages become preferred grounding sources, and verifiable claim-evidence structure is the AI-era discovery signal.
- Authoritative Pages Become Citation Targets — The system retrieves grounding sources per claim. Pages with clear, well-cited, authoritative content become preferred grounding sources for generative responses, making source-citation worthiness the AI-era discovery signal.
- Verifiable Facts Beat Unsupported Claims — Entailment validation favors content where claims map to clear evidence. Pages structured with cited statistics, sourced quotes, and verifiable assertions entail cleanly and ground responses reliably, where unsupported assertions do not.
- Structure Claims With Their Evidence — Output is parsed into atomic claims, each checked against sources. Content that pairs each claim with its supporting evidence is easy to entail, so a claim-then-evidence structure makes your page citation-ready.
- Source Quality Is Filtered Before Use — Each grounding source is quality-validated before entailment. Being the kind of source the system trusts, authoritative and reliable, is a prerequisite for being used as grounding at all.
- Hallucination Defense Rewards Clarity — The whole system exists to catch unsupported generation. Content that states verifiable facts plainly reduces ambiguity and is more likely to be the evidence a grounded answer cites than a hedged or vague page.
- Failed Claims Get Removed Or Caveated — On entailment failure, claims are removed, regenerated, or caveated. If a generative answer would lean on weakly-supported content, that content gets dropped, so robustly-sourced pages are the ones that make it into answers.
- Citation Worthiness Is The New Visibility — As grounded generative search grows, being the cited source for a claim is the discovery surface. Producing clearly attributable, evidence-backed content positions you to be the source generative systems quote.