Adjusts passage scores using the surrounding context within the source document, so a passage in a topically-coherent section scores higher than the same passage marooned in mixed content. Cross-listed with the 65 Google Patents collection as pat-42.
Patent Overview
- Inventor
- Srinivasan Venkatachary
- Assignee
- Google LLC
- Filed
- 2015-09-29
- Granted
- 2018-05-01
- Application Number
- US 14/870,141
The Challenge
The Challenge
A candidate answer passage's score should depend on its surroundings. A passage in a topically-coherent section of a focused article carries more confidence than the same words appearing as a stray quote in a mixed-topic page. The system needs to read surrounding context as a passage-quality signal.
- Passage Alone Misses Document Context — Scoring a passage in isolation ignores what surrounds it. Two passages with identical words can have very different meanings depending on their document context.
- Topical Coherence Validates The Passage — When the section around the passage is topically coherent and matches the passage's claim, the passage gains validation from its container. Mixed-context placement weakens that validation.
- Document-Wide Context Matters Too — Beyond the immediate section, the document's overall topic and structure inform whether the passage belongs. A passage in an off-topic document loses score even if its immediate context looks coherent.
- Context Signal Must Be Computable — Reading surrounding context for every candidate adds cost. The system must compute the context signal efficiently within the scoring latency budget.
- Per-Domain Variance Is Real — Some content types (news articles) are tightly topical throughout; others (blog posts) wander. The context-adjustment model must handle both without false penalty.
Innovation
How The System Works
The system extracts features describing the passage's immediate section and its document-level context, computes a topical-coherence score reflecting how well the passage fits its surroundings, and applies the score as an adjustment to the base passage score before display gating.
- Identify The Passage's Section — Per candidate, identify the section it lives in: the surrounding paragraphs, headings, and section boundaries. Section detection uses document-structure signals.
- Compute Section Topical Coherence — Score how topically coherent the section is, and how well the passage's topic aligns with the section's. Coherent and aligned passages earn higher context scores.
- Read Document-Level Context — Beyond the section, the document's overall topic and structure inform context fit. Document-level signals combine with section-level ones.
- Calibrate For Content Type — Per content type (news, blog, reference, listicle), calibrate the context-coherence expectations. Different types tolerate different coherence levels.
- Compute Context Adjustment — From the coherence signals and content-type calibration, compute the context adjustment value. Positive adjustments lift the passage; negative lower it.
- Apply To Base Score — The context adjustment combines with the base passage score (computed separately) to produce the final score the display gate evaluates.
- Re-Evaluate As Documents Change — When documents are re-crawled and their structure changes, context adjustments recompute. The signal stays current with document evolution.
Context Modulates Passage Score
The patent's load-bearing idea is that no passage exists alone. The text around it informs the system's confidence that the passage truly answers the query. Context becomes a first-class scoring dimension.
Surroundings Carry Signal
A passage in a focused topical section is implicitly endorsed by its container. A passage in mixed or off-topic surroundings lacks that endorsement. Reading the surroundings as evidence transforms passage scoring.
- Section-Level Coherence — How tightly the surrounding paragraphs and heading align with the passage's topic. Tight alignment validates the passage.
- Document-Level Context — Beyond the section, the document's overall topical character. A focused article supports its passages more than a wandering one.
- Content-Type Calibration — Different content types tolerate different coherence levels. News tightly topical; blogs vary; reference content highly structured. Per-type calibration prevents false penalty.
Technical Foundation
Technical Foundation
The patent specifies the section identifier, the coherence scorer, the document-context reader, the per-type calibration, the adjustment computation, and the integration with base scoring.
- Section Identifier — Uses document structure (HTML headings, paragraph boundaries, visual layout) to identify the section a candidate passage lives in. Sections are the substrate for coherence analysis.
- Coherence Scorer — Per section, measures topical coherence using embedding similarity, named-entity overlap, and topical-classifier agreement. Coherent sections score high.
- Document Context Reader — Beyond the section, reads the document's overall topic and structure. Document-level alignment with the passage adds to the context signal.
- Content Type Classifier — Classifies documents by content type to inform per-type calibration. News, blog, reference, listicle have distinct expectations.
- Adjustment Computation — Combines section coherence, document-level context, and content-type calibration into a single adjustment value. The adjustment modulates the base passage score.
- Integration Layer — Adjustment combines with the base passage score from the scoring pipeline. Final score determines display gating.
The Process
The Process
Context adjustment runs alongside base passage scoring. Per candidate, the adjustment is computed and combined with the base score before the display gate evaluates the final value.
- Receive Candidate — Candidate passage enters the scoring pipeline. Base scoring computes the standard score (separate patent in family).
- Identify Section — The section identifier finds the surrounding section in the source document.
- Score Section Coherence — Coherence scorer evaluates how tightly the section aligns with the passage topic.
- Read Document Context — Document-level topic and structure inform the broader context signal.
- Apply Content-Type Calibration — Per content type, the calibration adjusts coherence expectations. Output is the calibrated coherence signal.
- Compute Adjustment — Coherence signals plus calibration produce the adjustment value. The value modulates base score positively or negatively.
- Combine With Base Score — Final score equals base plus adjustment. The display gate evaluates final score against the threshold.
Quality Control
Quality Control
Wrong context adjustment over- or under-credits passages. The patent specifies safeguards.
- Coherence Scorer Calibration — Per content type, the coherence scorer is calibrated against labeled data. False positives over-credit; false negatives under-credit.
- Adjustment Magnitude Bounds — Adjustments are bounded so they cannot completely override base scoring. Robustness against context-scoring noise.
- Per-Content-Type Audit — Per type, accuracy is monitored. Types where the adjustment behaves poorly trigger recalibration.
- Outlier Detection — Extreme adjustment values flag for inspection. Most are pipeline artifacts; a few reveal real signal worth investigating.
- Continuous Feedback — User-engagement signals on displayed passages feed back into adjustment calibration. The system learns which context patterns predict good outcomes.
Real-World Application
Context-aware passage scoring underpins featured-snippet selection and SGE grounding choices. The patent's primitives are the technical reason topical coherence within content matters so much for direct-answer visibility.
- Section-level Primary Context Unit — Section coherence is the primary context dimension. Sections are the surrounding evidence for passage validation.
- Per-type Calibration Granularity — Content type calibrates coherence expectations. News, blog, reference, listicle have distinct profiles.
- Bounded Adjustment Magnitude — Adjustments modulate but do not dominate. Base scoring remains the foundation.
Why Section-Level Topic Coherence Matters
Pages structured with topically-coherent sections (one main idea per section, supporting paragraphs, clear topic sentences) earn higher context adjustments. Sprawling sections that mix topics weaken every passage they contain.
Why Topical Focus Beats Topical Breadth
A focused article on one topic gives its passages strong document-level context. A wandering article splits context across many topics, weakening each passage's standing. SEO benefits from depth over breadth at the article level.
<\/section>What This Means for SEO
What This Means for SEO
The patent adjusts an answer passage's score by the topical coherence of its surrounding section and document, so a passage embedded in focused, coherent content scores higher than the same words marooned in mixed content. SEO implication: section-level and article-level topical focus directly raises every passage's eligibility for direct answers.
- Section Coherence Boosts Passages — Pages with topically-coherent sections (one main idea per section, supporting paragraphs, clear topic sentences) earn higher context adjustments. Structure each section around a single idea so the passages within it inherit strong contextual endorsement.
- Topical Focus Beats Topical Breadth — A focused article gives its passages strong document-level context; a wandering article splits context and weakens each passage. Depth on one topic per article outperforms breadth for direct-answer eligibility.
- Surroundings Endorse The Passage — A passage in a focused topical section is implicitly endorsed by its container. The text around an answer is read as evidence of its trustworthiness. Surround your answers with on-topic supporting content rather than unrelated material.
- Avoid Stray Quotes In Mixed Pages — The same words score lower as a stray quote on a mixed-topic page than as part of a coherent section. Do not bury strong answers inside topically-scattered pages where the context adjustment penalizes them.
- Document-Level Context Compounds — The adjustment reads both immediate-section and document-level context. An article wholly about one topic reinforces every passage in it. Whole-page topical consistency compounds the boost across all your candidate passages.
- Coherence Gates Display — The context adjustment applies before display gating. A passage that would otherwise qualify can be held back if its surroundings are incoherent. Coherent structure is what lets a good passage clear the display threshold.
- Clean Topic Sentences Anchor Sections — Topic sentences signal a section's coherence to the scorer. Leading each section with a clear topic sentence helps the system read the section as focused, raising the context score for the answers inside.