An approach to improving search engine accuracy through consensus-based answer validation. This patent introduces techniques for generating high-quality short answers by leveraging multiple passages from different sources, ensuring users receive accurate, reliable information at a glance.
Patent Overview
- Granted
- October 2023
The Challenge
The Challenge
The problem this patent addresses comes from limits in how earlier systems handled the underlying signal. Several specific gaps motivated the new approach.
- The Problem — Traditional search engines don't account for answer quality when selecting passages. The logic used to extract short answers fails to validate accuracy against other sources. This can result in misleading or incorrect answers being prominently displayed, like showing "Near...
- GAP Threshold — Percentage of examples satisfying the "good answer precision" threshold in typical datasets
- Threshold-Based Display — Displays short answers only when accuracy scores exceed predetermined thresholds, ensuring quality. "The improved scoring engine determines a degree of consensus between multiple passages from different sources, resulting in higher quality short answers that are more likely...
Innovation
How The System Works
The patent introduces a multi-step mechanism that turns the input signal into a usable ranking output. Each step builds on the previous one.
- Core Method — Computer-implemented method for receiving search queries, generating search results with passages, selecting candidate and context passages, scoring using consensus, and providing short answers based on accuracy scores.
- Approach — This patent represents a fundamental shift in how search engines validate and present information to users. By leveraging consensus across multiple authoritative sources rather than relying on single passages, the system dramatically improves answer...
- Current State — Search engines display short answers in prominent callout positions, providing users with fast answers to factual queries without requiring clicks. These answers enable direct responses to diverse questions without curated knowledge bases. However...
- Accuracy Score Prediction — Employs a trained prediction engine to score candidate passages based on agreement with context passages.
Technical Foundation
Technical Foundation
The implementation rests on a specific set of components and data structures. These are the parts the patent claims and the engineering that ties them together.
- Search Engine Data (233) — Contains search results with passages and rankings. Each result links to a website with text passages (paragraphs, specified word counts) and numerical rankings.
- Prediction Engine Data (253) — Includes input embeddings, intermediate data (loss function output, hidden layer values), and teacher-student model information.
- Desktop/Server Devices — Desktop devices include processors, memory, storage, high-speed interfaces, and display capabilities. Server implementations may involve distributed systems with multiple processors and network-attached storage.
- Rack Architecture — Each computing device may include multiple racks, each with one or more processors, network-attached storage devices, and other computer-controlled devices. Racks interconnect through rack switches.
- Prediction Engine Input — Inputting candidate passages, context passages, search queries, and respective titles into score prediction engines.
The Process
The Process
In production, the system executes a sequence of stages from query reception to result delivery. Each stage applies one transformation to the data.
- Query Reception — User submits a search query to the search engine through various input modalities (keyboard, touchscreen, voice).
- Step 502: Query Reception — Query manager receives query data representing a search query input by user into the search engine through various modalities.
- Step 504: Results Generation — Search engine manager generates plurality of search results based on the query, each with respective passages relating to the search query using conventional selection methods.
- Step 506: Passage Selection — Prediction engine manager selects set of passages for scoring: one candidate passage from top-ranked result and remaining context passages (typically 3 total from top results).
- Step 508: Accuracy Scoring — Prediction engine manager scores candidate passage using context passages, query portion, and titles to produce accuracy score.
- Step 510: Display Decision — Based on accuracy score satisfying threshold, prediction engine manager provides candidate passage for display as short answer in search result page.
Quality Control
Quality Control
The system includes checks that defend against edge cases, manipulation, and degraded signal. Without these, the core mechanism would be exploitable.
- Multi-Source Validation — Uses multiple passages from different search results to validate answer accuracy through consensus.
- Higher Answer Quality — Short answers are more likely to be correct because the system validates them against multiple authoritative sources rather than relying on a single passage.
Real-World Application
The patent shapes how the search engine behaves in production. These are the visible outcomes for users and content publishers.
- Enhanced User Trust — Consensus-based validation increases user confidence in displayed answers, improving the overall search experience.
- 0.5 Bias Term Application — Add a bias term (e.g., -0.5) to accuracy scores, having the greatest positive effect on precision but potentially reducing recall.
- Reduced Network Data — Improved answer quality results in fewer follow-up user queries, reducing overall network traffic and server load.
What This Means for SEO
What This Means for SEO
When short-query answers are extracted from multiple sources and scored against each other, the page that says the same thing as the consensus wins.
- Consensus Phrasing Is A Ranking Boost — Pages that phrase a fact the way the consensus does are easier to validate. Outlier phrasings, even when correct, lose to consensus phrasings in scoring.
- Citing The Same Sources As Authorities Builds Credibility — When your answer cites the same primary sources the trusted answers cite, you join the consensus cluster. Source overlap is a soft signal of source quality.
- Short Answers Need Long Context — A short answer phrase wins extraction only when surrounded by depth. The model scores the phrase by the context around it, not by the phrase alone.