Matches advertisements to search queries by producing ad queries that include unigram features and external classification features applied to search results. The Yahoo-era patent that bridges search-quality signals into ad relevance.
Patent Overview
- Inventor
- Andrei Broder
- Assignee
- Yahoo! Inc.
- Filed
- 2008-04-03
- Granted
- Published 2009-10-08
The Challenge
The Challenge
Ad matching from raw queries fails when queries are short or ambiguous. Augmenting queries with knowledge from search results (the organic-result content itself) produces richer feature vectors for ad-relevance matching. The structural primitive that lets ads benefit from organic-search understanding.
- Raw Queries Are Short And Noisy — Search queries are 2-3 words average. Ad matching on raw queries misses intent.
- Organic Results Reveal Intent — Top organic results for a query reveal what the query is really about. Content from those results enriches the query.
- Augmented Queries Improve Matching — Per query, augmented features improve ad-matching precision and recall.
- Augmentation Must Be Fast — Per query, augmentation runs in real time. Latency budget tight.
- Manipulation Defense Required — Augmentation must defend against query manipulation aimed at gaming ad-matching.
Innovation
How The System Works
The system runs the query against organic search to retrieve top results, extracts unigram features from result content, classifies results via external classification systems, augments the original query with these features, and uses the augmented feature vector for ad-relevance matching.
- Receive Query — Per query, ad-matching pipeline activates.
- Run Against Organic — Query runs against organic search; top results retrieved.
- Extract Unigram Features — Per top result, unigram features extracted from content.
- Apply Classification Features — Per result, external classification systems produce additional features.
- Augment Query Vector — Original query plus extracted features form augmented vector.
- Match Against Ad Inventory — Augmented vector matches against ad inventory.
- Rank Ads By Relevance — Matching ads ranked by combined relevance score.
Organic Results Enrich Ad Matching
The patent's load-bearing idea is that organic-search understanding can be borrowed for ad matching. Augmenting queries with top-result features turns brief queries into rich feature vectors.
Borrow Knowledge From Organic
Organic search has already done the intent inference. Borrowing those results' features enriches ad matching without re-doing the work.
- Top-Result Feature Extraction — Per top organic result, unigram features extracted.
- External Classification Augmentation — External classification systems add features beyond unigrams.
- Augmented Vector Matching — Augmented feature vector matches ad inventory.
Technical Foundation
Technical Foundation
The patent specifies the query handler, organic searcher, unigram extractor, classifier, augmenter, ad matcher, and ranker.
- Query Handler — Receives query for ad matching.
- Organic Searcher — Runs query against organic search; retrieves top results.
- Unigram Extractor — Per result, extracts unigram features.
- Classifier — External classification systems produce features.
- Augmenter — Original query plus features form augmented vector.
- Ad Matcher — Augmented vector matches ad inventory.
The Process
The Process
Per query, the augmentation and matching pipeline runs in real time.
- Receive Query — Query arrives.
- Organic Search — Top organic results retrieved.
- Extract Unigrams — Per result, unigrams extracted.
- Apply Classifiers — External classifiers produce features.
- Augment — Augmented vector built.
- Match Ads — Vector matches inventory.
- Rank And Return — Top ads returned.
Quality Control
Quality Control
Wrong augmentation hurts ad relevance. The patent specifies safeguards.
- Organic-Result Quality Validation — Top organic results validated for quality before feature extraction.
- Feature Stability — Per query, feature stability across organic-result snapshots monitored.
- Manipulation Pattern Detection — Suspicious query patterns flagged before augmentation.
- Adversarial Ad Defense — Ads manipulating organic content to manipulate augmentation detected.
- Continuous Recalibration — Augmentation and matching models refresh against fresh data.
Real-World Application
Query augmentation via organic-result features is foundational for modern ad-matching. The pattern of borrowing organic-search understanding for ad systems appears across search-ad infrastructure.
- Organic-borrowed Feature Source — Top organic results provide augmentation features.
- Unigram + classifier Feature Types — Unigram features plus external classifications combine.
- Real-time Latency Class — Per query, augmentation runs in real time.
Why Strong Organic Performance Improves Ad Auction Position
When your content ranks well organically, augmentation builds richer feature vectors mentioning your topical area. Indirectly, this shapes which ads compete and how — strong organic presence benefits adjacent ad-auction dynamics.
Why Topical Relevance Compounds Across Ad And Organic
Content that wins organic for a topical area provides the augmentation features that shape ad matching for the topic. Topical relevance compounds across both surfaces.
<\/section>What This Means for SEO
What This Means for SEO
This Yahoo-era patent augments short ad queries with unigram and classification features pulled from the top organic results for that query. SEO implication: strong organic performance for a topic feeds the feature vectors that shape adjacent systems, so organic relevance has reach beyond organic.
- Organic Results Are Treated As Intent Truth — The system borrows the top organic results as the best available read of what a query means. Ranking organically for a topic means your content is part of how the query gets interpreted downstream.
- Short Queries Get Enriched From Your Content — Two-or-three-word queries are expanded using features extracted from result content. The words and topical framing on top-ranking pages become part of the augmented query vector.
- Topical Relevance Compounds Across Surfaces — Content that wins organic for a topical area supplies the augmentation features for that area. Winning the topic organically has knock-on effects on how related systems classify the same intent.
- Classification Features Reward Clear Topicality — Beyond raw words, external classifiers add category-level features from the results. Content that classifies cleanly into a clear topic contributes stronger, less ambiguous features.
- Augmentation Defends Against Manipulation — The patent specifies manipulation-pattern detection on suspicious query and content behavior. Trying to game augmentation by stuffing organic content with off-topic terms invites flagging rather than benefit.
- Feature Stability Favors Consistent Content — The system monitors feature stability across result snapshots. Content that consistently represents a topic produces stable features; volatile or contradictory pages produce noisy ones.
- Quality-Validated Results Drive The Vector — Top results are validated for quality before their features are used. Earning a top organic position with genuinely high-quality content is what gets your content into the augmentation pipeline at all.