Generates query suggestions tailored to the user by extracting entities from their past queries, building an entity collection that anchors the suggestions to the user's topical interests rather than population-average autocompletes.
Patent Overview
- Inventor
- Nitin Gupta
- Assignee
- Google LLC
- Filed
- 2014-09-12
- Granted
- 2016-05-17
- Application Number
- US 14/484,757
The Challenge
Suggestion Surfaces Are Audience-Average, Not Personal
Autocomplete and query-suggestion surfaces typically draw from population-average behaviors. A user who searches consistently for software-engineering topics still sees suggestions skewed by the global query distribution. The system needs to use the user's past queries to extract entities they actually care about, then anchor suggestions to those entities so the surface reflects the user rather than the average.
- Global Suggestions Miss Personal Topics — Without personalization, a user searching for niche topics sees suggestions dominated by mainstream queries. The surface fails the niche audience while serving the median user.
- Past Queries Carry Entity Signal — Each query a user has issued contains entities that reveal their interests. Aggregating entities across past queries produces a user-specific topical profile.
- Need Entity-Level Granularity — Personalizing at the query-string level is too coarse. Entity-level personalization captures the underlying interests while being robust to phrasing variations across queries.
- Candidate Suggestions Must Be Scored Against The Profile — Once an entity collection is built from past queries, candidate suggestions need a similarity measure against that collection so the most-relevant suggestions surface first.
- New Queries Should Update The Collection — The entity collection is dynamic. Each new query adds entities; old entities decay. The personalization adapts as the user's interests evolve.
Innovation
Entity Collections From Past Queries
The system extracts entities from one or more of the user's past queries to build an entity collection representing their topical interests. When candidate query suggestions are identified for the current query, each candidate is scored by similarity to the entity collection. Suggestions whose entities align most strongly with the user's collection surface first.
- Identify Past Queries — Pull one or more queries the user has issued in their history. Recency-weighting may emphasize recent queries over older ones.
- Extract Entities Per Query — For each past query, run entity extraction to identify the entities referenced. The extracted entities accumulate into a per-user entity collection.
- Receive Current Query — When the user types a new query (or partial query), produce candidate query suggestions from the standard suggestion pipeline.
- Extract Entities From Each Candidate — For each candidate suggestion, extract the entities it references.
- Compute Candidate-To-Collection Similarity — Score each candidate by how strongly its entities overlap with the user's entity collection. Stronger overlap means better personal fit.
- Rank Suggestions By Similarity — Surface the highest-similarity suggestions first. Generic suggestions that have no entity overlap with the user's profile fall to the bottom.
- Update Collection On New Query — Whichever suggestion the user selects (or whatever query they actually issue) feeds new entities back into the collection. Personalization improves with use.
Entity Collection As Personal Topic Profile
The patent's central contribution is treating past queries as a source of entities rather than as past query strings. The entity collection is a compact representation of the user's topical interests that drives suggestion ranking across all future queries.
Entities Are The Persistence Layer
Query strings change; entities persist. By extracting entities, the system gets a stable representation of user interests that handles phrasing variation gracefully.
- Past Query Mining — Run entity extraction across the user's query history. Accumulate the extracted entities into a per-user collection.
- Candidate Entity Match — For each suggestion candidate, extract its entities and compare against the user's collection. Overlap drives ranking.
- Continuous Update — Every new query updates the collection. Personalization compounds over time without explicit user setup.
Personalization is implicit, entity-driven, and silently improves with every search.
<\/section>Technical Foundation
The Similarity Computation
The candidate-to-collection similarity is the core metric driving personalized suggestion ranking.
- User Entity Collection — Aggregate of entities extracted from past queries, optionally weighted by recency and frequency. Represents the user's topical profile.
- Candidate Suggestion Entities — Entities extracted from a candidate query suggestion.
- Similarity Measure — Function over (candidate entities, user collection). Common forms: overlap count, weighted Jaccard, embedding similarity if entities have embeddings.
- Personalization Weight — How much the similarity score influences the final ranking. Tunable so the personalization can be strengthened or relaxed per query class.
Quality Metrics
- Candidate-Collection Similarity — Jaccard-style overlap. Weighted variants give more credit to candidates whose entities match higher-weighted entities in the user's collection.
sim(C, U) = |entities(C) ∩ U| / |entities(C) ∪ U|
Key Insight: Treating queries as evidence about entities (not as the entities themselves) is the abstraction that makes personalization scale. The user's collection is much smaller than their query history and remains useful even when the user's exact phrasing changes. Entity-level persistence handles vocabulary drift, paraphrasing, and partial queries gracefully.
<\/section>What This Means for SEO
What This Means for SEO
Entity-collection personalization is one of the silent mechanisms shaping every user's suggestions and search results. Understanding it changes how to think about entity-aligned content and audience-specific positioning.
- Entity Alignment Builds Personalized Authority — When a user has built up an entity collection on a topic, content that cleanly aligns with those entities surfaces more often for that user. Entity-aligned content compounds with audience-defined SEO.
- Entity Markup Compounds With This Signal — Pages with strong entity markup (schema, knowledge-graph references, explicit entity mentions) are easier to extract entities from and match against user collections. Implicit entity recognition is fine; explicit markup is better.
- Topical Consistency Beats Topic Hopping — Users who consistently engage with content on a single topic build a focused entity collection. Content that maintains topical consistency aligns with focused collections better than scattered content does.
- Long-Term Audiences Benefit Most — Personalization compounds with use. Long-term audiences who have built rich entity collections see your content surface more easily once you're in their topical neighborhood. First-touch acquisition matters because the long-term return is amplified.
- Niche Topics Are Advantaged — Generic mainstream topics have entity collections dominated by everyone. Niche topics produce entity collections that strongly differentiate aligned content from non-aligned content. Niche-focused content gets more personalization leverage.