Filters out spurious interests before they get fed into a personalized content feed by combining a curation score with a good-interest probability, protecting personalization from noisy click signals.
Patent Overview
- Inventor
- Steven D. Baker
- Assignee
- Google LLC
- Filed
- 2018-08-22
- Granted
- 2021-12-14
- Application Number
- US 16/109,318
The Challenge
Not Every Inferred Interest Should Reach The Feed
A personalized content feed only earns user trust when the interests behind its picks are real. Inferred interests can be noisy: a one-off search, a friend's link, an accidental tap. Feeding the system every inferred interest produces a feed full of false topics that erodes the value of personalization. The validation gate exists to prevent that erosion.
- Click Signals Are Noisy — A single click on a topic can come from curiosity, accident, or external context that does not reflect ongoing interest. Treating every click as an interest declaration over-fits to ephemeral behavior.
- Feeds Reward Reinforcement Loops — Without filtering, feeds quickly drift toward whatever the user clicked once, drowning out actual interests. The feed becomes a hall of mirrors reflecting noise back at the user.
- Need A Validation Gate Before Feed Insertion — Each candidate interest needs a pre-feed validation that combines multiple signals to decide whether to include it in the personalization profile. The gate is what prevents noise from compounding.
- Curation Quality Of The Entity Matters — Entities with poorly curated representations in the knowledge layer are risky personalization targets because the system has weak ability to find good content about them. Curation quality is a precondition for inclusion.
- User-Outcome Estimation Should Drive Decisions — Pure entity-quality signals are not enough. The system also needs an estimate of whether including this entity in the feed actually helps the user. That estimate must be probabilistic.
Innovation
Curation Score Plus Good-Interest Probability
For each candidate interest entity, the system computes two scores: a curation score that reflects how well the entity is structured and how curated its associated content is, and a good-interest probability that estimates whether including this entity in the feed will be valuable to the user. The combined signal decides whether the entity enters the personalization profile.
- Identify Candidate Interest Entity — An entity is surfaced as a candidate interest based on user behavior, query history, or other upstream signals. The candidate is provisional until validated.
- Compute Curation Score — Score the entity on how cleanly it is defined, how curated its associated content is, and how stable its representation is in the knowledge layer. Higher scores indicate the system can reliably find good content for the entity.
- Compute Good-Interest Probability — Estimate the probability that promoting this entity into the user's feed produces a positive user outcome. The estimate can use historical feed-performance data and user-specific signals.
- Combine The Two — Use both scores together. An entity needs both decent curation and a decent good-interest probability to qualify. Either alone is insufficient.
- Promote Or Exclude — Entities that clear both thresholds enter the personalization profile. Entities that fail are excluded, even if user behavior superficially suggested interest.
- Generate Personalized Feed — Build the user's content feed from web documents associated with the validated entities, weighted by the combined score. The feed reflects only validated interests.
Validation Before Personalization
The patent's contribution is putting a validation gate between interest inference and feed insertion. The gate combines structural quality (curation) with user-outcome probability so that only interests that pass both filters reach the feed.
Both Filters Required
An entity must clear both the curation threshold and the good-interest probability threshold to be promoted. The conjunction prevents either signal type from dominating.
- Curation Score — A structural quality measure of the entity itself: how well defined, how cleanly represented, how much curated content exists. The system's read of whether it can find good results for the entity.
- Good-Interest Probability — A user-outcome estimate of whether including this entity helps. Probabilistic, can be tuned per user, and incorporates historical feed-performance data.
Personalization respects two questions: can we serve this entity well, and should we?
<\/section>Technical Foundation
Two Independent Quality Filters
The validation depends on combining a structural quality score about the entity itself with a probabilistic estimate about user value.
- Curation Score — A measure of how curated and well-defined the entity is in the system's knowledge layer. Poorly curated entities are risky personalization targets.
- Good Interest Probability — An estimate of the probability that including this entity in the user's feed will be valuable to the user, based on patterns of user behavior and feed performance.
- Combined Decision — Both signals must meet thresholds for the entity to enter the user's interest profile and influence feed selection.
- Threshold Stratification — Different threshold combinations can promote entities at different confidence levels (top interest vs. background interest), influencing how much weight each gets in feed selection.
Key Insight: Personalization quality depends as much on what is excluded as on what is included. The validation gate's primary value is rejecting noisy interest signals before they corrupt the feed. The combination of two independent filters (entity quality and user-outcome estimate) makes the gate resistant to single-signal failure modes.
<\/section>The Process
Validation Pipeline
The validation runs offline (for stable interests) and online (for newly inferred interests). The output is a validated interest set per user that the feed runtime consumes.
- Candidate Surfaced — Upstream signals (clicks, queries, dwell time) surface a candidate interest entity for the user.
- Curation Lookup — Look up the entity's curation score from the knowledge layer. Entities below the curation threshold are excluded immediately.
- Probability Computation — Estimate the good-interest probability using historical feed performance for this entity and similar users.
- Threshold Check — Compare both scores against their thresholds. The entity passes only if both clear.
- Profile Update — Promoted entities are added to the user's interest profile. The profile is consumed by the feed-selection runtime to weight content choices.
What This Means for SEO
What This Means for SEO
Personalized feeds (Discover, content surfacing within Search) rely on a clean inventory of validated interest entities. The implications for content strategy in feed-eligible spaces are direct.
- Entity-Aligned Content Is Feed-Eligible — Content that is clearly about a recognized entity, with structured signals (titles, schema, internal linking) reinforcing that alignment, has a better chance of being selected when that entity is a validated interest in a user's profile.
- Topical Clarity Increases Curation Score — Pages that drift across multiple loosely related entities split their signal and are less curated from the system's perspective. Tightly scoped pages reinforce their entity association and lift the underlying entity's curation score.
- Watch For Niche Entity Surfaces — Niche entities with limited curated content tend to fall below the curation threshold for feed inclusion. Building the canonical content for an emerging entity is one way to lift its curation score and unlock feed eligibility.
- Knowledge-Graph Presence Helps — Entities with strong knowledge graph representation (Wikipedia, structured data, authoritative sources) have higher curation scores. Authoritative coverage of an entity feeds the curation signal.
- User-Outcome Signals Compound — When users engage well with feed items about an entity (clicks, dwell, return visits), the good-interest probability for that entity rises. Pages that consistently produce good user outcomes feed the system's confidence in their underlying entities.
- Stable Entity Representation Matters — Entities whose representation changes frequently (renamed, redefined, merged) have lower curation scores because the system cannot rely on them. Establish stable canonical naming early.