Filters search results using user-supplied annotations on documents, supporting collaborative curation and personalized result filtering through declarative tags that can be applied per-document, per-domain, or per-topic.
Patent Overview
- Inventor
- Ramanathan V. Guha
- Assignee
- Google LLC
- Filed
- 2010-09-30
- Granted
- 2012-12-25
- Application Number
- US 12/894,931
The Challenge
The Challenge
Algorithmic ranking captures broad relevance and quality signals, but individual users have specific filtering preferences that the algorithm cannot infer: 'show me only academic sources', 'exclude this domain', 'highlight content my colleagues endorsed'. The system needed user-supplied annotation as a filtering input.
- Algorithms Cannot Infer Every Preference — Some user filtering preferences are specific and idiosyncratic. The algorithm has no way to discover them automatically; user-supplied annotations fill the gap.
- Annotations Enable Collaborative Curation — When users in a community annotate documents (academic, expert, trusted), the annotations become collaborative quality signal. The system can leverage community-curated metadata.
- Annotations Must Be Granular — Per-document, per-domain, per-topic annotations all matter. The system must support multiple granularities and apply them appropriately.
- Filtering Must Be Composable — Users may apply multiple annotation filters simultaneously: 'academic sources only' plus 'exclude these domains'. The filter combinator must support boolean composition.
- Privacy Boundaries Matter — Personal annotations stay private; community annotations are shared. The system must respect these boundaries and let users control sharing.
Innovation
How The System Works
The patent collects user-supplied annotations on documents, domains, and topics, stores them per-user and per-community with privacy boundaries, and applies them as filters during search retrieval so the result set reflects both algorithmic ranking and user-declared filtering preferences.
- User Annotates Content — Through UI, users annotate documents, domains, or topics with declarative tags: 'academic', 'spam', 'high-quality', 'preferred-source', custom tags. Annotations save to per-user store.
- Community Annotation Sharing — Users can share annotations with communities (groups, organizations, public). Shared annotations become collaborative curation signal.
- Apply At Query Time — When the user queries, applicable annotations apply as filters. Per-document annotations filter individual results; per-domain annotations filter all results from that domain.
- Compose Filters — Multiple filters combine via boolean composition. Users can intersect, union, or negate annotation-based filters as needed.
- Respect Privacy Boundaries — Personal annotations apply only to the annotating user. Community annotations apply within the sharing scope. Privacy is enforced at filter-application time.
- Return Filtered Results — Filtered results reflect both algorithmic ranking and user-applied annotations. Users see what they want and exclude what they do not.
- Refine Annotations Over Time — Users add, remove, or refine annotations as their preferences evolve. Communities update shared annotations as curation effort progresses.
Annotation As Filter Substrate
The patent's load-bearing idea is to make user-supplied annotations a first-class filtering signal alongside algorithmic ranking. Personal and collaborative curation become composable inputs to search.
User Curation Beats Algorithmic Inference For Specifics
Algorithms infer broad preferences; users declare specific ones. Combining both produces results that respect both general quality and individual filtering intent.
- Multi-Granularity Annotations — Per-document, per-domain, per-topic annotations all supported. Users tag at the granularity that fits their need.
- Privacy-Bounded Sharing — Personal annotations stay private; shared annotations apply within sharing scope. Privacy is first-class.
- Composable Filters — Multiple annotation filters combine via boolean composition. Users build complex filtering preferences from simple primitives.
Technical Foundation
Technical Foundation
The patent specifies the annotation schema, the per-user store, the community sharing layer, the query-time filter applier, the boolean composition logic, and the privacy enforcement.
- Annotation Schema — Per annotation: target (document, domain, topic), tag label, optional metadata, owner, sharing scope. Schema is extensible for future tag types.
- Per-User Annotation Store — Annotations index per user and per community for fast query-time lookup. Storage scales to millions of annotations per user.
- Community Sharing Layer — Annotations can be shared with groups, organizations, or public. Sharing model is explicit; defaults to private.
- Query-Time Filter Applier — At query time, applicable annotations apply as filters over the retrieval result set. Filter execution is fast.
- Boolean Composition Logic — Multiple filters compose via intersect, union, negate operators. Users can build complex filtering expressions through UI affordances.
- Privacy Enforcement — Per annotation, sharing scope determines who can apply it. Private annotations never leak across users; shared ones apply within scope.
The Process
The Process
The annotation pipeline runs across both UI flows (annotation creation, sharing) and query-time filter application. Storage and lookup scale to large user and community bases.
- User Creates Annotation — Through UI, user annotates a document, domain, or topic with a tag. Annotation saves to the user's annotation store.
- Optional Sharing — User can share annotation with a community. Sharing scope saves with the annotation record.
- Query Issued — User issues a search query. Applicable annotations are loaded for the user and any communities they belong to.
- Construct Filter Expression — From applicable annotations plus any explicit filter preferences, construct the filter expression for this query.
- Retrieve And Filter — Standard retrieval produces candidates; the filter expression applies as a post-retrieval filter or as scoring adjustments.
- Return Results — Filtered results return reflecting both algorithmic ranking and annotation-based filtering.
- Refine — Users add or remove annotations as preferences evolve. Communities update shared sets as curation progresses.
Quality Control
Quality Control
Annotation abuse and privacy leaks must be prevented. The patent specifies safeguards.
- Privacy Enforcement — Personal annotations cannot leak. Sharing requires explicit user action; defaults protect privacy.
- Annotation Spam Detection — Annotation campaigns aimed at manipulating community shared sets are detected and demoted. Communities are protected from manipulation.
- Community Moderation — Communities can moderate shared annotations. Abusive contributions can be flagged and removed.
- Default Privacy Conservatism — New annotations default to private. Users must explicitly opt into sharing.
- Annotation Quality Filtering — Annotations with quality issues (vague tags, malformed targets) are flagged for user attention.
Real-World Application
Annotation-based filtering primitives appear in features like search-result history, personalized filtering, and collaborative curation surfaces. While dedicated annotation features have varied over time, the underlying primitives shape personalized retrieval.
- Multi-granularity Annotation Targets — Per-document, per-domain, per-topic. Users tag at the granularity their preferences require.
- Privacy-bounded Sharing Model — Private by default; community sharing requires explicit opt-in. Privacy is first-class.
- Composable Filter Expressions — Boolean composition of multiple annotation filters. Users build complex preferences from simple primitives.
Why Personal Curation Compounds Over Time
Users who annotate their search experience build a personalized filter substrate that grows in value. The longer the curation history, the more precisely results match the user's expressed preferences.
Why Community Curation Has Authority Implications
When communities (academic, professional, expert) collaboratively annotate documents, the resulting shared annotation sets become trusted curation signals. Sites that earn positive community annotation gain visibility within those communities.
<\/section>What This Means for SEO
What This Means for SEO
The patent makes user-supplied and community annotations a first-class filtering signal applied alongside algorithmic ranking. SEO implication: personal and collaborative curation shape what individuals and communities see, so earning positive annotation within a community translates into durable visibility there.
- Community Curation Carries Authority — When academic, professional, or expert communities collaboratively annotate documents, the shared annotation sets become trusted curation signals. Content that earns positive annotation from a relevant community gains visibility within it that pure algorithmic ranking would not provide.
- Personal Filters Compound Over Time — Per-user annotations build a personalized filter substrate that grows in value with history. Content a user saves, tags, or endorses keeps surfacing for them on related queries. Becoming a user's curated reference is a compounding, query-independent advantage.
- Annotations Can Exclude As Well As Include — Filtering supports exclusion (hide this domain, exclude this source) as much as promotion. Negative experiences that prompt users to filter you out remove you from their result set entirely. Avoiding the behaviors that trigger exclusion matters as much as earning inclusion.
- Be The Source Experts Tag — Annotation operates at document, domain, and topic granularity. Content credible enough that experts in a field tag or endorse it gets the collaborative curation lift. Aim to be reference-grade within a community rather than broadly mediocre.
- Curation Composes With Ranking — Annotations apply on top of algorithmic ranking, not instead of it. You still need to rank to be a candidate, then annotation refines which candidates a user or community sees. Treat curation as an additive layer, not a substitute for fundamentals.
- Privacy Boundaries Scope The Signal — Annotations are stored per-user and per-community with privacy boundaries, so the lift is scoped to the relevant audience. Visibility gains concentrate within the community that produced the annotations, which makes community-specific reputation the lever.
- Declarative Preferences Beat Inferred Ones For Specifics — The patent exists because algorithms infer broad preferences while users declare specific ones. Content that serves a clearly-stated specific need (academic-only, this-domain-only) wins when users apply those declared filters.