Classifies ambiguous geographic references in queries and documents. Handles 'Springfield' disambiguation across many U.S. cities sharing the same name — the structural primitive that prevents location-naming collisions from breaking local search.
Patent Overview
- Inventor
- Daniel Egnor
- Assignee
- Google LLC
- Filed
- 2010
- Granted
- 2016-04-26
The Challenge
The Challenge
Place names are ambiguous. Many cities share names ('Springfield' exists in 30+ U.S. states). 'Paris' could be France or Texas. Without disambiguation, location-aware ranking ships wrong-locale results.
- Place Names Collide Widely — Springfield, Lincoln, Madison, Washington — many U.S. cities share names. Disambiguation is required.
- Context Determines Sense — Surrounding query terms, user location, and document context all contribute to disambiguation.
- Common-Name Bias Exists — Some namesakes are dominant (Paris, France vs Paris, Texas). Disambiguation must balance dominance with user signals.
- Per-Document And Per-Query Both Matter — Ambiguous references in documents and queries both need classification. The same disambiguation logic applies.
- Multi-Level Ambiguity Occurs — Some references are ambiguous at multiple levels (city vs state vs country). Hierarchical resolution required.
Innovation
How The System Works
The system identifies ambiguous geographic references in queries and documents, gathers context signals (surrounding terms, user location, document topic), runs classification models that resolve to canonical place entities, and outputs disambiguated references for downstream indexing and ranking.
- Identify Ambiguous References — Per document or query, identify ambiguous geographic references via place-name corpora.
- Gather Context Signals — Per reference, gather surrounding terms, user location (for queries), document topic and other geographic references nearby.
- Run Classification Model — Per reference, classification model produces probability distribution over candidate places.
- Apply Dominance Priors — Per candidate place, dominance prior incorporated. Dominant namesakes earn baseline boost unless context overrides.
- Resolve To Canonical Place — Above-threshold candidate wins. Reference resolved to canonical place entity.
- Output Disambiguated Reference — Resolved reference replaces ambiguous reference in downstream indexing or ranking.
- Validate Against Ground Truth — Classification validated against held-out labeled data.
Context Resolves Ambiguity
The patent's load-bearing idea is that ambiguous place names resolve via context. Surrounding terms, user location, document topic combine into a classification that distinguishes the intended Springfield from the others.
Multi-Signal Disambiguation
No single signal resolves ambiguity reliably. Context terms, user location, document topic, dominance prior all contribute. Classification combines them.
- Ambiguous Reference Identification — Per document or query, ambiguous references identified via place-name corpora.
- Multi-Signal Context Gathering — Surrounding terms, user location, document topic, neighbor references gathered.
- Classification Plus Dominance Prior — Classification model plus dominance prior combine to resolve canonical place.
Technical Foundation
Technical Foundation
The patent specifies the ambiguous reference identifier, context gatherer, classification model, dominance prior, resolver, and validator.
- Ambiguous Reference Identifier — Per document or query, identifies ambiguous geographic references.
- Context Gatherer — Gathers surrounding terms, user location, document topic, neighbor references.
- Classification Model — Per reference, produces probability distribution over candidate places.
- Dominance Prior — Per candidate, dominance prior incorporated.
- Resolver — Above-threshold candidate wins; reference resolved to canonical place.
- Validator — Classification validated against labeled data.
The Process
The Process
Per ambiguous reference, the disambiguation pipeline runs in real time for queries and at indexing for documents.
- Identify Reference — Ambiguous geographic reference identified.
- Gather Context — Surrounding signals gathered.
- Run Classification — Classification model produces distribution.
- Apply Dominance — Dominance prior incorporated.
- Resolve — Above-threshold candidate selected.
- Output Resolved Reference — Canonical reference output for downstream use.
- Validate Periodically — Classification validated against labels.
Quality Control
Quality Control
Disambiguation errors send users to wrong locales. The patent specifies safeguards.
- Classification Validation — Classification validated against labeled disambiguation data.
- Threshold Calibration — Resolution threshold calibrated to balance over-resolution and under-resolution.
- Dominance Prior Tuning — Dominance priors tuned to prevent over-favoring dominant namesakes when context contradicts.
- Fallback For Low-Confidence — Low-confidence cases fall back to multi-result presentation rather than single-resolution.
- Continuous Recalibration — Classification, priors, and thresholds recalibrate against fresh data.
Real-World Application
Ambiguous-geographic-reference classification underpins place disambiguation in local search. The pattern of multi-signal context-based resolution appears across modern location-aware retrieval systems.
- Multi-signal Disambiguation Method — Surrounding terms, user location, document topic combine.
- Dominance-aware Prior Application — Dominant namesakes earn baseline; context can override.
- Canonical-resolved Output Format — Reference resolved to canonical place entity for downstream consumption.
Why Context-Rich Pages Resolve Cleanly
Pages with strong surrounding location context (state, region, ZIP, neighbor places) disambiguate cleanly. Pages with only the ambiguous place name risk wrong-locale resolution.
Why Local Markup Disambiguates
Schema.org local-business markup, address structure, and explicit coordinates provide unambiguous geographic signals that override place-name ambiguity entirely.
<\/section>What This Means for SEO
What This Means for SEO
This patent resolves ambiguous place names like Springfield using surrounding terms, user location, document topic, and dominance priors. SEO implication: context-rich location signals and explicit local markup let your pages resolve to the right place instead of a wrong-locale namesake.
- Surround Place Names With Context — The classifier combines surrounding terms, neighbor places, and document topic to resolve ambiguity. Pages that include state, region, ZIP, and nearby landmarks disambiguate cleanly; a bare city name risks resolving to the wrong namesake.
- Local Markup Overrides Name Ambiguity — Schema.org local-business markup, structured address data, and explicit coordinates provide unambiguous geographic signals that bypass place-name guessing entirely. This is the most reliable way to anchor a page to its real location.
- Dominance Priors Favor The Famous Namesake — Ambiguous names carry a dominance prior toward the best-known place, overridable only by context. If your locale is the lesser-known namesake, you must supply strong context to win resolution against the dominant one.
- Disambiguation Applies To Documents Too — The same logic classifies references in your pages, not just in queries. Ambiguous location mentions in your content can be mis-resolved, weakening your geographic association, so be explicit in the body, not just metadata.
- Hierarchical Ambiguity Needs Hierarchical Clarity — Some references are ambiguous across city, state, and country levels. Specifying the full hierarchy removes ambiguity at every level instead of leaving the system to guess which level you mean.
- Low Confidence Triggers Multi-Result Fallback — When confidence is low, the system shows multiple candidates rather than committing. Weak location signals can leave you sharing a slot with namesakes instead of owning your locale outright.
- Consistency Across The Page Reinforces Resolution — Neighboring geographic references contribute to classification. A page that consistently references one coherent locale resolves more confidently than one that scatters mentions of several places.