Indexes documents with geographical-relevance metadata so geographic ranking can apply at retrieval time. The structural infrastructure that makes location-aware ranking efficient at web scale.
Patent Overview
- Inventor
- Daniel Egnor
- Assignee
- Google LLC
- Filed
- 2010
- Granted
- 2015-11-17
The Challenge
The Challenge
Computing geographic relevance per document at query time is too expensive. The system needs to pre-compute and index geographic-relevance metadata so retrieval and ranking can consume it within latency budgets.
- Per-Query Geographic Computation Is Too Slow — Running geographic-relevance analysis across candidate documents per query exceeds latency budgets.
- Documents Have Implicit Geography — Most documents carry implicit geographic anchoring (address mentions, locale words, place names). Extraction at indexing time pre-pays the cost.
- Indexing Structure Must Be Retrieval-Friendly — Geographic indexing must integrate with the broader index architecture without bloating storage or hurting retrieval performance.
- Place Resolution Must Be Robust — 'Springfield' must resolve to the right Springfield. Place-disambiguation runs at indexing time so retrieval can use the resolved entity.
- Multi-Locale Support Matters — Documents may carry multiple locale anchors (service-area pages, multi-city businesses). Indexing must support multiple per-document anchors.
Innovation
How The System Works
The system extracts geographic references from documents at indexing time, resolves ambiguous references to canonical place entities, attaches geographic-relevance metadata to the index, and exposes the metadata to retrieval and ranking.
- Extract Geographic References — Per document at indexing, extract geographic references (place names, addresses, locale words, ZIP codes).
- Resolve Ambiguous References — Per reference, resolve to canonical place entity using context, place hierarchy, and disambiguation models.
- Compute Per-Place Relevance — Per (document, place) pair, compute relevance signal from reference frequency, context, and structural position.
- Attach Metadata To Index — Geographic-relevance metadata attached per document in the index. Multi-place support for service-area and multi-city documents.
- Expose To Retrieval — Retrieval consumes geographic metadata for candidate filtering by query location.
- Expose To Ranking — Ranking consumes per-(document, place) relevance for location-aware scoring.
- Continuous Refresh — Per crawl, geographic extraction and relevance recompute. Index stays current.
Indexing Time Pre-Pays Geographic Cost
The patent's load-bearing idea is that geographic analysis runs at indexing time, not query time. Pre-computed per-(document, place) relevance metadata enables fast location-aware retrieval and ranking within latency budgets.
Pre-Compute Geographic Anchoring
Per document, geographic references extracted, ambiguities resolved, per-place relevance computed, metadata attached. All at indexing time, so query time runs fast.
- Geographic Reference Extraction — Place names, addresses, locale words, ZIP codes extracted per document.
- Canonical Place Resolution — Ambiguous references resolved to canonical place entities via context and hierarchy.
- Per-Place Relevance Metadata — Per (document, place), relevance signal computed and indexed.
Technical Foundation
Technical Foundation
The patent specifies the geographic-reference extractor, place resolver, per-place relevance computer, index attacher, retrieval interface, and ranking interface.
- Geographic-Reference Extractor — Per document at indexing, extracts geographic references.
- Place Resolver — Resolves ambiguous references to canonical place entities.
- Per-Place Relevance Computer — Per (document, place), computes relevance from frequency, context, structural position.
- Index Attacher — Attaches geographic-relevance metadata to documents in the index.
- Retrieval Interface — Exposes geographic metadata to retrieval for candidate filtering.
- Ranking Interface — Exposes per-(document, place) relevance to ranking for location-aware scoring.
The Process
The Process
Geographic indexing runs at indexing time alongside content extraction. Metadata caches in the index for query-time consumption.
- Crawl Document — Crawler fetches page content.
- Extract Geographic References — Per document, geographic-reference extractor runs.
- Resolve References — Place resolver maps references to canonical places.
- Compute Relevance — Per-place relevance computed.
- Attach To Index — Geographic metadata attached to index entry.
- Index Storage — Metadata persists in index for retrieval/ranking access.
- Refresh Per Crawl — Per crawl, geographic extraction re-runs and metadata updates.
Quality Control
Quality Control
Geographic extraction and resolution accuracy determine downstream ranking quality. The patent specifies safeguards.
- Extraction Accuracy Validation — Geographic-reference extraction validated against labeled documents.
- Place Resolution Accuracy — Place resolver validated against canonical place-name corpora.
- Multi-Place Support — Documents with multiple legitimate place anchors handled correctly.
- Spurious Reference Filtering — Filters out spurious geographic references (e.g., 'in New York' as idiom, not location).
- Continuous Recalibration — Extraction and resolution models recalibrate against fresh data.
Real-World Application
Geographic-relevance indexing is the structural foundation of location-aware retrieval and ranking. The pre-computed per-(document, place) metadata pattern underpins modern local search at web scale.
- Per-document Indexing Granularity — Each document carries its own geographic-relevance metadata.
- Multi-place Support Pattern — Multi-city and service-area documents carry multiple place anchors.
- Index-time Computation Stage — Geographic analysis runs at indexing time. Query-time runs fast.
Why Explicit Location Markup Helps Discovery
Schema.org local-business markup, NAP citations, and clear address presentation produce strong geographic references that extractors read cleanly. Implicit signals work too, but explicit markup is more reliable.
Why Service-Area Pages Need Distinct Place Anchors
Multi-location businesses serve multiple places. Distinct landing pages per service area produce distinct per-place relevance metadata, surfacing each location independently in local search.
<\/section>What This Means for SEO
What This Means for SEO
This patent extracts and resolves geographic references at indexing time, attaching per-document, per-place relevance metadata so location-aware ranking runs fast at query time. SEO implication: explicit location markup is read cleanly, and distinct service-area pages each earn their own place anchor.
- Explicit Location Markup Reads Cleanly — Schema.org local-business markup, NAP citations, and clear address presentation produce strong geographic references the extractor reads reliably. Implicit signals work, but explicit markup is the dependable channel.
- Give Each Service Area Its Own Page — Multi-place support means distinct landing pages per service area each get distinct per-place relevance metadata. Separate, substantive location pages surface each area independently, where one combined page blurs them.
- Geography Is Pre-Computed, So Anchoring Must Be Clear — Relevance is computed at indexing time from references in your content. The location signals on the page at crawl are what get baked into the index, so vague or missing anchoring cannot be fixed at query time.
- Structural Position Of References Matters — Per-place relevance weighs reference frequency, context, and structural position. A location named in the title, headings, and address block signals more strongly than one buried once in the footer.
- Spurious Geographic Mentions Are Filtered — The system filters idiomatic uses like a place name in a figure of speech rather than a real location. Genuine, contextualized location references count; incidental ones do not, so anchor with intent.
- Place Resolution Happens At Index Time — Ambiguous references are resolved to canonical places during indexing. Providing unambiguous context in your content ensures the index stores the right place, not a wrong-locale guess.
- Refresh Updates The Anchor — Geographic extraction re-runs each crawl, so improving your location signals updates the stored metadata. Adding markup and clearer anchoring takes effect as the page is recrawled.