Computes per-topic popularity by combining the location identifiers of documents users actually visit with the topics those documents cover, separating topical interest from raw visit volume.
Patent Overview
- Inventor
- Amit Singhal
- Assignee
- Google LLC
- Filed
- 2004-03-29
- Granted
- 2013-11-26
- Application Number
- US 10/811,038
The Challenge
Popularity Without Topicality Misleads Ranking
Raw page visit counts are an unreliable popularity signal because some popular documents are popular for unrelated reasons. A homepage that everyone visits says nothing about which topic it deserves to rank for. The system needs to correlate popularity with topicality so that a document’s popularity contributes to ranking only on the topics it actually covers, and a document that is popular for one topic does not steal authority for another.
- Generic Popularity Misallocates Authority — A page that gets many visits because it sits at the root of a major site contributes equal popularity signal to every topic it touches lightly. Real authority should be topic-specific.
- Topic Without Popularity Misses Real Authority — Topic match alone cannot tell which of two equally on-topic pages is actually used. Both signals together are needed to rank reliably.
- Visit Data Is Noisy — Users visit pages for many reasons. Click data, dwell, and visit volume all carry signal but also carry noise. The system has to extract the topical popularity component from the mix.
- Topics Are Inferred Per Document — Documents are not labeled with their topics in any clean way. The system has to infer the topic(s) of each visited document through content analysis or external mapping.
- Per-Topic Aggregation Is The Goal — The output is not a single popularity score per document. It is a per-topic popularity score, so the same document contributes differently to different topical rankings.
Innovation
Map Visits To Topics, Then Aggregate Per Topic
The system receives location identifiers (URLs or document IDs) of documents users have visited. It retrieves each document and maps it to one or more topics. It then computes a popularity value for each visited document and correlates that popularity with the document’s topics. The output is a per-topic popularity score that the ranking system uses to weight documents within each topical cluster.
- Receive Visit Location Identifiers — Logs of documents users have visited are the input. Each entry is a location identifier (URL) plus optional metadata like visit count, dwell, or user count.
- Retrieve Documents — Fetch the actual document content for each visited URL. Retrieval enables topic mapping in the next step.
- Map To Topics — For each retrieved document, infer one or more topics it covers. Topic inference can use content classification, taxonomy mapping, or learned topic models.
- Compute Per-Document Popularity — Determine each document’s popularity value. Sources include visit count, unique visitor count, return-visit rate, and dwell-time signals.
- Correlate Popularity With Topics — For each document’s topics, attribute its popularity value. Documents with multiple topics distribute popularity across them per the correlation weights.
- Aggregate Per Topic — Sum or average the attributed popularity within each topic. The result is a per-topic popularity score that ranks documents by their relevance-weighted usage.
- Feed Into Ranking — The per-topic popularity score becomes a feature in the topical ranking pipeline. Documents popular within a topic rank higher when queries fall under that topic.
Popularity Is Topic-Bound
The patent’s contribution is treating popularity as a per-topic quantity rather than a per-document one. A page that gets visits across multiple topics distributes its authority among them rather than carrying full authority into each. The ranking system can then promote documents based on their topical popularity, not their raw visit count.
Visits Belong To Topics
When a user visits a document, the visit contributes popularity to the topics that document covers. Documents that span topics share their popularity proportionally.
- Document-To-Topic Mapping — Every visited document is associated with one or more topics through content analysis. The mapping is what makes per-topic aggregation possible.
- Per-Topic Aggregation — Popularity values are aggregated within each topic separately. The same document contributes different amounts to different topics.
Popularity flows through topics, not around them.
<\/section>Technical Foundation
What The System Computes
The pipeline links visit data to documents, documents to topics, and topics to popularity scores. Each link is computed offline and refreshed on a schedule.
- Visit Identifier — The URL or document ID of a visited resource. Carries optional metadata like visit count and visitor characteristics.
- Document-To-Topic Mapping — An assignment of documents to topics. Can be one-to-many (a document covers multiple topics) or one-to-one (a document is canonically about one topic).
- Per-Document Popularity — A scalar score per document representing its observed popularity. Combines visit count, unique visitor count, and engagement signals.
- Per-Topic Popularity Aggregation — The output: a popularity score per (document, topic) pair, suitable for use as a ranking feature in topical retrieval.
Key Insight: Separating popularity from topicality solves a structural problem with naive visit-based ranking. Sites that get visits across many topics (homepages, portals, major news domains) would otherwise carry full popularity into every topic, distorting topical rankings. Distributing popularity across topics matches how real authority works: usage of a document about Topic A is evidence of authority on Topic A, not on Topic B.
<\/section>The Process
End-To-End Aggregation
Offline pipeline runs against visit logs and a document store, producing the per-topic popularity table that ranking consults.
- Visit Log Ingestion — Aggregate visit data from logs, click streams, or other behavioral sources. Normalize URLs and deduplicate.
- Document Retrieval And Topic Inference — Fetch each visited document, run topic inference, and store the document-to-topics mapping.
- Per-Document Popularity Computation — Combine visit volume, unique visitors, and engagement into a per-document popularity score.
- Correlation And Attribution — Attribute each document’s popularity to its topics per the correlation weights. Documents covering multiple topics distribute popularity across them.
- Per-Topic Aggregation — Aggregate attributed popularity within each topic to produce per-(document, topic) scores.
- Publish To Ranking — Write the per-topic popularity table to the ranking feature store for use in topical retrieval.
What This Means for SEO
What This Means for SEO
Topical popularity is a quiet input to ranking that distinguishes pages with real audience traction from pages with raw visit counts inherited from site-wide patterns.
- Build Topical Depth, Not Just Volume — A page that is popular for its specific topic contributes more topical authority than a page that gets many visits across mixed topics. Topical concentration beats topical sprawl.
- Homepages Don’t Inherit Authority Per Topic — Your homepage gets visits for many reasons. Those visits do not become per-topic authority unless your homepage is genuinely about that specific topic. Topical landing pages capture per-topic popularity better than generic hubs.
- Topic Mapping Depends On Content Clarity — Documents that are clearly about one topic get cleaner topic attribution. Pages that drift across multiple topics split their popularity attribution and weaken their per-topic standing.
- Engagement Compounds Per Topic — When users return, spend time, and re-engage with your content on a topic, the topical popularity signal strengthens. CTR alone is not enough; dwell and return signals matter.
- Brand Visit Volume Is Topic-Specific Authority — Visits to your branded content do not carry over to unrelated topics. A high-traffic property does not rank well for a topic outside its content focus, even though its visit volume is high overall.
- Per-Topic Authority Persists — Pages that earn per-topic popularity tend to retain it because the aggregation runs on accumulated visit data. Sudden traffic spikes from unrelated sources do not produce topical authority the way sustained topical usage does.