Ranks news articles by combining article relevance with publisher authority, freshness, topical cluster signals, and story-trajectory data, so timely high-quality reporting rises above lower-quality content covering the same event.
Patent Overview
- Inventor
- Krishna Bharat
- Assignee
- Google LLC
- Filed
- 2003-09-05
- Granted
- 2009-08-18
- Application Number
- US 10/657,377
The Challenge
The Challenge
News presents unique challenges for ranking. Many publishers cover the same event simultaneously, freshness matters intensely, and publisher authority varies wildly. Generic web ranking does not weight these factors appropriately for news content.
- Multiple Publishers Cover One Event — When a story breaks, dozens to hundreds of publishers cover it within hours. The system must distinguish authoritative reporting from echo, syndication, and aggregation.
- Freshness Matters In Hours, Not Days — A news story decays rapidly. An article from this morning ranks above one from last week even at similar content quality. Generic freshness signals are too coarse.
- Publisher Authority Varies By Topic — A tech publication is authoritative on technology and weak on politics; a sports publication is the reverse. Authority must be measured per topic per publisher.
- Story Clusters Require Coordinated Ranking — Many articles about one event form a story cluster. Ranking must promote diverse perspectives within the cluster, not just the top single article.
- Original Reporting Deserves Lift — Articles that broke the story or contributed unique reporting should rank above aggregators that paraphrased. The system needs original-content detection.
Innovation
How The System Works
The system clusters incoming news articles by event, scores each article on relevance plus freshness plus publisher topical authority plus originality, ranks within and across clusters, and surfaces a diverse representative set for each event story.
- Cluster Articles By Event — Articles covering the same event form a story cluster. Clustering uses title similarity, entity overlap, and temporal proximity.
- Compute Article Freshness — Per article, compute its age in hours and the rate at which the story is being covered. Recent articles in active stories score high freshness.
- Score Publisher Topical Authority — Each publisher has per-topic authority scores derived from prior coverage quality, citation patterns, and editorial reputation. Authority weighs into the article score.
- Detect Originality — Articles that broke a story or added unique reporting (quotes, data, on-the-ground sources) earn an originality bonus. Aggregators and pure-rewrite content get less credit.
- Score Within Cluster — Inside each story cluster, articles are ranked by composite score: relevance plus freshness plus authority plus originality.
- Diversify Across Clusters — The SERP shows representative articles from multiple clusters when the query spans multiple events. Diversity prevents one story from dominating.
- Refresh Continuously — As new articles publish and existing ones age, the rankings update continuously. News rankings are not batch; they are streaming.
Multi-Signal News Ranking
The patent's load-bearing combination is event clustering plus freshness plus topical authority plus originality. None alone is sufficient for news; together they produce rankings that surface timely high-quality reporting.
Event Is The Unit, Not Article
Web ranking treats each document independently. News ranking treats events as the unit and articles as instances of coverage. The shift unlocks cluster-aware ranking and diversity.
- Event Clustering — Articles cluster by event. Each cluster is the substrate for within-cluster ranking and the unit for cross-cluster diversification.
- Per-Topic Publisher Authority — Publishers have authority that varies by topic. A tech publication's tech coverage ranks differently than its sports coverage.
- Originality Detection — Original reporting wins lift; aggregation gets less credit. The system must detect who broke the story and who added unique material.
Technical Foundation
Technical Foundation
The patent specifies the event clusterer, the freshness computation, the publisher-authority store, the originality detector, and the streaming ranking pipeline.
- Event Clusterer — Online clustering algorithm assigns incoming articles to story clusters. New events spawn new clusters; updates merge into existing ones.
- Freshness Function — Per article, freshness is a function of age and story-activity rate. Decay is steeper for hot stories than for slow-moving ones.
- Publisher Authority Store — Per-publisher, per-topic authority scores derived from coverage quality, citation patterns, and editorial reputation. Updated periodically as publishers evolve.
- Originality Detector — Compares each article to others in its cluster to identify unique content (quotes, data, photographs). Articles with unique material earn originality bonus.
- Composite Score Function — Combines relevance, freshness, authority, and originality into a single article score. Weights are tuned per query type so different news queries weight signals differently.
- Streaming Update Pipeline — Rankings update continuously as new articles publish and old ones age. The pipeline handles streaming inputs and produces continuously-updated rankings.
The Process
The Process
The pipeline runs as a continuous stream. New articles enter the system constantly; rankings update in near-real-time so the SERP always reflects the current state of the news world.
- Article Published, Crawled, Indexed — Publisher emits article; crawler picks it up; the indexer extracts content and metadata.
- Cluster Assignment — The clusterer assigns the article to a story cluster (existing or new) based on similarity to current clusters.
- Compute Freshness — Article timestamp plus cluster activity rate produces the freshness score. Hot stories get steeper decay; slow stories get gentler.
- Apply Publisher Authority — The publisher's topical authority for the cluster's topic adds to the article's composite score.
- Run Originality Detection — Compare to other articles in the cluster. Unique content earns the originality bonus.
- Compute Composite Score — Combine relevance, freshness, authority, originality. The composite is the article's ranking score.
- Update Rankings — The ranking system reads the updated score and reshapes the cluster and SERP rankings. Users querying for the topic see the freshest authoritative reporting first.
Quality Control
Quality Control
News ranking is sensitive to misinformation, aggregation gaming, and publisher manipulation. The patent specifies safeguards.
- Misinformation Filtering — Articles from known misinformation sources are filtered or heavily demoted regardless of freshness or relevance. The filter is editorial plus algorithmic.
- Aggregation Detection — Pure-aggregation articles (rewrites with no original content) get little originality credit. The detector compares to candidates with verified original material.
- Publisher Authority Audit — Per-topic publisher authority is audited periodically. Publishers whose coverage quality drops have their authority adjusted downward.
- Cluster Quality Verification — Clusters must be coherent (articles really cover one event). Bad clusters are flagged and split or merged as needed.
- Diversity Enforcement — The SERP enforces diversity across clusters so no single story dominates. The diversity rule is calibrated per news query type.
Real-World Application
This patent is the foundational ranking layer behind Google News, Top Stories carousels on web Search, and the news-aware sections in Discover. Its primitives drive how news content surfaces across Google's products.
- Real-time Update Cadence — Rankings update continuously as news flows in. New articles can rank within minutes of publication if they earn the composite score.
- Per-topic Authority Scope — Publisher authority is per-topic. The same publisher can be authoritative on one topic and weak on another.
- Event-clustered Organization Unit — Articles organize by event. Ranking happens within and across clusters with diversity enforcement.
Why Original Reporting Wins Top Stories Slots
The originality detector rewards articles that broke a story or added unique material. Publishers investing in original journalism win Top Stories visibility; aggregators see steadily less surface area over time.
Why Publisher Authority Compounds By Topic
Building per-topic editorial authority compounds visibility on the topic. A publisher recognized as authoritative on a niche earns ranking lift on every story in that niche, beyond what generic-publication metrics would predict.
<\/section>What This Means for SEO
What This Means for SEO
The patent clusters news by event and ranks articles on relevance, freshness, publisher topical authority, and originality, surfacing a diverse representative set per story. SEO implication: original reporting and per-topic publisher authority win Top Stories visibility, while aggregation loses surface area over time.
- Original Reporting Wins Top Stories — The originality detector rewards articles that broke a story or added unique material. Publishers investing in original journalism win Top Stories visibility, while aggregators see steadily less surface area over time.
- Publisher Authority Compounds By Topic — Building per-topic editorial authority compounds visibility on that topic. A publisher recognized as authoritative on a niche earns ranking lift on every story in it, beyond what generic-publication metrics would predict.
- Event Is The Unit, Not The Article — News ranking treats events as the unit and articles as instances of coverage. To win, your article must stand out within the event cluster, so adding a distinct angle or unique reporting on a covered event is what earns selection.
- Freshness Is Intense For News — Freshness weighs heavily in news ranking. Timely publishing on breaking events is essential; late coverage of a developing story competes poorly against fresher articles in the same cluster.
- Diversity Selection Limits Duplicates — The system surfaces a diverse representative set per event, so near-identical coverage competes for limited slots. Differentiated coverage (unique angle, data, or perspective) is favored over articles echoing the same wire copy.
- Aggregation Loses To Origination — Originality scoring structurally disadvantages content that merely republishes others' reporting. Sustainable news visibility comes from originating material, not aggregating it.
- Topic Focus Builds News Authority — Per-topic authority means a publisher focused on a niche outranks generalists on that niche's stories. Concentrating editorial investment in defined beats compounds news-ranking authority within them.