Scores documents using link-graph features: inbound link counts, link quality, anchor-text relevance, and link-source diversity. Foundational link-based ranking signal layered on top of PageRank-style graph analysis.
Patent Overview
- Inventor
- Jeffrey Dean, others
- Assignee
- Google LLC
- Filed
- 2003
- Granted
- 2013-03-26
The Challenge
The Challenge
PageRank tells the system how the web votes globally. Link-based document scoring tells the system how the web votes for a specific document, in a specific topical and structural context. The scoring layer needs link signals beyond eigenvector centrality.
- Raw Link Count Is Gameable — Inbound link counts can be inflated through link farms. Counting alone misses link quality and source diversity.
- Anchor Text Carries Topical Signal — The text used to link to a document reveals what the linking site thinks the document is about. This signal is independent of the document's own self-description.
- Source Diversity Matters — 1000 links from one source is weaker than 100 links from 100 sources. Diversity captures the breadth of endorsement.
- Link Quality Varies — A link from a high-authority source carries more weight than a link from a low-quality source. Per-link quality scoring is required.
- Topical Relevance Of Links Matters — Links from topically aligned sources carry more weight than off-topic links. Topical relevance is a per-link feature.
Innovation
How The System Works
The system extracts per-document link features (count, quality, diversity, anchor relevance, topical alignment), combines them into a link-criteria score, and applies the score in conjunction with content-based scoring.
- Enumerate Inbound Links — Per document, collect every known inbound link from the link graph. Each link carries source, anchor text, and timing.
- Score Per-Link Quality — Per link, compute source-authority score, topical-alignment score, and anchor-relevance score.
- Compute Source Diversity — Measure how many distinct domains and topical clusters link in. Diversity bounds prevent single-source dominance.
- Anchor-Text Topical Match — Compare anchor-text distribution against document topical model. Matches earn relevance bonus.
- Aggregate Per-Document Score — Combine per-link contributions with diversity and anchor-match factors. Output is a per-document link-criteria score.
- Apply In Ranking — Link-criteria score multiplies into the broader ranking function alongside content-based and freshness-based scores.
- Detect Link-Pattern Manipulation — Pattern analysis flags link farms, spike anomalies, and reciprocal-link cliques. Manipulation triggers penalties or filtering.
Link Criteria Beyond Count
The patent's load-bearing idea is that link-based scoring needs multiple criteria. Count, quality, diversity, and anchor relevance combine into a multidimensional link signal that resists single-vector manipulation.
Quality And Diversity Beat Count
A few high-quality, topically aligned, diverse-source links beat thousands of low-quality, off-topic, single-source links. The scoring weights make this true.
- Per-Link Quality — Source authority, topical alignment, and anchor relevance combine per link. Quality times count beats count alone.
- Source Diversity — Distinct domains and topical clusters required. Diversity bounds prevent single-source dominance.
- Anchor-Topical Match — Anchor-text distribution aligned with document topic earns bonus. Off-topic anchors signal manipulation.
Technical Foundation
Technical Foundation
The patent specifies the link enumerator, per-link quality scorer, diversity calculator, anchor-relevance matcher, aggregator, and manipulation detector.
- Link Enumerator — Per document, queries link graph for all known inbound links. Outputs source, anchor, and timing for each.
- Per-Link Quality Scorer — Combines source-authority, topical-alignment, and anchor-relevance scores per link. Output is a per-link quality value.
- Diversity Calculator — Counts distinct domains and topical clusters among link sources. Bounded diversity prevents single-source gaming.
- Anchor-Relevance Matcher — Compares anchor-text distribution against document topical model. Match score contributes to per-link quality.
- Aggregator — Combines per-link contributions, diversity, and anchor-match factors into a per-document link-criteria score.
- Manipulation Detector — Pattern analysis flags link farms, reciprocal cliques, and anomalous spikes. Triggers penalties or filtering.
The Process
The Process
Link-criteria scoring runs at indexing time. Per-document scores cache in the index for query-time consumption.
- Enumerate Links — Link graph provides all known inbound links per document.
- Score Per-Link Quality — Source authority, topical alignment, and anchor relevance combine per link.
- Compute Diversity — Distinct-domain and topical-cluster counts feed diversity score.
- Anchor-Topical Match — Anchor distribution compared against document topic model. Match bonus applied.
- Aggregate Score — Per-link contributions combine with diversity and anchor-match into per-document score.
- Manipulation Check — Pattern analysis flags suspicious link patterns. Flags trigger penalty or filtering.
- Cache And Apply — Score caches in index. Rankers consume at query time alongside content and freshness scores.
Quality Control
Quality Control
Link-criteria scoring is among the most manipulated ranking inputs. The patent specifies safeguards.
- Per-Source Authority Cap — Low-authority sources contribute bounded score regardless of count. Prevents link-farm gaming.
- Diversity Requirement — Per-document score requires source diversity. Single-source link bombs hit the diversity floor.
- Anchor-Topical Alignment — Off-topic anchor text signals manipulation. Mismatched anchors earn no bonus and can trigger penalty.
- Pattern-Based Manipulation Detection — Link farms, reciprocal cliques, and spike anomalies trigger filtering. Detected manipulation penalizes.
- Continuous Calibration — Per-link weights and manipulation classifiers recalibrate periodically against fresh labeled data.
Real-World Application
Link-criteria scoring is the multi-dimensional link layer that complements PageRank. The primitives appear in every modern link-based ranking system and define what high-quality link earning looks like.
- Multi-criteria Link Score Form — Count, quality, diversity, anchor relevance combine. No single criterion dominates.
- Per-link Quality Granularity — Each inbound link carries its own quality value. High-quality links contribute more.
- Diversity-bounded Single-Source Defense — Distinct-domain and topical-cluster diversity required. Prevents single-source gaming.
Why High-Quality Links Beat Many Low-Quality
Per-source authority caps mean low-quality sources contribute bounded score regardless of count. A handful of high-authority links outweigh thousands of low-authority ones.
Why On-Topic Anchors Matter
Anchor-text topical alignment with the linked document earns relevance bonus. Off-topic anchors signal manipulation and can trigger penalty. Earn links from topically aligned sources.
<\/section>What This Means for SEO
What This Means for SEO
This patent scores documents on multidimensional link features (per-link quality, source diversity, anchor-text topical relevance) layered on top of PageRank-style centrality. SEO implication: a few high-authority, on-topic, diverse-source links outweigh thousands of low-quality, off-topic, single-source links.
- Quality Times Count, Not Count Alone — Each inbound link carries a per-link quality value from source authority, topical alignment, and anchor relevance. Low-authority sources are capped regardless of volume, so pursue authoritative linkers over bulk.
- Source Diversity Is Required — A diversity floor means the score requires distinct domains and topical clusters; 1,000 links from one source lose to 100 links from 100 sources. Earn links across many independent sites, not repeatedly from the same one.
- Keep Anchor Text On Topic — Anchor-text distribution is compared against the document's topical model. On-topic anchors earn a relevance bonus; off-topic anchors signal manipulation and can trigger penalty. Avoid exact-match anchor stuffing.
- Topically Aligned Sources Count More — Links from sites in your topic carry more weight than off-topic links. A relevant industry mention beats a generic high-traffic placement that has nothing to do with your subject.
- Link Farms And Cliques Are Detected — Pattern analysis flags link farms, reciprocal-link cliques, and spike anomalies, triggering penalties or filtering. Reciprocal link schemes and PBN structures register as patterns.
- Per-Source Authority Caps Defeat Volume Plays — Because low-authority sources contribute bounded score, scaling up cheap links hits a ceiling fast. Budget toward fewer, stronger placements rather than mass acquisition.
- Anchor Mismatch Can Backfire — Mismatched anchors earn no bonus and can trigger penalty, so manipulating anchor text aggressively is worse than letting natural, varied anchors form. Let editors describe your page in their own words.