The original 2004 filing that broke PageRank's uniform-link assumption. Every link on a page is weighted by features a real human would actually notice, and PageRank flows through those weights instead of being split evenly across outbound links.
Patent Overview
- Inventor
- Corin Anderson, Jeffrey A. Dean, Alexis Battle
- Assignee
- Google LLC
- Filed
- 2004-06-17
- Granted
- May 11, 2010
The Challenge
The Challenge
Classic PageRank assumes a random surfer who clicks every outbound link on a page with equal probability. That assumption is mechanically convenient but humanly wrong. The challenge: replace the uniform-click model with a probability that respects how a real reader would actually scan a page and decide where to click, without abandoning the recursive link-graph math that made PageRank work.
- Uniform Click Probability Is Unrealistic — Per link, the original model gives a footer link the same weight as a hero in-content link. Real readers do not behave that way.
- Anchor Text Is Discarded — Per outbound link, descriptive anchor text and generic anchor text flow identical PageRank under the random surfer.
- Position On Page Is Ignored — Per source page, above-the-fold body links and buried sidebar links pass the same authority despite very different reader attention.
- Link Styling And Context Are Invisible — Per link, font size, surrounding sentence, and visual styling all signal intent. The random surfer reads none of it.
- Link-Type Attributes Are Unread — Per link, attributes including rel values, target, and link type carry editorial intent that the uniform model collapses to a single equal-weight click.
Innovation
How The System Works
The system replaces the uniform click probability with a per-link probability computed from the link's features. PageRank then flows through each outbound link weighted by that probability, so prominent and contextually relevant links pass more authority than buried or generic ones.
- Extract Per-Link Features — Per outbound link on a source page, the system reads anchor text, font size, position, styling, surrounding text, link type, and target attributes.
- Score Click Probability — Per link, a model converts the feature vector into a probability that a reasonable reader would click that specific link.
- Normalise Across Outbound Links — Per source page, the per-link probabilities are normalised so the total click mass across all outbound links sums to one.
- Weight PageRank Flow — Per (source, target) pair, the source page's PageRank is split across outbound links by the normalised click probability instead of uniformly.
- Recurse Across The Web Graph — Per iteration, weighted PageRank propagates through the full link graph until the values converge.
- Combine With Other Signals — Per ranker, the reasonable-surfer PageRank value joins content relevance, query-link match, and other inputs for the final ranking.
- Refresh As Pages Change — Per cycle, feature extraction and probability scoring re-run as source pages are recrawled, so weights track on-page edits and template changes.
Not Every Link Is Worth The Same
The patent's load-bearing idea is that link equity is not a property of the URL alone. It is a property of how the link is presented on the source page, because that presentation predicts whether a real reader would ever follow it.
Feature-Weighted PageRank Flow
Per link, click probability is computed from features. Per source page, PageRank is split across outbound links by those probabilities, not evenly.
- Anchor Text — Per link, descriptive anchor text raises click probability above generic anchors.
- Position And Styling — Per link, in-body prominent links outweigh footer and sidebar links.
- Context And Attributes — Per link, surrounding text and link-type attributes shape the weight.
Technical Foundation
Technical Foundation
The patent specifies feature extraction, click-probability modelling, normalisation per source page, weighted PageRank propagation, and integration with the existing ranking pipeline.
- Link Feature Vector — Per link, features include anchor text tokens, font size, position coordinates, styling, surrounding text, link type, and target attributes.
- Click Probability Model — Per link, a learned function maps the feature vector to a click probability calibrated against observed user behaviour.
- Per-Page Normalisation — Per source page, link probabilities are scaled so they sum to one across all outbound links on the page.
- Weighted Random Walk — Per iteration, the PageRank transition matrix uses per-link probabilities instead of uniform 1 over out-degree.
- Convergence And Storage — Per document, the converged reasonable-surfer PageRank value is stored for use at query time.
- Pipeline Integration — Per ranker, the weighted PageRank value is combined with content signals, query-link signals, and other inputs.
The Process
The Process
From a crawl of source pages, the system extracts link features, scores click probability per link, propagates PageRank through the weighted graph, and stores the result for the live ranker.
- Crawl Source Pages — Per source page, the full DOM, layout, and outbound links are captured.
- Extract Link Features — Per link, features are read from the page rendering and HTML structure.
- Score Click Probability — Per link, the probability model produces a weight reflecting reader likelihood to click.
- Normalise Per Page — Per source page, weights are scaled to sum to one across outbound links.
- Propagate Weighted PageRank — Per iteration, PageRank flows through the weighted link graph until convergence.
- Store Per-Document Values — Per document, the reasonable-surfer PageRank is stored alongside other ranking signals.
- Use At Query Time — Per query, the weighted PageRank value participates in the final ranking computation.
Quality Control
Quality Control
Feature-weighted PageRank introduces manipulation risks if site owners can game styling or anchor text to inflate per-link weights. The patent describes safeguards that keep weights honest.
- Calibration Against Observed Behaviour — Per link, the probability model is fit to actual user-click data so synthetic styling tricks do not drift far from real reader behaviour.
- Feature Independence Checks — Per source page, no single feature can dominate the probability score. The model combines anchor, position, styling, and context signals.
- Spam And Boilerplate Detection — Per source page, footer blocks, sitewide templates, and link farms are identified and their per-link weights are damped.
- Anchor Text Diversity Sanity — Per target page, suspiciously identical anchor text across many sources is flagged so anchor stuffing does not freely inflate weighted PageRank.
- Iterative Recomputation — Per cycle, feature extraction and probability scoring rerun, so on-page changes and template fixes are reflected in current weights.
Real-World Application
Reasonable-surfer weighting is the mechanical reason link equity has never been uniform. Every observation that link placement, anchor text, and on-page context affect how much authority a backlink passes traces back to this 2004 filing. The continuations in 2012 and 2016 refine the model, but the foundational logic is here.
- Per-link Probability Granularity — Every link on a source page carries its own click weight.
- Feature-driven Weighting Inputs — Anchor, position, styling, context, and attributes feed the probability model.
- Weighted PageRank Output Signal — Authority flows through the weighted graph instead of being split evenly.
Why Footer Links Pass Less Than In-Content Links
Per source page, an in-body link with rich anchor text and supporting context scores a high click probability and receives a large share of that page's PageRank flow. Footer-stuffed sitewide links sit in low-attention regions, score low probabilities, and pass thin authority even when the raw link count is high.
Why Anchor Text Still Matters Mechanically
Per link, descriptive anchor text raises the click probability the model assigns. Higher probability means a larger share of the source page's PageRank reaches the target. Anchor text is not a stylistic detail. It is an input into the weight the system uses to split authority.
<\/section>What This Means for SEO
What This Means for SEO
Reasonable-surfer weighting means backlink strategy is not a counting exercise. The same target page can receive very different amounts of PageRank from two backlinks depending on how each link is presented on its source page. Strategy must optimise for the features the model actually reads.
- Not All Backlinks Pass Equal Authority — A prominent in-content link wrapped in coherent supporting text passes more authority than a footer-stuffed sitewide link, even if both point at the same URL. Earned editorial links inside the body of relevant content outperform mass-placed boilerplate links by a wide margin in weighted PageRank terms.
- Anchor Text Matters Mechanically — Descriptive, query-relevant anchor text increases the per-link click probability the model assigns. That higher probability translates directly into a larger share of the source page's PageRank reaching your page. Generic anchors and bare-URL anchors leave authority on the table.
- Position On The Linking Page Matters — Above-the-fold and in-body links outweigh footer, sidebar, and related-widget links because the click-probability model gives them more weight. When negotiating placements or earning editorial mentions, the position of the link on the source page is part of the value the link carries.
- Link Styling And Surrounding Text Matter — A link wrapped in coherent context with explanatory surrounding sentences reads as deliberate editorial intent, which raises its weight. A link inside a generic blogroll or untouched template block reads as low-intent and is damped accordingly. Context is part of the link, not decoration around it.
- Link-Type Attributes Are Read As Signal — Attributes including rel values, target settings, and link type integrate into the per-link weight. Sponsored, ugc, and nofollow markers carry intent the reasonable-surfer model accounts for when computing how much authority flows through the link.
- Earn Links A Real Reader Would Click — The model is calibrated against how actual humans scan and click pages. Links that earn real reader attention earn weighted PageRank. SEO that chases raw link counts without considering whether anyone would ever click those links is optimising for a signal the system does not use.
- Anderson Is First Inventor On The Earliest Filing — This patent, filed in 2004 and granted in 2010, is the original Reasonable Surfer document with Corin Anderson as first inventor and Jeffrey Dean as co-inventor. The 2012 continuation (US 8,117,209) and 2016 continuation (US 9,305,099) refine the model further, but the core insight that link weight depends on per-link features traces to this foundational filing.