The AltaVista ranking patent. Ranks linked-database search results using linear combinations of matrices and eigenvector analysis. Incorporates attractor/non-attractor matrices, probability weighting, co-citation, and bibliographic coupling — the pre-PageRank-era multi-matrix ranking primitive.
Patent Overview
- Inventor
- Andrei Z. Broder
- Assignee
- Overture Services Inc
- Filed
- 2000-10-25
- Granted
- 2003-05-06
The Challenge
The Challenge
Web-page ranking from linked-database searches needs to combine multiple link-graph signals. PageRank captures global authority; co-citation captures topical proximity; bibliographic coupling captures shared-reference topical signal. Combining these into a coherent ranking is the problem.
- Single-Signal Ranking Misses Dimensions — PageRank alone misses topical proximity; co-citation alone misses authority. Multi-signal combination required.
- Matrix Algebra Captures Signal Composition — Each signal is a matrix over the link graph. Linear combinations capture multi-signal integration.
- Attractor / Non-Attractor Bias Adjusts Ranking — Desirable sites (attractors) and undesirable sites (non-attractors) inject bias into ranking via matrix construction.
- Eigenvector Analysis Yields Stable Rankings — Dominant-eigenvector analysis of the combined matrix produces stable ranking signals.
- Quality Probability Weights Refine Combination — Per-signal probability weights tune the matrix combination toward high-quality content.
Innovation
How The System Works
The system constructs multiple matrices over the link graph (link-presence, co-citation, bibliographic coupling, attractor, non-attractor), combines them linearly with probability weights, runs eigenvector analysis, and produces ranked results.
- Build Link-Graph Matrices — Construct per-page-pair matrices for link presence, co-citation, bibliographic coupling.
- Build Attractor / Non-Attractor Matrices — Per attractor (desirable) and non-attractor (undesirable) sites, construct bias matrices.
- Apply Probability Weights — Per matrix, apply quality probability weights to bias toward high-quality content.
- Linearly Combine — Combine matrices via weighted linear sum.
- Run Eigenvector Analysis — Compute dominant eigenvector of combined matrix to yield per-page rank score.
- Rank Results — Per-page rank scores sort results.
- Recalibrate Periodically — Matrix weights and attractor sets refresh against fresh corpus.
Multi-Matrix Eigenvector Ranking
The patent's load-bearing idea is that web-page ranking is a multi-matrix eigenvector problem. Combining link-presence, co-citation, bibliographic coupling, and quality-bias matrices yields ranking signals that single-matrix PageRank cannot match.
Each Matrix Captures A Dimension
Link-presence captures direct linking; co-citation captures topical proximity; bibliographic coupling captures shared-reference topical signal; attractor/non-attractor capture quality bias. Each dimension is its own matrix; combination is the architectural primitive.
- Multi-Matrix Construction — Per signal type, matrix constructed over link graph.
- Probability-Weighted Combination — Per matrix, weight applied; linear combination yields composite.
- Eigenvector Ranking — Dominant eigenvector of composite matrix produces ranking scores.
Technical Foundation
Technical Foundation
The patent specifies the matrix builders, weight applier, combiner, eigenvector analyzer, ranker, and recalibration loop.
- Link-Presence Matrix Builder — Constructs per-pair matrix from direct links.
- Co-Citation Matrix Builder — Constructs co-citation matrix from shared inbound-link patterns.
- Bibliographic-Coupling Matrix Builder — Constructs coupling matrix from shared outbound-link patterns.
- Attractor/Non-Attractor Matrix Builder — Per attractor and non-attractor site, constructs bias matrix.
- Combiner — Weighted linear combination of matrices.
- Eigenvector Analyzer — Dominant eigenvector of combined matrix produces ranking scores.
The Process
The Process
Matrix construction runs at indexing; eigenvector analysis runs as a batch.
- Crawl Link Graph — Crawler updates link graph.
- Build Matrices — Per signal type, matrix constructed.
- Define Attractors / Non-Attractors — Per known desirable/undesirable sites, bias matrices constructed.
- Apply Weights — Per matrix, probability weight applied.
- Combine — Linear combination yields composite matrix.
- Eigenvector Analysis — Dominant eigenvector computed.
- Rank Results — Per-page scores sort SERP.
Quality Control
Quality Control
Multi-matrix ranking quality depends on weight calibration and matrix validity. The patent specifies safeguards.
- Weight Calibration — Per-matrix weights calibrated against labeled relevance data.
- Attractor / Non-Attractor Curation — Bias-set membership curated against quality criteria.
- Eigenvector Convergence — Eigenvector computation must converge. Non-convergence triggers tuning.
- Matrix Sparsity Handling — Sparse matrices require efficient eigenvector methods.
- Continuous Recalibration — Weights and attractor sets refresh against fresh data.
Real-World Application
Broder's AltaVista ranking patent prefigures multi-signal modern ranking. The matrix-combination approach influenced subsequent multi-signal ranking systems even as PageRank dominated public attention.
- Multi-matrix Signal Integration — Per signal type, matrix; linear combination integrates.
- Probability-weighted Tuning Knob — Per-matrix weights tune signal contribution.
- Eigenvector-based Ranking Method — Dominant eigenvector of combined matrix yields ranking scores.
Why Multi-Signal Link Analysis Wins
Direct links alone miss topical proximity. Co-citation and bibliographic coupling add dimensions PageRank doesn't capture. Multi-signal ranking is the structural pattern even in PageRank-descendant systems.
Why Earning Co-Citations Compounds
When third parties link to your content alongside other authoritative resources on the topic, co-citation signal accumulates. Earning these co-citations compounds across the multi-matrix ranking dimensions.
<\/section>What This Means for SEO
What This Means for SEO
This AltaVista patent ranks pages by combining link-presence, co-citation, bibliographic-coupling, and quality-bias matrices through eigenvector analysis. SEO implication: link authority has always been multi-dimensional, so earning topical co-citations matters alongside raw inbound links.
- Direct Links Are Only One Dimension — The model treats link presence as just one matrix among several. Chasing inbound-link count alone misses co-citation and coupling signals that the combined ranking also weighs.
- Co-Citation Builds Topical Authority — When third parties link to you alongside other authoritative resources on the same topic, the co-citation matrix accumulates signal. Getting mentioned in the same context as recognized authorities compounds your standing.
- Bibliographic Coupling Rewards Shared References — Pages that cite the same authoritative sources you cite are coupled to you in the link graph. Referencing the canonical sources in your field positions you within the right topical neighborhood.
- Attractor And Non-Attractor Sets Inject Bias — The system constructs explicit desirable and undesirable site sets that bias ranking. Associating with the undesirable set, through bad neighborhoods or spammy link patterns, applies negative bias rather than neutral treatment.
- Quality Weights Tune Each Signal — Per-matrix probability weights bias the combination toward high-quality content. A single high-quality link source contributes more than many low-quality ones because the weighting is built into the math.
- Stable Authority Beats Spikes — Eigenvector analysis produces stable rankings from the whole graph structure. Authority emerges from durable position in the link graph, not from short-lived bursts of links.
- Multi-Signal Link Earning Is The Strategy — Because no single matrix dominates, a link-earning program that produces direct links, co-citations, and shared-reference coupling together outperforms one optimizing only for link count.