Monika Henzinger, Google Search Patents

By NizamUdDeen · Updated January 1, 2026 · Reviewed by the Nizam SEO War Room editorial team.

First, the short version. Below is the AIO-eligible passage and the question-format primer for Monika Henzinger, Google Search Patents.

98 search and IR patents by Monika Henzinger, the first Director of Research at Google. Co-inventor on the foundational near-duplicate-page-detection patent (US 6,138,113, with Dean) and an extensive set of document-scoring families (Query Analysis, Content Update, Inception Date, Link-Based Criteria, Historical Data — all cross-listed with Dean's canonical articles). Independent contributions: the detecting-duplicate-files family with William Pugh, the AltaVista connectivity-server, the connectivity-and-content-ranking patents, document-freshness determination, semantic-distance ranking, query-semantic-information ranking, anchor-text cross-language IR, in-context searching, hypertext-browser assistant, and usage-statistics-driven document retrieval. Spans 1997 to 2017.

About the Monika Henzinger, Google Search Patents track

Near-Duplicate Detection

Method for Identifying Near Duplicate Pages in a Hyperlinked Database (US 6,138,113 · October 24, 2000)
Detecting Duplicate and Near-Duplicate Files (US 6,658,423 · December 2, 2003)
Detecting Duplicate and Near-Duplicate Files (2008) (US 7,366,718 · April 29, 2008)
Detecting Duplicate and Near-Duplicate Files (2011) (US 8,015,162 · September 6, 2011)
Detecting Duplicate and Near-Duplicate Files (2016) (US 9,275,143 · March 1, 2016)
System and Method for Near-Uniform Sampling of Web Page Addresses (US 6,594,694 · July 15, 2003)

Connectivity, Link Analysis & Ranking

Connectivity Server for Locating Linkage Information Between Web Pages (US 6,073,135 · June 6, 2000)
Method for Ranking Hyperlinked Pages Using Content and Connectivity Analysis (US 6,738,678 · May 18, 2004)
Method for Ranking Documents in a Hyperlinked Environment Using Connectivity and Selective Content Analysis (US 6,112,203 · August 29, 2000)
Method for Ranking Hyperlinked Pages (continuation 2007) (US 7,117,206 · October 3, 2006)
Method for Identifying Related Pages in a Hyperlinked Database (US 6,665,837 · December 16, 2003)
Method for Identifying Related Pages (2009) (US 7,630,973 · December 8, 2009)
Method and Apparatus for Finding Mirrored Hosts by Analyzing Connectivity and IP Addresses (US 6,487,555 · November 26, 2002)
Method and Apparatus for Finding Mirrored Hosts by Analyzing URLs (US 6,286,006 · September 4, 2001)
Method and Apparatus for Preventing Topic Drift in Queries in Hyperlinked Environments (US 6,321,220 · November 20, 2001)
Ranking Search Engine Results (US 7,451,388 · November 11, 2008)
Methods and Systems for Identifying Manipulated Articles (US 7,302,645 · November 27, 2007)
Identification of Web Sites That Contain Session Identifiers (US 7,886,217 · February 8, 2011)

Document Scoring & Freshness

Document Scoring Based on Query Analysis (US 8,051,071 · November 1, 2011)
Document Scoring Based on Document Content Update (US 8,112,426 · February 7, 2012)
Document Scoring Based on Document Inception Date (US 7,840,572 · November 23, 2010)
Document Scoring Based on Link-Based Criteria (US 8,407,231 · March 26, 2013)
Information Retrieval Based on Historical Data (US 7,346,839 · March 18, 2008)
Document Ranking Based on Document Classification (US 8,224,827 · July 17, 2012)
Systems and Methods for Determining Document Freshness (US 7,797,316 · September 14, 2010)
Systems and Methods for Determining Document Freshness (2011) (US 8,082,244 · December 20, 2011)
Systems and Methods for Determining Document Freshness (2013) (US 8,515,952 · August 20, 2013)
Document Ranking Based on Semantic Distance Between Terms in a Document (US 7,716,216 · May 11, 2010)
Document Ranking Based on Semantic Distance (2011) (US 8,060,501 · November 15, 2011)
Document Ranking Based on Semantic Distance (2013) (US 8,606,778 · December 10, 2013)
Search Queries Improved Based on Query Semantic Information (US 8,055,669 · November 8, 2011)
Search Queries Improved Based on Query Semantic Information (2013) (US 8,577,907 · November 5, 2013)

User-Facing Search Features

Systems and Methods for Using Anchor Text as Parallel Corpora for Cross-Language Information Retrieval (US 7,146,358 · December 5, 2006)
Anchor Text as Parallel Corpora (2010) (US 7,814,103 · October 12, 2010)
Anchor Text as Parallel Corpora (2014) (US 8,631,010 · January 14, 2014)
Systems and Methods for Performing In-Context Searching (US 7,305,380 · December 4, 2007)
In-Context Searching (2011) (US 7,962,469 · June 14, 2011)
In-Context Searching (2015) (US 9,111,000 · August 18, 2015)
In-Context Searching (2017) (US 9,665,650 · May 30, 2017)
Hypertext Browser Assistant (US 7,421,432 · September 2, 2008)
Hypertext Browser Assistant (2012) (US 8,316,016 · November 20, 2012)
Hypertext Browser Assistant (2013) (US 8,560,564 · October 15, 2013)
Methods and Apparatus for Employing Usage Statistics in Document Retrieval (US 8,001,118 · August 16, 2011)
Usage Statistics Document Retrieval (2012) (US 8,156,100 · April 10, 2012)
Usage Statistics Document Retrieval (2013) (US 8,352,452 · January 8, 2013)
Voice Interface for a Search Engine (US 7,027,987 · April 11, 2006)
Voice Search Engine Interface for Scoring Search Hypotheses (US 8,768,700 · July 1, 2014)
Ranking Video Articles (US 7,933,338 · April 26, 2011)
Finding Web Pages Relevant to Multimedia Streams (US 8,868,543 · October 21, 2014)
Methods and Systems for Improving Search Rankings Using Advertising Data (US 8,676,790 · March 18, 2014)

Why this inventor matters

Each inventor track inside the Nizam SEO War Room patents archive isolates one engineer's research arc — typically a decade or more of continuations, divisionals, and follow-up patents on a coherent research thread. Reading by inventor (rather than by topic) recovers the narrative: how the original disclosure evolved, what the continuations added, which claims got carved out into divisional applications, and how the thread eventually intersected with other research lines at Google or Microsoft. This is how working SEOs build durable intuition about search-engine internals — not by memorizing claim language, but by following the research bibliography that shipped the algorithms we now optimize against.

How to read this track

Start with the earliest filing — it sets the foundational disclosure. Continuations refine the claims; divisional applications split out separable inventions; the follow-up patents tend to introduce performance optimizations, edge-case handling, or downstream integration with other systems. Each patent on this site is annotated with the ranking surface it touches — query understanding, document retrieval, ranking, behavioral signals, knowledge graph, or AI search — so the practitioner can map the research back to the algorithm output observed on live SERPs.

For example, a working SEO consultant uses Monika Henzinger, Google Search Patents when diagnosing a ranking drop, planning a content calendar, or briefing a client on why a tactic shifted. However, the concept only compounds when paired with the surrounding entries in the encyclopedia and patents archive. In addition, the platform connects this concept to live SERP data so the theory carries through to execution.

Finally, to summarize. Monika Henzinger, Google Search Patents matters because it intersects directly with the signals search engines and AI answer engines use to rank and surface results. The full article above covers the mechanism in depth, the patents it derives from, and the related encyclopedia entries to read next.

Monika Henzinger, Google Search Patents | Google Patents

What is Monika Henzinger, Google Search Patents?

About the Monika Henzinger, Google Search Patents track

Near-Duplicate Detection

Connectivity, Link Analysis & Ranking

Document Scoring & Freshness

User-Facing Search Features

Why this inventor matters

How to read this track

How does Monika Henzinger, Google Search Patents work in modern search?

Where Monika Henzinger, Google Search Patents fits in the Semantic SEO + AEO stack

Sources and related research

Near-Duplicate Detection

1Method for Identifying Near Duplicate Pages in a Hyperlinked Database

2Detecting Duplicate and Near-Duplicate Files

3Detecting Duplicate and Near-Duplicate Files (2008)

4Detecting Duplicate and Near-Duplicate Files (2011)

5Detecting Duplicate and Near-Duplicate Files (2016)

6System and Method for Near-Uniform Sampling of Web Page Addresses

Connectivity, Link Analysis & Ranking

7Connectivity Server for Locating Linkage Information Between Web Pages

8Method for Ranking Hyperlinked Pages Using Content and Connectivity Analysis

9Method for Ranking Documents in a Hyperlinked Environment Using Connectivity and Selective Content Analysis

10Method for Ranking Hyperlinked Pages (continuation 2007)

11Method for Identifying Related Pages in a Hyperlinked Database

12Method for Identifying Related Pages (2009)

13Method and Apparatus for Finding Mirrored Hosts by Analyzing Connectivity and IP Addresses

14Method and Apparatus for Finding Mirrored Hosts by Analyzing URLs

15Method and Apparatus for Preventing Topic Drift in Queries in Hyperlinked Environments

16Ranking Search Engine Results

17Methods and Systems for Identifying Manipulated Articles

18Identification of Web Sites That Contain Session Identifiers

Document Scoring & Freshness

19Document Scoring Based on Query Analysis

20Document Scoring Based on Document Content Update

21Document Scoring Based on Document Inception Date

22Document Scoring Based on Link-Based Criteria

23Information Retrieval Based on Historical Data

24Document Ranking Based on Document Classification

25Systems and Methods for Determining Document Freshness

26Systems and Methods for Determining Document Freshness (2011)

27Systems and Methods for Determining Document Freshness (2013)

28Document Ranking Based on Semantic Distance Between Terms in a Document

29Document Ranking Based on Semantic Distance (2011)

30Document Ranking Based on Semantic Distance (2013)

31Search Queries Improved Based on Query Semantic Information

32Search Queries Improved Based on Query Semantic Information (2013)

User-Facing Search Features

33Systems and Methods for Using Anchor Text as Parallel Corpora for Cross-Language Information Retrieval

34Anchor Text as Parallel Corpora (2010)

35Anchor Text as Parallel Corpora (2014)

36Systems and Methods for Performing In-Context Searching

37In-Context Searching (2011)

38In-Context Searching (2015)

39In-Context Searching (2017)

40Hypertext Browser Assistant

41Hypertext Browser Assistant (2012)

42Hypertext Browser Assistant (2013)

43Methods and Apparatus for Employing Usage Statistics in Document Retrieval

44Usage Statistics Document Retrieval (2012)

45Usage Statistics Document Retrieval (2013)

46Voice Interface for a Search Engine

47Voice Search Engine Interface for Scoring Search Hypotheses

48Ranking Video Articles

49Finding Web Pages Relevant to Multimedia Streams

50Methods and Systems for Improving Search Rankings Using Advertising Data