98 search and IR patents by Monika Henzinger, the first Director of Research at Google. Co-inventor on the foundational near-duplicate-page-detection patent (US 6,138,113, with Dean) and an extensive set of document-scoring families (Query Analysis, Content Update, Inception Date, Link-Based Criteria, Historical Data — all cross-listed with Dean's canonical articles). Independent contributions: the detecting-duplicate-files family with William Pugh, the AltaVista connectivity-server, the connectivity-and-content-ranking patents, document-freshness determination, semantic-distance ranking, query-semantic-information ranking, anchor-text cross-language IR, in-context searching, hypertext-browser assistant, and usage-statistics-driven document retrieval. Spans 1997 to 2017.
About the Monika Henzinger, Google Search Patents track
98 search and IR patents by Monika Henzinger, the first Director of Research at Google. Co-inventor on the foundational near-duplicate-page-detection patent (US 6,138,113, with Dean) and an extensive set of document-scoring families (Query Analysis, Content Update, Inception Date, Link-Based Criteria, Historical Data — all cross-listed with Dean's canonical articles). Independent contributions: the detecting-duplicate-files family with William Pugh, the AltaVista connectivity-server, the connectivity-and-content-ranking patents, document-freshness determination, semantic-distance ranking, query-semantic-information ranking, anchor-text cross-language IR, in-context searching, hypertext-browser assistant, and usage-statistics-driven document retrieval. Spans 1997 to 2017.
Near-Duplicate Detection
- Method for Identifying Near Duplicate Pages in a Hyperlinked Database (US 6,138,113 · October 24, 2000)
- Detecting Duplicate and Near-Duplicate Files (US 6,658,423 · December 2, 2003)
- Detecting Duplicate and Near-Duplicate Files (2008) (US 7,366,718 · April 29, 2008)
- Detecting Duplicate and Near-Duplicate Files (2011) (US 8,015,162 · September 6, 2011)
- Detecting Duplicate and Near-Duplicate Files (2016) (US 9,275,143 · March 1, 2016)
- System and Method for Near-Uniform Sampling of Web Page Addresses (US 6,594,694 · July 15, 2003)
Connectivity, Link Analysis & Ranking
- Connectivity Server for Locating Linkage Information Between Web Pages (US 6,073,135 · June 6, 2000)
- Method for Ranking Hyperlinked Pages Using Content and Connectivity Analysis (US 6,738,678 · May 18, 2004)
- Method for Ranking Documents in a Hyperlinked Environment Using Connectivity and Selective Content Analysis (US 6,112,203 · August 29, 2000)
- Method for Ranking Hyperlinked Pages (continuation 2007) (US 7,117,206 · October 3, 2006)
- Method for Identifying Related Pages in a Hyperlinked Database (US 6,665,837 · December 16, 2003)
- Method for Identifying Related Pages (2009) (US 7,630,973 · December 8, 2009)
- Method and Apparatus for Finding Mirrored Hosts by Analyzing Connectivity and IP Addresses (US 6,487,555 · November 26, 2002)
- Method and Apparatus for Finding Mirrored Hosts by Analyzing URLs (US 6,286,006 · September 4, 2001)
- Method and Apparatus for Preventing Topic Drift in Queries in Hyperlinked Environments (US 6,321,220 · November 20, 2001)
- Ranking Search Engine Results (US 7,451,388 · November 11, 2008)
- Methods and Systems for Identifying Manipulated Articles (US 7,302,645 · November 27, 2007)
- Identification of Web Sites That Contain Session Identifiers (US 7,886,217 · February 8, 2011)
Document Scoring & Freshness
- Document Scoring Based on Query Analysis (US 8,051,071 · November 1, 2011)
- Document Scoring Based on Document Content Update (US 8,112,426 · February 7, 2012)
- Document Scoring Based on Document Inception Date (US 7,840,572 · November 23, 2010)
- Document Scoring Based on Link-Based Criteria (US 8,407,231 · March 26, 2013)
- Information Retrieval Based on Historical Data (US 7,346,839 · March 18, 2008)
- Document Ranking Based on Document Classification (US 8,224,827 · July 17, 2012)
- Systems and Methods for Determining Document Freshness (US 7,797,316 · September 14, 2010)
- Systems and Methods for Determining Document Freshness (2011) (US 8,082,244 · December 20, 2011)
- Systems and Methods for Determining Document Freshness (2013) (US 8,515,952 · August 20, 2013)
- Document Ranking Based on Semantic Distance Between Terms in a Document (US 7,716,216 · May 11, 2010)
- Document Ranking Based on Semantic Distance (2011) (US 8,060,501 · November 15, 2011)
- Document Ranking Based on Semantic Distance (2013) (US 8,606,778 · December 10, 2013)
- Search Queries Improved Based on Query Semantic Information (US 8,055,669 · November 8, 2011)
- Search Queries Improved Based on Query Semantic Information (2013) (US 8,577,907 · November 5, 2013)
User-Facing Search Features
- Systems and Methods for Using Anchor Text as Parallel Corpora for Cross-Language Information Retrieval (US 7,146,358 · December 5, 2006)
- Anchor Text as Parallel Corpora (2010) (US 7,814,103 · October 12, 2010)
- Anchor Text as Parallel Corpora (2014) (US 8,631,010 · January 14, 2014)
- Systems and Methods for Performing In-Context Searching (US 7,305,380 · December 4, 2007)
- In-Context Searching (2011) (US 7,962,469 · June 14, 2011)
- In-Context Searching (2015) (US 9,111,000 · August 18, 2015)
- In-Context Searching (2017) (US 9,665,650 · May 30, 2017)
- Hypertext Browser Assistant (US 7,421,432 · September 2, 2008)
- Hypertext Browser Assistant (2012) (US 8,316,016 · November 20, 2012)
- Hypertext Browser Assistant (2013) (US 8,560,564 · October 15, 2013)
- Methods and Apparatus for Employing Usage Statistics in Document Retrieval (US 8,001,118 · August 16, 2011)
- Usage Statistics Document Retrieval (2012) (US 8,156,100 · April 10, 2012)
- Usage Statistics Document Retrieval (2013) (US 8,352,452 · January 8, 2013)
- Voice Interface for a Search Engine (US 7,027,987 · April 11, 2006)
- Voice Search Engine Interface for Scoring Search Hypotheses (US 8,768,700 · July 1, 2014)
- Ranking Video Articles (US 7,933,338 · April 26, 2011)
- Finding Web Pages Relevant to Multimedia Streams (US 8,868,543 · October 21, 2014)
- Methods and Systems for Improving Search Rankings Using Advertising Data (US 8,676,790 · March 18, 2014)
Why this inventor matters
Each inventor track inside the Nizam SEO War Room patents archive isolates one engineer's research arc — typically a decade or more of continuations, divisionals, and follow-up patents on a coherent research thread. Reading by inventor (rather than by topic) recovers the narrative: how the original disclosure evolved, what the continuations added, which claims got carved out into divisional applications, and how the thread eventually intersected with other research lines at Google or Microsoft. This is how working SEOs build durable intuition about search-engine internals — not by memorizing claim language, but by following the research bibliography that shipped the algorithms we now optimize against.
How to read this track
Start with the earliest filing — it sets the foundational disclosure. Continuations refine the claims; divisional applications split out separable inventions; the follow-up patents tend to introduce performance optimizations, edge-case handling, or downstream integration with other systems. Each patent on this site is annotated with the ranking surface it touches — query understanding, document retrieval, ranking, behavioral signals, knowledge graph, or AI search — so the practitioner can map the research back to the algorithm output observed on live SERPs.