25 search/IR patents by Marc Najork spanning his Microsoft Research era (web crawler architecture, spam-resistant ranking, content evaluation, hyperlink databases) and his Google era (form information extraction, ML-driven semantic similarity, active learning, document activity logs for relevance training). Includes a foundational cross-listed patent with the 65 Google Patents collection on document activity logs. Spans 2005 to 2025.
About the Marc Najork, Google Search Patents track
25 search/IR patents by Marc Najork spanning his Microsoft Research era (web crawler architecture, spam-resistant ranking, content evaluation, hyperlink databases) and his Google era (form information extraction, ML-driven semantic similarity, active learning, document activity logs for relevance training). Includes a foundational cross-listed patent with the 65 Google Patents collection on document activity logs. Spans 2005 to 2025.
Modern ML & Information Extraction
- System for Information Extraction from Form-Like Documents (US 11,393,233 · July 19, 2022)
- System for Information Extraction from Form-Like Documents (continuation 2023) (US 11,830,269 · November 28, 2023)
- System for Information Extraction from Form-Like Documents (continuation 2024) (US 12,354,396 · July 8, 2025)
- System for Information Extraction from Form-Like Documents (app 2021) (US App 2021/0374395 · December 2, 2021)
- System for Information Extraction from Form-Like Documents (app 2022) (US App 2022/0375245 · November 24, 2022)
- System for Information Extraction from Form-Like Documents (app 2024) (US App 2024/0046684 · February 8, 2024)
- System for Information Extraction from Form-Like Documents (app 2025) (US App 2025/0308277 · October 2, 2025)
- Systems and Methods for Machine-Learned Prediction of Semantic Similarity Between Documents (US 11,694,034 · July 4, 2023)
- ML-Learned Semantic Similarity (continuation 2025) (US 12,210,837 · January 28, 2025)
- ML-Learned Semantic Similarity (app 2022) (US App 2022/0129638 · April 28, 2022)
- ML-Learned Semantic Similarity (app 2023) (US App 2023/0297783 · September 21, 2023)
- ML-Learned Semantic Similarity (app 2025) (US App 2025/0209277 · June 26, 2025)
- Systems and Methods for Active Learning (US 11,526,752 · December 13, 2022)
- Systems and Methods for Active Learning (app) (US App 2020/0250527 · August 6, 2020)
- Document Activity Logs for Machine Learning (US App 2023/0267277 · August 24, 2023)
Ranking & Quality Signals
- Domain-Based Spam-Resistant Ranking (US App 2007/0067282 · March 22, 2007)
- Systems and Methods for Ranking Documents Based Upon Structurally Interrelated Information (US App 2005/0060297 · March 17, 2005)
- Content Evaluation (US App 2006/0069667 · March 30, 2006)
- Social Network Recommended Content and Recommending Members for Personalized Search Results (US 8,949,232 · February 3, 2015)
- Social Network Recommended Content (app) (US App 2013/0086057 · April 4, 2013)
Content & URL Detection
- Using Content Analysis to Detect Spam Web Pages (US App 2006/0184500 · August 17, 2006)
- Systems and Methods for Inferring URL Normalization Rules (US App 2006/0218143 · September 28, 2006)
- Automatically Creating Training Data For Language Identifiers (US App 2015/0006148 · January 1, 2015)
Search Infrastructure
- Incremental Update Scheme for Hyperlink Database (US App 2007/0250480 · October 25, 2007)
- Fault Tolerance Scheme for Distributed Hyperlink Database (US App 2007/0220064 · September 20, 2007)
Why this inventor matters
Each inventor track inside the Nizam SEO War Room patents archive isolates one engineer's research arc — typically a decade or more of continuations, divisionals, and follow-up patents on a coherent research thread. Reading by inventor (rather than by topic) recovers the narrative: how the original disclosure evolved, what the continuations added, which claims got carved out into divisional applications, and how the thread eventually intersected with other research lines at Google or Microsoft. This is how working SEOs build durable intuition about search-engine internals — not by memorizing claim language, but by following the research bibliography that shipped the algorithms we now optimize against.
How to read this track
Start with the earliest filing — it sets the foundational disclosure. Continuations refine the claims; divisional applications split out separable inventions; the follow-up patents tend to introduce performance optimizations, edge-case handling, or downstream integration with other systems. Each patent on this site is annotated with the ranking surface it touches — query understanding, document retrieval, ranking, behavioral signals, knowledge graph, or AI search — so the practitioner can map the research back to the algorithm output observed on live SERPs.