Interactive system for building a curated information database where users designate preferred-authority sources for each topic, predating and informing modern knowledge-graph editorial curation.
Patent Overview
- Inventor
- Prabhakar Raghavan
- Assignee
- IBM Corporation
- Filed
- 1998-08-29
- Granted
- 2002-01-01
- Application Number
- US 09/143,683
The Challenge
Curation Cannot Be Fully Automatic
Pure algorithmic information ranking misses something human: domain experts know which sources are authoritative in ways the algorithm cannot infer purely from graph structure or content. A medical professional knows which medical sources are trustworthy; a journalist knows which news sources have integrity. The system needs to combine algorithmic ranking with explicit human-curated preferred-authority designations, in an interactive framework that lets users build and refine the curation over time.
- Algorithmic Ranking Has Blind Spots — Algorithms rank by graph and content patterns. They miss editorial judgment about which sources are actually trustworthy for a topic. The blind spot is largest in expertise-heavy domains.
- Domain Experts Have Knowledge The Algorithm Lacks — Subject-matter experts can identify preferred-authority sources directly. Their input adds value the algorithm cannot derive on its own.
- Curation Must Be Interactive — A static editorial database goes stale. The curation system needs to support interactive creation, modification, and refinement of preferred-authority designations as the user's understanding evolves.
- Need Frame-Based Organization — Information is best organized in frames (slots and fillers) rather than flat lists. A topical frame holds preferred authorities, related topics, key concepts, and so on. The system supports frame-based curation explicitly.
- Hierarchy Plus Filters Plus Rankings — The curated database supports hierarchical organization, filters by attribute, and ranking by user-defined criteria. All three need to work together for the curation to be useful at query time.
Innovation
Interactive Frame-Based Curation Database
The patent describes a method for users to interactively create an information database structured as a frame-based hierarchy. At each frame, the user designates preferred-authority elements (canonical sources for the topic). The system supports cataloging, filtering, and relevance ranking over the curated database, blending algorithmic retrieval with human-curated authority designations.
- Define Frame Structure — Set up the frame-based organizational structure: top-level frames represent broad topics, nested frames represent sub-topics, leaf frames hold preferred-authority items.
- Catalog Information Elements — Add information elements (URLs, documents, references) to the appropriate frames. Each element carries metadata (source, type, last-verified-date).
- Designate Preferred Authorities — Within each frame, mark certain elements as preferred-authority. These are the canonical sources the user trusts for that topic.
- Apply Filters — Filter elements by attribute (source type, recency, format). Filters can be saved and reused across queries.
- Apply Relevance Ranking — Within a frame, rank elements by relevance to the current query or attention focus. The ranking combines the curated authority designation with content-based relevance.
- Iterate And Refine — The user revises the database interactively: adds new preferred authorities, demotes outdated ones, reorganizes frames, refines filters. The curation evolves with the user's understanding.
- Use At Query Time — When the user queries the database, the system returns preferred-authority elements first, then filtered and ranked supporting material. The curation drives the query response.
Human-Curated Authority Within Algorithmic Retrieval
The contribution is the explicit recognition that human curation belongs in the information system, not separate from it. The frame-based structure plus preferred-authority designations integrate human judgment into algorithmic retrieval in a way that scales with user input rather than requiring exhaustive labeling.
Curation Augments, Doesn't Replace, Algorithm
Preferred-authority designations are signals the algorithm reads alongside graph and content features. The curated database is consulted by the retrieval system; both contribute to the user's results.
- Frame-Based Organization — Information lives in nested frames that mirror topical structure. Each frame is a curation unit with its own preferred authorities and filters.
- Preferred-Authority Designation — User-marked elements that are canonical sources for their frame. The marking is explicit and persists across queries.
- Interactive Refinement — The curation is built interactively. Users add, remove, reorganize, and adjust the database as their understanding of the domain evolves.
Knowledge graphs at modern search engines descend conceptually from this curation primitive.
<\/section>Technical Foundation
Database Components
The curated database has structure that the retrieval system consults at query time.
- Frame — A topical unit holding information elements and metadata. Frames nest hierarchically to represent topical relationships.
- Information Element — An indexed resource (URL, document, reference) placed in one or more frames. Each element carries metadata.
- Preferred-Authority Mark — A boolean (or graded) flag on an element indicating it is a canonical source for its frame. Marks are user-set.
- Filter And Ranking Rules — Per-frame filtering and ranking specifications that the retrieval system applies when querying the frame.
Key Insight: The patent recognizes that information retrieval is part computation, part curation. Algorithms can rank vast amounts of material; humans can mark a small number of canonical sources. The hybrid uses each capability where it is strongest. Modern knowledge-graph curation at Google, Wikipedia, and structured-data platforms uses essentially this hybrid pattern.
<\/section>What This Means for SEO
What This Means for SEO
Editorial curation as a layer above algorithmic ranking is part of modern search. Knowing the preferred-authority concept changes how to think about being recognized as a canonical source.
- Curated Authority Beats Pure Algorithmic Authority — When editorial curation designates a source as authoritative for a topic, that designation carries weight beyond what graph and content signals alone provide. Becoming a recognized authority in curated lists (Wikipedia, expert directories, professional bodies) compounds with algorithmic signals.
- Topical Frames Mirror Modern Knowledge Graphs — The frame-based structure described in the patent is conceptually what knowledge graphs do today. Pages that fit cleanly into known topical frames (well-defined entities, clear topic boundaries) are easier to recognize as canonical sources.
- Be Catalogued, Not Just Indexed — Being in the algorithmic index is necessary but not sufficient. Being explicitly catalogued in curated databases (Wikipedia infoboxes, Google Knowledge Panel sources, industry-association lists) signals preferred-authority status.
- Frame Membership Persists — Once a source is designated preferred-authority in a curation database, the designation persists across queries until explicitly revised. Stable inclusion in curated lists is a long-running ranking asset.
- Multi-Frame Membership Compounds — Sources that are preferred-authority in multiple related frames (medical doctor recognized in cardiology, internal medicine, and patient-education frames) accumulate stronger authority than single-frame sources.