Classifies whether a search query expresses commercial intent so the engine can adjust result mix, ad inventory, and result-page format to match transactional or comparison-shopping users.
Patent Overview
- Inventor
- Amit Singhal
- Assignee
- Google LLC
- Filed
- 2005-10-26
- Granted
- 2011-10-25
- Application Number
- US 11/258,728
The Challenge
Same Words, Different Intent
A query for "laptop" can come from a buyer comparing models, a student looking for reviews, or a researcher reading about computing history. The engine cannot serve all three with the same result mix. A buyer needs product pages and ads. A reviewer needs comparison articles. A researcher needs background content. The system needs an explicit classifier that decides whether the query has commercial intent so the rest of the pipeline can route accordingly.
- One Result Mix Fails All Audiences — Without intent classification, the engine has to pick a default mix and serve it to every querier. Commercial users get too few product results; informational users get too many. Both groups are under-served.
- Ads Need Commercial Intent To Land — Advertising revenue depends on showing ads to users with buying intent. Showing product ads on informational queries wastes inventory and frustrates users. Showing them on commercial queries converts.
- Result Page Format Should Vary — Commercial queries benefit from product carousels, price tables, and shopping links. Informational queries benefit from article previews and answer passages. The page layout itself depends on intent.
- Need A Reliable Classifier — The system needs a classifier that flags commercial intent accurately. False positives (treating informational queries as commercial) annoy users. False negatives (missing commercial queries) lose revenue and serve worse results.
- Query Patterns Carry The Signal — Commercial queries share structural patterns: brand names, product categories, modifiers like "buy", "price", "deal", "review", "vs". These patterns can be learned and matched.
Innovation
Pattern-Based Commercial Query Detection
The system generates a list of query patterns associated with commercial intent. When a new query arrives, it is compared against the pattern list. Pattern matches classify the query as commercial of the corresponding type. The classification feeds into result-mix, ad-inventory, and page-format decisions downstream.
- Build A Pattern Library — Offline analysis of query logs identifies patterns associated with commercial outcomes (clicks on shopping results, ad clicks, ecommerce site visits). Each pattern represents a commercial query type.
- Tag Each Pattern By Type — Patterns are tagged with their commercial subtype: research, comparison, transactional, brand. Each subtype maps to a different result-mix policy.
- Receive User Query — When a query arrives at the engine, it enters the classification path before retrieval.
- Match Against The Library — Compare the incoming query to the pattern library. Use exact matching, normalized matching, or fuzzy matching depending on the pattern type.
- Classify Based On Best Match — If a pattern matches, classify the query as commercial of the matched subtype. If no pattern matches, classify as non-commercial.
- Route Downstream — Pass the classification to result-mix selection, ad-serving, and page-layout components. Each component uses the classification to adjust its behavior for the query.
- Refresh The Library Periodically — Pattern library is rebuilt periodically as new commercial queries emerge and old ones fade. Keeps the classifier aligned with evolving commerce vocabulary.
Classify Then Compose
Commercial query detection is the upstream decision that cascades into many downstream choices. The page mix, the ad load, the layout, the ranking signals: all depend on whether the engine has classified the query as commercial.
Pattern Match Is Cheap And Decisive
Pattern matching against a precomputed library is constant-time at serve speed. The classification it produces drives expensive downstream decisions that would otherwise be applied uniformly.
- Pattern Library — Built offline from log analysis. Contains the structural signatures of commercial queries grouped by subtype (research, comparison, transactional, brand).
- Runtime Match — Fast pattern lookup at query time. Produces a classification that feeds the rest of the pipeline.
- Downstream Routing — Result-mix, ad-serving, and page-layout components consume the classification and adjust their behavior accordingly.
Intent classification is the upstream lever that bends all downstream serving.
<\/section>Technical Foundation
Pattern Construction And Matching
The pattern library is the core asset. Its quality determines how accurately the classifier flags commercial intent.
- Query Patterns — Structural signatures derived from log analysis. Can include literal words, regex-style placeholders, or n-gram templates.
- Pattern Types — Categorization of patterns by commercial subtype: research ("reviews of X"), comparison ("X vs Y"), transactional ("buy X", "price of X"), brand ("X discount code"). Each subtype maps to different downstream behavior.
- Match Function — How the runtime decides whether the query matches a pattern. Exact, normalized, fuzzy, or model-based matching can all be used per pattern type.
- Classification Output — The structured result of the classifier: a binary commercial flag plus the matched pattern type if commercial. Used as input to downstream serving components.
Key Insight: Treating commercial detection as a pattern-matching problem (rather than a content-analysis problem) lets the classifier run fast at query time while still benefiting from sophisticated offline pattern construction. The library is the expensive part; the runtime is cheap. This split is what makes commercial classification practical at search-engine scale.
<\/section>The Process
Library Construction And Live Classification
Two pipelines: an offline one builds the pattern library; an online one matches queries against it.
- Log Mining — Offline analysis of query logs identifies queries that produced commercial outcomes (ad clicks, ecommerce-site visits, shopping-result clicks).
- Pattern Extraction — From the commercial-outcome queries, extract structural patterns. Group by subtype based on the outcome shape.
- Library Publication — Write the extracted patterns to a runtime-readable library. Include subtype tags and match function specifications.
- Runtime Classification — Incoming query is compared against the library. Match found means commercial classification; no match means non-commercial.
- Route To Serving Components — Classification is passed to result-mix, ad-serving, and page-layout selection, each of which adapts its behavior.
What This Means for SEO
What This Means for SEO
Commercial query classification is the reason the SERP for transactional queries looks structurally different from the SERP for informational queries. Knowing this changes how you target each.
- Match Your Content Format To Query Intent — Commercial queries get product-oriented SERPs (carousels, shopping ads, price comparisons). Informational queries get article-oriented SERPs (answer passages, featured snippets, news). Targeting a commercial query with an info article is a structural mismatch that limits ranking opportunity.
- Modifier Words Tilt The Classifier — Adding "buy", "price", "best", "vs", "review", "deal", "discount" to a query raises the commercial signal. Page titles and headings that include these modifiers help your content match commercial intent.
- Comparison Content Captures Mid-Funnel — "X vs Y" queries are classified as commercial-comparison subtype. Long-form comparison pages with clear pros/cons tables earn the traffic of users approaching a decision but not ready to buy.
- Pure Brand Queries Get Brand Treatment — Branded queries (just the brand name, or brand plus product line) are classified as commercial-brand subtype. The brand’s own pages tend to dominate these SERPs; third-party content has to add comparative or evaluative angles to compete.
- Transactional Queries Crowd Out Informational Content — Highly transactional queries have very little informational content in the top ranks because the engine routes them to commercial-mix serving. Don’t target transactional queries with explainer content; build for the intent the SERP rewards.
- Borderline Queries Reward Hybrid Content — Queries that are partially commercial (e.g., "how to choose a laptop") reward content that addresses both the informational and the commercial aspect. Cover the buying decision and include actionable product recommendations.
- Ad-Heavy SERPs Are A Signal — When the SERP shows many ads above and within the results, the engine has classified the query as strongly commercial. Organic ranking opportunity is compressed; focus on differentiation through depth and comparative value.