Neural-network architecture for generating answers to questions by encoding the candidate text passage and the question together, then producing the answer through a learned generation process.
Patent Overview
- Inventor
- Nitin Gupta
- Assignee
- Google LLC
- Filed
- 2018-10-23
- Granted
- 2021-08-17
- Application Number
- US 16/168,419
The Challenge
Symbolic Answer Extraction Hits A Ceiling
Earlier answer-passage systems extracted an answer span from a passage using symbolic matching (entity recognition, dependency parsing, type checks). The approach works for simple factoid questions but struggles with multi-step reasoning, complex phrasing, and answers that require synthesis across the passage. A neural-network approach can handle these cases by learning end-to-end from question-passage-answer triples.
- Span Extraction Misses Synthesis Cases — When the answer is not literally present as a contiguous span in the passage, span extractors fail. Many real-world answers require combining facts within the passage.
- Rule-Based Reasoning Doesn't Scale — Hand-coded reasoning rules cover specific question types and break on unseen variations. Generalization requires learned models.
- Need Joint Encoding Of Question And Passage — The model has to attend to both the question and the passage simultaneously. Encoding them separately and combining at the end discards too much information.
- Output Should Be Generated, Not Just Selected — For complex questions, the answer needs to be generated word-by-word rather than picked as a span. Generative output handles paraphrasing and synthesis natively.
- Inference Latency Matters — Question-answering must run within search-serving latency budgets. The neural model has to be efficient enough for real-time use at search-engine scale.
Innovation
Encoder-Decoder Network For Answer Generation
The system uses an encoder neural network to process an input text passage and produce per-token encoded representations. A decoder network then generates the answer at each time step, attending over the encoded passage representations and conditioning on the input question. The architecture is end-to-end trainable on question-passage-answer triples.
- Receive Question And Passage — Inputs are an input question string and an input text passage. The passage is the document or document subset that's expected to contain the answer.
- Encode The Passage — Process the passage through an encoder neural network to generate a respective encoded representation for each passage token. The encoded representations preserve token identity plus contextual information from surrounding tokens.
- Encode The Question — Process the question through an encoder (possibly shared with the passage encoder, possibly separate). The encoded question representation drives the decoder's attention.
- Initialize Decoder — Start the decoder with a special start-of-answer token. The decoder will generate the answer one token at a time.
- Generate Token By Token — At each time step, the decoder attends over the encoded passage representations (weighted by the question), uses the previous decoder state, and predicts the next answer token. The token is appended to the answer.
- Stop At End-Of-Answer Token — Continue generating tokens until the decoder produces an end-of-answer signal or hits a maximum length.
- Return Generated Answer — Output the generated answer string. The answer may be a span from the passage, a paraphrase of a span, or a synthesis of multiple parts of the passage.
End-To-End Neural QA
The patent moves answer generation from rule-based extraction to learned generation. The encoder-decoder architecture, trained end-to-end on question-passage-answer triples, generalizes far beyond what symbolic systems can handle.
Encode, Attend, Generate
The passage is encoded once. The decoder attends over the encoded passage while generating each answer token. Each token is conditioned on the question, the passage, and the previously generated answer tokens.
- Passage Encoder — Produces per-token encoded representations that preserve content and context. These are the inputs the decoder attends over.
- Question Encoder — Provides the query context that guides decoder attention. The question is what tells the decoder which parts of the passage to attend to.
- Generative Decoder — Produces the answer token by token, attending over the encoded passage at each step. Output can be span-like, paraphrased, or synthesized.
Neural QA bridges the symbolic-extraction era to modern LLM-based answer generation.
<\/section>Technical Foundation
Architecture Components
The architecture combines a passage encoder, a question encoder, and a decoder with attention.
- Passage Encoder — Neural network that produces a representation for each passage token. Typically a transformer or RNN with cross-token attention.
- Question Encoder — Network that encodes the question. May share weights with the passage encoder or be separate.
- Decoder — Generates the answer one token at a time. At each step it attends over the passage representations conditioned on the question and the previously generated tokens.
- Attention Mechanism — Decides which passage tokens the decoder focuses on at each generation step. Learned end-to-end with the rest of the network.
Key Insight: This patent sits at the inflection point between symbolic answer extraction and modern LLM-based answer generation. The encoder-decoder architecture it describes is essentially what powers the answer-generation parts of AI Overviews and Search Generative Experience today. The patent's contribution is making the QA pipeline neural and trainable rather than rule-based, opening the door to the LLM era.
<\/section>The Process
Training And Inference
The pipeline involves offline training on question-passage-answer triples and online inference for serving.
- Training Corpus — Assemble a corpus of (question, passage, answer) triples. Sources include FAQ pages, structured Q&A datasets, and curated examples.
- Train Encoder-Decoder — Train the encoder-decoder network on the corpus. Loss penalizes mismatch between generated answer and ground-truth answer at each token.
- Validate On Held-Out Set — Test against held-out triples. Measure span accuracy, exact match, and BLEU/ROUGE for generative outputs.
- Deploy For Inference — Serve the trained model behind the question-answering API. Each user query that's classified as a question routes through this model.
- Generate Per-Query Answer — On query arrival, the system identifies the relevant passage, encodes it with the question, and runs the decoder to generate the answer.
What This Means for SEO
What This Means for SEO
Neural QA is one of the foundations of modern direct-answer surfaces. Understanding the encode-attend-generate architecture changes how to think about question-targeted content.
- Passage Quality Drives Answer Quality — The decoder attends over the passage tokens. A well-structured passage with clean factual content produces a better generated answer than a noisy or rambling one. Quality at the passage level translates directly to the answer surface.
- Direct Answer Phrasing Helps Generation — When your passage contains the answer in clean canonical phrasing, the decoder is more likely to generate that phrasing. Passages where the answer is buried under filler produce weaker generations.
- Multi-Sentence Answers Are Native — Unlike span extraction, neural generation can synthesize answers across multiple sentences. Passages that cover an answer comprehensively in two-to-three sentences feed the generative model the structure it needs.
- Schema And Structured Data Help Indirectly — Strong entity markup helps the engine identify which passage is the answer candidate. Once a passage is selected, schema does not directly shape generation, but it raises the probability of your passage being the selected source.
- Citation Is Bound To The Source Passage — Generated answers cite the source passage. Being the source passage of a generated answer surfaces your domain as the attribution even when the user reads the synthesized answer rather than clicking through.