To start using PatSnap Eureka, click the verification button in the email we sent to .
This helps keep your account secure. Haven't received it? Check your spam folder.
Patent Drafting Analysis of Amazon Technologies’ LLM Output Verification and Citation System | US 12,353,469 B1
Patent Drafting Analysis of Amazon Technologies’ LLM Output Verification and Citation System | US 12,353,469 B1
IP Drafting Analysis · US 12,353,469 B1
Patent Drafting Analysis of Amazon Technologies' LLM Verification and Citation System | US 12,353,469 B1
A structural and strategic analysis of Amazon's granted patent covering RAG-based output verification for large language models, examining claim architecture, drafting quality, critical gaps, and prosecution positioning across all 20 claims.
US 12,353,469 B1Filed: Jun 28, 2024Granted: Jul 8, 2025G06F 16/00G06F 16/332G06F 16/334G06F 16/383G06Q 50/18
System overviews, flow diagrams, component architectures
Draft now ↗
Published byPatSnap Insights Team · · 13 min read Verified by PatSnap Eureka Data
Overview
Structural Overview
The detailed description dominates at approximately 56% of total words, providing substantial embodiment support across the RAG pipeline, citation module, and query optimizer subsystems. The claim set comprises 20 claims structured across 3 independent claims — one system (Claim 1), one method (Claim 4), and one CRM (Claim 16) — with 17 dependent claims yielding a 5.67:1 dependent-to-independent ratio typical for software/AI art units. The 13 drawing sheets provide comprehensive coverage of data flow from query ingestion through embedding comparison, document ranking, LLM response generation, and citation verification, with FIG. 1A through FIG. 1G illustrating the end-to-end user interaction and FIG. 3 mapping the complete process flow.
Section Word Distribution
↗ Click bars to explore
Figure Inventory — 13 Sheets
Figure
Description
Role
FIG. 1A
High-level system overview showing user (110) submitting query (130) and documents (135) to a verification and citation service (180) over network (190).Search in Eureka ↗
System architecture
FIG. 1B
Diagram showing query (130) being processed by an embedding model (140) to produce a query embedding (142).Search in Eureka ↗
Claim support
FIG. 1C
Illustration of database connector module (150) comparing query embedding (142) against knowledge base document embeddings (155-n) to identify relevant documents (160-1 through 160-10).Search in Eureka ↗
Claim support
FIG. 1D
Diagram showing query augmentation with contextual information (timeframe 132, business condition 134, company holdings 136, earnings 138) fed to embedding model (140) to produce augmented query embedding (145).Search in Eureka ↗
Claim support
FIG. 1E
Ranked documents output showing documents (160-1 through 160-10) sorted by similarity to augmented query embedding (145) via database connector module (150).Search in Eureka ↗
Claim support
FIG. 1F
Language model (170) generating answer (172) from query and highest-ranking documents, with citation module (174) identifying source document (160-5) for the answer.Search in Eureka ↗
Key embodiment
FIG. 1G
User interface display showing answer (172) presented on display (115) with source citation "SOURCE: DOCUMENT E" (160-5) rendered as a hyperlinked element (175).Search in Eureka ↗
UI/interface
FIG. 2
Block diagram of full system (200) including user computer device (210) and output verification and citation service (280) with ingestion module (240), database connector module (250), modeling module (270), citation module (274), and automatic assistant module (285).Search in Eureka ↗
System architecture
FIG. 3
End-to-end process flow chart (300) showing all steps from user query (310) through first embedding, knowledge base comparison, document ranking, query augmentation, second embedding, LLM response generation, citation identification, and final presentation (365).Search in Eureka ↗
Flow diagram
FIG. 4
Diagram of query optimizer system (400) showing query junction (400), query generator (402), score generator (404), query optimizer (405), feature impact attribution (406), prompt generator (408), and output (410) with threshold decision logic.Search in Eureka ↗
Key embodiment
FIG. 5
Processing system (580) architecture showing user (510), computer device (515), user interface (584), embedding model (540), database connector module (550), and language model (570) with context flow.Search in Eureka ↗
System architecture
FIG. 6A
High-level cloud provider network (600) environment showing query service (610) with prompt and response engineering system (613), agents (622), context aggregators (624), generative AI models (697), LLMs (699), compute services (690), and storage services (692).Search in Eureka ↗
System architecture
FIG. 6B
Detailed view of query service (610) agents (622) including orchestrator agent (601), sanitization agent (603), response validation agent (605), task-specific agents (607), context aggregators (624), and service data (615) components.Search in Eureka ↗
System architecture
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims
Claim Architecture Analysis
The patent contains 3 independent claims covering all three standard software claim types: Claim 1 (system/apparatus), Claim 4 (method), and Claim 16 (non-transitory computer-readable storage medium/CRM). The 17 dependent claims yield a 5.67:1 ratio, above the software/AI norm of 4–5:1, reflecting deliberate layering of fallback positions. The tripartite claim structure — system, method, and CRM — ensures enforcement coverage across service operators, practitioners performing the method, and distributors of executable software, providing comprehensive protection for Amazon's RAG-based verification pipeline.
Core inventive concept: The claims address the problem of LLM hallucination and factual inaccuracy in outputs derived from complex documents by requiring a two-stage embedding pipeline: first converting the user's query to an embedding compared against a knowledge base, then augmenting the query with temporal metadata before re-embedding and re-ranking, with a citation module that identifies a specific source document for quantitative data points in the LLM's response by executing structured query language queries against a corpus and verifying numeric accuracy. Claim 1 specifically recites that "the corpus of data does not include any of the plurality of documents associated with the user," ensuring the LLM's base training is distinct from the domain-specific knowledge base used for grounding.
Independent Claim Dissection
Claim
Preamble
Transition
Key Body Elements
Claim 1
A system operating on a provider network
comprising
one or more computer processors; at least one data store having a knowledge base with plurality of user documents; a large language model trained on corpus not including user documents; a querying and output validation service performing: extracting text, partitioning into chunks, augmenting with temporal metadata, indexing, receiving free-form question, converting to first query, comparing embedding to indexed embeddings via similarity analysis in common vector space, selecting subset, generating first response with data points identified from temporal metadata, identifying document with at least one data point, providing response and document to userSearch prior art ↗
Claim 4
A method
comprising
extracting text from corpus; partitioning into text chunks; augmenting with temporal metadata; indexing in searchable database; receiving query; generating first embedding; performing similarity analysis on embeddings; identifying subset of indexed augmented text chunks; generating LLM response with quantitative data points from temporal metadata; identifying particular document with particular quantitative data point; verifying accuracy of particular quantitative data point; providing response and identifier of particular document as sourceSearch prior art ↗
Claim 16
A non-transitory computer-readable storage medium having executable instructions stored thereon that, if executed by one or more processors of a computer system, cause the computer system to at least
(functional instructions)
extract text from corpus; partition into text chunks; augment with temporal metadata; index in searchable database; receive query; generate LLM response with first set of quantitative data points from temporal metadata; identify second set of quantitative data points from response; compare first quantitative data point to second quantitative data point; provide portion of response and identifier of source of quantitative data point to user with conditional logic based on whether first and second data points are equal or unequalSearch prior art ↗
Claim Dependency Tree
1 System on provider network — querying and output validation service with LLM, knowledge base, temporal metadata augmentation, embedding comparison, and document citationSearch Claim 1 prior art ↗
2 Adds: the one of plurality of documents comprises a financial record, legal opinion, or medical reportSearch in Eureka ↗
3 Adds: generating second query from first response data points; generating second response identifying document with second set of data points; determining comparison of second to first data points for accuracy verificationSearch in Eureka ↗
4 Method — extracting, partitioning, augmenting with temporal metadata, indexing, similarity analysis, LLM response generation with quantitative data points, accuracy verification, source citationSearch Claim 4 prior art ↗
5 Adds: verifying accuracy by identifying response portion, generating SQL query, executing SQL query, identifying corpus document from SQL response, determining document includes particular quantitative data pointSearch in Eureka ↗
6 Adds: verifying accuracy by identifying response portion with one of quantitative data points; generating SQL query; executing SQL query; determining particular document includes data point differing from response; substituting particular data point for the one in the responseSearch in Eureka ↗
7 Adds: identifying particular document by executing a regular expression function on response and corpusSearch in Eureka ↗
8 Adds: receiving query further comprises receiving query and at least one document; providing query and document as inputs to LLM; response generated from LLM outputSearch in Eureka ↗
9 Adds: receiving query further comprises receiving free-form description and converting to structured query language query via query generator moduleSearch in Eureka ↗
10 Adds: receiving query further comprises calculating quality score, determining query not optimal, prompting user for additional information, updating query with additional informationSearch in Eureka ↗
11 Adds: determining contextual information regarding query; augmenting query to include contextual information; subset identified based on augmented querySearch in Eureka ↗
12 Further: contextual information comprises at least one of: time associated with query; theme associated with query; or summary of querySearch in Eureka ↗
13 Adds: determining ranking of augmented text chunks based on similarity analysis; subset identified based on rankingSearch in Eureka ↗
14 Adds: similarity analysis is k-nearest neighbor analysis or cosine similarity analysisSearch in Eureka ↗
15 Adds: at least one of corpus of documents is received from the userSearch in Eureka ↗
16 CRM — executable instructions for RAG pipeline: extract, partition, augment with temporal metadata, index, receive query, generate LLM response with first set of quantitative data points, identify second set, compare first to second, provide response with conditional source citation based on equalitySearch Claim 16 prior art ↗
17 Adds: generate SQL query from response portion; identify SQL query response from corpus; response to SQL includes second set of quantitative data pointsSearch in Eureka ↗
18 Further: generate embedding from query; identify plurality of embeddings per augmented text chunk; perform similarity analysis; determine ranking of augmented chunks; identify subset based on rankingSearch in Eureka ↗
19 Further: receive at least one document from user; provide query and document as inputs to LLM; response generated from LLM outputSearch in Eureka ↗
20 Adds: particular document comprises at least one of: a financial record; a legal opinion; or a medical reportSearch in Eureka ↗
Metric
This Application
Software / Cloud Norm
Total claims
20
15 – 25
Independent claim count
3
1 – 3
Dependent : Independent ratio
5.67 : 1
4 – 8 : 1
Method claims present?
Yes — Claim 4
Common
System / apparatus claims?
Yes — Claim 1
Common
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality
Drafting Quality Signals
This patent demonstrates strong tripartite claim coverage across system, method, and CRM claim types, with Claim 4 providing the most detailed method recitation that aligns well with the FIG. 3 process flow. A notable weakness is the elevated §101 Alice risk inherent in the functional, process-oriented independent claims, particularly Claim 4, where the hardware tie-in relies on generic processor/storage recitations rather than structurally differentiated computing components.
✅
Antecedent Basis
Antecedent basis is consistently maintained throughout the claim set. In Claim 4, "a first embedding" is introduced before "the first embedding" is used in the similarity analysis step, and "a subset of the indexed augmented plurality of text chunks" is introduced before "the subset" is referenced in the LLM generation step. In Claim 16, "a first set of quantitative data points" and "a second set of quantitative data points" are both properly introduced before cross-referenced comparisons. No orphaned "the" references were identified in any of the 20 claims.
Key independent claim limitations map directly to specific figures and paragraphs. The temporal metadata augmentation limitation in Claim 1 and Claim 4 is specifically supported by FIG. 1D (showing timeframe 132, business condition 134, company holdings 136, earnings 138) and the detailed description columns 5-6. The similarity analysis in a "common vector space" recited in Claim 1 is supported by the description at column 6 discussing k-nearest neighbor and cosine similarity analyses. The citation module's SQL query generation and execution for numeric verification maps precisely to FIG. 1F (citation module 174) and columns 7-8.
All independent claims use "comprising" as the transition, which is the strategically optimal choice for software/AI claims as it permits the claimed system or method to include additional, unclaimed elements without defeating infringement. The dependent claims use consistent "wherein" and "further comprising" language, which is structurally correct. No "consisting of" or "consisting essentially of" language appears, avoiding unnecessary claim narrowing that would be inappropriate for a software service patent in this art unit.
Claim 1 recites "a querying and output validation service executed by the one or more computer processors, wherein the querying and output validation service is configured to perform operations comprising..." — this "configured to" functional label language could trigger §112(f) scrutiny if an examiner or court reads it as a means-plus-function limitation, requiring the specification to identify corresponding structure. While the specification describes modules (ingestion module 240, database connector module 250, modeling module 270, citation module 274) that could serve as corresponding structure, the mapping between "querying and output validation service" and these modules is implicit rather than express, creating potential indefiniteness risk in post-grant review.
All three independent claims face meaningful Alice/Mayo exposure at Alice Step 2A, Prong 1, as the claimed process of receiving a query, generating embeddings, comparing embeddings, and generating an LLM response could be characterized as an abstract idea of information retrieval and organization. The hardware tie-in in Claim 1 ("one or more computer processors," "at least one data store") is generic and would not alone survive Step 2A, Prong 2. The strongest §101 defense lies in Claim 4's specific recitation of "augmenting the plurality of text chunks with temporal metadata" combined with "verifying accuracy of at least the particular quantitative data point" — this functional pairing represents a concrete improvement to LLM reliability, but the absence of any unconventional hardware component weakens the overall §101 posture.
The dependent claims add meaningfully distinct fallback positions. Claims 5 and 6 add alternative accuracy verification mechanisms (SQL query for matching vs. SQL query for substitution on mismatch), providing two distinct verification pathways. Claim 7 adds RegEx-based document identification as an alternative to SQL, and Claims 8 and 9 add distinct input modes (user-uploaded document vs. free-form to SQL conversion). Claims 10 and 11/12 layer query optimization (quality score threshold) and contextual augmentation respectively, each adding a genuinely separate technical feature. The weakest fallbacks are Claims 2 and 20, which identically recite domain type limitations (financial record, legal opinion, or medical report) for their respective independent claims — this duplication adds no prosecution history differentiation.
The abstract accurately describes the overall system flow but omits the two most novel claim elements that distinguish this invention: (1) the dual-embedding approach where the query is first converted to a basic embedding and then re-embedded after temporal augmentation, and (2) the automatic SQL-based numeric verification that corrects LLM hallucinations in quantitative data points. The abstract states only that "a source for the answer is identified in at least one of the documents" — an examiner reading only the abstract may fail to appreciate the automatic numeric accuracy verification mechanism that forms the core of Claims 4–7 and 16–17, potentially leading to a narrower search strategy in the art unit.
Figure support is comprehensive for the core claim elements. The temporal metadata augmentation of Claim 1/4 is specifically shown in FIG. 1D with labeled contextual information types. The two-stage embedding process (query embedding 142, augmented query embedding 145) is shown in FIGs. 1B, 1C, and 1E. The citation and verification mechanism of Claims 4–7 maps to FIG. 1F (citation module 174, source document 160-5) and FIG. 3 (steps 360–365). The query optimizer system of Claims 9–10 has dedicated figure support in FIG. 4 (query optimizer 405, score generator 404, feature impact attribution 406). One gap exists: Claim 3's iterative second-query generation from first-response data points lacks a dedicated figure, relying solely on textual description.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard
Strategic Intent Scorecard
Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.
Claim Breadth
3.5
Prosecution Defensibility
3.2
Spec–Claim Consistency
4.2
Dependent Claim Coverage
4
Claim Type Diversity
5
Figure Support Quality
4.3
Key observation: Claim Type Diversity scores at the maximum (5.0) because all three standard software protection formats — system (Claim 1), method (Claim 4), and CRM (Claim 16) — are represented, covering operators of the service, practitioners of the method, and distributors of executable instructions, leaving no obvious claim-type design-around. Prosecution Defensibility scores lowest (3.2) because the §101 exposure across all three independent claims is significant — the claims rely on generic computing hardware recitations rather than structurally differentiated components, and the absence of a concrete technical improvement framing in the preambles of Claims 1 and 16 leaves the claims vulnerable to an Alice Step 2A, Prong 2 rejection that the dependent claims, while layered, may not fully cure. Practitioners should note that adding a continuation with claims expressly framing the invention as improving LLM reliability metrics or reducing hallucination rates would significantly strengthen the §101 posture.
A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.
GAP 01 · HIGHEST IMPACT
No Apparatus Claim Covering Citation Module as Standalone Component
Claim 1 bundles the entire querying and output validation service — including ingestion, embedding, ranking, LLM invocation, and citation verification — into a single system claim, leaving no independent claim directed specifically to the citation module (274) as a standalone verifiable component that can be added to existing LLM pipelines. A competitor that deploys only the citation/verification layer as a bolt-on service atop a third-party LLM (without operating the full end-to-end RAG pipeline) could argue non-infringement of Claim 1 because the competitor does not extract, partition, and index documents. A stronger filing would have included an independent apparatus claim reciting only: (1) a citation module that identifies quantitative data points in LLM outputs, (2) executes SQL queries against a corpus to verify those data points, and (3) presents the source document to the user — capturing the verification-only use case.
GAP 02 · HIGH IMPACT
Temporal Metadata Limitation Unnecessarily Present in All Three Independent Claims
All three independent claims (Claims 1, 4, and 16) require "augmenting the plurality of text chunks with temporal metadata," which means a competitor implementing the identical RAG-plus-verification system without temporal augmentation — for example, a purely semantic retrieval system for legal case law without time-dimension filtering — would not infringe any independent claim. The temporal metadata requirement is a specific implementation detail that narrows all three independent claims without a corresponding breadth trade-off, since the core inventive contribution (embedding-based retrieval combined with quantitative accuracy verification via SQL) does not require temporal metadata to function. A stronger filing would have drafted one independent claim without the temporal metadata limitation, relegating temporal augmentation to dependent claims, to cover the broader class of RAG-plus-verification systems.
GAP 03 · HIGH IMPACT
No Claim Covering Free-Form-to-SQL Query Conversion as Independent Feature
Unlock to read the full analysis.
🔒
3 Critical Gaps in This Claim Set
See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.
No standalone citation module claimTemporal metadata narrows all independent claimsFree-form-to-SQL conversion unclaimed independently
US 12,353,469 B1 protects a system, method, and computer-readable medium for verifying and providing citations for outputs generated by large language models (LLMs) when answering queries about complex documents. The patent addresses LLM hallucination in quantitative data by requiring that text chunks extracted from user documents be augmented with temporal metadata before indexing, and that the LLM's numeric outputs be automatically verified against a ground-truth corpus via structured query language queries, with the verified source document presented to the user alongside the answer.
US 12,353,469 B1 is owned by Amazon Technologies, Inc., headquartered in Seattle, WA (US). The listed inventors are Ladan Mahabadi (Seattle, WA), Alexander Illichmann (Seattle, WA), Tong Ge (Bellevue, NJ), Sudhir Hassan Manikya Raju (Bellevue, WA), Seema Yadav (Seattle, WA), Stebin Kodiamkunnel Sevichan (Everett, WA), Michiel David De Pooter (Seattle, WA), and Francesco Furno (New York, NY).
Claim 1 is a system claim covering a provider network system with a querying and output validation service that extracts, partitions, and temporally augments text from user documents, converts user free-form questions to queries, performs embedding-based similarity search, generates LLM responses with quantitative data points, and identifies a source document for those data points. Claim 4 is a method claim covering the same pipeline steps with the addition of an explicit quantitative data point accuracy verification step using SQL queries against a corpus. Claim 16 is a CRM claim covering a non-transitory storage medium whose instructions implement the same pipeline, with the added feature of comparing first and second sets of quantitative data points and conditionally presenting either the original or corrected data point based on whether the two sets match.
This patent covers technology that helps AI systems give more accurate and trustworthy answers when asked questions about complex financial, legal, or medical documents. When a user uploads a document and asks a question, the system intelligently searches the document using AI-generated mathematical representations (embeddings) of both the question and the document content, enriches the search with time-related context, and then asks an AI language model to generate an answer. Critically, any numbers or financial figures in the answer are automatically checked against the original document using database queries to catch AI errors, and the source document is shown to the user so they can verify the answer themselves.
G06F 16/00 (2019.01) — Information retrieval; Database structures therefor; File system structures therefor. G06F 16/332 (2019.01) — Query formulation, feedback or interaction. G06F 16/334 (2025.01) — Query processing. G06F 16/383 (2019.01) — Retrieval characterised by using metadata. G06Q 50/18 (2012.01) — Legal services.
Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka
Ready to Draft Your Next Patent with AI?
PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.
Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.
Ask anything about this patent. PatSnap Eureka searches patents and data to answer instantly.