System architecture, RAG pipeline, process flowcharts
Draft now ↗
Published byPatSnap Insights Team · · 12 min read Verified by PatSnap Eureka Data
Overview
Structural Overview
The detailed description dominates at approximately 62% of total words (~4,200 of ~6,800), providing extensive embodiment narrative for each flowchart step but limited structural differentiation from the claims. The claim set comprises 20 claims across 3 independent claims — a method (Claim 1), a system (Claim 8), and a computer-readable storage medium (Claim 15) — with 17 dependent claims yielding a 5.67:1 dependent-to-independent ratio consistent with software patent norms. Figure coverage spans 7 sheets including two high-level system diagrams, a primary process flowchart, encoding sub-flowcharts, similarity-determination sub-flowcharts, a feature vector generation flowchart, and a generic computing device diagram.
Section Word Distribution
↗ Click bars to explore
Figure Inventory — 7 Sheets
Figure
Description
Role
FIG. 1
Block diagram of example system 100 showing server(s) 102, client(s) 104, network(s) 106, GUI manager 108, response generator 110, dataset(s) 112, and GUI 114.Search in Eureka ↗
System architecture
FIG. 2
Detailed block diagram of response generator 110 including pre-processor 202, encoder 204, second feature vectors 206, comparator 208, retriever 210, prompt generator 212, large language model 214, and dataset(s) 112.Search in Eureka ↗
Key embodiment
FIG. 3
Flowchart 300 of the core RAG process: receive query (302), generate first feature vector (304), compare to second feature vectors (306), retrieve augmentation information (308), provide augmented prompt to LLM (310), receive response (312).Search in Eureka ↗
Flow diagram
FIG. 4A
Flowchart 400A showing step 402: encoding a query into a low-dimensional dense vector, supporting the first feature vector generation limitation.Search in Eureka ↗
Claim support
FIG. 4B
Flowchart 400B showing step 404: encoding pieces of augmentation information into low-dimensional dense vectors, supporting the second feature vector generation limitation.Search in Eureka ↗
Claim support
FIG. 5
Flowchart 500 showing step 502: determining cosine similarities between the first feature vector and plurality of second feature vectors, supporting the comparison limitation in Claims 3, 10, and 17.Search in Eureka ↗
Claim support
FIG. 6A
Flowchart 600A showing threshold-based selection: determining second feature vectors having cosine similarity satisfying a predetermined condition with a predetermined threshold.Search in Eureka ↗
Claim support
FIG. 6B
Flowchart 600B showing top-N selection: determining a predetermined number of second feature vectors having the highest cosine similarities to the first feature vector.Search in Eureka ↗
Claim support
FIG. 6C
Flowchart 600C showing combined top-N-with-threshold selection: predetermined number of second feature vectors with highest cosine similarities satisfying a predetermined condition with a predetermined threshold.Search in Eureka ↗
Claim support
FIG. 7
Flowchart 700 of augmentation information ingestion pipeline: receive file (702), pre-process (704), tokenize (706), generate feature vectors (708), supporting Claims 7, 14, and the file-processing dependent claims.Search in Eureka ↗
Flow diagram
FIG. 8
Block diagram of exemplary computing environment 800 including computing device 802 with processor 810, storage 820, network-based server infrastructure 870, and on-premises servers 892, supporting the system and CRM claims.Search in Eureka ↗
System architecture
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims
Claim Architecture Analysis
The patent presents 3 independent claims — Claim 1 (method), Claim 8 (system/apparatus), and Claim 15 (computer-readable storage medium/CRM) — covering all three standard software patent claim types. The 17 dependent claims yield a 5.67:1 ratio, which is near the lower end of software/AI patent norms but acceptable for a focused functional disclosure. The tripartite independent claim structure (method + system + CRM) provides full enforcement coverage, though the near-identical language across Claims 1, 8, and 15 creates redundancy rather than complementary claim scope.
Core inventive concept: The claims address the problem of large language models generating irrelevant or hallucinated responses when their training data is stale or domain-incomplete. The mechanism, expressed across Claims 1, 8, and 15, is a vector-comparison-based retrieval step that encodes a query into a first feature vector, compares it to pre-computed second feature vectors each corresponding to a piece of augmentation information, retrieves the semantically closest augmentation data, and constructs an augmented prompt incorporating that data before invoking the LLM — thereby grounding the response in current, domain-specific information rather than parametric training knowledge alone.
Independent Claim Dissection
Claim
Preamble
Transition
Key Body Elements
Claim 1
A method for augmenting a large language model
comprising
receiving a query; generating a first feature vector based on the query; comparing the first feature vector to a plurality of second feature vectors each corresponding to a piece of augmentation information to determine second feature vectors satisfying a predetermined condition; retrieving pieces of augmentation information corresponding to the determined subset; providing an augmented prompt generated based at least on the query and the retrieved augmentation information to the large language model; receiving a response generated by the large language modelSearch prior art ↗
Claim 8
A system for augmenting a large language model
comprising
a processor; a memory device that stores program code structured to cause the processor to perform the same operational steps as Claim 1 (receive query, generate first feature vector, compare to second feature vectors, retrieve augmentation information, provide augmented prompt, receive response)Search prior art ↗
Claim 15
A computer-readable storage medium comprising computer-executable instructions, that when executed by a processor, cause the processor to
comprising
receive a query; generate a first feature vector based on the query; compare the first feature vector to a plurality of second feature vectors each corresponding to a piece of augmentation information to determine a subset satisfying a predetermined condition; retrieve pieces of augmentation information corresponding to the determined subset; provide to the large language model an augmented prompt generated based at least on the query and the retrieved pieces of augmentation information; receive a response generated by the large language modelSearch prior art ↗
Claim Dependency Tree
1 Method for augmenting LLM — receive query, generate first feature vector, compare to second feature vectors, retrieve augmentation info, provide augmented prompt, receive responseSearch Claim 1 prior art ↗
2 Adds: providing a user interface for querying domain-specific information; query received from user through UI; response provided to user through UISearch in Eureka ↗
3 Adds: comparing comprises determining cosine similarities between at least a portion of first feature vector and corresponding portions of plurality of second feature vectorsSearch in Eureka ↗
4 Further: determining second feature vectors having cosine similarity satisfying first predetermined relationship with first threshold; or first predetermined number with highest cosine similarities; or second predetermined number with highest cosine similarities satisfying second predetermined relationship with second thresholdSearch in Eureka ↗
5 Adds: augmented prompt comprises a request for response to query based at least on retrieved augmentation information; and the retrieved augmentation informationSearch in Eureka ↗
6 Adds: generating first feature vector comprises encoding query into low-dimensional dense vector using GPT-based or BERT-based encoder; second feature vectors generated by encoding augmentation information into low-dimensional dense vectors using GPT-based or BERT-based encoderSearch in Eureka ↗
7 Adds: pieces of augmentation information comprise at least one of domain-specific, entity-specific, product-specific, recent information unavailable at LLM generation, or information changed after LLM generationSearch in Eureka ↗
8 System for augmenting LLM — processor and memory device with program code to perform same steps as Claim 1Search Claim 8 prior art ↗
9 Adds: program code further structured to provide user interface for domain-specific querying; query received from user; response provided to user through UISearch in Eureka ↗
10 Adds: comparing comprises determining cosine similarities between at least a portion of first feature vector and corresponding portions of plurality of second feature vectorsSearch in Eureka ↗
11 Further: determine second feature vectors having cosine similarity satisfying first predetermined relationship with first threshold; or first predetermined number with highest cosine similarities; or second predetermined number with highest cosine similarities satisfying second predetermined relationship with second thresholdSearch in Eureka ↗
12 Adds: augmented prompt comprises request for response to query based at least on retrieved augmentation information; and the retrieved augmentation informationSearch in Eureka ↗
13 Adds: generate first feature vector comprises encode query into low-dimensional dense vector using GPT-based or BERT-based encoder; second feature vectors generated by encoding augmentation information into low-dimensional dense vectorsSearch in Eureka ↗
14 Adds: pieces of augmentation information comprise at least one of domain-specific, entity-specific, product-specific, recent information unavailable at LLM generation, or information changed after LLM generationSearch in Eureka ↗
15 Computer-readable storage medium with computer-executable instructions to perform same steps as Claim 1 when executed by a processorSearch Claim 15 prior art ↗
16 Adds: instructions further cause processor to provide user interface for domain-specific querying; query received from user; response provided to user through UISearch in Eureka ↗
17 Adds: compare comprises determining cosine similarities between at least a portion of first feature vector and corresponding portions of plurality of second feature vectorsSearch in Eureka ↗
18 Further: determine second feature vectors having cosine similarity satisfying first predetermined relationship with first threshold; or first predetermined number with highest cosine similarities; or second predetermined number with highest cosine similarities satisfying second predetermined relationship with second thresholdSearch in Eureka ↗
19 Adds: augmented prompt comprises request for response to query based at least on retrieved augmentation information; and the retrieved augmentation informationSearch in Eureka ↗
20 Adds: generate first feature vector comprises encode query into low-dimensional dense vector using GPT-based or BERT-based encoder; second feature vectors generated by encoding augmentation information into low-dimensional dense vectors using GPT-based or BERT-based encoderSearch in Eureka ↗
Metric
This Application
Software / Cloud Norm
Total claims
20
15 – 25
Independent claim count
3
2 – 4
Dependent : Independent ratio
5.67 : 1
4 – 8 : 1
Method claims present?
Yes — Claim 1
Always
System / apparatus claims?
Yes — Claim 8
Always
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality
Drafting Quality Signals
The claim set demonstrates solid structural coverage through a tripartite independent claim architecture (method, system, CRM) with GPT/BERT-specific encoder limitations providing meaningful technical anchoring in Claims 6, 13, and 20. However, the near-verbatim mirroring of dependent claim language across all three independent claim families — compare Claims 2/9/16, Claims 3/10/17, and Claims 4/11/18 — squanders dependent claim slots that could have introduced distinct technical fallback positions rather than tripling the same limitations.
✅
Antecedent Basis
The claim language demonstrates clean antecedent basis throughout all 20 claims. Claim 1 introduces "a first feature vector" then consistently references "the first feature vector" in subsequent limitations; "a plurality of second feature vectors" is properly introduced before "the plurality of second feature vectors" and "the determined subset of second feature vectors" appear. Claims 8 and 15 replicate this pattern without deviation. No dangling "the" references or unintroduced terms were identified in any claim.
Every independent claim limitation maps directly to a specific figure and paragraph. FIG. 3 steps 302–312 and ¶[0049]–[0054] support Claim 1's core process steps. FIG. 4A/4B and ¶[0056]–[0057] support the first/second feature vector generation limitations. FIG. 5 and ¶[0059] support the cosine similarity comparison in Claims 3, 10, and 17. FIG. 7 and ¶[0065]–[0068] support the file pre-processing and tokenization limitations embedded in the additional embodiments sections at ¶[0100], ¶[0108].
All three independent claims (1, 8, 15) use "comprising" as the transition, which is the strategically correct open-ended choice for software/AI claims — it prevents a competitor from designing around by simply adding an additional processing step. The system claim (Claim 8) uses "comprising" at both the system level and for the program code sub-limitations, maintaining open-ended coverage. No missed opportunity to use "comprising" in place of more restrictive language was identified.
No explicit "means for" language appears in any claim, which avoids mandatory §112(f) invocation. However, the system claim (Claim 8) recites "a memory device that stores program code structured to cause the processor to" perform a list of functional steps — this functional-result language could attract §112(f)-adjacent scrutiny if an examiner argues the "program code structured to" formulation is a nonce word equivalent. The specification adequately supports this with FIG. 2's structural components (encoder 204, comparator 208, retriever 210, etc.) but the claim itself lacks structural recitation of those components.
All three independent claims (1, 8, 15) face meaningful Alice/Mayo exposure: the abstract idea of "retrieving relevant information to supplement a query" is a fundamental practice predating AI. The primary §101 defense resides in the feature vector comparison mechanism — specifically the cosine similarity computation recited in dependent Claims 3, 10, and 17 and the GPT/BERT encoder limitation in dependent Claims 6, 13, and 20 — but these technical anchors appear only in dependent claims, not in the independent claims that would govern in IPR or litigation. An examiner applying the Alice two-step could readily characterize Claims 1, 8, and 15 as directed to the abstract idea of information retrieval-augmented querying without the specific technical implementation detail.
The 17 dependent claims are substantially tripled across three independent claim families rather than introducing genuinely distinct fallback positions. Claims 2, 9, and 16 are virtually identical (UI limitation); Claims 3, 10, and 17 are virtually identical (cosine similarity); Claims 4, 11, and 18 are virtually identical (threshold/top-N selection logic). Claims 6, 13, and 20 add valuable GPT/BERT encoder specificity that strengthens §101 defense. The most structurally distinct dependent claims are Claim 7 (augmentation information type taxonomy) and the implied file-preprocessing chain in ¶[0100]/[0108], but the pre-processing/tokenization pipeline visible in FIG. 7 is disclosed only in the spec's additional embodiments section and never claimed as a dependent claim.
An examiner reading only the abstract may identify the RAG concept but miss the specific technical mechanism that differentiates this patent from prior RAG disclosures. The abstract accurately describes the functional pipeline (feature vector generation, comparison, retrieval, augmented prompt construction, LLM response) but omits the cosine similarity computation and the GPT/BERT encoder specificity that constitute the strongest patentable distinction. An examiner relying on the abstract alone would likely classify this as a generic RAG system without recognizing the specific vector comparison approach that the dependent claims protect.
Figure coverage is comprehensive for the core pipeline. FIG. 2 supports all structural components of the response generator (pre-processor 202, encoder 204, comparator 208, retriever 210, prompt generator 212, LLM 214). FIGS. 4A/4B directly support the first and second feature vector generation limitations. FIGS. 5, 6A, 6B, 6C collectively support the three selection modes recited in Claims 4, 11, and 18. FIG. 7 supports the file ingestion pipeline. The one gap is the absence of a figure specifically illustrating the prompt structure described in ¶[0022]–[0026] (context, content, question) that is recited in Claims 5, 12, and 19, leaving the augmented prompt composition limitation without dedicated figure support.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard
Strategic Intent Scorecard
Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.
Claim Breadth
3
Prosecution Defensibility
2.8
Spec–Claim Consistency
4.2
Dependent Claim Coverage
2.5
Claim Type Diversity
4.5
Figure Support Quality
4
Key observation: The highest-scoring dimension is Claim Type Diversity (4.5/5.0) — the tripartite independent claim structure across method (Claim 1), system (Claim 8), and CRM (Claim 15) provides the full enforcement spectrum for a software patent, ensuring coverage whether a competitor implements the technology as a process, a product, or distributes it as stored code. The lowest-scoring dimension is Dependent Claim Coverage (2.5/5.0) — the structural weakness is that 15 of 17 dependent claims are direct parallels of the same 5 limitations mirrored across three independent claim families, consuming claim budget without adding distinct fallback positions; a stronger filing would have used those slots to claim the specific pre-processing and tokenization pipeline of FIG. 7, re-ranking mechanisms, and multi-modal query handling disclosed in the specification but never claimed. Practitioners reading this filing should note that the §101 defense currently depends on dependent Claims 6, 13, and 20 (GPT/BERT encoder specificity) rather than the independent claims, creating significant vulnerability if examination forces abandonment of those dependent claims.
A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.
GAP 01 · HIGHEST IMPACT
§101-critical technical anchors confined to dependent claims only
Claims 1, 8, and 15 recite the RAG pipeline at a purely functional level — "generating a first feature vector," "comparing," "retrieving" — without specifying any technical mechanism in the independent claim body, leaving all GPT/BERT encoder specificity and cosine similarity computation in dependent Claims 6/13/20 and 3/10/17 respectively. This creates a critical §101 exposure: if an examiner characterizes Claims 1, 8, and 15 as directed to the abstract idea of query-augmented information retrieval under Alice Step 1, Microsoft must rely on dependent claim amendments to inject technical content, potentially narrowing claim scope dramatically. A stronger filing would have incorporated at least the "low-dimensional dense vector" encoding mechanism and cosine similarity comparison into the body of each independent claim, preserving §101 eligibility while maintaining relatively broad coverage.
GAP 02 · HIGH IMPACT
FIG. 7 ingestion pipeline disclosed but never claimed
The specification at ¶[0065]–[0068] and FIG. 7 discloses a distinct four-step augmentation information ingestion pipeline (receive file, pre-process to remove markup and extract metadata, tokenize, generate feature vectors) that constitutes a commercially valuable method for building the vector store used by the RAG system. This pipeline is referenced in spec paragraphs ¶[0100] and ¶[0108] as additional embodiments but never appears as a dependent claim in any of the three independent claim families, and no independent claim directed to the ingestion method was filed. A competitor could implement the same downstream query/retrieval system using a substantially different ingestion pipeline and avoid infringement entirely, or could file their own patent on the ingestion method covering ground Microsoft disclosed but did not claim.
GAP 03 · HIGH IMPACT
No claim coverage for LLM prioritization behavior
Unlock to read the full analysis.
🔒
3 Critical Gaps in This Claim Set
See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.
US 2024/0346256 A1 covers a method, system, and computer-readable storage medium for augmenting a large language model using retrieval-augmented generation (RAG). The patent addresses the problem of LLMs generating inaccurate or stale responses by encoding a user query into a feature vector, comparing it via cosine similarity to pre-computed vectors representing stored augmentation information, retrieving the most semantically similar augmentation data, and providing an augmented prompt combining the query and retrieved data to the LLM before generating a response.
US 2024/0346256 A1 is owned by Microsoft Technology Licensing, LLC, located in Redmond, Washington, US. The sole named inventor is Yinghua QIN, also of Redmond, Washington, US.
Claim 1 is a method claim covering the core RAG process: receiving a query, generating a first feature vector, comparing it to second feature vectors to identify augmentation information, retrieving that information, providing an augmented prompt to an LLM, and receiving the LLM's response. Claim 8 is a system/apparatus claim covering a processor and memory device with program code structured to perform the same operational steps. Claim 15 is a computer-readable storage medium (CRM) claim covering computer-executable instructions that, when executed by a processor, perform the same RAG pipeline steps.
This patent covers technology that makes AI chatbots and language models more accurate by supplementing their knowledge with up-to-date information at the time a question is asked. When a user submits a query, the system converts it into a numerical representation and compares that representation against a pre-built library of numerical representations of relevant documents or data — using a mathematical similarity measure called cosine similarity — to find the most relevant pieces of information. That retrieved information is then bundled with the original question in a specially constructed prompt sent to the AI model, steering it toward a more accurate and current answer rather than relying solely on potentially outdated training data.
G06F 40/40 (2020.01) — Natural language processing; Semantic processing, e.g. classification, natural language processing, NLP. G06F 40/40 (2006.01) — same classification under the earlier edition of the CPC scheme.
Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka
Ready to Draft Your Next Patent with AI?
PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.
Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.
Ask anything about this patent. PatSnap Eureka searches patents and data to answer instantly.