Book a demo

Patent Drafting Analysis of QUALCOMM’s Hybrid Machine Learning Architecture with NPU and CIM Processing Elements | US 2023/0025068 A1

Patent Drafting Analysis of QUALCOMM’s Hybrid Machine Learning Architecture with NPU and CIM Processing Elements | US 2023/0025068 A1
IP Drafting Analysis · US 2023/0025068 A1

Patent Drafting Analysis of QUALCOMM's Hybrid NPU and Compute-In-Memory Neural Network Architecture | US 2023/0025068 A1

A structural and strategic analysis of QUALCOMM's hybrid CIM/NPU machine learning circuit patent, examining claim architecture, drafting quality signals, §101 eligibility posture, critical prosecution gaps, and competitive design-around vulnerabilities.

US 2023/0025068 A1Filed: Jul 20, 2022Published: Jan 26, 2023G06N 3/063G06N 3/04G06F 15/80
Spec Words
9,200
Across 6 sections
Draft now ↗
Total Claims
30
3 independent · 27 dependent
Draft now ↗
Figure Sheets
15
Neural nets, CIM/NPU circuits, hybrid architecture, dataflow
Draft now ↗
Published by PatSnap Insights Team · · 12 min read Verified by PatSnap Eureka Data
Overview

Structural Overview

The detailed description dominates at approximately 63% of total words (~5,800 of ~9,200), with a rich background section (~1,120 words) providing substantial context for the memory-wall problem motivating the hybrid CIM/NPU approach. The claim set comprises 30 claims total — 3 independent (Claims 1, 26, and 30) and 27 dependent — reflecting a tripartite structure spanning circuit, method, and processing system formats. The 15 figure sheets provide comprehensive coverage of neural network types, DCIM array architecture, CIM cell circuits, NPU architecture, hybrid dataflow, and an example device system.

Section Word Distribution

Detailed Desc. 5800 w Claims 2750 w Summary 890 w Background 1120 w Brief Desc. 820 w Abstract 145 w ↗ Click bars to explore

Figure Inventory — 15 Sheets

FigureDescriptionRole
FIG. 1A
Depicts a fully connected neural network (102) showing every node in a first layer communicating output to every node in the second layer.Search in Eureka ↗
Other
FIG. 1B
Illustrates a locally connected neural network (104) with local area nodes (110, 112, 114, 116) showing limited connectivity between layers.Search in Eureka ↗
Other
FIG. 1C
Shows a convolutional neural network (CNN) (106) with shared kernel weights (108) applied across spatial locations of an input image.Search in Eureka ↗
Other
FIG. 1D
Illustrates a deep convolutional network (DCN) (100) for visual feature recognition showing feature-extraction layers (118, 120) and classification layers (122, 124, 128) fed by a camera (130).Search in Eureka ↗
Other
FIG. 2
Depicts a traditional convolution operation where a 12×12×3-channel input image (202) is convolved with a 5×5×3 kernel (204) to produce an 8×8×1-channel feature map (206).Search in Eureka ↗
Other
FIG. 3A
Shows a depthwise separable convolution (spatial fusion) where a 12×12×3 input (302) is convolved with three 5×5×1 kernels (304A-C) to yield an 8×8×3 feature map (306).Search in Eureka ↗
Other
FIG. 3B
Illustrates a pointwise convolution (channel fusion) where an 8×8×3 input (306) is convolved with a 1×1×3 kernel (308) to produce an 8×8×1 feature map (310).Search in Eureka ↗
Other
FIG. 4
Block diagram of a digital compute-in-memory (DCIM) circuit (400) showing the CIM array (401) with word-lines (404), columns (406), adder trees (410), weight-shift adder tree (412), and activation-shift accumulator (416) with output.Search in Eureka ↗
Key embodiment
FIG. 5
Circuit diagram of an 8-transistor (8T) SRAM CIM cell (500) showing the cross-coupled inverter pair (524), pass-gate transistors (502, 518), write word-line (WWL, 504), read word-line (RWL, 508), and read bit-line (RBL, 522).Search in Eureka ↗
Key embodiment
FIG. 6
Block diagram of a neural processing unit (NPU) architecture (600) showing MAC units (612), adder tree (614), 32-bit accumulator register (616), digital post-processing logic (608), and Weight/Activation/Output TCM buses (602, 604, 610).Search in Eureka ↗
Key embodiment
FIG. 7A
Block diagram of a hybrid architecture (700) with DCIM PEs (702) and NPU PEs (703) sharing Weight TCM (706), Activation TCM (708), Output TCM (710), bus arbitration (712), digital post-processing logic (713), and global memory (704) via PE bus (716).Search in Eureka ↗
System architecture
FIG. 7B
Block diagram of a second hybrid architecture (750) featuring DCIM PEs (702) and NPU PEs (703) with FIFO circuits (728, 730), tile mappers (722, 724), bit-serial interleaver (726), weight buffer (720), and bus arbitration (712) for enabling pipelined data exchange.Search in Eureka ↗
System architecture
FIG. 8
Comparison table (800) contrasting DCIM TOPS versus NPU TOPS performance across eight input/depth/kernel combinations (000–111), showing relative strengths for light versus heavy workloads.Search in Eureka ↗
Other
FIG. 9
Data exchange block diagram (900) between a DCIM PE (902) and NPU PE (904) using digital post-processing (DPP) plus FIFO units (906, 908) in a hybrid architecture, illustrating pipelined first-output-batch generation.Search in Eureka ↗
Claim support
FIG. 10
Output-stationary mapping scheme (1000) for NPU PEs showing depth cycles in Z-space, kernel cycles, and parallel inputs (Input 1, Input 2) with corresponding kernel and output tensor dimensions (Ni, No, KX, KY).Search in Eureka ↗
Claim support
FIG. 11
Pseudo-weight-stationary mapping scheme (1100) for DCIM PEs showing parallel inputs, depth cycles with DCIM weight writes, partial sum registers (PS1, PSR1, PSW1) and output registers (OUT1, OUTN) across depth cycles 1 through 4.Search in Eureka ↗
Claim support
FIG. 12
Flow diagram (1200) of neural network processing operations showing two steps: processing data in a neural network circuit (block 1205) comprising CIM PEs, NPU PEs, and a bus; and transferring processed data between CIM PEs and NPU PEs via the bus (block 1210).Search in Eureka ↗
Flow diagram
FIG. 13
Block diagram of an electronic device (1300) integrating the hybrid neural network (1307) with NPU PEs (1308) and CIM PEs (1309) alongside CPU (1302), GPU (1304), DSP (1306), wireless connectivity (1312), sensors (1316), and memory (1324).Search in Eureka ↗
System architecture
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims

Claim Architecture Analysis

The claim set contains 3 independent claims: Claim 1 (circuit/apparatus), Claim 26 (method), and Claim 30 (processing system), providing tripartite coverage across hardware, process, and system formats. The dependent-to-independent ratio is 9.00:1, well above the semiconductor/AI hardware norm of 4–8:1, reflecting Qualcomm's deliberate strategy of building layered fallback positions through extensive claim narrowing. Notably, Claims 1 and 26 each attract roughly symmetric dependent chains addressing shared memory resources, TCM configuration, same-layer/adjacent-layer deployment, pseudo-weight-stationary mapping, DCIM configuration, output-stationary NPU mapping, bus arbitration, and FIFO circuit variants.

Core inventive concept: The claims address the trade-off between CIM's energy efficiency and NPU's throughput advantage by coupling a plurality of CIM processing elements and a plurality of NPU processing elements to a shared bus, enabling direct data transfer between them — as recited in Claim 1: "a bus coupled to the plurality of CIM PEs and to the plurality of NPU PEs." The inventive mechanism — transferring processed data between at least one CIM PE and at least one NPU PE via this shared bus without necessarily writing to global memory (Claims 10, 15, 27) — enables cascaded pipeline operation that avoids the memory-wall bottleneck of conventional separate CIM and NPU architectures.

Independent Claim Dissection

ClaimPreambleTransitionKey Body Elements
Claim 1A neural-network-processing circuitcomprising
a plurality of CIM processing elements (PEs); a plurality of NPU PEs; and a bus coupled to the plurality of CIM PEs and to the plurality of NPU PEsSearch prior art ↗
Claim 26A method for neural network processingcomprising
processing data in a neural-network-processing circuit comprising a plurality of CIM PEs, a plurality of NPU PEs, and a bus coupled to the plurality of CIM PEs and to the plurality of NPU PEs; and transferring the processed data between at least one of the plurality of CIM PEs and at least one of the plurality of NPU PEs via the busSearch prior art ↗
Claim 30A processing systemcomprising
a plurality of CIM PEs; a plurality of NPU PEs; a bus coupled to the plurality of CIM PEs and to the plurality of NPU PEs; a memory having computer-executable instructions stored thereon; and one or more processors configured to execute the computer-executable instructions stored thereon to transfer processed data between at least one of the plurality of CIM PEs and at least one of the plurality of NPU PEs via the busSearch prior art ↗

Claim Dependency Tree

1 Circuit comprising CIM PEs, NPU PEs, and a shared bus coupling both PE typesSearch Claim 1 prior art ↗
2 Adds: one or more shared memory resources coupled to both CIM PEs and NPU PEsSearch in Eureka ↗
3 Adds: shared memory resources comprise a tightly coupled memory (TCM)Search in Eureka ↗
4 Adds: TCM configured to store activations, weights, or outputsSearch in Eureka ↗
5 Adds: at least one CIM PE configured to transfer data to at least one NPU PE (dep. on claim 3)Search in Eureka ↗
6 Adds: CIM PE transfers data to NPU PE without writing to or reading from TCM (dep. on claim 5)Search in Eureka ↗
7 Adds: at least one NPU PE configured to transfer data to at least one CIM PE (dep. on claim 3-6)Search in Eureka ↗
8 Adds: NPU PE transfers data to CIM PE without writing to or reading from TCM (dep. on claim 7)Search in Eureka ↗
9 Adds: at least one CIM PE configured to transfer data to at least one NPU PE (dep. on claims 1-4)Search in Eureka ↗
10 Adds: CIM PE transfers data to NPU PE without writing to or reading from global memory (dep. on claim 9)Search in Eureka ↗
11 Adds: CIM PE is in a same neural network layer as the NPU PE (dep. on claim 9)Search in Eureka ↗
12 Adds: CIM PE is in a first layer and NPU PE is in a second, different layer (dep. on claim 9)Search in Eureka ↗
13 Adds: second neural network layer is adjacent to the first neural network layer (dep. on claim 12)Search in Eureka ↗
14 Adds: at least one NPU PE configured to transfer data to at least one CIM PE (dep. on claims 1-6, 9)Search in Eureka ↗
15 Adds: NPU PE transfers data to CIM PE without writing to or reading from global memory (dep. on claim 14)Search in Eureka ↗
16 Adds: NPU PE is in a same neural network layer as the CIM PE (dep. on claim 14)Search in Eureka ↗
17 Adds: NPU PE is in a first layer and CIM PE is in a second, different layer (dep. on claim 14)Search in Eureka ↗
18 Adds: second neural network layer is adjacent to the first neural network layer (dep. on claim 17)Search in Eureka ↗
19 Adds: CIM PEs configured as pseudo-weight-stationary PEs (dep. on claim 1)Search in Eureka ↗
20 Adds: CIM PEs configured as digital compute-in-memory (DCIM) PEs (dep. on claim 1)Search in Eureka ↗
21 Adds: NPU PEs configured as output-stationary PEs (dep. on claim 1)Search in Eureka ↗
22 Adds: bus arbitration logic coupled between the bus and both CIM PEs and NPU PEs (dep. on claim 1)Search in Eureka ↗
23 Adds: digital processing circuit coupled between bus arbitration logic and both PE types (dep. on claim 22)Search in Eureka ↗
24 Adds: first FIFO circuit between digital processing circuit and CIM PEs; second FIFO circuit between digital processing circuit and NPU PEs (dep. on claim 23)Search in Eureka ↗
25 Adds: first FIFO circuit between bus arbitration logic and CIM PEs; second FIFO circuit between bus arbitration logic and NPU PEs (dep. on claim 22)Search in Eureka ↗
26 Method: processing data in circuit with CIM PEs, NPU PEs, and bus; transferring processed data between CIM PE and NPU PE via busSearch Claim 26 prior art ↗
27 Adds: circuit further comprises global memory or TCM; transferring comprises bypassing global memory or TCM (dep. on claim 26)Search in Eureka ↗
28 Adds: digitally post-processing the data in a digital processing circuit before transferring via the bus (dep. on claim 26)Search in Eureka ↗
29 Adds: CIM PEs configured as pseudo-weight-stationary PEs; NPU PEs configured as output-stationary PEs (dep. on claim 26)Search in Eureka ↗
30 Processing system comprising CIM PEs, NPU PEs, bus, memory with instructions, and processor executing instructions to transfer data between CIM PE and NPU PE via busSearch Claim 30 prior art ↗
MetricThis ApplicationSemiconductor / AI Hardware Norm
Total claims3015 – 25
Independent claim count32 – 4
Dependent : Independent ratio9.00 : 14 – 8 : 1
Method claims present?Yes — Claim 26Common
System / apparatus claims?Yes — Claim 1Always
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality

Drafting Quality Signals

The drafting quality is strong in structural breadth — Claim 1's three-element circuit preamble is intentionally minimal, and the dependent chain from Claims 2–25 builds extensive layered fallback across memory-bypass, same-layer/cross-layer deployment, DCIM/pseudo-weight-stationary configuration, and FIFO buffering. The primary weakness is the absence of a computer-readable medium (CRM) claim type, which creates a litigation coverage gap that a competitor could exploit by distributing the hybrid architecture as firmware or a compiled FPGA bitstream rather than as a physical circuit or a running method.

Antecedent Basis
Antecedent basis is consistently maintained across all 30 claims. Claim 1 introduces "a plurality of CIM processing elements (PEs)," "a plurality of NPU PEs," and "a bus" using the indefinite article, and each subsequent dependent claim correctly references "the plurality of CIM PEs," "the plurality of NPU PEs," and "the bus" throughout Claims 2–25. Claim 26 independently re-introduces these elements in its method preamble, and Claims 27–29 depend on Claim 26 with correct "the" references. No orphaned antecedents or ambiguous referents were identified.
Spec–Claim Consistency
All three independent claim limitations map to specific figures and paragraphs. The CIM PE limitation in Claim 1 is directly supported by FIG. 4 (DCIM circuit 400) and ¶[0061]–[0071]; the NPU PE limitation is supported by FIG. 6 (NPU architecture 600) and ¶[0076]–[0077]; and the shared bus limitation is supported by FIG. 7A (PE bus 716) and ¶[0084]. The memory-bypass limitation in Claims 10 and 15 is supported by ¶[0101] and FIG. 12. No claim limitation lacks specification support.
Transition Word Usage
All independent claims and dependent claims use "comprising," the broadest open-ended transition, which is strategically appropriate for hardware circuit and system claims in this semiconductor art unit. The use of "comprising" in Claims 1, 26, and 30 correctly leaves open the possibility that the circuit or system contains additional processing elements (e.g., DSP blocks, CPUs) beyond the claimed CIM PEs and NPU PEs. No missed opportunity for a narrower "consisting of" fallback was identified, though none is needed given the open preamble strategy.
§112(f) Means-Plus-Function Risk
No means-plus-function language appears in the formal claims filed with this application. The spec's ¶[0147] explicitly states that "no claim element is to be construed under the provisions of 35 U.S.C. §112(f) unless the element is expressly recited using the phrase 'means for'" — a boilerplate protective statement. Claims 22–25 use "bus arbitration logic" and "digital processing circuit," which are structural noun phrases rather than "means for" formulations, though ¶[0146] notes that operations may have "corresponding counterpart means-plus-function components." This mild tension is unlikely to create §112(f) vulnerability given the explicit disavowal.
⚠️
§101 Eligibility Risk
Claims 1 and 30 recite concrete hardware elements — physical CIM PEs (implemented as SRAM cells per FIG. 5), physical NPU PEs with MAC units (FIG. 6), and a physical bus — providing a strong §101 defense under the machine-or-transformation test. However, Claim 30 includes a "memory having computer-executable instructions" and "one or more processors configured to execute" language that mirrors CRM-style claims and could invite an Alice analysis if the examiner characterizes the claimed configuration as merely a generic computer executing an abstract idea. Claim 26 (method) lacks any hardware tie-in recitation in the body beyond the circuit comprising language, leaving it somewhat vulnerable if a future examiner applies Mayo abstractly to the "processing data" step.
Dependent Claim Fallback Quality
The dependent claims provide genuinely distinct fallback positions covering six independent technical axes: shared memory type (Claims 2–4), data-transfer direction CIM→NPU and NPU→CIM (Claims 5–10, 14–15), same-layer versus cross-layer deployment (Claims 11–13, 16–18), CIM mapping strategy — pseudo-weight-stationary (Claim 19) versus DCIM (Claim 20), NPU mapping — output-stationary (Claim 21), and bus arbitration plus FIFO buffering variants (Claims 22–25). Claims 6 and 10 add particularly valuable fallback by narrowing to memory-bypass transfer, which is the core commercialization advantage of this architecture. The mirroring of this structure in Claims 27–29 for the method branch is efficient but adds some redundancy.
⚠️
Abstract Quality
The abstract states that "one example neural-network-processing circuit generally includes a plurality of CIM processing elements (PEs), a plurality of neural processing unit (NPU) PEs, and a bus coupled to the plurality of CIM PEs and to the plurality of NPU PEs" — accurately naming the hardware components. However, it omits the specific inventive mechanism that distinguishes this architecture from prior art: the ability to transfer processed data between CIM PEs and NPU PEs via the shared bus without writing to global memory (the limitation that appears in Claims 10, 15, and 27). An examiner reading only the abstract may characterize the invention as merely a parallel processor array rather than identifying the memory-bypass data transfer as the novel contribution.
Figure Support Quality
Figure support is comprehensive: every structural claim limitation has direct figure support. The CIM PE limitation maps to FIG. 4 (DCIM circuit) and FIG. 5 (8T SRAM cell); the NPU PE limitation maps to FIG. 6; the shared bus maps to FIG. 7A (PE bus 716) and FIG. 7B; the TCM shared memory maps to FIG. 7A (706, 708, 710); the bus arbitration logic maps to FIG. 7A (712) and FIG. 7B; the FIFO circuits map to FIG. 7B (728, 730); the digital post-processing circuit maps to FIG. 7A (713) and FIG. 9 (906, 908); and the output-stationary and pseudo-weight-stationary mappings are supported by FIGs. 10 and 11 respectively. FIG. 13 supports Claim 30 as a system-level embodiment.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard

Strategic Intent Scorecard

Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.

Claim Breadth
4.2
Prosecution Defensibility
4
Spec–Claim Consistency
4.5
Dependent Claim Coverage
4.3
Claim Type Diversity
3.5
Figure Support Quality
4.5
Breadth Prosecution Consistency Dep. Coverage Claim Types Figures
Key observation: Spec–Claim Consistency and Figure Support Quality are the strongest dimensions (both 4.5/5.0), reflecting Qualcomm's thorough mapping of every structural claim element to specific DCIM array figures (FIG. 4, 5), NPU architecture (FIG. 6), hybrid system diagrams (FIG. 7A/7B), and dataflow figures (FIG. 9–11). The weakest dimension is Claim Type Diversity (3.5/5.0): the absence of a computer-readable medium (CRM) claim — which the specification's ¶[0010] broadly describes as a "non-transitory, computer-readable media comprising instructions" — leaves a significant enforcement gap against competitors who distribute the hybrid NPU/CIM configuration as firmware images or FPGA bitstreams without manufacturing a physical circuit or executing the method themselves. Practitioners reviewing this filing should consider filing a continuation with explicit CRM claims and potentially narrowed apparatus claims directed to the DCIM-specific cell architecture of FIG. 5.
See how your own draft compares — Open Eureka IP Drafting →
Critical Gaps

3 Critical Gaps in This Claim Set

A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.

🔒

3 Critical Gaps in This Claim Set

See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.

No CRM claim filed Memory-bypass only in dependents No workload routing/scheduling claims
Unlock Full Analysis — Free
Frequently asked questions

US 2023/0025068 A1 — key questions answered

Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka

Ready to Draft Your Next Patent with AI?

PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.

Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.

Ask anything about this patent.
PatSnap Eureka searches patents and data to answer instantly.
Powered by PatSnap Eureka
Link copied to clipboard

Help us improve this page

Found incorrect or outdated information? Let us know and we'll get it fixed.