To start using PatSnap Eureka, click the verification button in the email we sent to .
This helps keep your account secure. Haven't received it? Check your spam folder.
Patent Drafting Analysis of Google LLC’s Efficient Robot Control Based on Inputs From Remote Client Devices | US 12,138,810 B2
Patent Drafting Analysis of Google LLC’s Efficient Robot Control Based on Inputs From Remote Client Devices | US 12,138,810 B2
IP Drafting Analysis · US 12,138,810 B2
Patent Drafting Analysis of Google LLC's Efficient Robot Control Based on Inputs From Remote Client Devices | US 12,138,810 B2
A structural and strategic analysis of Google's robot manipulation parameter prediction patent, covering claim architecture, drafting quality, ML-robotics eligibility risk, dependency tree integrity, and prosecution positioning across 18 claims.
US 12,138,810 B2Filed: Aug 11, 2023Granted: Nov 12, 2024B25J 9/16G05B 2219/36422G06N 3/008
System architecture, VR UI, flowcharts, robot hardware
Draft now ↗
Published byPatSnap Insights Team · · 12 min read Verified by PatSnap Eureka Data
Overview
Structural Overview
The detailed description dominates at approximately 63% of total words (~6,200 words), providing extensive support for the multi-layer ML training pipeline and confidence-threshold gating mechanism. The claim set comprises 18 claims — 3 independent (Claims 1, 9, 17) and 15 dependent — structured across method and system types with a 5:1 dependent-to-independent ratio. The 13 drawing sheets provide strong operational flow coverage via FIGS. 3–6 and illustrative UI coverage via FIGS. 2A–2F, though hardware block diagrams (FIGS. 7–8) serve primarily as generic computer architecture support.
Section Word Distribution
↗ Click bars to explore
Figure Inventory — 13 Sheets
Figure
Description
Role
FIG. 1A
System-level environment showing robots 170A/170B, robotic vision components 174A/174B, additional vision component 194, system 110 with prediction engine 112, visual representation engine 114, manipulation parameters engine 116, training data engine 143, and remote client device 130.Search in Eureka ↗
System architecture
FIG. 1B
Data flow diagram illustrating interactions among machine learning models 165, prediction engine 112, visual representation engine 114, manipulation parameters engine 116, training data engine 143, and remote client device 130 input/output path.Search in Eureka ↗
Flow diagram
FIG. 2A
VR rendering at remote client device showing robot simulation 270A, object representation 292A of spatula, and virtual buttons 282A1–282A3 for defining path to grasp pose, using predefined path, and saving defined path.Search in Eureka ↗
UI/interface
FIG. 2B
VR rendering showing robot simulation 270B with operator-defined waypoints 289B1, 289B2, grasp pose 289B3, and object representation 292B of spatula, illustrating waypoint-based path definition UI.Search in Eureka ↗
UI/interface
FIG. 2C
VR rendering showing robot simulation 270C with operator-defined grasp pose only (289C1) and object representation 292C, illustrating the simplified grasp-pose-only definition UI interaction.Search in Eureka ↗
UI/interface
FIG. 2D
2D touchscreen rendering of spatula object 292D showing swipe gesture interface 282D for defining antipodal grasp with contact points 289D1 and 289D2 on the object.Search in Eureka ↗
UI/interface
FIG. 2E
2D rendering of spatula 292E with predicted antipodal grasp indication 288E and confirmation/alternate-grasp UI elements 282E1, 282E2, illustrating predicted parameter confirmation workflow.Search in Eureka ↗
Claim support
FIG. 2F
Bounding-box object representation 292F of spatula with dashed bounding shapes and antipodal grasp contact points 289F1, 289F2, illustrating less-accurate but data-efficient object representation for low-bandwidth transmission.Search in Eureka ↗
Claim support
FIG. 3
Flowchart 300 illustrating the method of causing a robot to manipulate an object: receiving vision data (352), optionally generating predicted parameters (354), selecting remote client device (358), transmitting visual representation (360), receiving UI input data (362), determining manipulation parameters (364), and causing robot manipulation (366).Search in Eureka ↗
Flow diagram
FIG. 4
Flowchart 400 illustrating training instance generation: identifying manipulation parameters and vision data (452), generating success measures from sensors (454), generating training instances (456), updating prediction model parameters (458), and iterating.Search in Eureka ↗
Flow diagram
FIG. 5
Flowchart 500 for selectively utilizing trained prediction models: receiving vision data (552), selecting manipulation parameters (554), checking trained model availability (556), generating predicted parameters with confidence measures (560), gating on confidence thresholds (562), and causing robot manipulation (572).Search in Eureka ↗
Flow diagram
FIG. 6
Flowchart 600 illustrating prediction model training, validation, and deployment: training on operator-guided attempts (652), validating by comparing predictions to ground truth (656), deploying model (660), and optionally further training during deployment (662).Search in Eureka ↗
Flow diagram
FIG. 7
Block diagram of robot 725 architecture showing robot control system 760, operational components 740a–740n, and sensors 742a–742m including vision, pressure, proximity, and other sensors.Search in Eureka ↗
System architecture
FIG. 8
Block diagram of computing device 810 showing storage subsystem 824 (memory 825, ROM 832, RAM 830, file storage 826), processor(s) 814, network interface 818, and user interface input/output devices 822/820.Search in Eureka ↗
Other
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims
Claim Architecture Analysis
The patent presents 3 independent claims: Claims 1 and 9 are method claims (each covering a distinct configuration of the confidence-threshold gating and visual representation workflow), and Claim 17 is a system/apparatus claim; the absence of a computer-readable medium (CRM) claim is a notable structural gap. The 15 dependent claims yield a 5:1 ratio, which is at the lower end of the typical 4–8:1 norm for robotics/ML patents (B25J class) and provides moderate but not robust fallback coverage. The dual independent method claim structure (Claims 1 and 9 sharing significant overlap) suggests a prosecution-driven split rather than a strategic claim-type diversification.
Core inventive concept: The claims solve the problem of robotic idle time and excessive remote human operator input by using a machine learning model to generate predicted robot control parameters with a confidence measure, then selectively bypassing human confirmation (controlling the robot directly per the predicted parameter) when confidence satisfies a threshold, while falling back to transmitting a visual object representation and predicted parameter indication to a remote client device for human UI input when confidence is insufficient — as recited across Claims 1, 9, and 17.
Independent Claim Dissection
Claim
Preamble
Transition
Key Body Elements
Claim 1
A method
comprising
receiving vision data from vision components; generating via ML model a predicted parameter and confidence measure; determining if confidence satisfies threshold; if fails: transmitting object representation and visual indication of predicted parameter to remote client device (conditioned on lower-bound threshold); receiving UI input data from remote client device; causing robot to be controlled on received data; if satisfies threshold: causing robot to be controlled on predicted parameter without transmitting visual indication for confirmationSearch prior art ↗
Claim 9
A method
comprising
receiving vision data from vision components; generating via ML model a predicted parameter and confidence measure; determining if confidence satisfies threshold; generating visual representation for transmission to remote client device including object representation; determining whether to include visual indication of predicted parameter based on lower-bound threshold; transmitting visual representation; receiving UI input data; causing robot to be controlled on received data; if satisfies threshold: direct robot control without prior confirmation transmissionSearch prior art ↗
Claim 17
A system
comprising
one or more vision components; memory storing instructions; one or more processors operable to: receive vision data; generate via ML model predicted parameter and confidence measure; determine if confidence satisfies threshold; if fails: transmit object representation and visual indication (conditioned on lower-bound threshold) to remote client device; receive UI input data; cause robot to be controlled on received data; if satisfies threshold: cause robot control without transmitting visual indication for confirmationSearch prior art ↗
Claim Dependency Tree
1 Method: ML-predicted robot control parameter + confidence threshold gating; human UI input loop when confidence failsSearch Claim 1 prior art ↗
2 Adds: when lower-bound threshold also fails, transmit object representation without any visual indication of predicted parameterSearch in Eureka ↗
3 Adds: received data indicates confirmation of predicted parameter; robot controlled accordinglySearch in Eureka ↗
4 Further: confirmation UI element rendered alongside object representation; confirmation based on input directed to that elementSearch in Eureka ↗
5 Adds: received data indicates alternative parameter; robot controlled according to alternative parameterSearch in Eureka ↗
6 Further: alternate parameter UI element rendered; alternative parameter indicated by input directed to that elementSearch in Eureka ↗
7 Further: generating positive training instance based on alternative parameter; training ML model on positive training instanceSearch in Eureka ↗
8 Adds: both confirmation UI element and alternate parameter UI element rendered alongside object representation and visual indicationSearch in Eureka ↗
9 Method: ML-predicted robot control parameter + confidence threshold gating; visual representation generation with conditional inclusion of predicted parameter indicationSearch Claim 9 prior art ↗
10 Adds: visual representation lacks any visual indication of predicted parameter when lower-bound confidence threshold failsSearch in Eureka ↗
11 Adds: received data indicates confirmation of predicted parameter; robot controlled accordinglySearch in Eureka ↗
12 Further: confirmation UI element rendered alongside representation; confirmation based on input directed to that elementSearch in Eureka ↗
13 Adds: received data indicates alternative parameter; robot controlled according to alternative parameterSearch in Eureka ↗
14 Further: alternate parameter UI element rendered; alternative parameter indicated by input directed to that elementSearch in Eureka ↗
15 Further: generating positive training instance based on alternative parameter; training ML model on positive training instanceSearch in Eureka ↗
16 Adds: both confirmation UI element and alternate parameter UI element rendered alongside object representation and visual indicationSearch in Eureka ↗
17 System: vision components + memory + processors implementing the confidence-gated ML parameter prediction and selective human confirmation loopSearch Claim 17 prior art ↗
18 Adds: processors further transmit object representation without visual indication when lower-bound confidence threshold also failsSearch in Eureka ↗
Metric
This Application
Robotics / ML Software Industry Norm
Total claims
18
15 – 25
Independent claim count
3
2 – 4
Dependent : Independent ratio
5.00 : 1
4 – 8 : 1
Method claims present?
Yes — Claims 1, 9
Common
System / apparatus claims?
Yes — Claim 17
Common
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality
Drafting Quality Signals
The claim set demonstrates strong internal consistency between the three independent claims and the detailed description — the confidence-threshold gating mechanism recited in Claim 1 is directly mapped to FIG. 3 (blocks 354–356) and FIG. 5 (blocks 560–566), providing concrete written description support. However, the absence of a computer-readable medium (CRM) claim leaves a significant design-around corridor, and the functional language around 'confidence measure' in Claim 1 introduces potential §112(f) ambiguity that could attract examiner attention in any continuation prosecution.
✅
Antecedent Basis
The claims present clean antecedent basis throughout. In Claim 1, 'the predicted parameter' is properly anteceded by 'a predicted parameter for use in controlling a robot' introduced earlier in the same claim, and 'the confidence measure' is similarly anteceded by 'a confidence measure for the predicted parameter.' Claims 9 and 17 follow the same disciplined pattern. No orphaned 'the' references were identified across all 18 claims.
The core limitations of Claim 1 map directly to specific figures and paragraphs: the confidence-threshold gating corresponds to FIG. 5 block 562 and the detailed description at columns 25–26; transmitting the object representation with visual indication of the predicted parameter corresponds to FIG. 3 block 360A and FIG. 2E; and the direct robot control bypass corresponds to FIG. 3 block 366 and FIG. 5 block 566. The lower-bound threshold sub-limitation in Claim 1 is supported by the detailed description's two-threshold discussion at columns 11–12.
All three independent claims use 'comprising,' which is the strategically optimal open-ended transition for this technology class — permitting infringement by systems that include additional components (e.g., additional ML models, additional sensor types) beyond those recited. The specification discloses optional elements such as training instance generation and model retraining, which correctly remain outside the independent claim scope. No missed opportunity to use 'comprising' in a narrower-appropriate dependent claim was identified.
Claim 17's processor limitations recite functional steps ('receive,' 'generate,' 'determine,' 'transmit,' 'cause') without explicit structural definition, creating potential §112(f) exposure under the 'nonce word' doctrine — though this risk is mitigated by the explicit structural recitation of 'one or more vision components,' 'memory storing instructions,' and 'one or more processors.' The term 'machine learning model' in Claims 1, 9, and 17 is purely functional and lacks structural definition within the claims; an examiner could request §112(a) written description support, which is available in the spec but not incorporated by reference into the claims.
Claims 1 and 9 are method claims that include generating predictions using a 'machine learning model' and causing a robot to be controlled — the robot control element ('causing the robot to be controlled in the environment') is the primary §101 anchor. The physical transformation of a robot's state in a physical environment (manipulating objects) provides a meaningful hardware tie-in under Alice Step 2B. However, if an examiner characterizes the claims as abstract ideas (applying a ML model to data) with an insignificant post-solution activity (robot control), the two-threshold confidence gating structure — which is specifically technological — would be the strongest counterargument and should be emphasized during prosecution.
The dependent claim set shows significant structural mirroring: Claims 2–8 on Claim 1 and Claims 10–16 on Claim 9 are near-parallel, each covering the same fallback positions (no visual indication when lower-bound threshold fails, confirmation UI, alternative parameter UI, positive training instance generation). While this parallel structure provides prosecution flexibility, it does not add genuinely new technical fallback positions — Claims 3 and 11 are substantively identical (confirmation of predicted parameter), as are Claims 5/13, 6/14, and 7/15. Claim 7/15 (positive training instance generation from alternative parameter) adds meaningful technical value by tying the UI feedback loop to model retraining.
The abstract describes the general theme accurately ('utilization of user interface inputs, from remote client devices, in controlling robot(s)') but omits the specific novel contribution: the two-tier confidence threshold mechanism that selectively bypasses human confirmation. An examiner reading only the abstract would identify a human-in-the-loop robot control system but would not identify the confidence-gated autonomous bypass as the core novelty — the mechanism that distinguishes this patent from prior art human-in-the-loop approaches cited in the background.
The 13 drawing sheets provide strong coverage of the major claim limitations. The confidence-threshold gating (core of Claims 1, 9, 17) is supported by FIG. 5 (blocks 560–566) and FIG. 3 (block 360A). The visual representation with predicted parameter indication is directly illustrated in FIG. 2E. The object representation as bounding shapes (reducing transmission data) is illustrated in FIG. 2F. One gap exists: no figure explicitly depicts the 'lower bound threshold confidence measure' as a separate decision node from the primary threshold — it is described textually but the flowcharts show only a single threshold branch decision.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard
Strategic Intent Scorecard
Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.
Claim Breadth
3.5
Prosecution Defensibility
3.8
Spec–Claim Consistency
4.2
Dependent Claim Coverage
3
Claim Type Diversity
2.5
Figure Support Quality
3.8
Key observation: Spec–Claim Consistency scores highest (4.2/5.0) because every independent claim limitation — confidence threshold gating, object representation transmission, and direct robot bypass — maps to specific flowchart blocks (FIGS. 3, 5) and detailed description paragraphs, giving prosecution strong §112(a) written description footing. Claim Type Diversity scores lowest (2.5/5.0) because the filing lacks a computer-readable medium (CRM) claim entirely and relies on only method and system types, leaving a significant enforcement gap for software-only implementations and cloud-based deployment of the ML model. Practitioners should consider filing a continuation specifically to add CRM claims and apparatus claims directed to the remote client device subsystem rather than the overall system.
A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.
GAP 01 · HIGHEST IMPACT
No Computer-Readable Medium Claim Filed
The claim set contains only two method claims (1, 9) and one system claim (17), with no computer-readable medium (CRM) or non-transitory storage medium claim. This structural absence means a competitor distributing the robot control software as a standalone executable, cloud API, or firmware update — without also owning the vision hardware — cannot be directly captured by the granted claims. A stronger filing would have included at least one CRM independent claim mirroring the method limitations of Claim 1, which is standard practice in robotics-ML patents filed by comparable assignees.
Claims 1 and 9 are both method claims covering the same underlying confidence-gating mechanism, with the primary distinction being that Claim 9 explicitly recites generating a visual representation and determining whether to include the predicted parameter indication, while Claim 1 focuses on the transmitting step. This structural overlap reduces prosecution flexibility — if a prior art reference covers the core confidence-threshold mechanism, both independent claims face simultaneous rejection without a meaningful independent fallback. A stronger filing would have differentiated the two independent method claims along a technically distinct axis (e.g., Claim 9 covering the training instance generation loop of FIG. 4, or the multi-robot queue selection mechanism of FIG. 3 block 358).
GAP 03 · HIGH IMPACT
No Claim Directed to Remote Client Device Subsystem
Unlock to read the full analysis.
🔒
3 Critical Gaps in This Claim Set
See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.
US 12,138,810 B2 protects a method and system for controlling robots using a machine learning model that predicts object manipulation parameters (e.g., grasp pose, trajectory) from vision data, together with a confidence-threshold gating mechanism that automatically controls the robot when confidence is high, and selectively solicits human operator input via a remote client device visual interface when confidence is insufficient. The core protected innovation is the two-tier confidence threshold architecture that minimizes remote human input while reducing robot idle time.
US 12,138,810 B2 is owned by Google LLC, Mountain View, California, US. The inventors are Johnny Lee (Mountain View, CA, US) and Stefan Welker (Mountain View, CA, US).
Claim 1 is a method claim covering receiving vision data, generating a predicted robot control parameter and confidence measure via ML model, transmitting an object representation and visual parameter indication to a remote client device when confidence falls between two thresholds, receiving UI input, and controlling the robot accordingly — or controlling the robot directly on the predicted parameter when confidence satisfies the upper threshold. Claim 9 is a method claim covering a similar confidence-gated workflow but with emphasis on generating a visual representation and determining whether to include the predicted parameter indication based on the lower-bound confidence threshold. Claim 17 is a system claim reciting vision components, memory, and processors implementing the same confidence-gated control architecture.
This patent covers a robotic control system where an AI/machine learning model watches objects (via cameras) and predicts how a robot should grasp or move those objects. If the AI is very confident in its prediction, the robot acts automatically without asking a human. If the AI is less confident, it sends a picture of the object to a human operator on a remote device, shows them the AI's predicted action, and lets them confirm or correct it — and uses that human feedback to improve the AI over time. The result is a robot that can work faster with less human supervision while still getting human help when the AI is uncertain.
B25J 9/16 (2006.01) — Programmable robots. B25J 9/1689 (2013.01) — Programming robots using learning algorithms. B25J 9/163 (2013.01) — Programming robots by demonstration. B25J 9/1697 (2013.01) — Programming robots using machine learning. B25J 9/1612 (2013.01) — Programming robots by teleoperation.
Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka
Ready to Draft Your Next Patent with AI?
PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.
Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.
Ask anything about this patent. PatSnap Eureka searches patents and data to answer instantly.