To start using PatSnap Eureka, click the verification button in the email we sent to .
This helps keep your account secure. Haven't received it? Check your spam folder.
Patent Drafting Analysis of Tesla’s Vision-Only Object Distance Estimation via ML | US 10,956,755 B2
Patent Drafting Analysis of Tesla’s Vision-Only Object Distance Estimation via ML | US 10,956,755 B2
IP Drafting Analysis · US 10,956,755 B2
Patent Drafting Analysis of Tesla's Vision-Only Object Distance Estimation Using Machine Learning | US 10,956,755 B2
A structural and strategic analysis of Tesla's camera-only distance prediction patent, examining claim architecture, drafting quality, §101 exposure, critical gaps, and prosecution positioning across all 23 claims.
US 10,956,755 B2Filed: Feb. 19, 2019Granted: Mar. 23, 2021G06K 9/00G06T 7/70G06N 20/00
System block diagram, training/inference flow diagrams, sensor capture, object detection
Draft now ↗
Published byPatSnap Insights Team · · 12 min read Verified by PatSnap Eureka Data
Overview
Structural Overview
The detailed description dominates at approximately 79% of total words (~6,500 of ~8,200), providing extensive embodiment coverage across system components, training data generation, model deployment, and inference workflows. The claim set comprises 23 claims across 4 independent claims — one system (Claim 1), one CRM (Claim 17), and two method claims (Claims 19 and 23) — with 19 dependent claims yielding a 4.75:1 dependency ratio. Six drawing sheets cover system architecture, training data flow, inference flow, multi-sensor capture, and object detection output, with FIG. 6 directly supporting the core claim limitation of outputting distance from image-only input.
Section Word Distribution
↗ Click bars to explore
Figure Inventory — 6 Sheets
Figure
Description
Role
FIG. 1
Block diagram of the deep learning system 100 for autonomous driving, showing vision sensors 101, additional sensors 103, image pre-processor 105, deep learning network 107, AI processor 109, vehicle control module 111, and network interface 113.Search in Eureka ↗
System architecture
FIG. 2
Flow diagram illustrating the 5-step training data creation process: receive vision data (201), receive related data (203), identify objects (205), determine ground truth (207), and package training data (209).Search in Eureka ↗
Flow diagram
FIG. 3
Flow diagram showing the end-to-end ML lifecycle: prepare training data (301), train model (303), deploy model (305), receive sensor data (307), apply trained model (309), and control vehicle (311).Search in Eureka ↗
Flow diagram
FIG. 4
Flow diagram illustrating autonomous driving inference pipeline: receive sensor data (401), pre-process (403), initiate deep learning analysis (405), provide results to vehicle control (407), control vehicle (409), and transmit sensor/related data (411).Search in Eureka ↗
Flow diagram
FIG. 5
Overhead diagram of autonomous vehicle 501 with forward sensor 503 and right-side sensor 553, showing fields of view 509/559, distance vectors 513/523/563, and bounding boxes around neighboring vehicles 511, 521, and 561 used to generate training ground truths.Search in Eureka ↗
Key embodiment
FIG. 6
Analyzed vision data 601 from a forward-facing camera showing detected lane lines 603–609 and bounding boxes 611–619 for multiple vehicles, illustrating the inference-time output of the trained machine learning model predicting object distances from image data only.Search in Eureka ↗
Claim support
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims
Claim Architecture Analysis
The patent contains 4 independent claims: Claim 1 (system/apparatus), Claim 17 (CRM/computer program product), Claim 19 (method — training pipeline), and Claim 23 (method — on-vehicle inference), providing tripartite enforcement coverage. The 19 dependent claims against 4 independent claims yields a 4.75:1 ratio, which is slightly below the software/AI industry norm of 5–8:1, leaving some dependent claim real estate unused. The structural split between a training-focused method claim (Claim 19) and an inference-focused method claim (Claim 23) is a strategically sound choice, but both method claims share nearly identical preambles with the system claim, offering limited differentiation.
Core inventive concept: The claims address the high cost and complexity of emitting distance sensors (radar, lidar, ultrasonic) in autonomous vehicles by training a machine learning model using correlated camera images and emitting distance sensor outputs — then deploying a model that outputs object distance "using only the image data" (Claim 1) or "without using distance information from a first emitting distance sensor" (Claim 23), eliminating the need for dedicated distance sensors at inference time.
Independent Claim Dissection
Claim
Preamble
Transition
Key Body Elements
Claim 1
A system, comprising: one or more processors configured to…; and a memory coupled to the one or more processors
comprising
receive image data from camera of vehicle depicting object in surrounding environment; utilize image data as input to trained ML model outputting distance from vehicle to object using only image data; wherein trained ML model was trained using training image and correlated output of emitting distance sensor; memory coupled to processorsSearch prior art ↗
Claim 17
A computer program product, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for:
comprising
receiving image data from camera of vehicle depicting object in surrounding environment; utilizing image data as input to trained ML model outputting distance using only image data; wherein trained ML model was trained using training image and correlated output of emitting distance sensorSearch prior art ↗
Claim 19
A method, comprising:
comprising
receiving selected image from camera of vehicle; receiving distance data from emitting distance sensor; identifying object using selected image as input to trained ML model; extracting distance estimate from received distance data; creating training image by annotating selected image with extracted distance estimate; training second ML model to predict distance measurement using training data set including training image; providing trained second ML model to second vehicle with second cameraSearch prior art ↗
Claim 23
A method implemented by a processor included in a vehicle, the method comprising:
comprising
receiving image data from camera of vehicle depicting object in surrounding environment; utilizing image data as input to trained ML model outputting distance without using distance information from first emitting distance sensor; wherein trained ML model was trained using training image and correlated output of second emitting distance sensorSearch prior art ↗
Claim Dependency Tree
1 System: processors receive camera image data, use trained ML model to output object distance using only image data; model trained with emitting distance sensor ground truthSearch Claim 1 prior art ↗
2 Adds: trained ML model also outputs direction of object with respect to vehicleSearch in Eureka ↗
3 Adds: processors further configured to identify velocity vector of object with respect to vehicleSearch in Eureka ↗
7 Adds: object is one of pedestrian, vehicle, obstacle, barrier, or traffic control objectSearch in Eureka ↗
8 Adds: training image and correlated emitting distance sensor output captured on a training vehicleSearch in Eureka ↗
9 Further: emitting distance sensor output includes estimated distance and direction of second object identified in training image from training vehicleSearch in Eureka ↗
10 Further: training image is one of a time series of images captured using image sensor of training vehicleSearch in Eureka ↗
11 Further: second object identified at least in part by analyzing plurality of images in time seriesSearch in Eureka ↗
12 Further: time series of images used to disambiguate emitting sensor output for second object among plurality of detected objectsSearch in Eureka ↗
13 Adds: identified distance of object used to predict a semantic label associated with the objectSearch in Eureka ↗
14 Adds: trained ML model was trained to predict distance output of emitting distance sensor based on input imageSearch in Eureka ↗
15 Adds: ML model distance output combined with output of installed emitting distance sensor to determine relative location of objectSearch in Eureka ↗
16 Adds: trained ML model outputs plurality of distances corresponding to points on bounding box associated with the object as depicted in image dataSearch in Eureka ↗
17 CRM: non-transitory storage medium with instructions to receive camera image data and utilize as input to trained ML model outputting distance using only image data, trained with emitting distance sensor ground truthSearch Claim 17 prior art ↗
18 Adds: trained ML model outputs plurality of distances corresponding to points on bounding box of object as depicted in image dataSearch in Eureka ↗
19 Method (training pipeline): receive selected image and emitting distance sensor data, identify object via ML model, extract distance estimate, annotate training image, train second ML model, provide to second vehicleSearch Claim 19 prior art ↗
20 Adds: captured image is part of time series of images; distance data is part of time series of captured distance dataSearch in Eureka ↗
21 Further: tracking identified object across time series; extracted distance estimate based on correlating tracked object to received distance dataSearch in Eureka ↗
22 Adds: identified object is a detected vehicle or pedestrianSearch in Eureka ↗
23 Method (on-vehicle inference): processor in vehicle receives camera image data, utilizes input to trained ML model outputting distance without using distance from first emitting distance sensor; model trained using training image and correlated output of second emitting distance sensorSearch Claim 23 prior art ↗
Metric
This Application
Software / AI / Autonomous Vehicle Norm
Total claims
23
20 – 30
Independent claim count
4
3 – 5
Dependent : Independent ratio
4.75 : 1
5 – 8 : 1
Method claims present?
Yes — Claims 19, 23
Common
System / apparatus claims?
Yes — Claim 1
Common
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality
Drafting Quality Signals
The patent demonstrates strong structural coverage — four independent claim types (system, CRM, training method, inference method) provide broad enforcement reach, and the key 'using only the image data' limitation in Claims 1, 17, and 23 is directly supported by FIG. 6 and the detailed description. However, the dependent claims are heavily clustered on Claim 1 (Claims 2–16), leaving Claims 17 and 23 with only one dependent claim each, which creates shallow fallback positions for the CRM and on-vehicle inference branches.
✅
Antecedent Basis
Antecedent basis is well-maintained throughout the claim set. The phrase "the trained machine learning model" in Claim 1's body is properly introduced in the same claim's limitations. In Claim 9's "the second object" is introduced by "a second object" in the same claim. In Claim 19, "the identified object" is properly introduced by "identifying an object" and "the received distance data" tracks back to "receiving distance data." No antecedent basis violations were identified across all 23 claims.
The core limitation in Claims 1, 17, and 23 — that the trained ML model outputs distance "using only the image data" — maps directly to FIG. 6 (analyzed vision data 601 showing bounding boxes 611–619 with predicted properties) and the detailed description col. 3 ll. 1–5 and col. 13 ll. 1–10. The training pipeline limitations of Claim 19 (annotating image with distance estimate, training second model) are supported by FIG. 2 steps 201–209 and col. 8–12. The "time series" limitations of Claims 10–12, 20–21 are supported by col. 9 ll. 1–55. All independent claim limitations map to specific figures and paragraphs.
All four independent claims use "comprising," the broadest available transition word, which is strategically appropriate for a software/AI patent where competitors may implement the core concept with additional components. The open-ended "comprising" in Claim 1 allows infringement even when an accused system also includes a dedicated distance sensor at deployment, which is commercially significant given Tesla's own fleet use of radar alongside cameras. No "consisting of" language appears, avoiding unnecessary narrowing.
No "means for" or "step for" language appears in any of the 23 claims. Functional language such as "configured to receive," "configured to output," and "configured to identify" in Claim 1 is paired with structural recitation of "one or more processors" and "a memory coupled to the one or more processors," satisfying the structural disclosure requirement and avoiding §112(f) treatment. The specification provides extensive structural support for the processor/memory components through FIG. 1 (AI processor 109, deep learning network 107), further insulating against §112(f) challenges.
Claims 1 and 17 carry moderate Alice exposure because the core concept — using a neural network to predict distances from images — could be characterized as an abstract idea (mathematical model predicting sensor output) applied to a general processor/storage medium. The §101 defense relies on the hardware tie-in: "one or more processors" in Claim 1 and "a camera of a vehicle" limiting the input source, plus the vehicular control application context. However, the CRM claim (Claim 17) lacks explicit vehicle control output limitations, which weakens its concrete-application anchor relative to Claim 1. Claims 19 and 23, which recite a specific training pipeline and a processor-in-vehicle respectively, have stronger eligibility posture due to their concrete operational steps.
Nineteen of the 19 dependent claims depend from Claim 1 or Claim 9/10, leaving Claims 17 and 23 each with only one dependent claim (Claims 18 and none respectively — actually Claim 18 depends from 17, and Claim 23 has no dependents at all). This means if Claim 1 is invalidated, the CRM coverage of Claim 17 has only one fallback (Claim 18's bounding box limitation), and Claim 23 has zero fallback positions. Among Claims 2–16, substantively distinct fallbacks include Claim 13 (semantic label prediction), Claim 15 (combination with installed sensor), and Claim 16 (bounding box distances), but Claims 4–6 (sensor type specificity) add only minor differentiation against a determined invalidity challenger.
The abstract reads: "The trained machine learning model has been trained using a training image and a correlated output of an emitting distance sensor" — which correctly identifies the novel training mechanism. However, the abstract focuses on the system claim framing only and does not mention the key differentiating limitation that the model operates "using only the image data" at inference time, which is the commercially most significant aspect of the invention (eliminating the dedicated distance sensor at deployment). An examiner reading only the abstract could easily conflate this with standard sensor fusion approaches, potentially triggering broader prior art searches that miss the point of novelty.
Figure support is comprehensive for the independent claim limitations. FIG. 1 supports the processor/memory/camera system architecture of Claim 1. FIG. 5 directly supports the training data collection mechanism (emitting distance sensor correlated with training image) recited in Claims 1, 17, and 19. FIG. 6 supports the inference-time output of distance from image data only. The bounding box limitations of Claims 16 and 18 are supported by FIG. 6 (bounding boxes 611–619) and the detailed description at col. 15 ll. 20–30. The time series limitations of Claims 10–12 and 20–21 are textually supported but lack a dedicated figure showing the temporal tracking process, which is a minor gap.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard
Strategic Intent Scorecard
Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.
Claim Breadth
3.8
Prosecution Defensibility
3.5
Spec–Claim Consistency
4.2
Dependent Claim Coverage
2.8
Claim Type Diversity
4.5
Figure Support Quality
4
Key observation: Claim Type Diversity scores highest (4.5/5) because the patent covers all four relevant claim types — system, CRM, training method, and on-vehicle inference method — providing enforcement pathways against system manufacturers, software distributors, training pipeline operators, and vehicle operators alike. Dependent Claim Coverage scores lowest (2.8/5) because Claim 23 (the on-vehicle inference method) carries zero dependent claims and Claim 17 (the CRM) has only one, meaning that if either independent claim is successfully challenged, there are no fallback positions to narrow to — a significant litigation vulnerability practitioners should address in any continuation filing. The practical implication is that a continuation application adding 10–15 dependent claims to Claims 17 and 23 with limitations drawn from the detailed description (e.g., time-series disambiguation, semantic labeling, 3D bounding boxes) would substantially improve the portfolio's invalidity resilience.
A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.
GAP 01 · HIGHEST IMPACT
No Dependent Claims on Inference Method Claim 23
Claim 23 — the on-vehicle inference method claim, which recites the commercially most valuable embodiment (operating without a first emitting distance sensor) — has zero dependent claims. This means that a single successful invalidity challenge or claim construction ruling narrowing Claim 23 leaves Tesla with no fallback positions whatsoever for the method-in-vehicle claim type. A competitor designing around Claim 23 could potentially carve out of the single independent claim formulation with no secondary barriers. A stronger filing would have added 5–8 dependent claims to Claim 23 mirroring the structure of dependent Claims 2–16 from Claim 1, specifically including direction output (analogous to Claim 2), velocity vector identification (analogous to Claim 3), semantic label prediction (analogous to Claim 13), and bounding box distance outputs (analogous to Claim 16).
GAP 02 · HIGH IMPACT
Training Pipeline Claim 19 Lacks Fleet-Collection Limitations
Claim 19 covers the training pipeline broadly but does not capture Tesla's most commercially distinctive aspect: the automated fleet-scale data collection system described extensively in the specification (col. 4 ll. 33–50 and col. 17 ll. 1–25), where triggered inaccurate predictions automatically cause sensor data transmission to a training server. The absence of the fleet-collection mechanism and trigger-based data capture from any claim means a competitor could train a model using the same image-annotation-with-emitting-sensor methodology but via manual collection or small-scale testing, potentially practicing Claim 19 while being entirely outside Tesla's actual commercial method. A stronger filing would have included dependent claims on Claim 19 directed to automatic trigger-based data collection, anonymized fleet data aggregation, and time-series ground truth determination.
GAP 03 · HIGH IMPACT
No Apparatus Claim for Training Server System
Unlock to read the full analysis.
🔒
3 Critical Gaps in This Claim Set
See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.
Zero dependents on Claim 23Fleet data collection unclaimedTraining server apparatus not claimed
US 10,956,755 B2 protects a system, computer program product, and methods for estimating the distance of objects from a vehicle using only camera image data processed by a trained machine learning model, without requiring a dedicated emitting distance sensor (radar, lidar, or ultrasonic) at inference time. The core problem solved is reducing the cost and complexity of autonomous driving systems by training a neural network on paired camera images and emitting distance sensor outputs, then deploying the model to production vehicles that may lack those expensive sensors. The system uses a trained ML model — trained on annotated images correlated with radar/lidar ground truth — to output object distances directly from visual image data.
US 10,956,755 B2 is owned by Tesla, Inc., headquartered in Palo Alto, California, USA. The inventors are James Anthony Musk of San Francisco, California; Swupnil Kumar Sahai of Saratoga, California; and Ashok Kumar Elluswamy of Sunnyvale, California.
Claim 1 is a system claim covering one or more processors and coupled memory configured to receive camera image data and use a trained ML model to output object distance using only image data, where the model was trained with emitting distance sensor ground truth. Claim 17 is a computer-readable storage medium (CRM) claim with the same core limitations as Claim 1 embodied as stored computer instructions. Claim 19 is a training-pipeline method claim covering receiving a camera image and emitting distance sensor data, identifying objects, extracting distance estimates, annotating training images, training a second ML model, and deploying it to a second vehicle. Claim 23 is an on-vehicle inference method claim where a processor in a vehicle outputs distance from camera image data without using distance information from a first emitting distance sensor, using a model trained with a second emitting distance sensor's output.
This patent covers technology that allows a self-driving car to figure out how far away other objects are — like other cars, pedestrians, or obstacles — using only video or still images from its cameras, without needing an expensive radar or lidar sensor to measure distance directly. The trick is in how the system is trained: during development, a vehicle equipped with both cameras and radar/lidar sensors collects paired data, and the AI learns to replicate what the distance sensor would measure using only the camera images. Once trained, the AI model can be installed on production vehicles that don't have radar or lidar, significantly reducing hardware costs while maintaining accurate distance detection for autonomous driving.
G06K 9/00 (2006.01) — Data recognition; Presenting data; Record carriers; Handling record carriers — covers image data recognition and computer vision processing. G06T 7/70 (2017.01) — Image analysis: determining position or orientation of objects or cameras — covers the estimation of object distance and position from image data. G06N 20/00 (2019.01) — Machine learning — covers the application of machine learning techniques, including the training and deployment of the neural network models central to this patent.
Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka
Ready to Draft Your Next Patent with AI?
PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.
Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.
Ask anything about this patent. PatSnap Eureka searches patents and data to answer instantly.