To start using PatSnap Eureka, click the verification button in the email we sent to .
This helps keep your account secure. Haven't received it? Check your spam folder.
Patent Drafting Analysis of Tesla’s Vision-Based ML Model for Autonomous Driving with Adjustable Virtual Camera | US 2023/0057509 A1
Patent Drafting Analysis of Tesla’s Vision-Based ML Model for Autonomous Driving with Adjustable Virtual Camera | US 2023/0057509 A1
IP Drafting Analysis · US 2023/0057509 A1
Patent Drafting Analysis of Tesla's Vision-Based ML Model for Autonomous Driving with Adjustable Virtual Camera | US 2023/0057509 A1
A structural and strategic analysis of Tesla's adjustable virtual camera patent, covering claim architecture, drafting quality signals, critical gaps, and prosecution positioning across method, system, and CRM claim types.
US 2023/0057509 A1Filed: Aug 18, 2022Published: Feb 23, 2023G06V 20/58G06N 20/00
Vehicle system, ML model architecture, VRU/non-VRU networks, virtual camera projections, process flowchart
Draft now ↗
Published byPatSnap Insights Team · · 12 min read Verified by PatSnap Eureka Data
Overview
Structural Overview
The detailed description dominates at approximately 57% of total specification words (~3,900 of ~6,800), providing substantial embodiment support across 10 figures and 104 numbered paragraphs. The claim set comprises 20 claims — 3 independent and 17 dependent — covering method (Claim 1), system/CRM (Claim 10), and non-transitory computer storage media (Claim 19) formats. Figure coverage is thorough for system architecture and process flows but provides only schematic-level support for the virtual camera projection mechanism that is central to the inventive concept.
Section Word Distribution
↗ Click bars to explore
Figure Inventory — 10 Sheets
Figure
Description
Role
FIG. 1A
Block diagram of an autonomous vehicle (100) showing image sensors 102A–102F positioned about the vehicle with processor system 120.Search in Eureka ↗
Key embodiment
FIG. 1B
Block diagram of the processor system 120 determining object/signal information 124 from image information 122 via the Vision-Based Machine Learning Model Engine 126.Search in Eureka ↗
System architecture
FIG. 2
Block diagram of the vision-based ML model showing backbone networks 200 feeding into VRU network 210 and Non-VRU network 230 with their respective output attributes 212 and 232.Search in Eureka ↗
Key embodiment
FIG. 3A
Block diagram of the VRU network 210 showing Fixed Projection Engine 302, Frame Selector Engine 304, and Video Modules 308A-308B producing VRU Velocity 310 and VRU Detection 312 outputs.Search in Eureka ↗
Claim support
FIG. 3B
Block diagram illustrating an example panoramic view projection 322 associated with a 1.5-meter virtual camera height from image information 320 processed by VRU network 210.Search in Eureka ↗
Claim support
FIG. 4A
Block diagram of the Non-VRU network 230 showing Transformer Network Engine 402, Frame Selector Engine 404, Video Modules 408A-408C, and output heads for Non-VRU Velocity 410, Detection 412, and Attributes 414.Search in Eureka ↗
Claim support
FIG. 4B
Block diagram illustrating an example periscope projection view 422 associated with a 20-meter virtual camera height from image information 420 processed by Non-VRU network 230.Search in Eureka ↗
Claim support
FIG. 5
Block diagram showing the vision-based ML model 502 used in combination with a super narrow machine learning model 504, depicting combined processing branches and video module outputs.Search in Eureka ↗
Key embodiment
FIG. 6
Flowchart 600 illustrating the 5-step process (602–610) for identifying VRU and non-VRU objects using the vision-based ML model: obtain images, forward pass, project into virtual camera vector space, aggregate temporally indexed features, determine object/signal information.Search in Eureka ↗
Flow diagram
FIG. 7
Block diagram of vehicle 700 showing the hardware components: Electric Motor 702, Batteries 704, Propulsion System 706, Processor System 120, and Display 708.Search in Eureka ↗
Key embodiment
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims
Claim Architecture Analysis
The patent contains 3 independent claims: Claim 1 (method), Claim 10 (system — one or more processors and non-transitory computer storage media), and Claim 19 (non-transitory computer storage media), achieving tripartite coverage across method, system, and CRM formats. The dependent-to-independent ratio of 5.67:1 is at the low end of the norm for AI/software patent applications (typically 6–9:1), leaving some fallback depth below industry expectations. Notably, Claims 13–15 and Claims 4–6 add meaningful numerical range limitations on virtual camera height that create specific infringement triggers and fallback positions.
Core inventive concept: The claims address the problem of occlusion and inaccurate object detection in autonomous vehicles caused by reliance on a single fixed-height virtual camera perspective — solving it by projecting multi-sensor image features into a vector space associated with a virtual camera set at a "particular height" that is adjustable by object type (VRU vs. non-VRU), then aggregating those temporally indexed projected features across video modules before determining object positions via multiple heads of the machine learning model.
Independent Claim Dissection
Claim
Preamble
Transition
Key Body Elements
Claim 1
A method implemented by a vehicle processor system
comprising
obtaining images from a multitude of image sensors positioned about a vehicle; determining features via forward pass through first portion of ML model; projecting features into a vector space associated with a virtual camera at a particular height; aggregating projected features with prior image features via video modules; determining plurality of objects positioned according to virtual camera via model headsSearch prior art ↗
Claim 10
A system comprising one or more processors and non-transitory computer storage media storing instructions that when executed by the one or more processors, cause the processors to perform operations, wherein the system is included in an autonomous or semi-autonomous vehicle
comprising
obtaining images from multitude of image sensors; determining features via forward pass through first portion of ML model; projecting features via second portion of ML model into vector space associated with virtual camera at particular height; aggregating via plurality of video modules; determining plurality of objects via plurality of headsSearch prior art ↗
Claim 19
Non-transitory computer storage media storing instructions that when executed by a system of one or more processors which are included in an autonomous or semi-autonomous vehicle, cause the system to perform operations
comprising
obtaining images from multitude of image sensors; determining features via forward pass through first portion of ML model; projecting features via second portion of ML model into vector space associated with virtual camera at particular height; aggregating via plurality of video modules; determining plurality of objects via plurality of heads according to virtual cameraSearch prior art ↗
Claim Dependency Tree
1 Method claim — vehicle processor system obtains multi-sensor images, projects features into virtual camera vector space at particular height, aggregates via video modules, determines objects via model headsSearch Claim 1 prior art ↗
2 Adds: first portion includes individual backbone networks per image sensor, each determining portion of featuresSearch in Eureka ↗
3 Further: ML model includes first branch (attention network receiving aggregated backbone features) and second branchSearch in Eureka ↗
4 Adds: ML model includes first branch (associated with virtual camera at particular height) and second branch (associated with different virtual camera at different height)Search in Eureka ↗
6 Further: different height is less than 21 meters and greater than 2 meters; second branch identifies non-VRU objects and determines respective velocitiesSearch in Eureka ↗
7 Adds: features are projected into the vector space, and the vector space is warpedSearch in Eureka ↗
8 Further: vector space is three-dimensional and warped to enlarge a center of the vector spaceSearch in Eureka ↗
9 Further: vector space is three-dimensional and objects are elongated in the vector spaceSearch in Eureka ↗
10 System claim — processors and non-transitory storage media in autonomous/semi-autonomous vehicle, performing same core operations as Claim 1Search Claim 10 prior art ↗
11 Adds: first portion includes individual backbone networks per image sensor, each determining portion of featuresSearch in Eureka ↗
12 Further: ML model includes first branch (attention network receiving aggregated backbone features) and second branchSearch in Eureka ↗
13 Adds: ML model includes first and second branch; first branch associated with virtual camera at particular height; second branch at different heightSearch in Eureka ↗
15 Further: different height is less than 20 meters and greater than 2 metersSearch in Eureka ↗
16 Adds: features are projected into the vector space, and the vector space is warpedSearch in Eureka ↗
17 Further: vector space is three-dimensional and warped to enlarge a center of the vector spaceSearch in Eureka ↗
18 Further: vector space is three-dimensional and objects are elongated in the vector spaceSearch in Eureka ↗
19 CRM claim — non-transitory computer storage media in autonomous/semi-autonomous vehicle performing same core operations as Claims 1 and 10Search Claim 19 prior art ↗
20 Adds: ML model includes first and second branch; first branch at particular height less than 2 meters; second branch at different height less than 20 meters and greater than 2 metersSearch in Eureka ↗
Metric
This Application
Software / AI Industry Norm
Total claims
20
18 – 30
Independent claim count
3
3 – 5
Dependent : Independent ratio
5.67 : 1
5 – 9 : 1
Method claims present?
Yes — Claim 1
Common
System / apparatus claims?
Yes — Claim 10
Common
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality
Drafting Quality Signals
The independent claims (1, 10, 19) exhibit strong structural parallelism and good hardware anchoring through the vehicle processor system and image sensor array, reducing §101 Alice risk. However, several key technical terms — including "particular height," "video modules," and "plurality of heads" — lack explicit structural definitions in the claims, creating potential §112(a) written description vulnerability and prosecution risk if prior art surfaces at similar height ranges.
✅
Antecedent Basis
Antecedent basis is generally well-maintained across all 20 claims. Key elements such as "a multitude of image sensors" (introduced in each independent claim) are correctly followed by "the images" and "the features" in subsequent limitations. The dependent claims consistently refer back to "the machine learning model" and "the virtual camera" with proper antecedent basis established in Claims 1, 10, and 19 respectively. No orphan "the" references were identified in a full claim scan.
Strong mapping exists between the independent claims and the specification. FIG. 6 (flowchart steps 602–610) directly maps to the five method steps of Claim 1. FIG. 3A maps to the "first portion" backbone network and frame selector engine limitations. FIGS. 3B and 4B visually demonstrate the "particular height" virtual camera projection limitation in Claims 1, 10, and 19. Paragraphs [0064]–[0068] provide detailed written description of the fixed projection engine and frame selector engine that support the projection and aggregation steps.
All three independent claims correctly use "comprising" as the open transition, which is strategically appropriate for this AI/autonomous driving technology space where additional hardware components (e.g., LiDAR, radar) in a competitor's system would not negate infringement. There is no inappropriate use of "consisting of" or "consisting essentially of" that would unnecessarily limit scope. The dependent claims similarly use "wherein" for additional limitations, which is the correct approach for method and system refinements.
No explicit "means for" language appears in the claims, but functional labels such as "a plurality of video modules" and "a plurality of heads of the machine learning model" in Claims 1, 10, and 19 risk §112(f) treatment if an examiner reads them as functional claiming without structural context. While the specification provides structural support for "video modules" at paragraphs [0070]–[0071] and FIG. 3A, the claim language itself does not recite structural parameters. A stronger drafting approach would have included at least one structural characteristic (e.g., "convolutional neural network video modules") to eliminate this ambiguity.
The claims provide a strong §101 defense because all three independent claims anchor the operations to a specific hardware system: Claim 1 recites "a vehicle processor system," Claim 10 explicitly recites "one or more processors and non-transitory computer storage media" included in "an autonomous or semi-autonomous vehicle," and Claim 19 mirrors Claim 10's hardware tie-in. The obtaining step also grounds the claims in physical image sensors. Under the USPTO's Alice/Mayo framework, this hardware-specific recitation combined with the technical problem (reducing sensor complexity while improving object detection accuracy) provides a credible "practical application" argument sufficient to overcome §101 rejection.
The dependent claim set is structurally repetitive between the method and system/CRM branches: Claims 2–9 largely mirror Claims 11–18 and Claim 20, adding no genuinely new fallback positions across claim types. Claims 5 and 14 (particular height less than 2 meters) and Claims 6 and 15 (different height range with non-VRU specification) add meaningful numerical range limitations. However, Claims 7–9 and 16–18 (vector space warping/elongation) are virtually identical across claim types, representing drafting repetition rather than strategic fallback depth. A stronger filing would have included dependent claims directed to specific backbone network architectures, training methods, or CIPV-specific detection scenarios described at length in the specification.
An examiner reading the abstract would correctly identify the broad subject matter (multi-sensor image processing for autonomous driving with virtual camera projection) but the abstract fails to highlight the specific novel mechanism — the adjustable virtual camera height that differs between VRU and non-VRU branches. The abstract states "The features are projected into a vector space associated with a virtual camera at a particular height" but omits the critical dual-height, dual-branch architecture that distinguishes this invention from prior art single-height projection systems. This omission could cause a search examiner to under-search the relevant prior art landscape.
Figure support is strong for the system architecture and process flow aspects of the claims. FIG. 6 maps to all five steps of Claims 1, 10, and 19 with direct correspondence to blocks 602–610. FIGS. 3A and 4A support the backbone network, frame selector, and video module limitations. FIGS. 3B and 4B directly illustrate the virtual camera projection at 1.5 meters and 20 meters respectively, supporting the height limitations in Claims 5, 6, 14, and 15. One gap is the absence of a figure specifically showing the vector space warping mechanism recited in Claims 7–9 and 16–18 — only textual description in [0066] and [0080] supports these limitations.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard
Strategic Intent Scorecard
Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.
Claim Breadth
3.5
Prosecution Defensibility
3.8
Spec–Claim Consistency
4.2
Dependent Claim Coverage
2.8
Claim Type Diversity
4.5
Figure Support Quality
4
Key observation: Claim Type Diversity scores highest (4.5/5.0) because Tesla's tripartite filing structure across method (Claim 1), system/CRM (Claim 10), and non-transitory media (Claim 19) provides enforcement coverage against direct infringers operating the vehicle system, manufacturers supplying the processor system, and software providers distributing the model. Dependent Claim Coverage scores lowest (2.8/5.0) because the 17 dependent claims are structurally repetitive across the three independent branches rather than adding distinct technical limitations — Claims 2–9 and 11–18 are near-mirror images of each other, meaning an invalidity finding on any single dependent limitation would neutralize fallback positions across all three independent claim branches simultaneously. Practitioners should consider filing a continuation with dependent claims directed to the transformer network architecture, the CIPV detection use case, and the HDR image preprocessing pipeline described at length in the specification but entirely absent from the current claim set.
A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.
GAP 01 · HIGHEST IMPACT
No Apparatus Claim for the Vehicle Processor System Hardware
Claim 10's preamble recites "a system comprising one or more processors and non-transitory computer storage media" but does not independently claim the vehicle processor system 120 as a standalone apparatus with its structural hardware components (matrix processors, BiFPN networks, backbone networks). This means a manufacturer supplying only the physical processor hardware without the software instructions would not directly infringe any claim. A competitor could design around by arguing the system claim requires both processors AND storage media with the software loaded, while a pure hardware supply would escape. A stronger filing would have added an apparatus claim directed to the vehicle processor system 120 itself, reciting the backbone network architecture and transformer engine as structural hardware components.
GAP 02 · HIGH IMPACT
"Particular Height" Undefined — Design-Around via Height Selection
Claims 1, 10, and 19 recite projecting features into a vector space associated with a virtual camera "at a particular height" without defining any height range or differentiating between VRU and non-VRU branches at the independent claim level. The dual-height, dual-branch architecture that distinguishes this invention is only introduced in dependent Claims 4–6 and 13–15, meaning a competitor using a single adjustable height or a slightly different height range outside those numerical limits may avoid the dependent claims while still performing the independent claim's projection step. A stronger filing would have included the dual-height, dual-branch mechanism in the independent claims, or added additional intermediate dependent claims with overlapping height ranges to close the design-around corridor.
GAP 03 · HIGH IMPACT
No Training Method Claims for End-to-End ML Model
Unlock to read the full analysis.
🔒
3 Critical Gaps in This Claim Set
See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.
No standalone hardware apparatus claimUndefined virtual camera height in independent claimsNo training method claims for ML model
US 2023/0057509 A1 protects a vision-based machine learning model for autonomous driving that uses multiple image sensors positioned about a vehicle and projects extracted image features into a vector space associated with a virtual camera at an adjustable height. The patent covers method, system, and non-transitory computer storage media claim types. The specific technical problem solved is reducing reliance on costly and error-prone non-visual sensors (radar, LiDAR) by using adjustable virtual camera heights to separately optimize object detection for vulnerable road users (pedestrians, at low heights ~1.5m) and non-vulnerable road users (vehicles, at higher heights ~15-20m).
US 2023/0057509 A1 is owned by Tesla, Inc., headquartered in Austin, Texas, US. The listed inventors are John Emmons (Austin, TX), Danny Hung (Austin, TX), Ethan Knight (Austin, TX), and Lane McIntosh (Austin, TX).
Claim 1 is a method claim covering a vehicle processor system that obtains multi-sensor images, determines features via a forward pass through a first ML model portion, projects features into a virtual camera vector space at a particular height, aggregates temporally indexed projected features via video modules, and determines object positions via model heads. Claim 10 is a system claim (processors and non-transitory storage media in an autonomous vehicle) performing the same core operations. Claim 19 is a non-transitory computer storage media (CRM) claim storing instructions for an autonomous vehicle system to perform the same operations.
This patent covers a software system that helps autonomous vehicles 'see' and understand their surroundings using only cameras — without needing radar or LiDAR sensors. The system takes images from multiple cameras placed around the vehicle and uses a machine learning model to combine all those images into a single simulated viewpoint from a virtual camera placed at a chosen height above the vehicle. By adjusting the virtual camera height — lower for detecting pedestrians and cyclists, higher for detecting other vehicles — the system can more accurately identify and track objects around the car. This height-adjustable approach reduces manufacturing complexity while improving detection accuracy.
G06V 20/58 (2006.01) — Image or video recognition or understanding relating to traffic or road conditions, specifically scene understanding for autonomous vehicles. G06N 20/00 (2019.01) — Machine learning, covering computational methods and systems for learning-based processing of data.
Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka
Ready to Draft Your Next Patent with AI?
PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.
Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.
Ask anything about this patent. PatSnap Eureka searches patents and data to answer instantly.