To start using PatSnap Eureka, click the verification button in the email we sent to .
This helps keep your account secure. Haven't received it? Check your spam folder.
Patent Drafting Analysis of Google LLC’s Photo Relighting Using Deep Neural Networks and Confidence Learning | US 12,136,203 B2
Patent Drafting Analysis of Google LLC’s Photo Relighting Using Deep Neural Networks and Confidence Learning | US 12,136,203 B2
IP Drafting Analysis · US 12,136,203 B2
Patent Drafting Analysis of Google LLC's Photo Relighting Neural Network with Confidence Learning | US 12,136,203 B2
A structural and strategic analysis of US 12,136,203 B2, examining claim architecture, drafting quality, critical gaps, and prosecution positioning for Google's confidence-learning-based photo relighting system.
US 12,136,203 B2Filed: Aug 22, 2023Granted: Nov 5, 2024G06T 5/00G06T 5/94G06T 15/50
Example images, CNN architecture, confidence learning, network diagrams
Draft now ↗
Published byPatSnap Insights Team · · 12 min read Verified by PatSnap Eureka Data
Overview
Structural Overview
The detailed description dominates at approximately 63% of total specification words (~6,200 of ~9,800), providing extensive technical coverage of the CNN architecture, confidence learning mechanism, and distributed computing infrastructure. The claim set comprises 20 claims — 3 independent and 17 dependent — spanning two method claim families (Claims 1 and 13) and one apparatus claim (Claim 20), with the dependent:independent ratio of 5.67:1 slightly above the software/AI industry norm. The 23 drawing sheets are predominantly photographic examples of relighting results (FIGS. 7–18), with key structural figures covering the CNN architecture (FIGS. 4–6), machine learning framework (FIG. 19), and computing infrastructure (FIGS. 20–22).
Section Word Distribution
↗ Click bars to explore
Figure Inventory — 23 Sheets
Figure
Description
Role
FIG. 1
Illustrates images 110, 120, 130 with imperfect lighting conditions including inconsistent lighting, multiple shadows, and dim/moody greenish tint on human faces.Search in Eureka ↗
Other
FIG. 2
Shows image 210 of a scene with multiple light sources alongside images 220, 230, 240, 250 of a face captured under different individual lighting conditions for OLAT training data.Search in Eureka ↗
Claim support
FIG. 3
Depicts indoor images 310, 320, 330 and outdoor images 340, 350, 360 representing various lighting conditions for training the convolutional neural network.Search in Eureka ↗
Claim support
FIG. 4
Diagram depicting training of fully-convolutional neural network 430, showing input original image 410 and target lighting model 420 flowing through the network to produce target image 440 and original lighting model 450.Search in Eureka ↗
System architecture
FIG. 5
Block diagram of convolutional neural network 430 showing encoder layers Orig L1–L4 (510–516), OLM information layers (520–524), TLM information layers (530–534), and decoder target layers (540–546) with skip connections.Search in Eureka ↗
System architecture
FIG. 6
Block diagram illustrating confidence learning 630, showing OLM Info 610 generating light prediction 640 at Patch 1, multiplied by prediction confidence value 650 to produce updated light prediction 660, with non-predicted normal direction 622 indicated.Search in Eureka ↗
Key embodiment
FIG. 7
Shows image 700 of a human face with groundtruth and predicted original lighting environment maps side by side, demonstrating CNN lighting model prediction accuracy.Search in Eureka ↗
Claim support
FIG. 8
Shows image 800 of a human face comparing groundtruth and predicted original lighting models, confirming light comes predominantly from behind the face.Search in Eureka ↗
Claim support
FIG. 9
Shows image 900 with groundtruth and predicted original lighting maps indicating two light sources from behind the face on both sides.Search in Eureka ↗
Claim support
FIG. 10
Displays original image 1010, ground-truth target image 1020, and predicted target image 1030 for a face originally backlit with uniform light, relighted to show three-source target lighting.Search in Eureka ↗
Claim support
FIG. 11
Displays original image 1110, ground-truth target image 1120, and predicted target image 1130 for a face with dim backlit lighting relighted with bright back-lighting.Search in Eureka ↗
Claim support
FIG. 12
Shows original image 1210, ground-truth target image 1220, and predicted target image 1230 where a face with stronger left-side backlit light is relighted with a single dominant large light source.Search in Eureka ↗
Claim support
FIG. 13
Shows original image 1310, ground-truth target image 1320, and predicted target image 1330 for a uniformly backlit face relighted with two right-dominant light sources.Search in Eureka ↗
Claim support
FIG. 14
Shows original image 1410, ground-truth target image 1420, and predicted target image 1430 for a face backlit with one large white source relighted with three mixed white/yellow light sources.Search in Eureka ↗
Claim support
FIG. 15
Shows original image 1510, ground-truth target image 1520, and predicted target image 1530 for a uniformly backlit face relighted with a single left-side light source.Search in Eureka ↗
Claim support
FIG. 16
Shows original image 1610, ground-truth target image 1620, and predicted target image 1630 for a face with left-dominant light relighted with a dominant right-side bright source.Search in Eureka ↗
Claim support
FIG. 17
Shows original image 1710, ground-truth target image 1720, and predicted target image 1730 for a right-lit face relighted with three light sources including two large backlighting sources.Search in Eureka ↗
Claim support
FIG. 18
Shows original image 1810, ground-truth target image 1820, and predicted target image 1830 for a face lit by one close white source relighted with three yellow light sources.Search in Eureka ↗
Claim support
FIG. 19
Diagram 1900 illustrating training phase 1902 and inference phase 1904 of trained machine learning model(s) 1932, showing training data 1910, ML algorithms 1920, inference requests 1940, and predictions 1950 with feedback loop 1960.Search in Eureka ↗
System architecture
FIG. 20
Depicts distributed computing architecture 2000 with server devices 2008, 2010, network 2006, and programmable devices 2004a–2004e including tablet, phone, wearable, and vehicle.Search in Eureka ↗
System architecture
FIG. 21
Block diagram of computing device 2100 showing processors 2103, data storage 2104 with computer-readable instructions 2106 and trained neural network model 2112, cameras 2118, sensors 2120, and power system 2122.Search in Eureka ↗
Key embodiment
FIG. 22
Depicts cloud-based server system with three computing clusters 2209a–2209c, each containing computing devices 2200, cluster storage arrays 2210, and cluster routers 2211 connected via network 2006.Search in Eureka ↗
System architecture
FIG. 23
Flowchart of method 2300 showing three steps: train neural network (2310), receive input image and lighting model data (2320), and determine output image using trained neural network (2330).Search in Eureka ↗
Flow diagram
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims
Claim Architecture Analysis
The patent presents 3 independent claims — Claims 1 and 13 are method claims (computer-implemented methods) and Claim 20 is an apparatus claim (computing device) — establishing coverage across method and system formats but notably omitting a computer-readable medium (CRM) claim type. The 17 dependent claims yield a 5.67:1 dependent-to-independent ratio, above the typical 4–5:1 norm for software/AI patents, indicating reasonable fallback layering. Claim 13's structure is notably distinct from Claim 1 in that it recites a trained neural network performing dual prediction — initial lighting model and relighting — without the training step, targeting the inference deployment scenario separately.
Core inventive concept: The claims address the technical problem of applying desired lighting models to existing photographs by training a neural network using confidence learning — where per-patch light predictions are mathematically combined with prediction confidence values — enabling the network to weight reliable predictions over less confident ones. As expressed across Claims 1 and 13, this involves jointly predicting an initial lighting model and a relighting of the object by replacing the initial lighting model with a target lighting model, using a network trained on datasets pairing images with corresponding lighting models indicative of environmental light source locations.
Independent Claim Dissection
Claim
Preamble
Transition
Key Body Elements
Claim 1
A computer-implemented method
comprising
receiving a training dataset of images each associated with a corresponding lighting model indicative of environmental light source locations; training a neural network by receiving input image and target lighting model, predicting an initial lighting model and a relighting replacing the initial with target; providing the trained neural networkSearch prior art ↗
Claim 13
A computer-implemented method
comprising
receiving, by a computing device, an input image and data about a target lighting model; predicting by a trained neural network (i) an initial lighting model indicative of environmental light source locations, and (ii) a relighting of the object applying the target lighting model, the neural network having been trained by the specified training process; providing, by the computing device, an output image comprising the relightingSearch prior art ↗
Claim 20
A computing device
comprising
one or more processors; data storage with computer-executable instructions that when executed cause: receiving input image and target lighting model data; predicting by trained neural network (i) initial lighting model and (ii) relighting of object applying target lighting model, neural network trained by specified process; providing output image comprising the relightingSearch prior art ↗
Claim Dependency Tree
1 Computer-implemented method — training neural network via confidence learning, receiving training dataset with lighting models, predicting initial lighting model and relightingSearch Claim 1 prior art ↗
5 Adds: training utilizes deep supervision technique to constrain one or more intermediate layersSearch in Eureka ↗
6 Adds: training is based on a generative adversarial net loss functionSearch in Eureka ↗
7 Adds: training utilizes confidence learning based on light predictions and prediction confidence values associated with lighting of the input imageSearch in Eureka ↗
8 Adds: the object comprises a reflection property that diffusely reflects lightSearch in Eureka ↗
10 Adds: training the neural network comprises training the neural network at the computing deviceSearch in Eureka ↗
11 Adds: initial lighting model, target lighting model, and given lighting model comprise data for color, intensity, albedo, direction, surface normal, or light source locationSearch in Eureka ↗
12 Adds: plurality of images comprise objects under plurality of different lighting conditions including different directions, intensities, colors, or numbers of light sourcesSearch in Eureka ↗
13 Computer-implemented method — inference-focused: receive input image and target lighting model, predict initial lighting model and relighting using pre-trained neural network, provide output imageSearch Claim 13 prior art ↗
14 Adds: the object comprises a reflection property that diffusely reflects lightSearch in Eureka ↗
16 Adds: relighting is modeled using the initial lighting model predicted by the trained neural network; method further includes providing the initial lighting modelSearch in Eureka ↗
17 Adds: providing the output image comprises determining request to apply target lighting model, sending request to second computing device with trained neural network, receiving output image from second computing deviceSearch in Eureka ↗
18 Adds: providing the output image comprises obtaining the trained neural network at the computing device and determining output using obtained neural networkSearch in Eureka ↗
19 Adds: computing device comprises an image capturing device; receiving the input image comprises capturing the input image using the image capturing deviceSearch in Eureka ↗
20 Computing device apparatus — processors and data storage with instructions to perform the method of Claim 13's functional scopeSearch Claim 20 prior art ↗
Metric
This Application
Software / AI Industry Norm
Total claims
20
15 – 25
Independent claim count
3
2 – 4
Dependent : Independent ratio
5.67 : 1
4 – 6 : 1
Method claims present?
Yes — Claims 1 & 13
Common
System / apparatus claims?
Yes — Claim 20
Common
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality
Drafting Quality Signals
The patent exhibits notable strengths in its specification-to-claim mapping — FIG. 6 and the confidence learning discussion directly support the key differentiating limitation in Claim 7 (confidence learning based on light predictions and confidence values), and the extensive figure set (FIGS. 7–18) provides rich visual support for the inference claims in Claim 13. A significant weakness is the absence of a computer-readable medium (CRM) claim, leaving a major design-around pathway open and creating enforcement gaps against parties who distribute the relighting software without operating a computing device.
✅
Antecedent Basis
Antecedent basis is consistently maintained throughout the 20 claims. In Claim 13, "a trained neural network" is introduced in the predicting step and properly referenced as "the neural network" in the training history clause. In Claim 20, "data storage" is introduced in the preamble and correctly recalled as "the data storage" in the body. No orphaned "the [element]" references were identified across dependent Claims 2–12 or 14–19.
Key claim limitations map clearly to specific figures and paragraphs. The confidence learning limitation recited in Claim 7 is directly supported by FIG. 6 (confidence learning 630, light prediction 640, prediction confidence value 650) and the detailed description columns 10–11. The dual-prediction structure of Claims 13 and 20 — predicting both initial lighting model and relighting — maps directly to FIG. 4 and column 6 discussing outputs 440 and 450. The lighting model data representations claimed in Claim 11 (color, intensity, albedo, direction, surface normal) are enumerated in the detailed description column 7.
All independent claims (1, 13, 20) use "comprising" as the transition, which is strategically optimal for this AI/image processing technology — it ensures the claims read on embodiments that include additional neural network components, loss functions, or image processing steps beyond those explicitly recited. The dependent claims also use "wherein" appropriately for limiting the parent claims rather than "consisting of," preserving maximum coverage at each fallback level. No missed opportunities for broader transitional language were identified.
Claim 20 of the additional clauses section (Clause 20 in the specification, not the formal Claim 20) recites "means for carrying out the computer-implemented method" which is classic §112(f) language — however, this appears only in the non-limiting additional clauses section, not in the formal claims as granted. The formal Claim 20 uses structural apparatus language ("one or more processors" and "data storage") which avoids §112(f) invocation. No means-plus-function language was identified in any of the 20 formal granted claims, so this risk is effectively managed.
Claims 1 and 13 face moderate Alice/Mayo exposure as computer-implemented methods directed to image processing using neural networks — an area the USPTO has found abstract in related cases. The §101 defense rests primarily on the hardware tie-in in Claim 20 ("one or more processors" and "data storage") and the "computing device" recitation in Claims 1 and 13. However, Claims 1 and 13 do not independently recite specific hardware beyond the computing device, which could be argued as a generic computer. The confidence learning mechanism (Claim 7) and dual simultaneous prediction of lighting model and relighting (Claims 13, 20) provide the strongest technical improvement arguments under the Enfish/McRO line of cases for surviving Alice at step 2A prong 2.
The dependent claims add genuinely distinct fallback positions across multiple dimensions: Claims 2–6 specify different loss functions (cycle loss, L2, log L1, GAN), which provides layered fallback against prior art using specific training regimes. Claim 7 adds the key confidence learning limitation, making it the single most valuable dependent claim for prosecution history estoppel management. Claims 11 and 12 add data representation specifics and training dataset diversity requirements respectively. However, Claims 8/14 and 9/15 are near-parallel duplicates across the two method claim families (diffuse reflection property and face-of-a-person), adding less distinct value.
The abstract accurately describes the apparatus and method structure and mentions confidence learning explicitly — "training of the neural network can utilize confidence learning that is based on light predictions and prediction confidence values" — which is the core differentiating feature. However, the abstract does not distinguish this invention from prior art single-image relighting work (e.g., the referenced Sunkavalli publications), omitting any mention of the dual simultaneous prediction of both the initial lighting model and the relighting output that distinguishes Claim 13. An examiner reading only the abstract might incorrectly categorize this as a straightforward CNN relighting patent rather than one focusing on confidence-weighted per-patch light prediction.
Figure support is comprehensive for all key claim limitations. The lighting model grid structure (16×32 cells per FIG. 4) supports Claim 11's data representation limitations. FIG. 5 supports the encoder-decoder architecture implicit in the trained neural network claims. FIG. 6 directly maps to Claim 7's confidence learning limitation. FIG. 19 supports Claims 10 and 14 regarding on-device training. FIG. 21 supports Claim 20's computing device architecture. One minor gap: no figure explicitly depicts the distributed request-response scenario in Claim 17, though FIGs. 20 and 22 provide general infrastructure support for this embodiment.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard
Strategic Intent Scorecard
Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.
Claim Breadth
3.5
Prosecution Defensibility
3.8
Spec–Claim Consistency
4.2
Dependent Claim Coverage
3.6
Claim Type Diversity
3
Figure Support Quality
4
Key observation: Spec–Claim Consistency scores highest (4.2/5) because every independent claim limitation — including the dual prediction structure of Claims 13 and 20, the confidence learning of Claim 7, and the lighting model data representations of Claim 11 — maps directly to named figures and specific paragraphs in the detailed description, providing strong §112 written description support. Claim Type Diversity scores lowest (3.0/5) because the patent omits a computer-readable medium (CRM) claim entirely — a gap that creates an immediate design-around opportunity for software distributors and limits Google's enforcement options against parties who distribute relighting software without operating hardware. A practitioner reviewing this patent should consider whether a continuation application to add CRM claims and potentially a system claim reciting the specific encoder-decoder architecture of FIG. 5 would meaningfully strengthen the portfolio position.
A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.
GAP 01 · HIGHEST IMPACT
No Computer-Readable Medium Claims Filed
The claim set entirely omits a computer-readable medium (CRM) or non-transitory computer-readable storage medium claim type, despite the specification explicitly describing such embodiments (see "Additional Example Embodiments" Clause 18: "An article of manufacture including one or more computer readable media"). This gap allows any party that distributes relighting software — as a downloadable app, SDK, or cloud API client — to potentially avoid infringement of the method claims (which require active execution) and the apparatus claim (which requires operating a computing device), creating an immediate design-around pathway. A stronger filing would have included at least one CRM independent claim in the formal claims, as the specification fully supports it, and would have added it as the third independent claim type alongside the two method claims.
GAP 02 · HIGH IMPACT
Confidence Learning Limited Only to Dependent Claim 7
The confidence learning mechanism — the patent's core technical differentiator over prior single-image relighting art — appears only in dependent Claim 7 (depending from Claim 1) and is not recited in either of the other two independent claims (Claim 13 or Claim 20). If Claim 1 is invalidated or narrowed during litigation, the only remaining confidence learning protection falls into the dependent claim hierarchy of Claim 1, leaving Claims 13 and 20 — the inference and apparatus claims — unprotected against competitors who specifically implement confidence-weighted per-patch light prediction. A stronger filing would have incorporated the confidence learning limitation into all three independent claims, or alternatively filed Claim 13 in a form that explicitly incorporates the neural network's confidence-learning training history.
GAP 03 · HIGH IMPACT
No System Claim on Distributed Training Architecture
Unlock to read the full analysis.
🔒
3 Critical Gaps in This Claim Set
See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.
Missing CRM claim typeConfidence learning only in Claim 7No distributed training system claim
US 12,136,203 B2 protects computer-implemented methods and computing devices for applying lighting models to images of objects using a trained neural network. The patent solves the technical problem of adjusting lighting in already-captured photographs by training a convolutional neural network with confidence learning — where per-patch light predictions are mathematically combined with prediction confidence values to weight reliable predictions — enabling the network to simultaneously predict the original lighting model of an input image and apply a target lighting model to generate a relighted output image.
US 12,136,203 B2 is assigned to Google LLC, located in Mountain View, California, US. The inventors are Tiancheng Sun (Mountain View, CA, US), Yun-Ta Tsai (Los Gatos, CA, US), and Jonathan Barron (Alameda, CA, US).
Claim 1 is a computer-implemented method covering the training phase — receiving a training dataset with lighting models, training a neural network to predict both an initial lighting model and a relighting of an image, and providing the trained neural network. Claim 13 is a computer-implemented method covering the inference phase — receiving an input image and target lighting model, using a pre-trained neural network to predict the initial lighting model and relighted output, and providing the output image. Claim 20 is an apparatus claim directed to a computing device with processors and data storage containing instructions to perform the functional equivalents of Claim 13's inference steps.
This patent covers AI-powered photo relighting technology — a system that can take an existing photograph of a person or object and change the lighting to make it look like the photo was taken under different lighting conditions. The key innovation is a confidence learning approach where the neural network learns to trust its own predictions more when they are reliable and less when they are uncertain, which leads to more accurate and natural-looking relighting results. This technology is applicable to portrait photography enhancement, virtual photography studios, and real-time camera applications on mobile devices.
Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.
Ask anything about this patent. PatSnap Eureka searches patents and data to answer instantly.