Book a demo

Patent Drafting Analysis of Origin Research Wireless’s Radio-Based Voice Activity Detection System | US 12,272,375 B2

Patent Drafting Analysis of Origin Research Wireless’s Radio-Based Voice Activity Detection System | US 12,272,375 B2
IP Drafting Analysis · US 12,272,375 B2

Patent Drafting Analysis of Origin Research Wireless's Radio-Based Voice Activity Detection System | US 12,272,375 B2

A structural and strategic analysis of US 12,272,375 B2, examining claim architecture, drafting quality, dependent claim fallback coverage, §101 eligibility exposure, and prosecution positioning for this mmWave-based voice activity detection patent.

US 12,272,375 B2Filed: Oct 4, 2022Granted: Apr 8, 2025G10L 25/78G10L 25/18G10L 25/30G10L 25/90H01Q 3/22H04W 4/021
Spec Words
38,500
Across 6 sections
Draft now ↗
Total Claims
18
2 independent · 16 dependent
Draft now ↗
Figure Sheets
32
System diagrams, radar maps, neural networks, performance charts
Draft now ↗
Published by PatSnap Insights Team · · 14 min read Verified by PatSnap Eureka Data
Overview

Structural Overview

The detailed description dominates at approximately 85% of total words (~34,000 words), reflecting a highly expansive specification that incorporates broad background technical boilerplate characteristic of this applicant's patent family. The claim set comprises 18 claims total — 2 independent (Claims 1 and 18 covering system and method respectively) and 16 dependent claims — yielding an 8:1 dependent-to-independent ratio. The 32 drawing sheets provide extensive visual coverage spanning system overviews, radar feature maps, neural network architectures, signal spectrograms, and performance comparison charts.

Section Word Distribution

Detailed Desc. 34000 w Claims 6800 w Summary 3400 w Background 4800 w Brief Desc. 2700 w Abstract 700 w ↗ Click bars to explore

Figure Inventory — 32 Sheets

FigureDescriptionRole
FIG. 1
Overview of speech enhancement and separation (RadioSES) system showing mmWave sensing and acoustic sensing modules feeding a Speech Enhancement and Separation block outputting per-person audio streams.Search in Eureka ↗
System architecture
FIG. 2
Block diagram of RadioSES system 200 showing mmWave radar (Tx 211, Rx 212), smart speaker 221, source detection and localization block 201, and RadioSESNet deep learning module 202.Search in Eureka ↗
Key embodiment
FIG. 3
Constant false alarm rate (CFAR) window in range-azimuth space showing cell under test, guard cells, and training cells for target detection.Search in Eureka ↗
Claim support
FIG. 4
Amplitude map of range-azimuth plane for radio feature extraction showing two speaker locations at approximately 1 m and 1.5 m range.Search in Eureka ↗
Claim support
FIG. 5
Variance map for radio feature extraction showing the variance of channel impulse response at each range-azimuth bin to identify stationary versus dynamic objects.Search in Eureka ↗
Claim support
FIG. 6
Detection map for radio feature extraction showing binary detection output at specific range-azimuth bins corresponding to two speaker locations.Search in Eureka ↗
Claim support
FIG. 7
DBSCAN clustering output showing localized positions of Person I and Person II in range-azimuth space used for speaker number estimation.Search in Eureka ↗
Claim support
FIG. 8A
Unimodal (audio-only) RadioSESNet architecture showing encoder, masker with time-frequency representation, multiplicative masks, and dual decoders producing per-speaker audio outputs.Search in Eureka ↗
Key embodiment
FIG. 8B
Multimodal (audio-radio) RadioSESNet architecture extending FIG. 8A with a parallel radio encoder branch feeding radio features into the masker alongside audio features.Search in Eureka ↗
Key embodiment
FIG. 9
Detailed RadioSESNet masker structure showing adaptive encoders for radio and audio, Radio/Audio DPRNN blocks, Fusion+DPRNN(×4) multimodal masker, and adaptive decoders outputting Audio Mask 1 and Audio Mask 2.Search in Eureka ↗
Key embodiment
FIG. 10
DPRNN block workflow showing input reshaping operation, intra-block BLSTM(T) processing, inter-block BLSTM(S) processing, and output block with normalization layers.Search in Eureka ↗
Key embodiment
FIG. 11
Learning curves comparing audio-radio (AR) system versus audio-only (AO) system across training epochs for clean and noisy separation validation sets.Search in Eureka ↗
Other
FIG. 12
Scatter plot comparing output SiSDR of audio-radio versus audio-only systems across varying input SiSDR levels showing consistent performance gains of AR approach.Search in Eureka ↗
Other
FIG. 13
Differential gain (ΔSiSDR) scatter plot showing relative improvement of audio-radio over audio-only system with median improvement line across input SiSDR values.Search in Eureka ↗
Other
FIG. 14A
Experimental setup showing distances (50 cm, 75 cm, 100 cm) from collocated microphone and radar device to seated speakers for separation performance evaluation.Search in Eureka ↗
Other
FIG. 14B
Orientation variation experimental setup showing speaker at 75 cm distance from device at angle θ relative to radar boresight.Search in Eureka ↗
Other
FIG. 14C
Head orientation variation experimental setup showing speaker at 50 cm rotating head at angle θ to test robustness of system to head movement.Search in Eureka ↗
Other
FIG. 15
Block diagram of Bot 1500 (Type 1 / transmitter device) showing processor 1502, memory 1504, transceiver 1510, synchronization controller 1506, power module 1508, and wireless signal generator 1522.Search in Eureka ↗
System architecture
FIG. 16
Block diagram of Origin 1600 (Type 2 / receiver device) showing processor 1602, memory 1604, transceiver 1610, synchronization controller 1606, channel information extractor 1620, and motion detector 1622.Search in Eureka ↗
System architecture
FIG. 17
Flow chart of method 1700 for radio-assisted signal estimation: obtain baseband mixture signal, obtain radio feature, construct adaptive filter, filter mixture signal, generate source estimation.Search in Eureka ↗
Flow diagram
FIG. 18
System 1800 for radio-assisted signal estimation showing sensor 1810, transmitter 1820 sending first radio signal through wireless channel 1840, receiver 1830 receiving second radio signal, and processor 1835.Search in Eureka ↗
Key embodiment
FIG. 19
First adaptive filter 1900 architecture with first baseband filter 1910, second baseband filter 1920, third filter 1930, and fourth filter 1940 producing first output signal 1909.Search in Eureka ↗
Key embodiment
FIG. 20
Detailed diagram of first adaptive filter 1900 showing first pre-processing 2011, first transformation 2012, transformed-domain filters, third pre-processing 2031, fourth transformed-domain filter 2041, and post-processing 2043.Search in Eureka ↗
Key embodiment
FIG. 21A
Microphone spectrogram showing frequency components of background noise, target speech, and interference signals over a 20-second recording period.Search in Eureka ↗
Claim support
FIG. 21B
Radio spectrogram showing frequency components up to 400 Hz captured by mmWave radar, demonstrating radio-only captures target speaker vibration without background noise or interference.Search in Eureka ↗
Claim support
FIG. 21C
Comparative VAD output timeline showing Audio-VAD and Silero-VAD triggering false alarms during interference while Radio-VAD correctly detects only the target speaker's voice activity.Search in Eureka ↗
Claim support
FIG. 22
Overview diagram of VAD system 2200 with smart device 2210 (radio + mic), mmWave-based VAD 2220, and microphone recording and processing subsystem 2230 showing sound vs. vibration sensing paths.Search in Eureka ↗
System architecture
FIG. 23
Neural network architecture for radio-based VAD showing Conv32@[2×16] encoder, BiLSTM(dim=1)×4 blocks, LSTM(dim=2) inter-block, fully connected layers, overlap-and-add, downsample, producing VAD output.Search in Eureka ↗
Key embodiment
FIG. 24A
Bar chart comparing Radio-VAD, Audio-VAD, and Silero VAD on accuracy, precision, recall, and F1-score metrics in test set I (closed condition — seen users).Search in Eureka ↗
Other
FIG. 24B
Bar chart comparing Radio-VAD, Audio-VAD, and Silero VAD on accuracy, precision, recall, and F1-score metrics in test set II (open condition — unseen users).Search in Eureka ↗
Other
FIG. 25
Audio-radio multimodal VAD framework showing dual encoders for radio and audio inputs, concatenation of modalities, DPRNN block, and decoder producing VAD output.Search in Eureka ↗
Key embodiment
FIG. 26
Flow chart of method 2600 for radio-based VAD: obtain radio signal through wireless channel (2602), compute time series of CI (2604), detect voice activity without any other signal (2606).Search in Eureka ↗
Flow diagram
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims

Claim Architecture Analysis

The patent contains 2 independent claims — Claim 1 (system) and Claim 18 (method) — with 16 dependent claims providing a 8:1 dependent-to-independent ratio, significantly above the typical norm of 4–8:1 for wireless communications patents, suggesting meaningful layering of fallback positions. Both independent claims are structured with a transmitter/receiver/processor triumvirate and recite a distinctive VAD limitation — detecting voice activity of a target source 'without using any media signal' — establishing the interference-resilient, radio-only detection as the core novel contribution. Notably absent is a computer-readable medium (CRM) claim type, leaving a significant claim format gap.

Core inventive concept: Claims 1 and 18 address the problem of conventional audio-based VAD systems triggering false alarms when interfering speakers or background noise are present in a venue. The inventive mechanism, recited across both independent claims, is: extracting a time series of channel information (TSCI) from a standard-compliant radio signal (WiFi, LTE, IEEE 802.11), performing beamforming to compute a directional TSCI (DTSCI) associated with the target voice source's direction, computing a time series of radio feature (TSRF) from the DTSCI, and detecting voice activity via pitch profile and intermittent voiced/unvoiced speech sequence detection — all 'without using any media signal.'

Independent Claim Dissection

ClaimPreambleTransitionKey Body Elements
Claim 1A system for radio-based voice activity detection,comprising:
transmitter transmitting standard-compliant radio signal (mobile cellular, WLAN, WiFi, IEEE 802.11/802.11bf) through wireless channel; receiver receiving signal where channel is impacted by target voice source and non-target voice source; processor configured to: extract TSCI (CSI/CFR/CIR), perform beamforming, compute directional TSCI (DTSCI), compute time series of radio feature (TSRF), compute radio spectrogram, detect pitch profile with time profile of pitch and harmonics, detect sequence of intermittent voiced/unvoiced speech based on continuous time trend and harmonics, detect voice activity of target separately from non-target voice source without using any media signalSearch prior art ↗
Claim 18A method for radio-based voice activity detection,comprising:
obtaining standard-compliant radio signal transmitted from transmitter to receiver through wireless channel (impacted by target voice and non-target voice, compliant to mobile cellular/WLAN/WiFi/IEEE 802.11/802.11bf); extracting set of TSCI (CSI/CFR/CIR); performing beamforming; computing DTSCI associated with direction of target voice source; computing TSRF from DTSCI; computing radio spectrogram; detecting pitch profile comprising time profile of pitch and harmonics; detecting sequence of intermittent voiced and unvoiced speech based on continuous time trend and harmonics; detecting voice activity of target voice source separately from non-target voice source without using any media signalSearch prior art ↗

Claim Dependency Tree

1 System for radio-based VAD: transmitter, receiver, processor extracting TSCI/DTSCI/TSRF, detecting pitch profile, detecting voiced/unvoiced sequence, detecting voice activity without media signalSearch Claim 1 prior art ↗
2 Adds: media signal definition — microphone signal, speech signal, vocal signal, audio signal, telephone signal, visual/video/multimedia signalSearch in Eureka ↗
3 Adds: standard-compliant radio signal is data communication signal; voice activity detected without using data payloadSearch in Eureka ↗
4 Further: voice activity detected without using any media signal data in the data communication signalSearch in Eureka ↗
5 Further: voice activity of target voice source associated with voice producing motion of target voice sourceSearch in Eureka ↗
6 Adds: each RF of TSRF computed based on sliding window of DTSCI; TSRF is baseband signal bandlimited to 1 MHzSearch in Eureka ↗
7 Adds: detecting instantaneous pitch where instantaneous fundamental frequency is greater than lower threshold and less than upper thresholdSearch in Eureka ↗
8 Further: detecting at least one instantaneous harmonic of instantaneous pitch where frequency is integer multiple of fundamental frequencySearch in Eureka ↗
9 Further: detecting pitch profile comprising plurality of instantaneous pitches; detecting voice-related time trend of pluralitySearch in Eureka ↗
10 Further: voice-related time trend comprises local continuity of pitches, habitual pitch, long term pitch, timing/pacing, fast/slow pitch changeSearch in Eureka ↗
11 Further (from 10): processor processes TSRF with neural network and detects pitch profile based on neural network processingSearch in Eureka ↗
12 Further (from 10): processor computes frequency decomposition (STFT, wavelet, filter-bank, Fourier, multi-resolution, time-frequency, voiceprint, waterfall) and detects pitch profile based on decompositionSearch in Eureka ↗
13 Further (from 12): detecting pitch profile by processing frequency decomposition with neural networkSearch in Eureka ↗
14 Further (from 13): processor configured to detect instantaneous pitch based on frequency decomposition of TSRF in time windowSearch in Eureka ↗
15 Further (from 14): processor detects instantaneous harmonics based on frequency decomposition of TSRF in time windowSearch in Eureka ↗
16 Further (from 15): transmitter or receiver has array of antennas; each TSCI associated with respective antenna; DTSCI associated with direction relative to antenna arraySearch in Eureka ↗
17 Further (from 16): processor associates target voice source with DTSCI component; associates non-target voice sources with different DTSCI components; selects component; rejects non-target sourcesSearch in Eureka ↗
18 Method for radio-based VAD: obtain standard-compliant radio signal, extract TSCI, beamform, compute DTSCI/TSRF/spectrogram, detect pitch profile and voiced/unvoiced sequence, detect voice activity without media signalSearch Claim 18 prior art ↗
MetricThis ApplicationWireless / Signal Processing Norm
Total claims1815 – 25
Independent claim count22 – 4
Dependent : Independent ratio8.00 : 14 – 8 : 1
Method claims present?Yes — Claim 18Common
System / apparatus claims?Yes — Claim 1Always
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality

Drafting Quality Signals

The patent's primary strength is the precisely engineered 'without using any media signal' negative limitation in Claims 1 and 18, which creates a clear differentiation from prior art audio-based VAD systems and provides a prosecution-tested basis for novelty. A significant structural weakness is the absence of a computer-readable medium (CRM) claim type — the entire dependent claim tree hangs off only two independent claims covering system and method, leaving the software-implementation pathway entirely unprotected.

Antecedent Basis
Antecedent basis is clean across all 18 claims. The key coined terms — TSCI, DTSCI, TSRF — are all introduced in Claim 1 with indefinite articles ('a set of time series of channel information (TSCI)', 'a directional TSCI (DTSCI)', 'a time series of radio feature (TSRF)') and are then referenced with 'the' in all subsequent dependent claims. Claims 7 through 17, which reference 'the instantaneous pitch' and 'the pitch profile,' consistently trace back to the initial introductions in Claim 7. No orphaned 'the [element]' references were identified.
Spec–Claim Consistency
The core claim limitations map directly to specific figures and specification sections. The TSCI extraction limitation in Claim 1 is supported by FIG. 2's channel information module and the detailed description at columns 21–24. The beamforming and DTSCI limitation maps to FIG. 2 (digital beamforming block) and the mathematical derivation at column 54. The TSRF computation and pitch detection limitation maps to FIG. 23 (neural network VAD architecture) and FIG. 21B (radio spectrogram). The 'without using any media signal' limitation is supported by FIG. 21C and the RadioVAD discussion at columns 82–89.
Transition Word Usage
Both independent claims correctly use 'comprising' as the transition word, preserving open-ended claim scope that cannot be defeated by the addition of further elements to an accused device or method. Dependent claims also use 'comprising' or 'wherein' transitions appropriately. No instances of 'consisting of' or 'consisting essentially of' were used, which is strategically appropriate given the multi-component system architecture and the desire to capture systems that may add further processing steps or hardware modules.
§112(f) Means-Plus-Function Risk
No 'means for' or 'step for' language appears in any of the 18 claims. The processor limitations in Claim 1 are recited as 'a processor configured for' followed by specific structural/functional steps — this is the preferred 'configured to' formulation that the Federal Circuit has held generally does not invoke §112(f). The coined acronyms (TSCI, DTSCI, TSRF) are given explicit structural definitions in the claim body itself, further reducing indefiniteness risk. The neural network processing recited in Claims 11 and 13 is functional but tied to the specific 'processor' structure of Claim 1.
⚠️
§101 Eligibility Risk
Claims 1 and 18 present moderate §101 Alice exposure because the core inventive step — detecting voice activity by computing pitch profiles and voiced/unvoiced sequences from a TSRF — is an abstract signal processing method that could be characterized as a mathematical concept under Alice step 1. The hardware tie-in (transmitter, receiver, processor) provides a meaningful §101 defense under step 2A prong 2 ('integrates the abstract idea into a practical application'), particularly because Claim 1 explicitly ties the VAD to a 'standard-compliant radio signal' transmitted through a physical wireless channel impacted by the target voice source's physical motion. The 'without using any media signal' limitation reinforces the physical-domain anchor. However, claims 11 and 13 add only neural network processing — an examiner may characterize these as pure software limitations adding only abstract-idea elements.
⚠️
Dependent Claim Fallback Quality
The dependent claim structure is heavily weighted toward narrowing the pitch-detection mechanism (Claims 7–17 all descend through the pitch/harmonics/time-trend chain), leaving relatively weak fallback positions if the pitch-based VAD limitations are challenged. Claims 3–5 provide meaningful fallback by specifying data-communication signal types and voice-producing-motion limitations. Claims 16 and 17 add valuable antenna array and DTSCI component selection specificity. However, the entire 16-claim dependent tree hangs off only 2 independent claims, and no dependent claims add independently novel system configurations (e.g., different TSCI extraction methods, different beamforming algorithms) that could survive if the independent claim structure is invalidated.
⚠️
Abstract Quality
The abstract describes the system architecture accurately ('a transmitter configured to transmit a radio signal through a wireless channel… a receiver… a processor configured for: computing a time series of channel information…') and calls out the key distinguishing feature ('detecting the voice activity of the target voice source based on the time series of CI (TSCI) of the wireless channel, without using any media signal'). However, the abstract does not name the pitch-profile and voiced/unvoiced sequence detection mechanism — which is the most technically specific claim limitation — and therefore an examiner reading only the abstract would not identify this as the novel technical contribution distinguishing it from prior wireless VAD art.
⚠️
Figure Support Quality
Most structural claim limitations have figure support: transmitter/receiver hardware is shown in FIG. 15 and FIG. 16; beamforming is illustrated in FIG. 2 and FIG. 3; TSRF and pitch profile detection is supported by FIG. 21B, FIG. 21C, and FIG. 23; voiced/unvoiced sequence detection appears in FIG. 21C. However, the 'standard-compliant radio signal' limitation (WiFi, LTE, IEEE 802.11bf) is stated in text but lacks a figure showing the specific protocol compliance hardware, and the 'without using any media signal' limitation has behavioral support (FIG. 21C) but no structural figure specifically illustrating the exclusion of media signal processing paths in the system architecture.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard

Strategic Intent Scorecard

Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.

Claim Breadth
3.8
Prosecution Defensibility
3.5
Spec–Claim Consistency
4
Dependent Claim Coverage
3.2
Claim Type Diversity
2.5
Figure Support Quality
3.5
Breadth Prosecution Consistency Dep. Coverage Claim Types Figures
Key observation: Spec–Claim Consistency scores highest (4.0/5) because the detailed description provides direct figure and mathematical equation support for each key claim limitation — TSCI extraction maps to FIG. 2 and column 54 formulas, pitch detection maps to FIG. 23, and interference rejection maps to FIG. 21C — creating a strong written description defense. Claim Type Diversity scores lowest (2.5/5) because the patent files only system and method claim types, entirely omitting a computer-readable medium (CRM) claim despite the deep learning neural network embodiments described extensively in the specification (FIGs. 9, 23, 25), creating a straightforward design-around pathway for software-only implementations. Practitioners reading this patent should prioritize filing continuation claims directed to CRM and apparatus-implementing-neural-network formats to close this gap.
See how your own draft compares — Open Eureka IP Drafting →
Critical Gaps

3 Critical Gaps in This Claim Set

A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.

🔒

3 Critical Gaps in This Claim Set

See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.

No CRM software claim filed Standard-compliance over-limits scope RadioSES speech separation unprotected
Unlock Full Analysis — Free
Frequently asked questions

US 12,272,375 B2 — key questions answered

Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka

Ready to Draft Your Next Patent with AI?

PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.

Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.

Ask anything about this patent.
PatSnap Eureka searches patents and data to answer instantly.
Powered by PatSnap Eureka
Link copied to clipboard

Eureka built for innovation research

Eureka built for research
Domain-specific AI agents for IP, Engineering, Life Sciences, and Materials
Patents, Scientific Literature, Compounds & More Unified in One Platform
Ask, Research, Solve, Draft, and Validate Your Work from Weeks to Minutes
Try it for Free

Help us improve this page

Found incorrect or outdated information? Let us know and we'll get it fixed.