Book a demo

Patent Drafting Analysis of Netskope’s ML-Based Anomaly Detection Initialization | US 12,244,617 B2

Patent Drafting Analysis of Netskope’s ML-Based Anomaly Detection Initialization | US 12,244,617 B2
IP Drafting Analysis · US 12,244,617 B2

Patent Drafting Analysis of Netskope's Machine Learning Anomaly Detection Initialization | US 12,244,617 B2

A structural and strategic analysis of Netskope's granted patent covering online streaming ML-based anomaly detector initialization, examining claim architecture, drafting quality, §101 eligibility positioning, critical gaps, and prosecution defensibility across 20 method claims.

US 12,244,617 B2Filed: Jul 5, 2023Granted: Mar 4, 2025H04L 9/40G06F 21/55G06N 5/02G06N 7/01G06N 20/00
Spec Words
12,400
Across 7 sections
Draft now ↗
Total Claims
20
1 independent · 19 dependent
Draft now ↗
Figure Sheets
12
System architecture, ML flow, data tables, process flowcharts
Draft now ↗
Published by PatSnap Insights Team · · 13 min read Verified by PatSnap Eureka Data
Overview

Structural Overview

The detailed description dominates the specification at approximately 63% of total words (~9,200 of ~14,600), providing extensive narrative support for the online streaming ML architecture, anomaly detection loop, and hash-space storage mechanisms. The claim set comprises 20 claims, all method claims, with a single independent claim (Claim 1) anchoring 19 dependent claims — an unusually steep dependency ratio that concentrates all validity risk on one independent claim. The 12 drawing sheets span system architecture, process flowcharts, data tables, and sample output, collectively providing solid figure coverage for the core transformation and scoring pipeline.

Section Word Distribution

Detailed Desc. 9200 w Claims 3500 w Summary 820 w Background 1370 w Brief Desc. 660 w Abstract 190 w ↗ Click bars to explore

Figure Inventory — 12 Sheets

FigureDescriptionRole
FIG. 1
System 100 illustrating the machine learning based anomaly detection architecture, including security-related events 102, transformer 104, tenant models 106, user models 116, online machine learner 122, loss function analyzer 132, anomaly detection engine 142, and network security system 146 interconnected via network 114.Search in Eureka ↗
System architecture
FIG. 2
Exemplary architecture 200 of the machine learning based anomaly detection system, showing pipeline 202, mapper data manager 204, *.4anom files 206, OML watcher/classifier 208, OML instance 216, OML watcher/modeler 212, and models 214 with anomaly decision flow.Search in Eureka ↗
System architecture
FIG. 3
Sample security-related event 300 shown as a plain text representation of a connection event including output/target feature, space ID (User2864), standard candle feature (SC:0.1), application (Google Drive), and multiple feature-value pairs for source/destination IP, location, and time dimensions.Search in Eureka ↗
Other
FIG. 4
Illustration 400 of assigning time-based features of security-related events into multiple sets of periodic bins — day-of-week bucket (7 distinct values), time-of-day bucket (24 distinct values), and part-of-day bucket (6 distinct values) — from a timestamp example of 23:50:12, 05/14/2016.Search in Eureka ↗
Claim support
FIG. 5
Feature table 500 depicting a sample of features for a connection event, including application, source IP sub-features, destination IP sub-features, user agent, OS, browser version, and time-based features (Hour-of-Day, Part-of-Day, Day-of-Week), with grouping and UI columns.Search in Eureka ↗
Claim support
FIG. 6
Learning process data table 600 showing average error, event error, event counter, actual value, predicted value, and feature count columns during online machine learning, illustrating the convergence behavior of likelihood coefficients over thousands of events.Search in Eureka ↗
Claim support
FIG. 7
Sample anomaly output 700 showing detected anomalies for Tenant1 when the anomaly threshold is set to 50× relative-error spike, listing specific users, applications, locations, days, and anomaly multipliers across five detected anomaly events in 2,991,859 total events.Search in Eureka ↗
Other
FIG. 8
Flowchart 800 illustrating a representative method of initializing an anomaly detector, with eight steps (810–880) covering feeding events to an online machine learner, assigning feature-value pairs to categorical bins, Boolean coding, grouping into sub-streams, loss function correlation, likelihood coefficient calculation, prevalencist probability determination, and storing/supplying for initialization.Search in Eureka ↗
Flow diagram
FIG. 9
Flowchart 900 showing a method of detecting anomalies based on activity models learned using machine learning, with five steps (910–950) covering relative-error ratio determination, candle value determination, likelihood coefficient evaluation, overall likelihood coefficient calculation, and anomaly event determination.Search in Eureka ↗
Flow diagram
FIG. 10
Flowchart 1000 illustrating a method of detecting an anomaly event not frequently observed, with six steps (1010–1060) covering loosely supervised ML with loss function analyzer, hash-space mapping, production event transformation, hash function application, history-based contrast construction, and anomaly reporting.Search in Eureka ↗
Flow diagram
FIG. 11
Flowchart 1100 showing a method of detecting infrequently observed anomaly events in an ongoing event stream, with seven steps (1110–1170) covering hash-space expansion of compressed likelihood coefficients, event reception, feature-value transformation, hash function application, scoring with previously unseen pair handling, history contrast construction, and reporting.Search in Eureka ↗
Flow diagram
FIG. 12
Block diagram of example computer system 1200 used to generate anomalies, including computer system 1210 with processor(s) 1214, network interface 1216, bus subsystem 1212, storage subsystem 1224 (memory subsystem 1226, file storage subsystem 1228, RAM 1234, ROM 1232), user interface input/output devices 1222/1218, and application server 1220.Search in Eureka ↗
System architecture
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Claims

Claim Architecture Analysis

The patent contains exactly 1 independent claim (Claim 1, a method claim) and 19 dependent claims, yielding a 19:1 dependent-to-independent ratio that is far above the software/cloud norm of 4–8:1. All 20 claims are method claims — there are no system/apparatus or CRM claims filed — leaving significant design-around and enforcement gaps. The claim strategy concentrates all protection into a single independent method claim, with dependent claims 2–20 layering in specific features such as hash-space storage, candle values, user-seasoning logic, time-binning, and SGD analyzers.

Core inventive concept: Claim 1 addresses the problem of initializing an anomaly detector for real-time security event streams without labeled training data by transforming unsupervised learning into a supervised problem — it feeds security-related events labeled with space IDs to an online machine learner, assigns feature-value pairs into categorical bins coded with Boolean values, groups events into per-space-ID sub-streams, and uses a loss function analyzer to correlate coded feature-value pairs through an origin with a target feature "artificially labeled as a constant," thereby calculating likelihood coefficients and prevalencist probability values used to initialize the anomaly detector.

Independent Claim Dissection

ClaimPreambleTransitionKey Body Elements
Claim 1A method of initializing an anomaly detector that handles a stream of security-related events of one or more organizationscomprising
feeding stream to online machine learner (each event comprising a space ID of a plurality of space IDs and feature-value pairs); transforming events by assigning feature-value pairs to categorical bins and coding with Boolean values; analyzing transformed stream using loss function analyzer by grouping into per-space-ID sub-streams and separately analyzing each sub-stream via loss function correlating coded feature-value pairs with a target feature artificially labeled as a constant; storing probability predictions for coded feature-value pairs by space ID; initializing the anomaly detector using the probability predictionsSearch prior art ↗

Claim Dependency Tree

1 Method of initializing anomaly detector — feeds security events to online ML learner, transforms to categorical bins with Boolean coding, analyzes sub-streams via loss function correlating to constant target, stores and uses probability predictions for initializationSearch Claim 1 prior art ↗
2 Adds: during training, majority of feature-value pairs assigned to at least two categorical binsSearch in Eureka ↗
3 Adds: separately analyzing each sub-stream further comprises calculating likelihood coefficients and determining prevalencist probability values; storing comprises storing likelihood coefficients and prevalencist probability values by space ID; initializing further uses likelihood coefficients and prevalencist probability valuesSearch in Eureka ↗
4 Further: storing likelihood coefficients and prevalencist probability values for multiple space IDs in hash-space as tenant activity model; updating tenant activity model with new eventsSearch in Eureka ↗
5 Further: storing likelihood coefficients and prevalencist probability values for particular space ID in hash-space as user activity model; updating user activity model with new eventsSearch in Eureka ↗
6 Further: accumulating non-zero likelihood coefficients for frequently appearing feature-value pairs; updating and converging likelihood coefficients over time to match target featureSearch in Eureka ↗
7 Further: determining relative-error ratio, standard candle value, lowest likelihood coefficient feature-value pairs, overall likelihood coefficient; determining production event is anomaly when all three exceed thresholdSearch in Eureka ↗
8 Further: (depends on 7) distinguishing seasoned vs. unseasoned user by initializing second space ID with loss function analyzer with standard candle value; maturing standard candle value to target value responsive to threshold eventsSearch in Eureka ↗
9 Further: (depends on 8) seasoned space IDs have non-zero standard candle values; unseasoned space IDs have near-zero standard candle valuesSearch in Eureka ↗
10 Adds: annotating security-related events with prevalencist probability values of between 0 to 1, indicative of occurrence frequencySearch in Eureka ↗
11 Adds: loss function analyzer is a stochastic gradient descent (SGD) analyzerSearch in Eureka ↗
12 Further: (depends on 3) storing likelihood coefficients annotated with corresponding coded feature-value pairs in respective slots of hash space; retrieving by applying hash function to coded feature-value pairsSearch in Eureka ↗
13 Adds: security-related events include connection events and application eventsSearch in Eureka ↗
14 Adds: learning user-specific activity habits based on separate analysis of sub-streams by space ID; persisting in hash-space separate user-states representing occurrence frequencies of all past events for individual usersSearch in Eureka ↗
15 Adds: updating tenant and user activity models over time, including maturing and storing frequently occurring anomalous events as normal user activitySearch in Eureka ↗
16 Adds: online machine learner is an online streaming processer that learns features for 5,000 to 50,000 security-related events per second per hardware nodeSearch in Eureka ↗
17 Adds: online machine learner processes 50,000 to 5 million features per second per hardware nodeSearch in Eureka ↗
18 Adds: features of feature-value pairs include one or more time dimensions, source location, source IP address, destination location, destination IP address, source device identity, application used, activity type and detail, manipulated object dimension, or combination thereofSearch in Eureka ↗
19 Adds: assigning time-based features of the security-related events into multiple sets of periodic bins with varying granularitySearch in Eureka ↗
20 Further: (depends on 19) assigning time-based features into at least one of: day-of-week periodic bin with seven distinct values; time-of-day periodic bin with twenty-four distinct values; part-of-day periodic bin with six distinct valuesSearch in Eureka ↗
MetricThis ApplicationSoftware / Cloud Norm
Total claims2015 – 25
Independent claim count12 – 4
Dependent : Independent ratio19.0 : 14 – 8 : 1
Method claims present?Yes — Claim 1Common
System / apparatus claims?NoCommon
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Drafting Quality

Drafting Quality Signals

Claim 1's use of 'comprising' throughout provides open-ended breadth appropriate to this software-intensive ML security technology, and the detailed description provides strong narrative support for the loss function analyzer and categorical binning mechanisms. However, the complete absence of apparatus/system and CRM claims is a material drafting weakness, leaving Netskope without direct infringement coverage against system implementers and creating a significant litigation gap.

Antecedent Basis
The claim set is generally clean on antecedent basis. Claim 1 introduces "an online machine learner," "a loss function analyzer," and "a stream of security-related events" with proper indefinite articles, and all subsequent references use "the online machine learner" and "the loss function analyzer" consistently. Dependent claims 3–9 and 12 correctly carry forward defined terms from Claim 1. No orphaned "the" references were identified in the reviewed claims.
Spec–Claim Consistency
The key claim limitations map directly to specific figures and paragraphs. The categorical binning and Boolean coding limitations in Claim 1 are supported by FIG. 4 (time-based bins) and the detailed description's Transformer 104 discussion (cols. 10–12). The loss function analyzer correlating "through an origin" with a constant target is explicitly described in the detailed description (col. 14–15) and illustrated in FIG. 8 (steps 840–850). The hash-space storage in Claim 12 maps to FIG. 10–11 and col. 27–28.
Transition Word Usage
All independent and dependent claims use "comprising" as the transition, which is strategically optimal for this technology — it preserves open-ended coverage allowing infringers to add additional steps without escaping the claim scope. No "consisting of" or "consisting essentially of" transitions appear, which is appropriate given the multi-step method structure. The nested comprising within Claim 1 ("the transforming comprising:" and "the analyzing comprising:") is well-structured and correctly narrows scope within each limitation without over-restricting the broader claim.
⚠️
§112(f) Means-Plus-Function Risk
No "means for" language appears in any claim, which avoids the §112(f) trap. However, the use of functional label language such as "loss function analyzer" in Claims 1, 3, 6–9, and 11 may invite §112(f) scrutiny if an examiner or court reads it as a functional claim element without structural definition. The specification does provide structural description of the loss function analyzer 132 (e.g., as an SGD-based regression engine) in col. 7–8, but the claim itself does not recite the structural context, creating moderate risk that a challenger could argue the term lacks sufficient structural definiteness.
⚠️
§101 Eligibility Risk
Claim 1 faces meaningful Alice/Mayo risk because it claims a method that could be characterized as abstract mathematical and statistical operations — correlating feature-value pairs via a loss function, calculating likelihood coefficients, and determining probability values. The hardware tie-in is limited: Claim 1 recites no processor, computer, or machine hardware; the "online machine learner" is a functional label, not a recited hardware component. Claims 16 and 17 add processing-speed limitations (5,000–50,000 events/sec per node and 50,000–5 million features/sec per node), but these are dependent-only safeguards. A stronger filing would have included hardware-anchored apparatus claims to create a cleaner §101 defense at the independent level.
Dependent Claim Fallback Quality
The dependent claims generally add meaningful distinct limitations. Claim 3 adds a critical layer by introducing likelihood coefficients and prevalencist probability values as explicit outputs, making this the practical fallback for an examiner who narrows Claim 1. Claims 4–5 add tenant vs. user activity model distinctions. Claim 7 adds the multi-factor anomaly determination mechanism (relative-error ratio + candle value + overall likelihood coefficient), which is the core detection logic. Claims 16–17 add hardware-performance parameters that strengthen §101 and §112 positions. However, Claim 11 (SGD specification) is relatively weak as a standalone fallback since SGD is a standard technique.
⚠️
Abstract Quality
The abstract accurately describes the system and key technical mechanisms — specifically mentioning the transformation of unsupervised to supervised learning by fixing a target label, the loss function analyzer correlating through an origin, and the anomaly score based on likelihood coefficients and prevalencist probability values. However, the abstract does not mention the "artificially labeled as a constant" mechanism — the core novelty that distinguishes this approach from standard ML methods — meaning an examiner reading only the abstract might not immediately identify the precise novel contribution. A stronger abstract would have foregrounded this constant-target labeling trick as the primary inventive step.
Figure Support Quality
Figure support is strong. FIG. 1 covers the system architecture supporting Claim 1's machine learner/loss function analyzer limitations; FIG. 4 directly supports Claims 19–20 (time-based periodic bins); FIG. 5 directly supports Claim 18 (feature dimensions including source/destination IP, application, time); FIG. 8 maps step-by-step to Claim 1's method steps (810–880); and FIG. 12 provides the computer system substrate. The only notable gap is that the hash-space retrieval mechanism (Claim 12) is described in FIGs. 10–11 procedurally but lacks a dedicated structural diagram showing the hash table layout.
Analysis powered by PatSnap Eureka. Patent text and figures publicly available from USPTO. Draft a Similar Patent
Scorecard

Strategic Intent Scorecard

Multi-dimensional assessment of this application's patent strategy quality, based on claim structure, specification depth, and prosecution positioning.

Claim Breadth
3.2
Prosecution Defensibility
3
Spec–Claim Consistency
4.2
Dependent Claim Coverage
4
Claim Type Diversity
1.5
Figure Support Quality
4
Breadth Prosecution Consistency Dep. Coverage Claim Types Figures
Key observation: Spec–Claim Consistency scores highest (4.2/5) because the detailed description and 12 figures comprehensively map to Claim 1's limitations — FIG. 8 traces all eight method steps, FIGs. 4–5 directly support the binning limitations, and FIG. 12 anchors the computer system substrate. Claim Type Diversity scores lowest (1.5/5) because the entire patent consists of 20 method claims with zero apparatus, system, or CRM claims — an omission that allows device or system implementers to avoid direct infringement liability entirely, as they need not perform the claimed method steps. Practitioners should consider filing a continuation with apparatus claims mirroring Claim 1's structure (e.g., "a system comprising: one or more processors configured to...") and a CRM claim to close this enforcement gap.
See how your own draft compares — Open Eureka IP Drafting →
Critical Gaps

3 Critical Gaps in This Claim Set

A senior-attorney lens on the three highest-priority structural weaknesses — what each exposes in prosecution and litigation, and what a stronger filing would have done differently.

🔒

3 Critical Gaps in This Claim Set

See the full attorney-level analysis of what this application leaves unprotected — and how to draft it more defensively for your own filings.

No apparatus or CRM claims filed Single independent claim validity risk §101 exposure — no hardware in Claim 1
Unlock Full Analysis — Free
Frequently asked questions

US 12,244,617 B2 — key questions answered

Still have questions? PatSnap Eureka can answer them from patent data instantly. Search in Eureka
PatSnap Eureka

Ready to Draft Your Next Patent with AI?

PatSnap Eureka's AI drafting agent writes structured claims, flags coverage gaps, and positions your application for prosecution success.

Disclaimer: This analysis is generated by PatSnap Eureka AI based on publicly available patent data from the USPTO. It does not constitute legal advice and should not be relied upon as such. Patent data may be subject to change as prosecution progresses. Scores and assessments reflect automated analysis and may not capture all relevant legal or technical nuances. Always consult a qualified patent attorney for formal legal opinions on patentability, freedom to operate, or infringement.

Ask anything about this patent.
PatSnap Eureka searches patents and data to answer instantly.
Powered by PatSnap Eureka
Link copied to clipboard

Eureka built for innovation research

Eureka built for research
Domain-specific AI agents for IP, Engineering, Life Sciences, and Materials
Patents, Scientific Literature, Compounds & More Unified in One Platform
Ask, Research, Solve, Draft, and Validate Your Work from Weeks to Minutes
Try it for Free

Help us improve this page

Found incorrect or outdated information? Let us know and we'll get it fixed.