Cogita-PRO Anomaly Detection

Olivera Stojanovic
Blog Author Image
January 21, 2026

One of the most useful features of Cogita-PRO is anomaly detection. By definition, an anomaly is something that happens unexpectedly and rarely, deviating from the norm an outlier. These can be some of the most interesting situations to arise in chip verification … if we can find them. 

Since DV is primarily spec-driven, traditional methods include directed and constrained-random tests with coverage models to flag bugs. We code what we can anticipate. But what about unexpected and, therefore, unmodeled bugs? Behaviors that our coverage models miss entirely?

And, of course, these anomalies can hide among millions of signals, thousands of modes and gigabytes of simulation logs. Humans simply can’t inspect this manually. 

The challenge: catch unknown unknowns at scale. 

Types of Anomalies

In most cases, RTL and testbench bugs arise from a specific combination of data or an unusual sequence of events.
 

  1. Data anomalies: When patterns diverge from the norm, the data content can be inverted, shifted, misaligned, or otherwise odd. Perhaps a rare mix/combination of different fields values of packets. Rare data repetition can also be a symptom of incorrect behavior. 
  2. Sequence pattern anomalies: In this case, the error is not caused by unusual data values but by a unique event sequence that deviates from all previous execution patterns. Detecting such temporal anomalies helps reveal what sequence of interactions led to the issue and how it differs from normal behavior.

Cogita-PRO has a set of multiple anomaly-detection algorithms tailored for verification datasets. 

Moreover, Cogita-PRO will use layers of anomaly detection algorithms to eliminate false-positives and ensure the user sees only the most relevant results. This pipelining of algorithms can be configured by the user or Cogita-PRO can gather results and provide a unified presentation of overall conclusions.

Anomaly type: Data

1. Neural Network model training and usages of the models

Data Scale Type: Large-scale high-dimensional (Big data set – lot of occurrences and lot of data fields columns) 

Usecase/application: Model build on passing tests are used on failing test, subsystem or SoC level

Key benefits:

  • Captures non-linear feature interactions and complex data dependencies
  • Generalizes across test scenarios when trained on diverse passing data

Neural Network model training and usages of the models

2. Ensemble

Data Scale Type: Large-scale high-dimensional (Big data set – lot of occurrences and lot of data fields columns) 

Usecase/application: Detects anomalies in multi-field value combinations within single test execution

Key benefits:

  • Identifies rare multi-dimensional combinations 
  • Highlights specific field combinations contributing to anomaly score

Ensemble

3. Describe analysis

Data Scale Type: Large-scale high-dimensional (Big data set – lot of occurrences and lot of data fields columns) 

Application: Post-detection explainability layer that analyzes flagged anomalies to identify distinguishing characteristics compared to normal transactions within the test

Key benefits:

  • Root cause attribution - identifies which specific fields/combinations deviate from baseline
  • Anomaly interpretability - translates probabilistic scores into actionable debug insights
  • Feature importance ranking - prioritizes which data fields contribute most to anomaly classification
  • Comparative analysis - quantifies how suspect transactions differs from cluster of normal transactions
  • Reduces debug time - eliminates manual comparison of thousands of field values
  • Multi-algorithm fusion - explains anomalies detected by any upstream detection method (NN, ensemble, statistical)

4. Extreme values outliers 

Data Scale Type: Small-scale and large-scale

Usecase/application: Identifies extreme value outliers, extreme successful occurrence values and timing distribution outliers

Key benefits:

  • Detects timing violations and performance anomalies
  • Pinpoints transactions with unusual field values or frequencies

Extreme values outliers

Anomaly type: Sequence pattern 

1. Automated Transaction Path Extraction, Classification and Golden Reference Model for Failure Analysis

Data Scale Type: Small-scale and large-scale

Usecase/application: This method is ideal for NoC fabrics and multi-path subsystems, where transaction routing exhibits high combinatorial diversity.

Key benefits:

This automatic classification reveals:

  • Protocol conformance - whether transactions follow expected paths
  • Path diversity - how many variants exist in actual execution
  • Anomalous flows - rare or unexpected sequences
  • Execution coverage - which protocol paths are actually exercised
  • Specified it's a differential/comparison method
  • Positioned golden model as "learned specification"

Automated Transaction Path Extraction

2. Automated Discrete State Machine Transition Extraction and Modeling

Data Scale Type: Small-scale and large-scale

Usecase/application: Protocol FSM verification, transaction type state tracking, multi-FSM concurrent analysis 

Key benefits:

  • Automatic transition discovery - extracts complete state graph from logs
  • Illegal transition detection - identifies state sequences absent in golden model
  • Multi-field FSM correlation - tracks coupled state machine behavior

Automated Discrete State Machine Transition

3. Automated Discrete State Machine Sequence Extraction and Modeling

Data Scale Type: Small-scale and large-scale

Usecase/application: Temporal ordering violations, multi-FSM interaction patterns, causality chain analysis 

Key benefits:

  • Temporal anomaly isolation - detects illegal event orderings
  • Multi-dimensional sequence comparison - analyzes concurrent field evolution
  • Pass/fail differential - highlights sequence deviations causing failures
  • Interaction pattern discovery - reveals unexpected cross-FSM dependencies

Automated Discrete State Machine Sequence

Regression analysis

All anomaly detection algorithms can be deployed with Cogita-PRO in regression mode to perform real-time anomaly detection without user interaction. Then, in the event of an anomaly, an immediate alert is issued and the user can view the results, correlate the anomaly to any UVM errors or use it for regression triage. Cogita-PRO can then be launched in interactive mode and the regression results are immediately viewable. 

Conclusion

Verification anomalies—whether data-driven or sequence-driven—are the hardest bugs to find because they represent rare combinations that appear only rarely. Cogita-PRO's suite of tailored algorithms automates their detection across all scales of verification data, from block-level to full SoC regression, enabling verification teams to focus on fixing bugs rather than hunting for them in massive log files.

Cogita-PRO Anomaly Detection

Olivera Stojanovic
January 21, 2026

One of the most useful features of Cogita-PRO is anomaly detection. By definition, an anomaly is something that happens unexpectedly and rarely, deviating from the norm an outlier. These can be some of the most interesting situations to arise in chip verification … if we can find them. 

Since DV is primarily spec-driven, traditional methods include directed and constrained-random tests with coverage models to flag bugs. We code what we can anticipate. But what about unexpected and, therefore, unmodeled bugs? Behaviors that our coverage models miss entirely?

And, of course, these anomalies can hide among millions of signals, thousands of modes and gigabytes of simulation logs. Humans simply can’t inspect this manually. 

The challenge: catch unknown unknowns at scale. 

Types of Anomalies

In most cases, RTL and testbench bugs arise from a specific combination of data or an unusual sequence of events.
 

  1. Data anomalies: When patterns diverge from the norm, the data content can be inverted, shifted, misaligned, or otherwise odd. Perhaps a rare mix/combination of different fields values of packets. Rare data repetition can also be a symptom of incorrect behavior. 
  2. Sequence pattern anomalies: In this case, the error is not caused by unusual data values but by a unique event sequence that deviates from all previous execution patterns. Detecting such temporal anomalies helps reveal what sequence of interactions led to the issue and how it differs from normal behavior.

Cogita-PRO has a set of multiple anomaly-detection algorithms tailored for verification datasets. 

Moreover, Cogita-PRO will use layers of anomaly detection algorithms to eliminate false-positives and ensure the user sees only the most relevant results. This pipelining of algorithms can be configured by the user or Cogita-PRO can gather results and provide a unified presentation of overall conclusions.

Anomaly type: Data

1. Neural Network model training and usages of the models

Data Scale Type: Large-scale high-dimensional (Big data set – lot of occurrences and lot of data fields columns) 

Usecase/application: Model build on passing tests are used on failing test, subsystem or SoC level

Key benefits:

  • Captures non-linear feature interactions and complex data dependencies
  • Generalizes across test scenarios when trained on diverse passing data

Neural Network model training and usages of the models

2. Ensemble

Data Scale Type: Large-scale high-dimensional (Big data set – lot of occurrences and lot of data fields columns) 

Usecase/application: Detects anomalies in multi-field value combinations within single test execution

Key benefits:

  • Identifies rare multi-dimensional combinations 
  • Highlights specific field combinations contributing to anomaly score

Ensemble

3. Describe analysis

Data Scale Type: Large-scale high-dimensional (Big data set – lot of occurrences and lot of data fields columns) 

Application: Post-detection explainability layer that analyzes flagged anomalies to identify distinguishing characteristics compared to normal transactions within the test

Key benefits:

  • Root cause attribution - identifies which specific fields/combinations deviate from baseline
  • Anomaly interpretability - translates probabilistic scores into actionable debug insights
  • Feature importance ranking - prioritizes which data fields contribute most to anomaly classification
  • Comparative analysis - quantifies how suspect transactions differs from cluster of normal transactions
  • Reduces debug time - eliminates manual comparison of thousands of field values
  • Multi-algorithm fusion - explains anomalies detected by any upstream detection method (NN, ensemble, statistical)

4. Extreme values outliers 

Data Scale Type: Small-scale and large-scale

Usecase/application: Identifies extreme value outliers, extreme successful occurrence values and timing distribution outliers

Key benefits:

  • Detects timing violations and performance anomalies
  • Pinpoints transactions with unusual field values or frequencies

Extreme values outliers

Anomaly type: Sequence pattern 

1. Automated Transaction Path Extraction, Classification and Golden Reference Model for Failure Analysis

Data Scale Type: Small-scale and large-scale

Usecase/application: This method is ideal for NoC fabrics and multi-path subsystems, where transaction routing exhibits high combinatorial diversity.

Key benefits:

This automatic classification reveals:

  • Protocol conformance - whether transactions follow expected paths
  • Path diversity - how many variants exist in actual execution
  • Anomalous flows - rare or unexpected sequences
  • Execution coverage - which protocol paths are actually exercised
  • Specified it's a differential/comparison method
  • Positioned golden model as "learned specification"

Automated Transaction Path Extraction

2. Automated Discrete State Machine Transition Extraction and Modeling

Data Scale Type: Small-scale and large-scale

Usecase/application: Protocol FSM verification, transaction type state tracking, multi-FSM concurrent analysis 

Key benefits:

  • Automatic transition discovery - extracts complete state graph from logs
  • Illegal transition detection - identifies state sequences absent in golden model
  • Multi-field FSM correlation - tracks coupled state machine behavior

Automated Discrete State Machine Transition

3. Automated Discrete State Machine Sequence Extraction and Modeling

Data Scale Type: Small-scale and large-scale

Usecase/application: Temporal ordering violations, multi-FSM interaction patterns, causality chain analysis 

Key benefits:

  • Temporal anomaly isolation - detects illegal event orderings
  • Multi-dimensional sequence comparison - analyzes concurrent field evolution
  • Pass/fail differential - highlights sequence deviations causing failures
  • Interaction pattern discovery - reveals unexpected cross-FSM dependencies

Automated Discrete State Machine Sequence

Regression analysis

All anomaly detection algorithms can be deployed with Cogita-PRO in regression mode to perform real-time anomaly detection without user interaction. Then, in the event of an anomaly, an immediate alert is issued and the user can view the results, correlate the anomaly to any UVM errors or use it for regression triage. Cogita-PRO can then be launched in interactive mode and the regression results are immediately viewable. 

Conclusion

Verification anomalies—whether data-driven or sequence-driven—are the hardest bugs to find because they represent rare combinations that appear only rarely. Cogita-PRO's suite of tailored algorithms automates their detection across all scales of verification data, from block-level to full SoC regression, enabling verification teams to focus on fixing bugs rather than hunting for them in massive log files.

Real-world example:

Memory Access Time Anomaly in a Multi-CPU, Shared-Memory NoC System

System Context

  • 4–8 CPU clusters (e.g., Cortex-A class or custom RISC-V)
  • Shared L3 cache + DRAM controller
  • Coherent NoC (AXI-based with QoS, virtual channels, and credit-based flow control)
  • Mix of real-time and best-effort traffic

Bug Scenario

Under specific traffic interleavings, one CPU experiences sporadic 10–50× memory access latency spikes, even though:

  • No deadlock occurs
  • No protocol violation is flagged
  • Performance counters look mostly normal in aggregate

This only happens:

  • When three or more CPUs issue bursts of write-backs
  • While another CPU issues cache-miss reads with low QoS
  • During LLC eviction pressure

Root Cause (Observed in Real Systems)

A rare interaction between:

  • NoC credit starvation on a return path
  • A QoS downgrade rule triggered when write buffers exceed a threshold
  • A fairness watchdog that incorrectly resets priority after a long stall

The result:

  • One CPU’s read responses get stuck behind write responses
  • The stall is long but finite, so watchdogs do not fire
  • The system “recovers” without errors, but latency spikes are extreme

Why This Is an Anomaly (Not a Simple Bug)
  • Average latency is fine
  • Max latency occasionally explodes
  • Happens only under rare traffic mixes
  • Reproducing it requires timing alignment, not a single bad condition

Anomaly Detection Angle

Instead of checking:

      “Did latency exceed X?”

Cogita-PRO detects:

  • Temporal patterns such as:
    • Repeated long gaps between AR and R channels for one master
    • Correlation between write-buffer occupancy and read starvation
  • Deviation from historical per-CPU latency distributions

Related Blogs

Ready to Automate Your Customer Interactions?New Product Release: Cogita-PRO 2.0
Olivera Stojanovic
Blog Author Image
February 11, 2026
Ready to Automate Your Customer Interactions?Cogita-PRO Anomaly Detection
Olivera Stojanovic
Blog Author Image
January 21, 2026
Ready to Automate Your Customer Interactions?The Cogita-PRO paradigm
Olivera Stojanovic
Blog Author Image
December 9, 2025