Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Before presenting the results we have to introduce how we define true positives, false positives, and false negative. It might seem trivial to define these, but in reality an anomaly is very often not a single data point but a series of data points. In the case of IT operations the anomalies are often deviations from the usual behaviour that are persistent for some time. When labelling manually our data we might be off the actual starting and final point of said deviations, and sometimes might be impossible to precisely define a starting and finishing point, precisely. For example: , is an anomaly starting when a deviation is well established or should we include in an anomaly also the oscillations that preceded the anomaly? Is an anomaly finishing when the value of a metric is back to a normal value or when it is on its way back to the normal value, but has not reached it yet?

In the following a labelled anomaly is a series of datapoints which are labelled as anomalous, the labels are going to be used to test the algorithms. A detected anomaly is a series of datapoints considered anomalous by our algorithm. To capture the variability of labelling we have established these few rules:

  • A labelled anomaly (a series of datapoints which were labelled as anomalous) that overlaps with a detected anomaly (a series of data point that are defined anomalous by our algorithm) is considered discovered: none anomalyis considered detected. None of the datapoints in the labelled anomaly are going to be considered as false negatives, even if they do not correspond to datapoints considered anomalous by our algorithmin a detected anomaly.

  • All the datapoints of a detected anomaly are considered true positives if the detected anomaly did not last more than 50% of the overlaps with the labelled anomalies.

  • If a detected anomaly persists for more than 50% of its overlap with labeled anomalies, any additional points are labeled as false positives.

  • All the points of detected anomalies which do not overlap with labelled anomalies are considered false positive.

  • All the points of a labelled anomaly which do not overlap with a detected anomaly, therefore is not detected, are false negatives.

...

These are the test data for which data are labelled:

...

let’s Let’s zoom in and circle the labelled anomalies in red:

...