All time series are scanned to establish Once per week we perform a full scan of all the metrics, to determine which series are producing data, and at which frequency and how often the value changes. This step also onboards new environments and new resources connected to the machine learning pipeline. It is happening on a schedule, every week, the night between Saturday and Sunday. New environment will be onboarded on the first Sunday after they have been active (produced data) for at least 6 days, new metrics will be onboarded at the first Sunday after they have been producing enough data (the amount of data can vary dependinding on the unique behaviour of the metric, but an absolute minimun is 7 data points on average in the last 7 days). For new metrics added to a pre-existing environment the first week of anomaly detection might produce false positives.
This is done to optimise our computational resources. We can use more complex algorithms for rapid changing metrics and lighter weight algorithms for stable metrics. Non-active metrics are not considered for analysis (time series with no data points in the last 7 days).
We find out Every week we do a full scan of all the time series and how they have been behaving in the past 7 days. We identify the typical rate at which the data come in (frequency), and and how often they change (activity). Then each time series is classified according to frequency and activity into three main groups:
...