Onboarding, preprocessing and filtering of the data
Once per week we perform a full scan of all the metrics, to determine which series are producing data, and at which frequency and how often the value changes. This step also onboards new environments and new resources connected to the machine learning pipeline. It is happening on a schedule, every week, the night between Saturday and Sunday. New environment will be onboarded on the first Sunday after they have been active (produced data) for at least 6 days, new metrics will be onboarded at the first Sunday after they have been producing enough data (the amount of data can vary dependinding on the unique behaviour of the metric, but an absolute minimun is 7 data points on average in the last 7 days). For new metrics added to a pre-existing environment the first week of anomaly detection might produce false positives.
This is done to optimise our computational resources. We use more complex algorithms for rapid changing metrics and lighter weight algorithms for stable metrics. Non-active metrics are not considered for analysis (time series with no data points in the last 7 days).
Every week we do a full scan of all the time series and how they have been behaving in the past 7 days. We identify the typical rate at which the data come in (frequency), and and how often they change (activity). Then each time series is classified according to frequency and activity into three main groups:
High Frequency High activity: Time series with data points registered more frequently than every 15 minutes and that change values more frequently than every 15 minutes.
High Frequency Low Activity: Time series with data points registered more frequently than every 15 minutes and that change values every 15 minutes or more slowly.
Low Frequency: Times series with data points coming in with a typical interval bigger than 15 minutes.
This can be time consuming and computationally intensive, therefore it is done once per week. The rest of the week we use the values already calculated.