...
Even after the initial training is done, the ML will continue to learn on new data for increased accuracy of anomaly detection.
The reasons for at least 7 days of training are:
One full week of data, to enable daily and weekly cycles.
More training data reduces number of false positives and negatives.
The last week of data (also after the initial 7 days) will have the highest “weight” in the anomaly detection.
Beyond 7 days we also consider the following data patterns (from current) to classify if there is an anomaly:
14 days ago
21 days ago
28 days ago
all data between 8 and 28 days (included) from the day for which the corridors are formed.
monthly and yearly cycles.
More recent data have an higher weight in the building of baselines, but also weekly monthly and yearly cycles are up-weighted.