...
We have been evaluating Eyer's core anomaly detection algorithm by calculating F1 score, recall, and precision using a subset of the Server Machine Dataset, a publicly available labelled dataset. Initially, we tested the dataset as is, then proceeded to relabel it. The relabelling was prompted by the observation that some events, which appeared anomalous based on the data, were not labelled as such. We'll discuss this observation further in the relabelling section. This test was conducted on a smaller dataset, and we plan to release additional updates with larger datasets in the future. Our algorithm consistently shows high recall in both the original and relabelled cases, indicating its effectiveness in identifying all labelled anomalies. For the original dataset, precision and, consequently, the F1 score were low, but improved significantly after relabelling. In the relabelled dataset, we achieved an F1 score of 0.81 (0.86 when considering only critical anomalies) after four weeks of data collection.
...