Menu

Anomaly detection in time series

May 14, 2019/Ondrej Kurak

Anomaly detection is one of the areas I deal with in Gauss Algorithmic. And so when I was at Machine Learning Prague 2019, I waited curiously for Vítězslav Vlček's lecture Data-driven System health determination in Monitoring Softwares for Operational Intelligence, in which Vlček's methods of anomaly detection was to be presented.

Major problems in anomaly detection

Unlike other problems commonly encountered in machine learning, the most difficult thing in this case is to determine the "degree of anomaly" in individual cases, and the small amount of labeled data is also problematic. This, together with a significant disproportion between anomalous and normal examples, makes it almost impossible to use common techniques.

Time-series anomalies

The main topic of the lecture was the problem of anomaly detection in time series, specifically in several interdependent series. An example of anomaly detection using CPU, RAM and disk was presented. When combining these three closely interconnected time series, it is difficult to create a prediction model for all three variables. And even if such a model was successfully created, it would be extremely complex and difficult to interpret.

Vítězslav Vlček presented his own method of solving this problem, inspired by the wave function collapse algorithm. In this case, the behavior that has not previously occurred in these metrics is considered anomalous. To give you a better idea, I have visualized all three signals (CPU, RAM and disk) in a graph.

Obrázek č. 1: Využití procesoru, paměti RAM a disku
Figure 1: Processor, RAM and disk usage

We split these three signals into tiles according to the selected time interval, which means we can model their interdependence. Their subsequent development can be predicted by "placing" the tiles so that they best correspond to the connection to all three signals. An anomaly is then defined based on the difference between the real development of the signals and the prediction based on the tiles.

Obrázek č. 2: Rozpadnutí časového intervalu na dlaždice
Figure 2: Splitting a time interval into tiles

The advantage of this method is that anomalies do not repeat: if an anomaly occurred in the past, there is a tile the anomaly can be compared to. Even if we don't achieve the desired effect, it is still possible to implement a system of forgetting old tiles, or rather use only the tiles that have previously been displayed at least once. The method is expected to have low memory demand since it's not necessary to save the entire course to the tile – only the coordinates of the start and end for each signal are saved.

Evaluation

In my opinion, the idea of this method of detecting anomalies is interesting. The algorithm itself is simple and computationally undemanding. However, the use might be a bit problematic because it's limited to a very specific kind of problems. I will follow the latest development of this method as well as its use.

Do you like the article? Share it.

On a similar topic

Gauss Algorithmic won the 3rd annual Innovation to Company competiton in Vienna
Gauss Algorithmic won the 3rd annual Innovation to Company competiton in ViennaApril 30, 2018

Our company, Gauss Algorithmic, participated in the 3rd season of the prestigious Innovation to Company competition,...

More info about Gauss Algorithmic won the 3rd annual Innovation to Company competiton in Vienna
Apache Hadoop 3 has been released
Apache Hadoop 3 has been releasedJanuary 22, 2018

Late in 2017, a long-expected Apache Hadoop 3.0.0., a new version of the Hadoop framework designed for the processin...

More info about Apache Hadoop 3 has been released

Read our blog

We are one of the 50 fastest growing companies in the Czech Republic

We are one of the 50 fastest growing companies in the Czech Republic

22. 11. 2020Read more We are one of the 50 fastest growing companies in the Czech Republic

Are you interested in our services?

Contact us

We collect anonymous data to monitor traffic and enhance our website. Do you agree to cookies?

YesNo, give me more information