Anomaly detection of time series can be solved in multiple ways. One of the methods is using deep learning-based autoencoder models utilizing encoder-decoder architecture. Before we deep-dive into the methodology in detail, here we are discussing the high-level flow of anomaly detection of time series using autoencoder models
- Pre-process the data to create an input dimension of 10 days’ worth of hourly data and output dimension to predict is 2 day’s data. As neural networks work better in convergent shaped data rather than divergent
- Construct the deep learning model to fit the input and output dimensions using LSTM (or any recurrent neural networks)
- Train the model on train data and predict on unseen days
- Prediction from the model only gives central value on which upper and lower bounds needs to be created based on the standard deviation of output values by adding/subtracting 1.96 * stdev from predicted central value
- If actual value crossed either the upside or lower side, anomaly detection needs to be alerted and intimate user on the same
From the above mentioned all steps, our area of interest would be the development and application of deep learning-based autoencoder model step.
In Encoder-decoder architecture, data will be initially squeezed into latent space/reduced dimensions, and that reduced dimensions will be reconstructed to match output values. LSTM cells used in both encoder-decoder to memorize the time-series pattern, including long-term dependencies of seasonality, trend, and level, etc. Latent space is also known as thought vector, which is a compressed representation of input vector
Full form of LSTM is Long-Short Term Memory. Which means these LSTM cells do have the capacity to memorize both long-term and short-term patterns within the data. With this capacity, these networks can remember short-term patterns like day level patterns and as well as long-term patterns like monthly/yearly patterns
The constructed model is trained on 10 days as input and consecutive 2 days as output, and this training method will be continued recursively by advancing for every 2 days on the input time series data. One can doubt that why we need to proceed by only 2 days? Because after predicting for 2 days, the next 2 days will be consecutive new predictions for bridging the gap and predict seamlessly
At every step, the model is trained for a prescribed number of iterations to update the weights of input (encoder) and output (decoder) layers of the model. Once the model training completed, it will be deployed in a production environment, by utilizing the latest available 10 days data to predict the values for consecutive 2 days. User can choose to retrain the model again, once the actual values get collected. As discussed earlier, upper and lower bounds can be predicted based on variation from the data, so that algorithm can trigger alerts whenever the actual value crosses either upper or lower limits to inform the user
Summon the power of Augmented Analytics to help you identify risks and business incidents in real-time.
Sign up for a pilot run on your data to find new business incidents
Pratap Dangeti is the Principal Data Scientist at CrunchMetrics. He has close to 9 years of experience in the field of analytics across the domains like banking, IT, credit & risk, manufacturing, hi-tech, utilities and telecom. His technical expertise includes Statistical Modelling, Machine Learning, Big Data, Deep Learning, NLP, and artificial intelligence. As a hobbyist, he has written 2 books in the field of Machine Learning & NLP