Data Mining currently offers time series prediction engine. This engine can be used for alert prediction or predictive charts in a user interface. Project is decomposed into several modules for ability of using only time series models or even web application without Hawkular integration code.
The most important Hawkular Data Mining modules:
Lightweight library for time series modeling and forecasting.
Web archive for standalone usage without Hawkular.
Web archive with Hawkular integration code designed to be deployed in Hawkular.
In standalone distribution, metric data has to be pushed to Data Mining, however easy customization can be made to use any metrics storage.
The forecast artifact is lightweight library for time series modelling and forecasting. It can be easy used in any java project. Following time series models and statistical functions are implemented:
Simple exponential smoothing
Double exponential smoothing (Holt’s linear trend)
Seasonal triple exponential smoothing (Holt Winters)
Simple moving average
Weighted moving average
All variants of exponential smoothing contains optimizer which finds the best smoothing parameters for given training data set. Optimizers minimizes mean squared error of one step ahead prediction using non linear optimization algorithm (Nelder-Mead simplex).
AutomaticForecaster can be used for the best model selecting. It selects the best model based on Akaike information criterion (AIC and AICc with correction) or Bayesian information criterion (BIC). If the time series changes during time, AutomaticForecaster is able to select different model from previously selected, therefore it can be used on arbitrary time series stream data which exhibits concept drift over time.
Augmented Dickey-Fuller test
Autocorrelation function (ACF)
Time series decomposition
Time series lagging
Time series differencing
Automatic period identification based on ACF
When Data Mining is deployed in Hawkular, predictions can be enabled by creating relationships in Inventory:
from tenant to tenant for forecasting all metrics of given tenant
from tenant to metric type for forecasting all metrics of given type
from tenant to metric for forecasting given metric
Following diagram depicts the interaction with other modules. Numbers denotes the order of metric data flow from agent to Data Mining and Metrics module.
Hawkular Data Mining Git repository is hosted on GitHub. To build the repository Maven and Java 8 are needed.