Overview of Hawkular
It’s a hawk with a monocular. Hawks are known to have a very sharp vision and very good hunters, they can catch preys anticipating their movements at a very fast speed. The analogy with this project is our goal is to be able to monitor things and catch anomalies in fast pace environments.
The project started around the end of 2014 as a successor of the RHQ Project.
It’s a set of few opensource (Apache License v2) projects targeted for monitoring solutions and is sponsored by Red Hat. Those projects provide REST services for all kind of monitoring needs. From collecting rain sensors data and send an SMS on rain, to monitor docker containers or do Application Performance Monitoring, we aim at providing a generic solutions to common problems.
The monitoring services provided by Hawkular are adopted by different projects and central to the Middleware management solution in the ManageIQ project.
IoT entusiast who needs to collect metrics and possibly need to trigger alerts
Operators who are looking for a solution to store metrics from statsD, collectd, syslog…
Developers of solutions who need long term timeseries database storage
Users of ManageIQ who are looking for Middleware management
Users of Kubernetes/Heapster who wants to store docker container metrics in a long term timeseries database storage, thanks to the Heapster sink for Hawkular.
Hawkular Services is the flagship project here. It provides services to store metrics, alert on metrics, keep a graph view of an inventory (how "things" are connected)… If you are looking to use or build open a monitoring solution, this is likely what you will want to use.
Hawkular APM (Application Performance Management) is a separate project specifically designed to monitor applications. The project allows to detect and capture fragments of business requests so one can know and visualize where time is spent (which layer of an application architecture, which microservice…)
Hawkular Metrics is a TimeSeries Database (TSDB), backed by Cassandra for scalability. Hawkular Metrics is used and exposed by Hawkular Services. If you just need a TSBD without alerting and inventory, this is the project that you will want to use.
Hawkular services are a set of independent services sharing information over a communication bus. Each service is focused on a particular task; some examples: store metrics, store inventory of resources, or evaluate rules to trigger actions. While those services can scale independently, at this time a single package with all the services is provided to simplify the installation process.
The Hawkular development team made the choice of using Cassandra as single source of storage. While this requires to maintain an additional piece in the architecture, we believe that the benefits overcome the drawbacks. By using Cassandra, Hawkular can guarantee replication and horizontal scalability for the most demanding needs.
Push vs Pull is a long debate, some solutions made the choice of lettings collectors send the data to the server (Push) while other solutions chose to query metrics from the server to an API
Hawkular Metrics offers a REST API to store and retrieve metrics. While Hawkular provides some clients libraries, the same API can be used with any tool, framework or language that supports HTTP communication, such as commands line tools like Curl.
Three types of metrics are available:
Availability: metric that represent availability: up or down
Counter: numeric value that always goes up; example: the total number of visitors of a web page
Rates: rate of change can be retrieve for counter rates
Gauge: numeric value that can go up or down; example: the response time on an application
Aggregate buckets (min/max/average on some intervals) can be obtained for gauges when individual datapoints are not needed
All metrics can have tags for easy filtering and retention time to automatically prune older data.
Through ptrans network protocol adapter, Hawkular supports metrics collected through:
Additionally Heapster natively has a sink for Hawkular, it allows to collect metrics coming from a Kubernetes environment (container, pods and hosts metrics in general).
Powerful alerting is a must have in any monitoring system, in general alerting can be used to notify people or to execute operations (restart a machine, autoscale an environment…).
Hawkular Alerts embeds a rule engine but hides the complexity behind a simple REST API.
Multiple kind of conditions can be defined:
Threshold: (X < 10, X >= 20)
Threshold Range: X inside [10,20), X outside [100,200]
Compare two metrics: X < 80% Y
String: X starts with "ABC", X matches "A.*B"
Availability: X is DOWN
When many measurements are made, some measurements may not be relevant and one need to filter noise from real issues. To do so, Hawkular Alerts has various ways for dampening:
Strict: N consecutives true conditions are needed before the action is triggered
Relaxed counts: N consecutives true conditions out of M measurements
Strict time: Condition is true for the time T
Relaxed time: N true conditions in the time T
Hawkular has notification plugins for:
Mobile Push notification (through Aerogear UPS)
SMS notification (with Twillio)
But one can extend and add its own plugin which may take actions rather than notify.
Hawkular Services and Metrics have a Grafana plugin for dashboard-like visualization. Additionally, Hawkular provides ready to use Angular directives to stick graphs on any HTML page. One can build a customized report in very little time.
To better react to system failures and/or do root cause analysis, a good understanding of the environment is needed. Hawkular Inventory provides a graph database designed to store information about how various parts of the architecture are connected together.
A simple REST API allows to record and query how elements are related, it could be deployments in a server, group temperature sensors…
Hawkular Application Performance Management (APM), formerly known as Hawkular Business Transaction Management (BTM), can be used to instrument JVM based applications, to observe the application invocations (traces) that may be executing across one or more servers (on-premises and/or cloud).
Fragments of the trace instances, captured from different servers, are reported to the APM server where further analysis is performed. In addition to providing statistical information about the trace instances, it is also possible to extract business relevant properties from the executing applications, for later analysis.
More detailed documentation on this capability can be found here.