Hawkular Blog

Connecting Hawkular Agent to Hawkular Services over SSL

19 September 2017, by Josejulio Martínez

SSL provides identification of the communicating parties, confidentiality and integrity of the information being shared.

In production environments, network eavesdropping could be a concern. You can use SSL to provide a secure communication channel between servers.

Hawkular Services expects Hawkular Agents to connect, push information and listen for commands. These commands also include server credentials that could be vulnerable to a man-in-the-middle attack.

A previous article shows how to configure your Hawkular Services with SSL and have Hawkular Agents connect to it trough SSL. This guide will show how is done for Dockerized Hawkular Services Agents.

Preparing your certificates

Before starting we need to prepare the certificates that we are going to use. Your public and private key for Hawkular Services need to be on PEM or PKC12 format. For this guide we will use PEM.

We can create self-signed certificates (in PEM format) using:

keytool -genkey -keystore hawkular.keystore -alias hawkular -dname "CN=hawkular-services" -keyalg RSA -storepass hawkular -keypass hawkular -validity 36500 -ext san=ip:,dns:hawkular-services
keytool -importkeystore -srckeystore hawkular.keystore  -destkeystore hawkular.p12 -deststoretype PKCS12 -srcalias hawkular -deststorepass hawkular -destkeypass hawkular -srcstorepass hawkular
openssl pkcs12 -in hawkular.p12 -password pass:hawkular -nokeys -out hawkular-services-public.pem
openssl pkcs12 -in hawkular.p12 -password pass:hawkular -nodes -nocerts -out hawkular-services-private.key
When creating the certificates, remember to include the host that Hawkular will use (as Common Name(CN) and Subject Alternative Name(SAN)), else the certificate validation will fail. Through this guide, host will be set to hawkular-services.
Always include san=ip: in your certificate, as this will be used internally by Hawkular Services.

Starting Cassandra

Hawkular Services requires a cassandra instance, you can start one by doing:

docker run --name hawkular-cassandra -e CASSANDRA_START_RPC=true -d cassandra:3.0.12

Starting Hawkular Services on SSL

Now that there is a cassandra ready, Hawkular Services can be started, it will be linked to the cassandra instance (hawkular-cassandra)

docker pull hawkular/hawkular-services
docker run --name hawkular-services --link=hawkular-cassandra -e CASSANDRA_NODES=hawkular-cassandra -e HAWKULAR_HOSTNAME=hawkular-services -e HAWKULAR_USE_SSL=true -p 8443:8443 -v `pwd`/hawkular-services-private.key:/client-secrets/hawkular-services-private.key:z -v `pwd`/hawkular-services-public.pem:/client-secrets/hawkular-services-public.pem:z hawkular/hawkular-services
To avoid guessing the host, we explicitly set it to hawkular-services using HAWKULAR_HOSTNAME environmental variable.
If you don’t specify any certificate, Hawkular services will create one, but we won’t be able to connect agents unless we export the certificate.

Starting a Hawkular Agent

By now there should be a Hawkular Services listening on default secure port 8443, if any agent wants to connect, it will need to know and trust its certificate. If you are self-signing your certificate, you will need to pass Hawkular Service’s certificate when starting the agent.

docker pull hawkular/wildfly-hawkular-javaagent
docker run --name hawkular-agent-01 --link=hawkular-services -e HAWKULAR_SERVER_PROTOCOL=https -e HAWKULAR_SERVER_ADDR=hawkular-services -e HAWKULAR_SERVER_PORT=8443 -v `pwd`/hawkular-services-public.pem:/client-secrets/hawkular-services-public.pem:z hawkular/wildfly-hawkular-javaagent
Host must match with the one specified in the certificate, thus setting HAWKULAR_SERVER_ADDR to hawkular-services is required. Update as needed if using other value.

Testing the setup

Any Hawkular client that supports connecting using SSL can be used. Curl, ManageIQ and HawkFX will be show below.

If using self-signed certificates, the client machine needs to trust Hawkular Services certificate or the client should support to specify which certificate to trust. See here for more info.


Simplest way to test, we only need to tell curl how to resolve hawkular-services and pass the public certificate.

curl -H "Hawkular-Tenant: hawkular" --resolve "hawkular-services:8443:" --cacert hawkular-services-prd -X GET https://hawkular-services:8443/hawkular/metrics/metrics

It will output metrics stored on Hawkular Services.


It can be quickly tested using the dockerized ManageIQ as follow:

docker run --link=hawkular-services --privileged -d -p 8444:443 manageiq/manageiq:fine-3

Once ManageIQ has started, add a Middleware Provider, there is three secured ways to do that:

  1. SSL  — Requires that the client box trusts Hawkular Services certificate.

  2. SSL trusting custom CA — Requires to provide the certificate to trust.

  3. SSL without validation — No validation is performed.

We will focus on (2) and (3) as (1) will require to trust the certificates on the machine itself.

Before going onto details, navigate to https://localhost:8444/, login with default username admin and password smartvm. Click on Middleware → Configuration → Add a New Middleware Provider.

SSL trusting custom CA

We need to select SSL trusting custom CA in Security Protocol and copy the certificate from the contents of hawkular-services-public.pem. Fill the Hostname with the one used in the certificate and complete the required information.

2017 09 manageiq custom ca
Figure 1: Middleware Provider SSL trusting custom ca

SSL without validation

We need to select SSL without validation in Security Protocol. No validation regarding the certificate is made with this option, only fill the required information.

2017 09 manageiq custom ca
Figure 2: Middleware Provider SSL without validation


One can connect using HawkFx by either installing the certificate on the machine or disabling the verification as show in the image. You will also need to update your /etc/hosts to resolve hawkular-services.

sudo su -c "echo ' hawkular-services' >> /etc/hosts"

Alternatively, if no verification is used, one could use localhost instead of hawkular-services.

2017 09 hawkfx disable validation
Figure 3: HawkFX without validation


Securing communications between a dockerized Hawkular Agent and Hawkular Services is possible with self-signed certificates. Connecting clients is also possible with the additional step of providing the certificate.

Hawkular Alerts with OpenTracing

06 September 2017, by John Mazzitelli

Two recent blogs discuss how OpenTracing instrumentation can be used to collect application metrics:

A further interesting integration can be the addition of Hawkular Alerts to the environment.

As the previous blog and demo discuss, Hawkular Alerts is a generic, federated alerts system that can trigger events, alerts, and notifications from different, independent systems such as Prometheus, ElasticSearch, and Kafka.

Here we can combine the two. Let’s follow the directions for the OpenTracing demo (using the Jaeger implementation) and add Hawkular Alerts.

What this can show is OpenTracing application metrics triggering alerts when (as in this example) OpenTracing spans encounter a larger-than-expected error rates.

(Note: these instructions assume you are using Kubernetes / Minikube - see the Hawkular OpenTracing blogs linked above for more details on these instructions)


Here we start minikube giving it enough resources to run all of the pods necessary for this demo. We also start up a browser pointing to the Kubernetes dashboard, so you can follow the progress of the remaining instructions.

minikube start --cpus 4 --memory 8192

minikube dashboard


kubectl create -f https://raw.githubusercontent.com/coreos/prometheus-operator/v0.11.0/bundle.yaml

kubectl create -f https://raw.githubusercontent.com/objectiser/opentracing-prometheus-example/master/prometheus-kubernetes.yml

(Note: the last command might not work depending on your version - if you get errors, download a copy of prometheus-kubernetes.yml and edit it, changing “v1alpha1” to “v1”)


kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-kubernetes/master/all-in-one/jaeger-all-in-one-template.yml

The following will build and deploy the Jaeger example code that will produce the OpenTracing data for the demo:

mkdir -p ${HOME}/opentracing ; cd ${HOME}/opentracing

git clone git@github.com:objectiser/opentracing-prometheus-example.git

cd opentracing-prometheus-example/simple

eval $(minikube docker-env)

mvn clean install docker:build

kubectl create -f services-kubernetes.yml

(Note: The last command might not work depending on your version - if you get errors, edit services-kubernetes.yml, changing “v1alpha1” to “v1”)


The following will deploy Hawkular Alerts and create the trigger definition that will trigger an alert when the Jaeger OpenTracing data indicates an error rate that is over 30%

kubectl create -f https://raw.githubusercontent.com/hawkular/hawkular-alerts/master/dist/hawkular-alerts-k8s.yml

Next use minikube service hawkular-alerts --url to determine the Hawkular Alerts URL and point your browser to the path “/hawkular/alerts/ui” at that URL (i.e. http://host:port/hawkular/alerts/ui).

From the browser page running the Hawkular Alerts UI, enter a tenant name in the top right text field (“my-organization” for example) and click the “Change” button.

Navigate to the “Triggers” page (found in the left-hand nav menu).

Click the kabob menu icon at the top and select “New Trigger”.

In the text area, enter the following to define a new trigger that will trigger alerts when the Prometheus query shows that there is a 30% error rate or greater in the accountmgr or ordermgr servers:

      "name":"High Error Rate",
      "description":"Data indicates high error rate",
         "expression":"(sum(increase(span_count{error=\"true\",span_kind=\"server\"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) / sum(increase(span_count{span_kind=\"server\"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)) > 0.3"

2017 09 06 new trigger
Figure 1: Create New Alert Trigger
2017 09 06 trigger
Figure 2: Alert Trigger

Now navigate back to the “Dashboard” page (again via the left-hand nav menu). From this Dashboard page, look for alerts when they are triggered. We’ll next start generating the data that will trigger these alerts.


export ORDERMGR=$(minikube service ordermgr --url)


Once the data starts to be collected, you will see alerts in the Hawkular Alerts UI as error rates become over 30% in the past minute (as per the Prometheus query).

2017 09 06 dashboard
Figure 3: Alerts Dashboard
2017 09 06 alert list
Figure 4: Alert

If you look at the alerts information in the Hawkular Alerts UI, you’ll see the conditions that triggered the alerts. For example, one such alert could look like this:

Time: 2017-09-01 17:41:17 -0400
External[prometheus]: prometheus-test[Event [tenantId=my-organization,
id=1a81471d-340d-4dba-abe9-5b991326dc80, ctime=1504302077288, category=prometheus,
dataId=prometheus-test, dataSource=none, text=[1.504302077286E9, 0.3333333333333333],
context={service=ordermgr, version=0.0.1}, tags={}, trigger=null]] matches
[(sum(increase(span_count{error="true",span_kind="server"}[1m])) without
(pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) /
sum(increase(span_count{span_kind="server"}[1m])) without
(pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)) > 0.3]

Notice the “ordermgr” service (version "0.0.1") had an error rate of 0.3333 (33%) which caused the alert since it is above the allowed 30% threshold.

At this point, the Hawkular Alerts UI provides the ability for system admins to log notes about the issue, acknowledge the alert and mark the alert resolved if the underlying issue has been fixed. These lifecycle functions (also available as REST operations) are just part of the value add of Hawkular-Alerts.

You could do more complex things such as only trigger this alert if this Prometheus query generated results AND some other condition was true (say, ElasticSearch logs match a particular pattern, or if a Kafka topic had certain data). This demo merely scratches the surface, but does show how Hawkular Alerts can be used to work with OpenTracing to provide additional capabilities that may be found useful by system administrators and IT support personnel.

Canary Deployment in OpenShift using OpenTracing based Application Metrics

18 August 2017, by Gary Brown

In a previous article we showed how OpenTracing instrumentation can be used to collect application metrics, in addition to (but independent from) reported tracing data, from services deployed within a cloud environment (e.g. Kubernetes or OpenShift).

In this article we will show how this information can be used to aid a Canary deployment strategy within OpenShift.

2017 08 18 canary service compare
Figure 1: Error ratio per service and version

The updated example application

We will be using the same example as used in the previous article.

However since writing that article, the configuration of the tracer and Prometheus metrics support has been simplified. There is now no explicit configuration of either, with only some auto configuration of MetricLabel beans to identify some custom labels to be added to the Prometheus metrics, e.g.

Metrics configuration used in both services:
public class MetricsConfiguration {

    public MetricLabel transactionLabel() {
        return new BaggageMetricLabel("transaction", "n/a"); (1)

    public MetricLabel versionLabel() {
        return new ConstMetricLabel("version", System.getenv("VERSION")); (2)

1 This metric label identifies the business transaction associated with the metrics, which can be used to isolate the specific number of requests, duration and errors that occurred when the service was used within the particular business transaction
2 This metric label identifies the service version, which is especially useful in the Canary deployment use case being discussed in this article

The first step is to following the instructions in the example for deploying and using the services within OpenShift.

Once the ./genorders.sh script has been running for a while, to generate plenty of metrics for version 0.0.1 of the services, then deploy the new version of the services. This is achieved by:

  • updating the versions in the pom.xml files, within the simple/accountmgr and simple/ordermgr folders from 0.0.1 to 0.0.2

  • re-run the mvn clean install docker:build command from the simple folder

  • deploy the canary versions of the services using the command oc create -f services-canary-kubernetes.yml

As our services accountmgr and ordermgr determine the backing deployment based on the respective labels app: accountmgr and app: ordermgr, simply having a second deployment with these labels will make them serve requests in a round-robin manner.

This deployment script has been pre-configured with the 0.0.2 version, and to only start a single instance of the new version of the services. This may be desirable if you want to monitor the behaviour of the new service versions over a reasonable time period, but as we want to see results faster we will scale them up to see more activity. You can do this by expanding the deployment area for each service in the OpenShift web console and selecting the up arrow to scale up each service:

2017 08 18 canary scale up
Figure 2: Scaling up canary deployment

Now we can monitor the Prometheus dashboard, using the following query, to see the error ratio per service and version:

sum(increase(span_count{error="true",span_kind="server"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) / sum(increase(span_count{span_kind="server"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)

The result of this query can be seen in Figure 1 at the top of the article. This chart shows that version 0.0.2 of the accountmgr service has not generated any errors, while the 0.0.2 of the ordermgr appears to be less error prone than version 0.0.1.

Based on these metrics, we could decide that the new versions of these services are better than the previous, and therefore update the main service deployments to use the new versions. In the OpenShift web console you can do this by clicking the three vertical dots in the upper right hand side of the deployment region and selecting Edit YAML from the menu. This will display an editor window where you can change the version from 0.0.1 to 0.0.2 in the YAML file.

2017 08 18 canary service update
Figure 3: Update the service version

After you save the YAML configuration file, in the web console you can see the service going through a "rolling update" as OpenShift incrementally changes each service instance over to the new version.

2017 08 18 canary rolling update
Figure 4: Rolling update

After the rolling update has completed for both the ordermgr and accountmgr services, then you can scale down or completely remove the canary version of each deployment.

An alternative to performing the rolling update would simply be to name the canary version something else (i.e. specific to the version being tested), and when it comes time to switch over, simply scale down the previous deployment version. This would be more straightforward, but wouldn’t show off the cool rolling update approach in the OpenShift web console :-)

2017 08 18 canary scale down
Figure 5: Scaling down canary deployment
Although we have updated both services at the same time, this is not necessary. Generally microservices would be managed by separate teams and subject to their own deployment lifecycles.


This article has shown how application metrics, captured by instrumenting services using the OpenTracing API, can be used to support a simple Canary deployment strategy.

These metrics can similarly be used with other deployment strategies, such as A/B testing, which can be achieved using a weighted load balancing capability within OpenShift.

Advanced Behaviour Detection with Nelson Rules

15 August 2017, by Lucas Ponce

Modeling Conditions

Hawkular Alerting offers several types of Conditions for defining Triggers. Most of the Conditions deal with numeric data but String, Availability and Event data are also supported.

Modeling scenarios for detecting behaviours is highly dependent on the nature of the Domain being represented. The Domain may only require simple numeric threshold conditions to efficiently detect unexpected situations.

In other domains, it can be non-trivial to identify unusual metric variations that may lead to a problem. Simple thresholds are not expressive enough to detect metric patterns or trends that can identify potential problems.

Nelson Rules

Hawkular Alerting supports Conditions based on Nelson Rules to enable advanced detection on Numeric metrics.

These rules are based on the mean and the standard deviation of the samples and offer additional techniques for modeling complex scenarios.

For example,

   "id": "nelson-rule-trigger",
   "name": "Nelson Rule Trigger",
   "description": "An example Trigger that uses Nelson Rules Conditions.",
   "enabled": true,
      "type": "NELSON", (1)
      "dataId": "metric-data-id",
      "activeRules": ["Rule1","Rule2"], (2)
      "sampleSize": 75 (3)
  1. Mark this Condition as a NelsonRule

  2. Define the Nelson Rules to activate (Rule1, Rule2, …​, Rule8) for metric-data-id (all rules are activated by default)

  3. Define the sampleSize (by default this value is set to 50)

Each rule represents a specific pattern as described below:

Rule 1

Nelson Rule 1

One sample is grossly out of control.

Rule 2

Nelson Rule 2

Some prolonged error has been detected.

Rule 3

Nelson Rule 3

An unusual trend has been detected.

Rule 4

Nelson Rule 4

The oscillation of a metric is beyond an expected amount of noise.

Note that the rule is concerned with directionality only. The position of the mean and the size of the standard deviation have no bearing.

Rule 5

Nelson Rule 5

There is a medium tendency for samples to be mediumly out of control.

The side of the mean for the third point is unspecified.

Rule 6

Nelson Rule 6

There is a strong tendency for samples to be out of control.

Rule 7

Nelson Rule 7

A greater variation would be expected.

Rule 8

Nelson Rule 8

Jumping from above to below whilst missing the first standard deviation band is rarely random.


Applying Nelson Rules in our scenario can help to detect potential "out of control" situations.

But as discussed, modeling scenarios are highly dependent of the nature of the Domain; applying Nelson Rules is a useful tool to help identify a problem. Although, the alerts are predictive and a Domain’s Analyst may need to evaluate the quality of the model.

Hawkular Alerts with Prometheus, ElasticSearch, Kafka

11 August 2017, by John Mazzitelli

Federated Alerts

Hawkular Alerts aims to be a federated alerting system. That is to say, it can fire alerts and send notifications that are triggered by data coming from a number of third-party external systems.

Thus, Hawkular Alerts is more than just an alerting system for use with Hawkular Metrics. In fact, Hawkular Alerts can be used independently of Hawkular Metrics. This means you do not even have to be using Hawkular Metrics to take advantage of the functionality provided by Hawkular Alerts.

This is a key differentiator between Hawkular Alerts and other alerting systems. Most alerting systems only alert on data coming from their respective storage systems (e.g. the Prometheus Alert Engine alerts only on Prometheus data). Hawkular Alerts, on the other hand, can trigger alerts based on data from various systems.

Alerts vs. Events

Before we begin, a quick clarification is in order. When it is said that Hawkular Alerts fires an "alert" it means some data came into Hawkular Alerts that matched some conditions which triggered the creation of an alert in Hawkular Alerts backend storage (which can then trigger additional actions such as sending emails or calling a webhook). An "alert" typically refers to a problem that has been detected, and someone should take action to fix it. An alert has a lifecycle attached to it - alerts are opened, then acknowledged by some user who will hopefully fix the problem, then resolved when the problem can be considered closed.

However, there can be conditions that occur that do not represent problems but nevertheless are events you want recorded. There is no lifecycle associated with events and no additional actions are triggered by events, but "events" are fired by Hawkular Alerts in the same general manner as "alerts" are.

In this document, when it is said that Hawkular Alerts can fire "alerts" based on data coming from external third-party systems such as Prometheus, ElasticSearch, and Kakfa, this also means events can be fired as well as alerts. What this means is you can record any event (not just a "problem", aka "alert") that can be gleaned from this data coming from external third-party systems.

See alerting philosophy for more.


There is a recorded demo found here that will illustrate what this document is describing. After you read this document, you should watch the demo to gain further clarity on what is being explained. The demo is the multiple-sources example which you can run yourself found here (note: at the time of writing, this example is only found in the next branch, to be merged in master soon).


Hawkular Alerts can take the results of Prometheus metric queries and use the queried data for triggers that can fire alerts.

This Hawkular Alerts trigger will fire an alert (and send an email) when a Prometheus metric indicates our store’s inventory of widgets is consistently low (as defined by the Prometheus query you see in the "expression" field of the condition):

   "id": "low-stock-prometheus-trigger",
   "name": "Low Stock",
   "description": "The number of widgets in stock is consistently low.",
   "severity": "MEDIUM",
   "enabled": true,
   "tags": {
      "prometheus": "Prometheus"
      "actionPlugin": "email",
      "actionId": "email-notify-owner"
      "type": "EXTERNAL",
      "alerterId": "prometheus",
      "dataId": "prometheus-dataid",
      "expression": "rate(products_in_inventory{product=\"widget\"}[30s])<2"

Integration with Prometheus Alert Engine

As a side note, though not demostrated in the example, Hawkular Alerts also has an integration with Prometheus' own Alert Engine. This means the alerts generated by Prometheus itself can be forward to Hawkular Alerts which can, in turn, be used for additional processing, perhaps for use with data that is unavailable to Prometheus that can tell Hawkular Alerts to fire other alerts. For example, Hawkular Alerts can take Prometheus alerts as input and feed it back into other conditions that trigger on the Prometheus alert along with ElasticSearch logs.


Hawkular Alerts can examine logs stored in ElasticSearch and trigger alerts based on patterns that match within the ElasticSearch log messages.

This Hawkular Alerts trigger will fire an alert (and send an email) when ElasticSearch logs indicate sales are being lost due to inventory being out of stock of items (as defined by the condition which looks for a log category of "FATAL" which happens to mean a lost sale in the case of the store’s logs). Notice dampening is enabled on this trigger - this alert will only fire when the logs indicate lost sales every 3 times.

   "id": "lost-sale-elasticsearch-trigger",
   "name": "Lost Sale",
   "description": "A sale was lost due to inventory out of stock.",
   "severity": "CRITICAL",
   "enabled": true,
   "tags": {
      "Elasticsearch": "Localhost instance"
   "context": {
      "timestamp": "@timestamp",
      "filter": "{\"match\":{\"category\":\"inventory\"}}",
      "interval": "10s",
      "index": "store",
      "mapping": "level:category,@timestamp:ctime,message:text,category:dataId,index:tags"
      "actionPlugin": "email",
      "actionId": "email-notify-owner"
"dampenings": [
      "triggerMode": "FIRING",
      "evalTrueSetting": 3
      "type": "EVENT",
      "dataId": "inventory",
      "expression": "category == 'FATAL'"


Hawkular Alerts can examine data retrieved from Kafka message streams and trigger alerts based that Kafka data.

This Hawkular Alerts trigger will fire an alert when data over a Kakfa topic indicates a large purchase was made to fill the store’s inventory (as defined by the condition which evaluates to true when any number over 17 is received on the Kafka topic):

   "id": "large-inventory-purchase-kafka-trigger",
   "name": "Large Inventory Purchase",
   "description": "A large purchase was made to restock inventory.",
   "severity": "LOW",
   "enabled": true,
   "tags": {
      "Kafka": "Localhost instance"
   "context": {
      "topic": "store",
      "kafka.bootstrap.servers": "localhost:9092",
      "kafka.group.id": "hawkular-alerting"
   "actions":[ ]
      "type": "THRESHOLD",
      "dataId": "store",
      "operator": "GT",
      "threshold": 17

But, Wait! There’s More!

The above only mentions the different ways Hawkular Metrics retrieves data for use in determining what alerts to fire. What is not covered here is the fact that Hawkular Alerts can stream data in the other direction as well - Hawkular Alerts can send alert and event data to things like an ElasticSearch server or a Kafka broker. There are additional examples (mentioned below) that can demonstrate this capability.

The point is Hawkular Alerts should be seen as a shared, common alerting engine that can be shared for use by multiple third-party systems and can be used as both a consumer and producer - as a consumer of the data from external third-party systems (which is used to fire alerts and events) and as a producer to send notifications of alerts and events to external third-party systems.

More Examples

Take a look at the Hawkular Alerts examples for more examples on using external systems as data to be used for triggering alerts. (note: at the time of writing, some examples are currently in the next branch such as the Kafka ones).

Older posts:

RSS Feed


© 2016 | Hawkular is released under Apache License v2.0