Hawkular Blog

Running Hawkular Agent

11 December 2017, by John Mazzitelli

Now that Hawkular is moving towards integraton with Prometheus as its metrics collection and storage system, the agent has had some changes. This document will discuss those changes and how to run the Hawkular Agent, specifically version 2.0.0.Final.

First, the Hawkular Agent now only runs as a javaagent (no longer does it run inside WildFly as a subsystem extension). So to run the agent, simply attach it as a javaagent to whatever JVM you want to monitor (be it a WildFly or EAP app server or any JVM application that exposes metrics via JMX). For example, add this command line option to the command that starts the JVM:

-javaagent:hawkular-javaagent.jar=config=hawkular-javaagent-config.yaml,delay=10

The configuration file contains all the settings necessary to start the agent and connect to the Hawkular Server. Once connected, the agent will pull down additional configuration settings from the server.

Running in WildFly

If you are attaching the agent to a WildFly server, you must configure some additional settings.

If you are running WildFly in standalone mode, the standalone.conf you should have something like:

JBOSS_MODULES_SYSTEM_PKGS="org.jboss.byteman,org.jboss.logmanager"
...
JAVA_OPTS="$JAVA_OPTS -Djava.util.logging.manager=org.jboss.logmanager.LogManager"
...
JAVA_OPTS="$JAVA_OPTS -javaagent:$JBOSS_HOME/bin/hawkular-javaagent.jar=config=$JBOSS_HOME/standalone/configuration/hawkular-javaagent-config.yaml,delay=10"

If you are running WildFly in domain mode, the host controller’s domain.conf file should have something like:

HOST_CONTROLLER_JAVA_OPTS="$HOST_CONTROLLER_JAVA_OPTS -Dhawkular.agent.metadata.domain=true \
   -Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager \
   -Djava.util.logging.manager=org.jboss.logmanager.LogManager \
   -Djboss.host.server-excluded-properties=jboss.modules.system.pkgs,java.util.logging.manager,hawkular.agent.metadata.domain \
   -javaagent:$JBOSS_HOME/bin/hawkular-javaagent.jar=config=$JBOSS_HOME/domain/configuration/hawkular-javaagent-config-domain.yaml,delay=10"

Running in WildFly Domain Mode

When running WildFly in domain mode, a Hawkular Agent runs in the host controller. There also needs to be a Hawkular Agent running in each of the slave servers spawned by the host controller. You ensure an agent is attached to each slave server by adding the proper -javaagent command line option to the server JVM command found in the host controller’s host.xml. Note also that you also have to tell each slave server’s agent what port it should bind the metrics exporter endpoint to. For example:

  <jvms>
    <jvm name="default">
      <jvm-options>
        ...
        <!-- HAWKULAR SETTINGS -->
        <option value="-Djboss.modules.system.pkgs=org.jboss.byteman,org.jboss.logmanager"/>
        <option value="-Djava.util.logging.manager=org.jboss.logmanager.LogManager"/>
        <option value="-javaagent:${jboss.home.dir}/bin/hawkular-javaagent.jar=config=${jboss.domain.config.dir}/hawkular-javaagent-config-metrics-only.yaml,delay=10"/>
      </jvm-options>
    </jvm>
  </jvms>
  <servers>
    <server name="server-one" group="main-server-group">
      <jvm name="default">
        <jvm-options>
          <!-- HAWKULAR SETTINGS -->
          <option value="-Dhawkular.agent.metrics.port=9780"/>
        </jvm-options>
      </jvm>
    </server>
  </servers>
  ...

The slave server agents are to be run in metrics-only mode (that is, no managed-servers section is defined in their configuration files).

Example

To see an example WildFly distribution configured to run the Hawkular Agent in either standalone or domain mode, build from source or grab from a Maven repository the Maven artifact "org.hawkular.agent:hawkular-javaagent-wildfly-dist". Look at its bin/ directory for the domain.conf and standalone.conf files and look in the standalone/configuration and domain/configuration directories for the different agent configuration files. Note also the host-hawkular.xml that is to be used with the --host-config option when running in domain mode.

How It Works

When the agent starts, it reads its configuration file from disk. The agent then attempts to connect to the Hawkular Server. The agent will download some additional configuration from the Hawkular Server that it overlays on top of its existing configuration (this pulls down specific metadata for the types of resources it is to manage).

The agent will also pull down a separate configuration file that is used to configure the Prometheus JMX Exporter which is started by the agent. This is called the "Metrics Exporter". Thus you will get an agent that collects inventory and stores it in Hawkular, but you will also have a metrics exporter that will be used to expose metrics for collection by Hawkular’s Prometheus Server.

The agent determines what configuration files to pull down via the settings in the local agent configuration file. The local agent configuration file is the one referred to in the javaagent command line - it is where you told the agent to find the configuration file. e.g. javaagent:hawkular-javaagent.jar=config=this-config-file.yaml,delay=10:

subsystem:
  type-version: WF10
...
metrics-exporter:
  config-file: WF10

As an example, if the agent is to monitor EAP6, you would replace "WF10" with "EAP6". As more supported products are added, additional configuration files are loaded on the Hawkular Server which can then be pulled down by the agent by simply changing the values of these type-version and config-file settings.

Note	Note that a WildFly host controller agent must be told to download the metrics exporter config file of "WF10-DOMAIN" while the type-version should still be "WF10". See the WildFly+Agent distro (Maven artifact org.hawkular.agent:hawkular-javaagent-wildfly-dist) as an example.

When an agent itself is registered in inventory on the Hawkular Server, Hawkular’s Prometheus Server will be told to scrape the agent’s metrics exporter endpoint, thus the agent’s metrics are collected automatically.

How It Works with WildFly Domain Mode

The agent runs in the host controller. The host controller’s agent can get all of the inventory for the host controller and its slave servers. There is also a metrics exporter running in the host controller’s agent as well (so metrics for the host controller can be collected).

An agent runs in each slave server as well, but those agents run in "metrics-only" mode. No inventory is collected by these agents, but they do start a metrics exporter so metrics from each slave server can be collected.

To configure the Prometheus JMX Exporter in the slave server’s agent, you must turn it on in "slave proxy mode" via these settings within the local agent configuration file. This is the agent configuration file you told the slave server to use in their -javaagent command line argument. See the host.xml entry you added for the slave servers. e.g.:

<jvms>
  <jvm name="default">
    <jvm-options>
      <option value="-javaagent:${jboss.home.dir}/bin/hawkular-javaagent.jar=config=${jboss.domain.config.dir}/hawkular-javaagent-config-metrics-only.yaml,delay=10"/>

metrics-exporter:
  proxy:
    mode: slave
    data-dir: ${jboss.domain.base.dir}/data/hawkular-metrics-exporter

The host controller’s agent must turn on the Prometheus JMX Exporter in "master proxy mode" in the host controller agent’s local configuration file (this is the configuration file you told the host controller agent to use in its -javaagent command line option within the domain.conf file):

metrics-exporter:
  proxy:
    mode: master
    data-dir: ${jboss.domain.data.dir}/hawkular-metrics-exporter

Note the data directories between the slaves and master must be the same. It is recommended to use a subdirectory under WildFly’s domain/data directory as you see in the examples above.

What happens under the covers is the slave server will write a file to the data directory describing the metrics exporter endpoint it started. The master will collect this information from all slaves and makes sure the Hawkular Server will tell the Prometheus Server to scrape those slave endpoints as well as the host controller agent’s own metrics exporter endpoint.

Hawkular Alerts in ManageIQ

11 October 2017, by Edgar Hernández

Hawkular project includes an alerting module that can be used to send notifications when certain conditions are met. The alerting module is bundled in Hawkular Middleware Manager (aka. Hawkular Services)

ManageIQ also has alerting capabilities and Hawkular Middleware Manager integrates to it. Once Hawkular is added as a provider, ManageIQ alerting features can be used to monitor servers managed by Hawkular.

ManageIQ alerting terminology

In ManageIQ, two terms are used:

Alert: Defines the which type of infrastructure item should be monitored, the conditions and the actions to take when the conditions are met.
Alert profile: It is a relation between a set of alerts and a set of infrastructure items. The set of alerts in the profile will all be applied to the set of infrastructure items. If any of the infrastructure items met the conditions in one of the alerts, the actions of the alert will run.

ManageIQ has support for several types of infrastructure items, but Hawkular Middleware Manager supports only a subset of them. Middleware servers are an example of infrastructure items that are supported by Hawkular.

Hawkular alerting terminology

Hawkular’s alerting module has several kinds of objects (and terms). The two objects that are relevant from a ManageIQ perspective are:

Group trigger: Defines the set of conditions that should be met to fire an alert or an event. A group trigger is a template and no alert nor events will be fired until a member is added to it.
Group member: It’s an association of a group trigger with the actual data to be evaluated. Internally to Hawkular, group members are managed instances of group triggers. Changes to the group trigger are pushed down to the members.

Currently, ManageIQ creates group triggers that raise events (not alerts) and polls Hawkular to catch the events and run configured actions, if needed.

Terminology relationships

When a user creates a ManageIQ alert, a Hawkular group trigger is created in the background. From then, any changes to the ManageIQ alert are replicated to the associated Hawkular group trigger until the alert is deleted, which causes the group trigger to also be deleted.

While ManageIQ alerts have a direct relation with Hawkular group triggers, ManageIQ alert profiles are a little bit more complicated. For each alert in a ManageIQ profile, one Hawkular group member is created for each infrastructure item in the profile.

This means that nothing is created in Hawkular if:

The alert profile is empty (has no alerts nor infrastructure items)
The alert profile has alerts but has no infrastructure items
The alert profile has infrastructure items but has no alerts

If a ManageIQ alert profile has one alert and has assigned multiple infrastructure items, then the relevant Hawkular group trigger will have as many members as infrastructure items are in the ManageIQ alert profile.

If a ManageIQ alert profile has multiple alerts and has assigned only one infrastructure item, then each Hawkular group trigger will have one member.

With these two examples, you should be able to guess what will happen if the ManageIQ alert profile has a set of alerts and a set of infrastructure items.

Creating ManageIQ alerts

Alerts are created by navigating to Control > Explorer > Alerts. In the alerts tree, select the All alerts folder. This will enable the Add a New Alert option under the Configuration button.

In the form to create/edit an alert, be sure to select Middleware server in the Based On field. Currently, this is the only infrastructure item supported by Hawkular. If you choose something else, the alert won’t be managed by Hawkular. All other options can be filled as desired.

When the alert is created, it will be available in the control explorer and will be available to be included in an alert profile. In the background, a Hawkular group trigger is also created.

Creating an alert profile

Alert profiles are created by navigating to Control > Explorer > Alert Profiles. In the alerts tree, select the Middleware server Alert Profiles item. This will enable the Add a Middleware Server Alert Profile option under the Configuration button.

In the form to create/edit an alert profile, write a description and select the desired alerts to evaluate. At least, one alert is required to be able to create the alert profile.

When the alert profile is created, it will be available in the control explorer which will also list the alerts contained in the profile. Nothing will be created in Hawkular because when the profile is created it is still not assigned to middleware servers.

Assigning middleware servers to an alert profile

To assign middleware servers to an alert profile, select the desired profile in the control explorer. In the toolbar, use the Edit assignments for this Alert Profile options under the Configuration button.

This will show the assignments page. An alert profile can be assigned to specific middleware servers or to all inventoried middleware servers (The Enterprise).

Once you have chosen the desired middleware servers (or the enterprise) and changes are saved, the view page of the alert will be displayed again.

In the background, Hawkular group members will be created to make effective the configuration and alerts should start triggering.

Viewing alerts in the timeline

The timeline of ManageIQ Hawkular provider will log events if the configuration of an alert is enabled to show timeline events. Hawkular’s provider timeline can be accessed through the summary page of the provider, under the Monitoring menu:

ManageIQ Hawkular provider timeline menu button

If the alert has enabled the standard Show on Timeline configuration, the options to query the events are:

Event type: Management Events
Category: Alarm/Status Change/Errors

Conclusion

ManageIQ and Hawkular, albeit being two independent projects, they can be connected together to complement each other features. In this post is discussed how the alerting integration works and how to configure a basic alert.

At the time of writing, ManageIQ supports only Middleware Servers as targets and, also, only a limited set of metrics is available to configure alerts. There is ongoing work to provide a wider range of metrics and this is expected to be available in following versions.

Connecting Hawkular Agent to Hawkular Services over SSL

19 September 2017, by Josejulio Martínez

SSL provides identification of the communicating parties, confidentiality and integrity of the information being shared.

In production environments, network eavesdropping could be a concern. You can use SSL to provide a secure communication channel between servers.

Hawkular Services expects Hawkular Agents to connect, push information and listen for commands. These commands also include server credentials that could be vulnerable to a man-in-the-middle attack.

A previous article shows how to configure your Hawkular Services with SSL and have Hawkular Agents connect to it trough SSL. This guide will show how is done for Dockerized Hawkular Services Agents.

Preparing your certificates

Before starting we need to prepare the certificates that we are going to use. Your public and private key for Hawkular Services need to be on PEM or PKC12 format. For this guide we will use PEM.

We can create self-signed certificates (in PEM format) using:

keytool -genkey -keystore hawkular.keystore -alias hawkular -dname "CN=hawkular-services" -keyalg RSA -storepass hawkular -keypass hawkular -validity 36500 -ext san=ip:127.0.0.1,dns:hawkular-services
keytool -importkeystore -srckeystore hawkular.keystore  -destkeystore hawkular.p12 -deststoretype PKCS12 -srcalias hawkular -deststorepass hawkular -destkeypass hawkular -srcstorepass hawkular
openssl pkcs12 -in hawkular.p12 -password pass:hawkular -nokeys -out hawkular-services-public.pem
openssl pkcs12 -in hawkular.p12 -password pass:hawkular -nodes -nocerts -out hawkular-services-private.key

Important

When creating the certificates, remember to include the host that Hawkular will use (as Common Name(CN) and Subject Alternative Name(SAN)), else the certificate validation will fail. Through this guide, host will be set to hawkular-services.

Warning

Always include san=ip:127.0.0.1 in your certificate, as this will be used internally by Hawkular Services.

Starting Cassandra

Hawkular Services requires a cassandra instance, you can start one by doing:

docker run --name hawkular-cassandra -e CASSANDRA_START_RPC=true -d cassandra:3.0.12

Starting Hawkular Services on SSL

Now that there is a cassandra ready, Hawkular Services can be started, it will be linked to the cassandra instance (hawkular-cassandra)

docker pull hawkular/hawkular-services
docker run --name hawkular-services --link=hawkular-cassandra -e CASSANDRA_NODES=hawkular-cassandra -e HAWKULAR_HOSTNAME=hawkular-services -e HAWKULAR_USE_SSL=true -p 8443:8443 -v `pwd`/hawkular-services-private.key:/client-secrets/hawkular-services-private.key:z -v `pwd`/hawkular-services-public.pem:/client-secrets/hawkular-services-public.pem:z hawkular/hawkular-services

Note	To avoid guessing the host, we explicitly set it to hawkular-services using HAWKULAR_HOSTNAME environmental variable.

Warning

If you don’t specify any certificate, Hawkular services will create one, but we won’t be able to connect agents unless we export the certificate.

Starting a Hawkular Agent

By now there should be a Hawkular Services listening on default secure port 8443, if any agent wants to connect, it will need to know and trust its certificate. If you are self-signing your certificate, you will need to pass Hawkular Service’s certificate when starting the agent.

docker pull hawkular/wildfly-hawkular-javaagent
docker run --name hawkular-agent-01 --link=hawkular-services -e HAWKULAR_SERVER_PROTOCOL=https -e HAWKULAR_SERVER_ADDR=hawkular-services -e HAWKULAR_SERVER_PORT=8443 -v `pwd`/hawkular-services-public.pem:/client-secrets/hawkular-services-public.pem:z hawkular/wildfly-hawkular-javaagent

Note	Host must match with the one specified in the certificate, thus setting HAWKULAR_SERVER_ADDR to hawkular-services is required. Update as needed if using other value.

Testing the setup

Any Hawkular client that supports connecting using SSL can be used. Curl, ManageIQ and HawkFX will be show below.

Note	If using self-signed certificates, the client machine needs to trust Hawkular Services certificate or the client should support to specify which certificate to trust. See here for more info.

Curl

Simplest way to test, we only need to tell curl how to resolve hawkular-services and pass the public certificate.

curl -H "Hawkular-Tenant: hawkular" --resolve "hawkular-services:8443:127.0.0.1" --cacert hawkular-services-prd -X GET https://hawkular-services:8443/hawkular/metrics/metrics

It will output metrics stored on Hawkular Services.

ManageIQ

It can be quickly tested using the dockerized ManageIQ as follow:

docker run --link=hawkular-services --privileged -d -p 8444:443 manageiq/manageiq:fine-3

Once ManageIQ has started, add a Middleware Provider, there is three secured ways to do that:

SSL — Requires that the client box trusts Hawkular Services certificate.
SSL trusting custom CA — Requires to provide the certificate to trust.
SSL without validation — No validation is performed.

We will focus on (2) and (3) as (1) will require to trust the certificates on the machine itself.

Before going onto details, navigate to https://localhost:8444/, login with default username admin and password smartvm. Click on Middleware → Configuration → Add a New Middleware Provider.

SSL trusting custom CA

We need to select SSL trusting custom CA in Security Protocol and copy the certificate from the contents of hawkular-services-public.pem. Fill the Hostname with the one used in the certificate and complete the required information.

Figure 1: Middleware Provider SSL trusting custom ca

SSL without validation

We need to select SSL without validation in Security Protocol. No validation regarding the certificate is made with this option, only fill the required information.

Figure 2: Middleware Provider SSL without validation

HawkFX

One can connect using HawkFx by either installing the certificate on the machine or disabling the verification as show in the image. You will also need to update your /etc/hosts to resolve hawkular-services.

sudo su -c "echo '127.0.0.1 hawkular-services' >> /etc/hosts"

Alternatively, if no verification is used, one could use localhost instead of hawkular-services.

Figure 3: HawkFX without validation

Conclusion

Securing communications between a dockerized Hawkular Agent and Hawkular Services is possible with self-signed certificates. Connecting clients is also possible with the additional step of providing the certificate.

Hawkular Alerts with OpenTracing

06 September 2017, by John Mazzitelli

Two recent blogs discuss how OpenTracing instrumentation can be used to collect application metrics:

A further interesting integration can be the addition of Hawkular Alerts to the environment.

As the previous blog and demo discuss, Hawkular Alerts is a generic, federated alerts system that can trigger events, alerts, and notifications from different, independent systems such as Prometheus, ElasticSearch, and Kafka.

Here we can combine the two. Let’s follow the directions for the OpenTracing demo (using the Jaeger implementation) and add Hawkular Alerts.

What this can show is OpenTracing application metrics triggering alerts when (as in this example) OpenTracing spans encounter a larger-than-expected error rates.

(Note: these instructions assume you are using Kubernetes / Minikube - see the Hawkular OpenTracing blogs linked above for more details on these instructions)

START KUBERNETES

Here we start minikube giving it enough resources to run all of the pods necessary for this demo. We also start up a browser pointing to the Kubernetes dashboard, so you can follow the progress of the remaining instructions.

minikube start --cpus 4 --memory 8192

minikube dashboard

DEPLOY PROMETHEUS

kubectl create -f https://raw.githubusercontent.com/coreos/prometheus-operator/v0.11.0/bundle.yaml

kubectl create -f https://raw.githubusercontent.com/objectiser/opentracing-prometheus-example/master/prometheus-kubernetes.yml

(Note: the last command might not work depending on your version - if you get errors, download a copy of prometheus-kubernetes.yml and edit it, changing “v1alpha1” to “v1”)

DEPLOY JAEGER

kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-kubernetes/master/all-in-one/jaeger-all-in-one-template.yml

The following will build and deploy the Jaeger example code that will produce the OpenTracing data for the demo:

mkdir -p ${HOME}/opentracing ; cd ${HOME}/opentracing

git clone git@github.com:objectiser/opentracing-prometheus-example.git

cd opentracing-prometheus-example/simple

eval $(minikube docker-env)

mvn clean install docker:build

kubectl create -f services-kubernetes.yml

(Note: The last command might not work depending on your version - if you get errors, edit services-kubernetes.yml, changing “v1alpha1” to “v1”)

DEPLOY HAWKULAR-ALERTS AND CREATE ALERT TRIGGER

The following will deploy Hawkular Alerts and create the trigger definition that will trigger an alert when the Jaeger OpenTracing data indicates an error rate that is over 30%

kubectl create -f https://raw.githubusercontent.com/hawkular/hawkular-alerts/master/dist/hawkular-alerts-k8s.yml

Next use minikube service hawkular-alerts --url to determine the Hawkular Alerts URL and point your browser to the path “/hawkular/alerts/ui” at that URL (i.e. http://host:port/hawkular/alerts/ui).

From the browser page running the Hawkular Alerts UI, enter a tenant name in the top right text field (“my-organization” for example) and click the “Change” button.

Navigate to the “Triggers” page (found in the left-hand nav menu).

Click the kabob menu icon at the top and select “New Trigger”.

In the text area, enter the following to define a new trigger that will trigger alerts when the Prometheus query shows that there is a 30% error rate or greater in the accountmgr or ordermgr servers:

{
   "trigger":{
      "id":"jaeger-prom-trigger",
      "name":"High Error Rate",
      "description":"Data indicates high error rate",
      "severity":"HIGH",
      "enabled":true,
      "autoDisable":false,
      "tags":{
         "prometheus":"Test"
      },
      "context":{
         "prometheus.url":"http://prometheus:9090"
      }
   },
   "conditions":[
      {
         "type":"EXTERNAL",
         "alerterId":"prometheus",
         "dataId":"prometheus-test",
         "expression":"(sum(increase(span_count{error=\"true\",span_kind=\"server\"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) / sum(increase(span_count{span_kind=\"server\"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)) > 0.3"
      }
   ]
}

Figure 1: Create New Alert Trigger

Figure 2: Alert Trigger

Now navigate back to the “Dashboard” page (again via the left-hand nav menu). From this Dashboard page, look for alerts when they are triggered. We’ll next start generating the data that will trigger these alerts.

GENERATE SOME SAMPLE OPEN TRACING APPLICATION DATA

export ORDERMGR=$(minikube service ordermgr --url)

${HOME}/opentracing/opentracing-prometheus-example/simple/genorders.sh

Once the data starts to be collected, you will see alerts in the Hawkular Alerts UI as error rates become over 30% in the past minute (as per the Prometheus query).

Figure 3: Alerts Dashboard

Figure 4: Alert

If you look at the alerts information in the Hawkular Alerts UI, you’ll see the conditions that triggered the alerts. For example, one such alert could look like this:

Time: 2017-09-01 17:41:17 -0400
External[prometheus]: prometheus-test[Event [tenantId=my-organization,
id=1a81471d-340d-4dba-abe9-5b991326dc80, ctime=1504302077288, category=prometheus,
dataId=prometheus-test, dataSource=none, text=[1.504302077286E9, 0.3333333333333333],
context={service=ordermgr, version=0.0.1}, tags={}, trigger=null]] matches
[(sum(increase(span_count{error="true",span_kind="server"}[1m])) without
(pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) /
sum(increase(span_count{span_kind="server"}[1m])) without
(pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)) > 0.3]

Notice the “ordermgr” service (version "0.0.1") had an error rate of 0.3333 (33%) which caused the alert since it is above the allowed 30% threshold.

At this point, the Hawkular Alerts UI provides the ability for system admins to log notes about the issue, acknowledge the alert and mark the alert resolved if the underlying issue has been fixed. These lifecycle functions (also available as REST operations) are just part of the value add of Hawkular-Alerts.

You could do more complex things such as only trigger this alert if this Prometheus query generated results AND some other condition was true (say, ElasticSearch logs match a particular pattern, or if a Kafka topic had certain data). This demo merely scratches the surface, but does show how Hawkular Alerts can be used to work with OpenTracing to provide additional capabilities that may be found useful by system administrators and IT support personnel.

Canary Deployment in OpenShift using OpenTracing based Application Metrics

18 August 2017, by Gary Brown

In a previous article we showed how OpenTracing instrumentation can be used to collect application metrics, in addition to (but independent from) reported tracing data, from services deployed within a cloud environment (e.g. Kubernetes or OpenShift).

In this article we will show how this information can be used to aid a Canary deployment strategy within OpenShift.

Figure 1: Error ratio per service and version

The updated example application

We will be using the same example as used in the previous article.

However since writing that article, the configuration of the tracer and Prometheus metrics support has been simplified. There is now no explicit configuration of either, with only some auto configuration of MetricLabel beans to identify some custom labels to be added to the Prometheus metrics, e.g.

Metrics configuration used in both services:

@Configuration
public class MetricsConfiguration {

    @Bean
    public MetricLabel transactionLabel() {
        return new BaggageMetricLabel("transaction", "n/a"); (1)
    }

    @Bean
    public MetricLabel versionLabel() {
        return new ConstMetricLabel("version", System.getenv("VERSION")); (2)
    }

}

1	This metric label identifies the business transaction associated with the metrics, which can be used to isolate the specific number of requests, duration and errors that occurred when the service was used within the particular business transaction
2	This metric label identifies the service version, which is especially useful in the Canary deployment use case being discussed in this article

The first step is to following the instructions in the example for deploying and using the services within OpenShift.

Once the ./genorders.sh script has been running for a while, to generate plenty of metrics for version 0.0.1 of the services, then deploy the new version of the services. This is achieved by:

updating the versions in the pom.xml files, within the simple/accountmgr and simple/ordermgr folders from 0.0.1 to 0.0.2
re-run the mvn clean install docker:build command from the simple folder
deploy the canary versions of the services using the command oc create -f services-canary-kubernetes.yml

As our services accountmgr and ordermgr determine the backing deployment based on the respective labels app: accountmgr and app: ordermgr, simply having a second deployment with these labels will make them serve requests in a round-robin manner.

This deployment script has been pre-configured with the 0.0.2 version, and to only start a single instance of the new version of the services. This may be desirable if you want to monitor the behaviour of the new service versions over a reasonable time period, but as we want to see results faster we will scale them up to see more activity. You can do this by expanding the deployment area for each service in the OpenShift web console and selecting the up arrow to scale up each service:

Figure 2: Scaling up canary deployment

Now we can monitor the Prometheus dashboard, using the following query, to see the error ratio per service and version:

sum(increase(span_count{error="true",span_kind="server"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind) / sum(increase(span_count{span_kind="server"}[1m])) without (pod,instance,job,namespace,endpoint,transaction,error,operation,span_kind)

The result of this query can be seen in Figure 1 at the top of the article. This chart shows that version 0.0.2 of the accountmgr service has not generated any errors, while the 0.0.2 of the ordermgr appears to be less error prone than version 0.0.1.

Based on these metrics, we could decide that the new versions of these services are better than the previous, and therefore update the main service deployments to use the new versions. In the OpenShift web console you can do this by clicking the three vertical dots in the upper right hand side of the deployment region and selecting Edit YAML from the menu. This will display an editor window where you can change the version from 0.0.1 to 0.0.2 in the YAML file.

Figure 3: Update the service version

After you save the YAML configuration file, in the web console you can see the service going through a "rolling update" as OpenShift incrementally changes each service instance over to the new version.

Figure 4: Rolling update

After the rolling update has completed for both the ordermgr and accountmgr services, then you can scale down or completely remove the canary version of each deployment.

An alternative to performing the rolling update would simply be to name the canary version something else (i.e. specific to the version being tested), and when it comes time to switch over, simply scale down the previous deployment version. This would be more straightforward, but wouldn’t show off the cool rolling update approach in the OpenShift web console :-)

Figure 5: Scaling down canary deployment

Although we have updated both services at the same time, this is not necessary. Generally microservices would be managed by separate teams and subject to their own deployment lifecycles.

Conclusion

This article has shown how application metrics, captured by instrumenting services using the OpenTracing API, can be used to support a simple Canary deployment strategy.

These metrics can similarly be used with other deployment strategies, such as A/B testing, which can be achieved using a weighted load balancing capability within OpenShift.

Older posts:

15 August 2017 - Advanced Behaviour Detection with Nelson Rules
11 August 2017 - Hawkular Alerts with Prometheus, ElasticSearch, Kafka
28 July 2017 - OpenTracing EJB instrumentation
24 July 2017 - Grafana: new query interface
18 July 2017 - Protecting Jaeger UI with a sidecar security proxy
10 July 2017 - OpenTracing JAX-RS Instrumentation
26 June 2017 - Using OpenTracing to collect Application Metrics in Kubernetes
13 June 2017 - OpenTracing Spring Boot Instrumentation
06 June 2017 - Hawkular Metrics 0.27.0 - Release
31 May 2017 - Hawkular and Grafana Out of the Box
25 May 2017 - Kubernetes and OpenShift Templates for Jaeger
03 May 2017 - Hawkular Services 0.37.Final
27 April 2017 - Hawkular Alerting Tutorial: Lesson 06 - Events!
19 April 2017 - Hawkular APM: The Future
07 April 2017 - CloudNativeCon summary
06 April 2017 - Alerts and Notifications for Elasticsearch using Hawkular Alerting
05 April 2017 - Deploying Hawkular OpenShift Agent Easily
04 April 2017 - Hawkular Metrics 0.26.0 - Release
25 March 2017 - Collecting Application Metrics Within OpenShift
24 March 2017 - Distributed Tracing with Apache Camel and OpenTracing
24 March 2017 - Introducing Hawkular Java Agent
22 March 2017 - Adjusting sampling rates for Hawkular APM on OpenShift
15 March 2017 - Distributed Tracing Workshop and OpenTracing Collaboration
14 March 2017 - Hawkular Alerting: Tutorial Now Available!
07 March 2017 - Hawkular Metrics 0.25.0 - Release
22 February 2017 - Processing Hawkular-Metrics data with Python Pandas
15 February 2017 - Hawkular Services 0.32.Final
13 February 2017 - Monitoring Canary Releases with Hawkular APM
08 February 2017 - Hawkular Metrics 0.24.0 - Release
07 February 2017 - Hawkular Metrics - 2017 Roadmap
04 February 2017 - Hawkular APM: Comparing performance of service versions
03 February 2017 - Display custom events in Grafana
31 January 2017 - Hawkular Services 0.30.0.Final
27 January 2017 - Getting the status of Hosa
17 January 2017 - Monitoring Microservices on OpenShift with HOSA
16 January 2017 - Reporting Dropwizard metrics to Hawkular
13 January 2017 - Extending Complex Event Processing in Hawkular Alerting
04 January 2017 - Hawkular Metrics 0.23.0 - Release
22 December 2016 - Hawkinit
16 December 2016 - Hawkular APM improvements for OpenShift
06 December 2016 - Hawkular Metrics 0.22.0 - Release
25 November 2016 - Using Hawkular APM on OpenShift
16 November 2016 - Hawkular APM OpenTracing JavaScript
09 November 2016 - Hawkular Services 0.20.0.Final
25 October 2016 - Hawkular Metrics 0.21.0 - Release
24 October 2016 - Define Alert Triggers via HawkFX
24 October 2016 - Monitoring Microservices with OpenShift, Hawkular Metrics and Grafana
17 October 2016 - Hawkular APM supports OpenTracing and Alerts
14 October 2016 - Hawkular APM Distributed Tracing of Polyglot Application using Zipkin Instrumentations
06 October 2016 - Hawkular Metrics 0.20.0 - Release
19 September 2016 - Using Hawkular APM on Red Hat's Microservices Reference Architecture example
14 September 2016 - Consuming Hawkular API over SSL with self signed certificates
05 September 2016 - Hawkular Metrics 0.19.0 - Release
31 August 2016 - Metric Data Analysis using the Apache Spark
11 August 2016 - Vert.x agent inventory implementation
01 August 2016 - Hawkular Metrics 0.18.0 - Release
19 July 2016 - What is new in Android Client
14 July 2016 - Monitoring Application Performance within Openshift
14 July 2016 - Getting started with ManageIQ and Hawkular
13 July 2016 - Introducing HawkFX
12 July 2016 - Hawkular and JMX/Jolokia
12 July 2016 - Hawkular and Prometheus
06 July 2016 - Hawkular Metrics 0.17.0 - Release
05 July 2016 - Hawkular Services 0.0.5.Final
05 July 2016 - Scaling stateful services: an example in Hawkular Alerting
27 June 2016 - Hawkular APM 0.9.0.Final Released
14 June 2016 - Hawkular Services 0.0.2.Final
01 June 2016 - Hawkular BTM component rename
01 June 2016 - Hawkular Metrics 0.16.0 - Release
26 May 2016 - Monitoring Microservices for Application Performance, Distributed Tracing and Business Transactions
24 May 2016 - Obtaining business metrics without the need to instrument your application
02 May 2016 - Hawkular Metrics 0.15.0 - Release
28 April 2016 - New Hawkular packaging
27 April 2016 - Hawkular Inventory 0.15.0.Final - Release
22 April 2016 - Collecting Metrics from Prometheus Endpoints
21 April 2016 - Hawkular Data Mining 0.1.0.Final Released
19 April 2016 - Monitoring JVM applications with jmxtrans
08 April 2016 - Hawkular in ManageIQ sprint demo(s)
29 March 2016 - Hawkular Metrics 0.14.0 - Release
21 March 2016 - QR Code support for Android Client
16 March 2016 - Hawkular Metrics - Roadmap
15 March 2016 - The eleventh milestone of Hawkular released
02 March 2016 - Hawkular Metrics 0.13.0 - Release
23 February 2016 - Integration with ManageIQ
02 February 2016 - Hawkular Metrics 0.12.0 - Release
25 January 2016 - Hawkular-BTM now includes Application Performance Management
21 January 2016 - The nineth milestone of Hawkular released
21 January 2016 - Hawkular Command Gateway Clients
20 January 2016 - Hawkular Data Mining - Predictive Charts
12 January 2016 - Hawkular Metrics 0.11.0 - Release
05 January 2016 - We are hiring !
16 December 2015 - The eigth milestone of Hawkular released
01 December 2015 - BTM: Instrumenting an application running from a docker image
30 November 2015 - Monitoring Business Transactions in JBoss Ticket Monster App (Part 2)
19 November 2015 - Yet another release of Hawkular, the seventh one !
10 November 2015 - Monitoring Business Transactions in JBoss Ticket Monster App
30 October 2015 - Hawkular Metrics 0.9.0 - Release
30 October 2015 - Hawkular Metrics 0.10.0 - Release
24 October 2015 - Introduction to Hawkular Data Mining Module
21 October 2015 - The sixth milestone of Hawkular released
12 October 2015 - Hawkular Metrics 0.8.0 - Release
09 October 2015 - Titan Graph DB Performance Tips
06 October 2015 - Hawkular-BTM 0.4.0 released
05 October 2015 - Hawkular Metrics - 2k Commits
30 September 2015 - Hawkular Metrics 0.7.0 - Release
23 September 2015 - The fifth milestone of Hawkular released
03 September 2015 - Monitoring Rails App using Hawkular Metrics
29 August 2015 - Hawkular Metrics 0.6.0 - Release
27 August 2015 - The fourth milestone of Hawkular released
25 August 2015 - Introduction to AutoResolve triggers
19 August 2015 - Using Hawkular Alerts as a standalone engine
14 August 2015 - Visualising Business Transaction Information using Hawkular BTM 0.3.0.Final with RTGov Integration
30 July 2015 - Hawkular, all good things make three!
22 July 2015 - Monitoring a Vert.x Application using Hawkular BTM 0.2.0.Final
02 July 2015 - Hawkular, the second release!
01 July 2015 - Hawkular-BTM 0.1.0 released
01 July 2015 - Monitoring a SwitchYard Application using Hawkular BTM 0.1.0.Final
22 June 2015 - Hawkular Metrics 0.4.0 - Release
17 June 2015 - MarsJUG and RivieraDEV retrospective
04 June 2015 - Hawkular, the first release!
01 June 2015 - Hawkular Metrics 0.3.4 - Release
30 April 2015 - Introducing the latest Hawkular component: Business Transaction Management
29 April 2015 - Hello GSoC Students
17 April 2015 - Intro to Hawkular, a middleware open-source management solution
14 April 2015 - Hawkular-Monitor Agent
09 April 2015 - Alert notifiers for mobile devices
08 April 2015 - Testing collectd integration
07 April 2015 - Hawkular Metrics 0.3.1 - Release
01 April 2015 - Hawkular Discontinued
30 March 2015 - Dockerized Hawkular builds available
24 February 2015 - The Kettle starts boiling
16 February 2015 - Hawkular Metrics 0.2.7 - Release

RSS Feed

Hawkular Blog

Running Hawkular Agent

Running in WildFly

Running in WildFly Domain Mode

Example

How It Works

How It Works with WildFly Domain Mode

Hawkular Alerts in ManageIQ

ManageIQ alerting terminology

Hawkular alerting terminology

Terminology relationships

Creating ManageIQ alerts

Creating an alert profile

Assigning middleware servers to an alert profile

Viewing alerts in the timeline

Conclusion

Connecting Hawkular Agent to Hawkular Services over SSL

Preparing your certificates

Starting Cassandra

Starting Hawkular Services on SSL

Starting a Hawkular Agent

Testing the setup

Curl

ManageIQ

SSL trusting custom CA

SSL without validation

HawkFX

Conclusion

Hawkular Alerts with OpenTracing

START KUBERNETES

DEPLOY PROMETHEUS

DEPLOY JAEGER

DEPLOY HAWKULAR-ALERTS AND CREATE ALERT TRIGGER

GENERATE SOME SAMPLE OPEN TRACING APPLICATION DATA

Canary Deployment in OpenShift using OpenTracing based Application Metrics

The updated example application

Conclusion

Links

Older posts: