Introduction to AutoResolve triggers

A blog post by Lucas Ponce

alerts | autoresolve | hawkular | standalone



Hawkular Alerts provide a simple Alert life-cycle management and an automated Alert resolution as core features.

Alert Lifecycle

Triggers are defined to detect problems. When a problem happens a new alert is generated by the engine with the information about the data that matched the defined conditions. Also an Alert provides a simple lifecycle to indicate the status of the problem.

A new Alert starts with OPEN status, optionally moves to ACKNOWLEDGED to indicate the alert has been seen and the issue is being resolved, and is finally set to RESOLVED to indicate that the problem has been fixed.

Alert lifecycle can be handled manually or can be managed automatically by the AutoResolve tooling.

AutoResolve triggers

A trigger has one or multiple conditions that are set to detect a problem. When incoming data matches the condition expresions an alert is fired to inform about this new issue. In this context, if the engine is still receiving data, new alerts will be generated even if it refers to the same problem.

The detection of multiple alerts belonging to the same problem can be controlled with fine granularity.

Hawkular Alerts allow to define that once a problem triggered an alert, the trigger is not going to evaluate more data until the issue is resolved to avoid repeated alerts for the same issue.

Resolution of a problem can be done manually or automatically using AUTORESOLVE triggers.

A trigger defines conditions that are responsible to detect a problem. These conditions are called FIRING conditions in Hawkular Alerts. Optionally, a trigger can define conditions that are responsible to detect when the problem is gone. These new ones are called AUTORESOLVE conditions. So, in the engine a trigger will be in FIRING mode when it is evaluating data to detect a problem while it will be a in AUTORESOLVE mode when it is evaluating data to detect that the problem is no longer present.

These combination of FIRING and AUTORESOLVE conditions ensures that only one alert is generated for a problem and when the problem is resolved the trigger automatically returns to firing mode.

Example of alerting on process availability

Let’s define a simple example to show how AutoResolve triggers work.

In create-definition-check-process.sh bash script we have defined an AutoResolve trigger activating flags autoResolve and autoResolveAlerts.

This fragment shows how to mark a trigger as AutoResolve:

{
 "id": "check-firefox-process",
 "name": "Firefox process",
 "description": "Check availability firefox process",
 "actions": {
   "email": ["my-group-to-notify"]
 },
 "firingMatch": "ALL",
 "autoResolveMatch": "ALL",
 "enabled": true,
 "autoDisable": false,
 "autoEnable": false,
 "autoResolve": true, (1)
 "autoResolveAlerts": true, (2)
 "severity": "HIGH"
}
  1. autoResolve flag set to true indicates that the trigger will activate FIRING and AUTORESOLVE modes. Flag set to false will run the trigger only in FIRING mode.

  2. autoResolveAlerts flag set to true will automatically mark all unresolved alerts (for the trigger) as resolved when AUTORESOLVE trigger detect problem is gone. Flag set to false will not modify the alert lifecycle.

And the next fragment will show how to declare FIRING and AUTORESOLVE conditions linked with the trigger:

{
 "triggerMode": "FIRING", (1)
 "type": "AVAILABILITY",
 "dataId": "firefox-process",
 "operator": "NOT_UP"
}

...

{
 "triggerMode": "AUTORESOLVE", (2)
 "type": "AVAILABILITY",
 "dataId": "firefox-process",
 "operator": "UP"
}
  1. triggerMode indicates in which mode will this condition be executed, FIRING mode is used to detect the problem we want to alert. In our example, it is an occurence of the Firefox process not being up.

  2. AUTORESOLVE conditions are executed to detect when the probem is not present.

Finally, in send-data-check-process.sh bash script we show how we can check and send availability data about process we want to monitor.

The format of the payload is shown in the next fragment:

{
 "availability": [
    {"id": "firefox-process", (1)
     "type": "AVAILABILITY",
     "timestamp": $timestamp,
     "value": "$firefox_availability"
     }
 ]
}
  1. This id should match the dataId field defined in the conditions.

Examples can be run using Hawkular Alerts in Standalone deployment used in the previous post.

More details about Hawkular Alerts features can be found at




Published by Lucas Ponce on 25 August 2015

redhatlogo-white

© 2016 | Hawkular is released under Apache License v2.0