Monitoring

Web Interface

The web interface of Stretchy Orchestrator provides an overview of the application’s health (System settingsSystem health):

Nodes

Lists all nodes, whether they are enabled and online, and their current usage according to the Orchestrator’s internal database.

Stuck Machines

Machines that cannot complete a required step in the machine lifecycle and might need human intervention to advance.

Waiting Machines

Lists machines that are waiting for a node with the necessary resources to run.

See Troubleshooting for advice on how to fix problems.

The usage reported by Stretchy Orchestrator does not necessarily match the reality on a node. The reason is that resources are considered occupied as long as a machine is assigned to a node.

Endpoints

Stretchy Orchestrator provides a number of endpoints that let you monitor it with common monitoring software like Nagios. All endpoints are available over HTTP and Java Management Extensions (JMX).

The following Stretchy-specific endpoints are available:

Table 1. Monitoring endpoints
ID Description

machines

Returns all machines that are stuck or waiting.

nodes

Lists all nodes, their current state, and usage.

All HTTP endpoints are mapped to /actuator/<id>. For example, the list of nodes can be accessed at /actuator/nodes.

By default, all endpoints are disabled. To enable them, add their ID to the configuration property management.endpoints.web.exposure.include (for HTTP) or management.endpoints.jmx.exposure.include (JMX), respectively. Both properties take a comma separated list.

For example, to only expose the endpoints for machines and nodes over HTTP, add the property in Example 1 to your application.properties.

Example 1. Enabling HTTP Monitoring Endpoints
management.endpoints.web.exposure.include=machines,nodes

In the case of JMX, it is necessary to enable JMX itself with spring.jmx.enabled=true, too, as can be seen in Example 2.

Example 2. Enabling JMX Monitoring Endpoints
spring.jmx.enabled=true
management.endpoints.jmx.exposure.include=machines,nodes
Stretchy Orchestrator is based on Spring Boot, which features a number of built-in endpoints that you can enable to monitor additional aspects of Stretchy Orchestrator. Its documentation shows more complex setups, too.

Metrics

Metrics are a very experimental feature. Stretchy Orchestrator exposes few Stretchy-specific metrics and supports even fewer monitoring systems out of the box. We are looking for feedback on how to evolve the feature.

While the monitoring endpoints expose structured data, metrics provide individual values like counters or gauges, for example, the number of operational nodes or running machines.

Metrics are collected and published with the help of Micrometer and Spring Boot. That means that besides Stretchy-specific metrics, a large number of application-agnostic meters is available, too. Refer to the Spring Boot documentation for more information on the available application-agnostic metrics.

Available Metrics

Nodes

Metrics like the number of operational nodes are published under the stretchy.nodes meter name.

Supported Monitoring Systems

Generic

By enabling the metrics endpoint as shown in Example 3, a list of all meter names is published at /actuator/metrics. You can drill down to view information about a particular meter by providing its name as a selector, for example, /actuator/metrics/stretchy.nodes.operational.

Example 3. Enabling Metrics Endpoint
management.endpoints.web.exposure.include=metrics

Prometheus

To expose metrics in a format that can be scraped by Prometheus, enable the Prometheus endpoint as shown in Example 4. The metrics are then available at /actuator/prometheus.

Example 4. Enabling Prometheus Endpoint
management.endpoints.web.exposure.include=prometheus

Security

The monitoring-related parts of the web interface and HTTP endpoints are only accessible to authenticated users with the roles ADMIN and MONITOR.

Access to JMX endpoints is regulated by the JVM, not Stretchy Orchestrator. See Oracle’s documentation on Monitoring and Management Using JMX for how to configure it.