Monitoring
Web Interface
The web interface of Stretchy Orchestrator provides an overview of the application’s health (System settings ➤ System health):
- Nodes
-
Lists all nodes, whether they are enabled and online, and their current usage according to the Orchestrator’s internal database.
- Stuck Machines
-
Machines that cannot complete a required step in the machine lifecycle and might need human intervention to advance.
- Waiting Machines
-
Lists machines that are waiting for a node with the necessary resources to run.
See Troubleshooting for advice on how to fix problems.
| The usage reported by Stretchy Orchestrator does not necessarily match the reality on a node. The reason is that resources are considered occupied as long as a machine is assigned to a node. |
Endpoints
Stretchy Orchestrator provides a number of endpoints that let you monitor it with common monitoring software like Nagios. All endpoints are available over HTTP and Java Management Extensions (JMX).
The following Stretchy-specific endpoints are available:
| ID | Description |
|---|---|
machines |
Returns all machines that are stuck or waiting. |
nodes |
Lists all nodes, their current state, and usage. |
All HTTP endpoints are mapped to /actuator/<id>. For example, the list of nodes can be accessed at /actuator/nodes.
By default, all endpoints are disabled. To enable them, add their ID to the configuration property management.endpoints.web.exposure.include (for HTTP) or management.endpoints.jmx.exposure.include (JMX), respectively. Both properties take a comma separated list.
For example, to only expose the endpoints for machines and nodes over HTTP, add the property in Example 1 to your application.properties.
management.endpoints.web.exposure.include=machines,nodes
In the case of JMX, it is necessary to enable JMX itself with spring.jmx.enabled=true, too, as can be seen in Example 2.
spring.jmx.enabled=true
management.endpoints.jmx.exposure.include=machines,nodes
| Stretchy Orchestrator is based on Spring Boot, which features a number of built-in endpoints that you can enable to monitor additional aspects of Stretchy Orchestrator. Its documentation shows more complex setups, too. |
Metrics
| Metrics are a very experimental feature. Stretchy Orchestrator exposes few Stretchy-specific metrics and supports even fewer monitoring systems out of the box. We are looking for feedback on how to evolve the feature. |
While the monitoring endpoints expose structured data, metrics provide individual values like counters or gauges, for example, the number of operational nodes or running machines.
Metrics are collected and published with the help of Micrometer and Spring Boot. That means that besides Stretchy-specific metrics, a large number of application-agnostic meters is available, too. Refer to the Spring Boot documentation for more information on the available application-agnostic metrics.
Available Metrics
- Nodes
-
Metrics like the number of operational nodes are published under the
stretchy.nodesmeter name.
Supported Monitoring Systems
Generic
By enabling the metrics endpoint as shown in Example 3, a list of all meter names is published at /actuator/metrics. You can drill down to view information about a particular meter by providing its name as a selector, for example, /actuator/metrics/stretchy.nodes.operational.
management.endpoints.web.exposure.include=metrics
Prometheus
To expose metrics in a format that can be scraped by Prometheus, enable the Prometheus endpoint as shown in Example 4. The metrics are then available at /actuator/prometheus.
management.endpoints.web.exposure.include=prometheus
Security
The monitoring-related parts of the web interface and HTTP endpoints are only accessible to authenticated users with the roles ADMIN and MONITOR.
Access to JMX endpoints is regulated by the JVM, not Stretchy Orchestrator. See Oracle’s documentation on Monitoring and Management Using JMX for how to configure it.