Prometheus Store
Last updated
Was this helpful?
Last updated
Was this helpful?
All collected metrics are stored in a server inside the mesh. This store can be accessed by both APIs and a built-in UI. To access the UI, visit the Prometheus URL from the Intelligence 360 Application. The default path of the UI is /services/prometheus/latest
.
NOTE: you can always find the URLs of core Grey Matter components, like the Prometheus base URL, with the path.
Grey Matter aggregates metrics from every instance of every Service throughout the Fabric mesh and presents them for insight and analysis. The main key indicators are brought forth in the and views of the Intelligence 360 Application, but a great deal more can be accessed whenever needed.
From the UI, you can execute queries against the collected metrics and graph the results.
Grey Matter metrics can also be queried directly. For example, we can find the system CPU usage for all services that Prometheus monitors by running the following query:
Similarly, the request duration for a specific route can be queried:
In addition to the UI, Prometheus exposes /api/{version}/query
to be used as an API endpoint. This can be used to pull historical metrics for reporting and custom analysis. The examples below demonstrate the types of queries that can be performed, but a full explanation of the options available can be found in the .
The Prometheus server deployed with Grey Matter comes with many useful to access frequently needed or computationally expensive expressions. You can see a list of all available recording rules by navigating to the Status > Rules page in the Prometheus UI, or by accessing the /rules
route.
These rules can be used as is, or built upon to form more complex queries. For example, the overviewQueries:avgUpPercent:avg
rule computes the up time for a service at each scrape interval (usually every 15s) and stores it as a new timeseries. We can combine this new timeseries with Prometheus's built in to return the percentage of uptime for the edge
service over the past hour:
Running this query returns an . The value
array contains a timestamp representing the instant that the metric was captured and a corresponding percentage value.
This gives us an array of .
To narrow down results, we can provide a job parameter to the request. The job should map to the of the service:
Alerting is currently performed directly through Prometheus . This requires the setup of an AlertManager and configuration of alerting rules. In the Prometheus configuration of the deployment.