Catalog

Summary

The Grey Matter Catalog service interfaces with the Fabric mesh xDS interface to provide high level summaries and more easily consumable views of the current state of the mesh. It powers the Intelligence 360 application and any other applications that need to understand what is present in the mesh.

Features

Multi Zone Management
Human-friendly Service Management
Real-time Service Metrics Access

Note: Catalog deploys with swagger API documentation. For the most up-to-date fields and complete usage, refer to those docs of your deployment.

Zones

In Grey Matter, each zone is a unique data plane. Catalog provides the ability to track multiple zones, gather high-level information on service statuses, and present them to applications.

A sample zone is shown below. This zone is owned by Decipher, for core and demo capabilities. It has 7 different service clusters(4 stable, 2 warning, and 1 down) with a total of 9 unique service instances.

{
  "zoneName": "default-zone",
  "owners": [
    "Decipher"
  ],
  "capabilities": [
    "Greymatter",
    "Demo"
  ],
  "clusterCount": 7,
  "clusterStableCount": 4,
  "clusterWarningCount": 2,
  "clusterDownCount": 1,
  "instanceCount": 9
}

Service Entry

Within each zone, Catalog also tracks each service cluster in the Fabric service mesh. Catalog watches the mesh for instances of these services, reports when they're up/down, and presents user-settable metadata for each.

Name

Type

Default

Description

clusterName

String

The name of the cluster (logical group name of service instances), provided by gm-control.

zoneName

String

The zone in which this service cluster resides.

name

String

The name of the service to be displayed in the Intelligence 360 Dashboard (make this as human friendly as possible).

version

String

The semver version of the service.

owner

String

The name of the service owner like "Decipher" or "Cool Customer". Used to sort by Owner in the Intelligence 360 Dashboard.

capability

String

The name of the service capability like "Storage" or "Security". Used to sort by Capability in the Intelligence 360 Dashboard.

documentation

String

A URL path to documentation relative to the root URL of the service or a full path depending on your needs.

prometheusJob

String

The name of the Prometheus Job used to store and query time series data associated with this service. This must match the cluster name.

minInstances

Number

The minimum number of instances this service should scale to. If below this number, the service will be in a warning state in the Intelligence 360 Dashboard.

maxInstances

Number

The maximum number of instances this service should scale to. If above this number, the service will be in a warning state in the Intelligence 360 Dashboard.

enableInstanceMetrics

Boolean

true

Enable the instance metrics view in the Intelligence 360 Dashboard.

enableHistoricalMetrics

Boolean

false

Enable the historical metrics view in the Intelligence 360 Dashboard.

metricsTemplate

String

URL template for constructing the service's instance metrics endpoint e.g. http://{{host}}:{{port}}/metrics.

metricsPort

Number

8081

TCP port serving up service instance metrics.

Example Service Cluster

[
  {
    "clusterName": "catalog",
    "zoneName": "default-zone",
    "name": "Grey Matter Catalog",
    "version": "1.0",
    "owner": "Decipher",
    "capability": "Greymatter",
    "runtime": "GO",
    "documentation": "/services/catalog/1.0/",
    "prometheusJob": "catalog",
    "minInstances": 1,
    "maxInstances": 1,
    "authorized": true,
    "clusterID": "",
    "meshID": "",
    "enableInstanceMetrics": true,
    "enableHistoricalMetrics": true,
    "instances": [
      {
        "name": "27e2d92fd05aaeafbdde6c97f561a2e0",
        "startTime": 1585274080308
      }
    ],
    "metricsTemplate": "",
    "metricsPort": 8081
  }
]

Metrics

Catalog also serves as a normalized passthrough to the real-time metrics for each service. These can be requested with the catalog /metrics endpoint (e.g curl host:port/metrics/<cluster_name>/<instance_id>). See the deployed swagger docs for more detail.

Example Returned Metrics

{
  "grey-matter-metrics-version": "1.0.0",
  "Total/requests": 25,
  "HTTP/requests": 0,
  "HTTPS/requests": 25,
  "RPC/requests": 0,
  "RPC_TLS/requests": 0,
  "route/services/catalog/1.0/metrics/GET/errors.count": 0,
  "route/services/catalog/1.0/metrics/GET/in_throughput": 0,
  "route/services/catalog/1.0/metrics/GET/out_throughput": 148004,
  "route/services/catalog/1.0/summary/GET/requests": 2,
  "route/services/catalog/1.0/summary/GET/routes": "",
  "route/services/catalog/1.0/summary/GET/status/200": 2,
  "route/services/catalog/1.0/summary/GET/status/2XX": 2,
  "route/services/catalog/1.0/summary/GET/latency_ms.avg": 0.000000,
  "route/services/catalog/1.0/summary/GET/latency_ms.count": 2,
  "route/services/catalog/1.0/summary/GET/latency_ms.max": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.min": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.sum": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.p50": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.p90": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.p95": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.p99": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.p9990": 0,
  "route/services/catalog/1.0/summary/GET/latency_ms.p9999": 0,
  "route/services/catalog/1.0/summary/GET/errors.count": 0,
  "route/services/catalog/1.0/summary/GET/in_throughput": 0,
  "route/services/catalog/1.0/summary/GET/out_throughput": 10248004,
  "all/requests": 25,
  "all/routes": "",
  "all/status/200": 21,
  "all/status/500": 1,
  "all/status/304": 3,
  "all/status/2XX": 21,
  "all/status/5XX": 1,
  "all/status/3XX": 3,
  "all/latency_ms.avg": 0.000000,
  "all/latency_ms.count": 25,
  "all/latency_ms.max": 0,
  "all/latency_ms.min": 0,
  "all/latency_ms.sum": 0,
  "all/latency_ms.p50": 0,
  "all/latency_ms.p90": 0,
  "all/latency_ms.p95": 0,
  "all/latency_ms.p99": 0,
  "all/latency_ms.p9990": 0,
  "all/latency_ms.p9999": 0,
  "all/errors.count": 0,
  "all/in_throughput": 0,
  "all/out_throughput": 5022597,
  "route/services/prometheus/api/v1/GET/requests": 10,
  "route/services/prometheus/api/v1/GET/routes": "",
  "route/services/prometheus/api/v1/GET/status/200": 10,
  "route/services/prometheus/api/v1/GET/status/2XX": 10,
  "route/services/prometheus/api/v1/GET/latency_ms.avg": 0.000000,
  "route/services/prometheus/api/v1/GET/latency_ms.count": 10,
  "route/services/prometheus/api/v1/GET/latency_ms.max": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.min": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.sum": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.p50": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.p90": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.p95": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.p99": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.p9990": 0,
  "route/services/prometheus/api/v1/GET/latency_ms.p9999": 0,
  "route/services/prometheus/api/v1/GET/errors.count": 0,
  "route/services/prometheus/api/v1/GET/in_throughput": 0,
  "route/services/prometheus/api/v1/GET/out_throughput": 201782,
  "route/services/slo/latest/objectives/GET/requests": 2,
  "route/services/slo/latest/objectives/GET/routes": "",
  "route/services/slo/latest/objectives/GET/status/200": 2,
  "route/services/slo/latest/objectives/GET/status/2XX": 2,
  "route/services/slo/latest/objectives/GET/latency_ms.avg": 0.000000,
  "route/services/slo/latest/objectives/GET/latency_ms.count": 2,
  "route/services/slo/latest/objectives/GET/latency_ms.max": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.min": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.sum": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.p50": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.p90": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.p95": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.p99": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.p9990": 0,
  "route/services/slo/latest/objectives/GET/latency_ms.p9999": 0,
  "route/services/slo/latest/objectives/GET/errors.count": 0,
  "route/services/slo/latest/objectives/GET/in_throughput": 0,
  "route/services/slo/latest/objectives/GET/out_throughput": 2838,
  "go_metrics/runtime/num_goroutines": 20,
  "system/start_time": 1585326138311,
  "system/cpu.pct": 5.729167,
  "system/cpu_cores": 2,
  "os": "linux",
  "os_arch": "amd64",
  "system/memory/available": 6736568320,
  "system/memory/used": 1492824064,
  "system/memory/used_percent": 17.851688,
  "process/memory/used": 71762168
}

PreviousIntelligence 360 NextUsage

Last updated 5 years ago

Was this helpful?