Troubleshoot

Learn how to debug common problems.

Fabric Mesh Communication

404 Errors

If the 404 is only for the correct URL for your service (e.g. /services/fibonacci/1.0/) then a missing route configuration is the most likely cause. Check that the response from control when you did your greymatter create route < 2_sidecar/route.json looks correct.

“No healthy upstream”

A request returning no healthy upstream indicates that the Sidecar responsible for handling this request is aware of what service to send it, but there is no healthy instance up and running.

To verify this, send a request to the /clusters endpoint of the Sidecar's admin server. A healthy service will have both a named host and valid IP addresses. If you see no ip addresses, like the example below, then this is indeed the issue.

my-service::default_priority::max_connections::1024
my-service::default_priority::max_pending_requests::1024
my-service::default_priority::max_requests::1024
my-service::default_priority::max_retries::3
my-service::high_priority::max_connections::1024
my-service::high_priority::max_pending_requests::1024
my-service::high_priority::max_requests::1024
my-service::high_priority::max_retries::3
my-service::added_via_api::true

If you just deployed your service, wait a few minutes. To avoid network chattiness, the control plane delays updates between updates. If it has been more than a few minutes and the error is still there, then the network chain to this service needs to be inspected.

Typically: 1. Verify that the service is announcing with the correct labels. 2. Verify that the Sidecar deployed with the service has XDS_CLUSTER set to match the proxy API object's name field. 3. Verify that the API cluster objects that should reference this service have their name field set to match the above fields.

“Upstream connect error or disconnect/reset before headers”

A request returning Upstream connect error or disconnect/reset before headers indicates that the Sidecar responsible for handling this request both knows what service should receive it and has a healthy instance to send it to. However, in this case the service did not send a response.

The most likely cause for this is that the microservice at the end of the request is not yet up and running. Though the network path is setup, if the end server is still booting up this error will be returned until the server is ready to accept requests.

A more uncommon error is that the IP address sent to one of the Sidecars along the network path is stale. This could be from incorrect service announcement, or incorrect instance resolution behind a loadbalancer. Check the IP addresses returned from the Sidecar's admin endpoints against the service registry (k8s, consul, dc/os, aws, etc) to confirm.

Questions

Last updated

Was this helpful?