Setup Rate Limiting in Grey Matter on Kubernetes
Last updated
Was this helpful?
Last updated
Was this helpful?
Envoy allows configuration of how many requests / second a service can field. This is useful in preventing DDOS attacks and otherwise making sure your server's resources aren't easily overrun.
Unlike which are configured on each cluster, rate limiting is configured across multiple listeners or clusters. Overall, circuit breakers are good for avoiding cascading failures of a bad downstream host. However, when:
[A] large number of hosts are forwarding to a small number of hosts and the average request latency is low (e.g., connections/requests to a database server). If the target hosts become backed up, the downstream hosts will overwhelm the upstream cluster. In this scenario it is extremely difficult to configure a tight enough circuit breaking limit on each downstream host such that the system will operate normally during typical request patterns but still prevent cascading failure when the system starts to fail. Global rate limiting is a good solution for this case.
This is because rate limiting sets a global number of requests / second across a service, which is independent of the number of instances configured for a cluster. This guide shows how to configure rate limiting on the edge node in a Grey Matter deployment to an edge
namespace.
setup with a running Fabric mesh.
An existing Grey Matter deployment running on Kubernetes _()
kubectl
or oc
setup with access to the cluster
Rate limiting relies on an external service to regulate and calculate the current number of requests / second. For this example we use which is based on . is a good example of this architecture.
Begin by creating a deployment for the rate limit service by applying these configs to kubernetes. Note for all these examples I use edge
as the namespace – this may differ for your deployment. Additionally, be sure to update the Redis password with your own:
When applied with kubectl
, this service will serve as a central hub to limit the number of requests coming from the domain "edge" to 100 requests / second.
In order to configure rate limiting on a sidecar, we need to configure the sidecar with an additional cluster that points to the rate limit service. As a convenience, Grey Matter Sidecar allows for defining a cluster using environment variables. The following sample environment variables define a cluster ratelimit
that points to our deployed ratelimit service:
Make sure that you can see this cluster on whichever sidecar you have configured. It's available on localhost:8001/clusters
under the ratelimit
cluster.
Now let's update the listener config for the sidecar we've configured. Edit the listener in the Grey Matter CLI and add the following attributes:
You should see that requests to the ratelimit
cluster succeed on the localhost:8001/clusters
endpoint on the sidecar. You should also see logs in the ratelimit pod on every request since the logs are set to debug. If there are ever more than 1 request / second, all requests are blocked until the 1 request / second threshold is released.
The ratelimit we set can be tested by changing the number of requests / second to 1 and spamming the sidecar. Make a series of curl requests to the edge. The response code should be 429, although if TLS is enabled this often is just an SSL error. You should see also something like the following logs in the ratelimit pod:
This shows that the ratelimit service is registering calls from edge and is checking the requests / second against the limit of a maximum of 1 request / second. If you wish to raise the limit in a production environment, 100 requests / second is a good starting point.