Setup Rate Limiting in Grey Matter on Kubernetes

Envoy allows configuration of how many requests / second a service can field. This is useful in preventing DDOS attacks and otherwise making sure your server's resources aren't easily overrun.

Unlike circuit breakers which are configured on each cluster, rate limiting is configured across multiple listeners or clusters. Overall, circuit breakers are good for avoiding cascading failures of a bad downstream host. However, when:

[A] large number of hosts are forwarding to a small number of hosts and the average request latency is low (e.g., connections/requests to a database server). If the target hosts become backed up, the downstream hosts will overwhelm the upstream cluster. In this scenario it is extremely difficult to configure a tight enough circuit breaking limit on each downstream host such that the system will operate normally during typical request patterns but still prevent cascading failure when the system starts to fail. Global rate limiting is a good solution for this case.

This is because rate limiting sets a global number of requests / second across a service, which is independent of the number of instances configured for a cluster. This guide shows how to configure rate limiting on the edge node in a Grey Matter deployment to an edge namespace.

Prerequisites

  1. greymatter setup with a running Fabric mesh.

  2. An existing Grey Matter deployment running on Kubernetes _(tutorial)

  3. kubectl or oc setup with access to the cluster

Steps

1. Deploy Ratelimit Service

Rate limiting relies on an external service to regulate and calculate the current number of requests / second. For this example we use envoy proxy's open source rate limit service which is based on Lyft's original rate limiting service. This blog post is a good example of this architecture.

Begin by creating a deployment for the rate limit service by applying these configs to kubernetes. Note for all these examples I use edge as the namespace – this may differ for your deployment. Additionally, be sure to update the Redis password with your own:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: ratelimit
  name: ratelimit
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: ratelimit
  template:
    metadata:
      labels:
        app: ratelimit
    spec:
      serviceAccountName: default
      containers:
        - name: ratelimit
          image: "envoyproxy/ratelimit:v1.4.0"
          imagePullPolicy: IfNotPresent
          env:
          - name: USE_STATSD
            value: "false"
          - name: LOG_LEVEL
            value: "debug"
          - name: REDIS_SOCKET_TYPE
            value: "tcp"
          - name: REDIS_URL
            value: "redis.default.svc:6379"
          - name: RUNTIME_ROOT
            value: "/"
          - name: RUNTIME_SUBDIRECTORY
            value: "ratelimit"
          - name: REDIS_AUTH
            valueFrom:
              secretKeyRef:
                name: redis-password
                key: password
          command: ["/bin/sh","-c"]
          args: ["mkdir -p /ratelimit/config && cp /data/ratelimit/config/config.yaml /ratelimit/config/config.yaml && cat /ratelimit/config/config.yaml &&  /bin/ratelimit"]
          ports:
            - name: server
              containerPort: 8081
            - name: debug
              containerPort: 6070
          volumeMounts:
            - name: ratelimit-config
              mountPath: /data/ratelimit/config
              readOnly: true
      volumes:
        - name: ratelimit-config
          configMap:
            name: ratelimit
---
kind: Service
apiVersion: v1
metadata:
  name: ratelimit
  labels:
    app: ratelimit
spec:
  ports:
    - name: server
      port: 8081
      protocol: TCP
      targetPort: 8081
    - name: debug
      port: 6070
      protocol: TCP
      targetPort: 6070
  selector:
    app: ratelimit
  sessionAffinity: None
  type: ClusterIP
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: ratelimit
  namespace: default
data:
  config.yaml: |-
    ---
    domain: edge
    descriptors:
      - key: path
        value: "/"
        rate_limit:
          unit: second
          requests_per_unit: 1

When applied with kubectl, this service will serve as a central hub to limit the number of requests coming from the domain "edge" to 100 requests / second.

2. Configure Grey Matter Sidecar To Use Ratelimit Filter

In order to configure rate limiting on a sidecar, we need to configure the sidecar with an additional cluster that points to the rate limit service. As a convenience, Grey Matter Sidecar allows for defining a cluster using environment variables. The following sample environment variables define a cluster ratelimit that points to our deployed ratelimit service:

...
      tcp_cluster:
        type: 'value'
        value: 'ratelimit'
      tcp_host:
        type: 'value'
        value: 'ratelimit.default.svc.cluster.local'
      tcp_port:
        type: 'value'
        value: '8081'
...

Make sure that you can see this cluster on whichever sidecar you have configured. It's available on localhost:8001/clusters under the ratelimit cluster.

Now let's update the listener config for the sidecar we've configured. Edit the listener in the Grey Matter CLI and add the following attributes:

...
    "active_network_filters": [
    "envoy.rate_limit"
  ],
  "network_filters": {
    "envoy_rate_limit": {
      "stat_prefix": "edge",
      "domain": "edge",
      "failure_mode_deny": true,
      "descriptors": [
        {
          "entries": [
            {
              "key": "path",
              "value": "/"
            }
          ]
        }
      ],
      "rate_limit_service": {
        "grpc_service": {
          "envoy_grpc": {
            "cluster_name": "ratelimit"
          }
        }
      }
    }
  },
...

You should see that requests to the ratelimit cluster succeed on the localhost:8001/clusters endpoint on the sidecar. You should also see logs in the ratelimit pod on every request since the logs are set to debug. If there are ever more than 1 request / second, all requests are blocked until the 1 request / second threshold is released.

3. Trust but Verify

The ratelimit we set can be tested by changing the number of requests / second to 1 and spamming the sidecar. Make a series of curl requests to the edge. The response code should be 429, although if TLS is enabled this often is just an SSL error. You should see also something like the following logs in the ratelimit pod:

time="2020-09-03T21:54:40Z" level=debug msg="returning normal response"
time="2020-09-03T21:54:40Z" level=debug msg="cache key: edge_path_/_1599170080 current: 1"
time="2020-09-03T21:54:40Z" level=debug msg="returning normal response"
time="2020-09-03T21:54:40Z" level=debug msg="starting get limit lookup"
time="2020-09-03T21:54:40Z" level=debug msg="looking up key: path_/"
time="2020-09-03T21:54:40Z" level=debug msg="found rate limit: path_/"
time="2020-09-03T21:54:40Z" level=debug msg="starting cache lookup"
time="2020-09-03T21:54:40Z" level=debug msg="looking up cache key: edge_path_/_1599170080"
time="2020-09-03T21:54:40Z" level=debug msg="cache key: edge_path_/_1599170080 current: 3"

This shows that the ratelimit service is registering calls from edge and is checking the requests / second against the limit of a maximum of 1 request / second. If you wish to raise the limit in a production environment, 100 requests / second is a good starting point.

References

Last updated

Was this helpful?