Scaling Kubernetes Deployments in AWS with Container Insights Metrics

Harry Tsiligiannis
Harry Tsiligiannis

Kubernetes supports several autoscaling mechanisms, these include -

  • Horizontal pod autoscaler - adjusts the number of replicas of an application.
  • Vertical pod autoscaler - adjusts the resource requests and limits of a container.
  • Cluster autoscaler - adjusts the number of nodes of a cluster.

All these components that perform autoscaling scale a particular aspect of infrastructure and they all address different use cases.

In order to make use of the autoscaling mechanism, some form of metrics are required. The easiest and simplest solution would be to setup metrics server which enables autoscaling certain components based on CPU and memory. However, what happens if your metrics are hosted using a third-party or external service?

Support for external metrics was introduced in Kubernetes v1.10. AWS provides CloudWatch Container Insights as a solution to collect, aggregate, and summarise metrics and logs from your containerized applications and microservices. In this article, we are going to cover how to leverage metrics from Container Insights to scale apps horizontally.

Prerequisites

Before commencing, the following is required -

Getting started

Create an EKS cluster by running the following command.

eksctl create cluster \
--name test-cluster \
--with-oidc \
--managed

Setting up the CloudWatch Agent to collect cluster metrics

Before we deploy the Cloudwatch agent to the cluster we must grant IAM permissions to enable the Amazon EKS worker nodes to send metrics to Cloudwatch. We can achieve this by utilising eksctl to create the necessary IAM Role and Service Account.

eksctl create iamserviceaccount \
    --name=cloudwatch-agent \
    --namespace=amazon-cloudwatch \
    --cluster=test-cluster \
    --attach-policy-arn=arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \
    --approve

Next, create a file named cwagent-configmap.yaml and paste the following contents into it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: v1
data:
  cwagentconfig.json: |
    {
      "logs": {
        "metrics_collected": {
          "kubernetes": {
            "metrics_collection_interval": 60
          }
        },
        "force_flush_interval": 5
      }
    }
kind: ConfigMap
metadata:
  name: cwagentconfig
  namespace: amazon-cloudwatch

Afterwards, create the ConfigMap in the cluster by running the following command.

kubectl apply -f cwagent-configmap.yaml

Finally, deploy the CloudWatch agent as a DaemonSet using the following command.

kubectl apply -f https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/cwagent/cwagent-daemonset.yaml

From the data that the agent exports, CloudWatch creates aggregated metrics at the cluster, node, pod, task, and service level as CloudWatch metrics. You can get a full list of the metrics that are collected for Amazon EKS and Kubernetes here. Now to be able to fully utilise the metrics collected in Cloudwatch, we will need to install an additional component.

The k8s-cloudwatch-adapter is an implementation of the Kubernetes External Metrics API with integration for CloudWatch metrics. It allows you to scale your Kubernetes deployment using the Horizontal Pod Autoscaler (HPA) with CloudWatch metrics.

This adapter requires the following permissions to access metric data from Amazon Cloudwatch. cloudwatch:GetMetricData

You can create an IAM policy using this template, and attach it to the Service Account Role if you are using IAM roles for service accounts.

1
2
3
4
5
6
7
8
9
10
11
12
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "cloudwatch:GetMetricData"
            ],
            "Resource": "*"
        }
    ]
}
aws iam create-policy --policy-name k8s-cloudwatch-adapter --policy-document file://policy

Create the IAM Role and service account using the following command.

eksctl create iamserviceaccount \
    --name=k8s-cloudwatch-adapter \
    --namespace=custom-metrics \
    --cluster=test-cluster \
    --attach-policy-arn=arn:aws:iam::aws:policy/k8s-cloudwatch-adapter \
    --approve

You can now deploy the k8s-cloudwatch-adapter to your Kubernetes cluster.

kubectl apply -f https://raw.githubusercontent.com/awslabs/k8s-cloudwatch-adapter/master/deploy/adapter.yaml

Testing the setup

To demonstrate horizontal pod autoscaler mechanism functioning we will use a custom docker image based on the php-apache image.

Deploy the application with the following command:

kubectl apply -f https://k8s.io/examples/application/php-apache.yaml

Set up External metric and HPA

Create a file named externalmetric.yaml and paste the following contents into it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
    name: php-apache-cpu-utilization
spec:
    name: php-apache-cpu-utilization
    resource:
      resource: "deployment"
    queries:
      - id: php_apache_pod_cpu
        metricStat:
          metric:
            namespace: "ContainerInsights"
            metricName: "pod_cpu_utilization"
            dimensions:
              - name: PodName
                value: "php-apache"
              - name: Namespace
                value: "default"
              - name: ClusterName
                value: "hpa-cluster"
          period: 60
          stat: Average
          unit: Percent
        returnData: true

Deploy these metrics by executing the following.

kubectl apply -f externalmetric.yaml

Then, verify that the metric resource has been registered successfully by querying the metrics APIs with the following command.

kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1"

Next, create a file named hpa.yaml and paste the following contents to it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta2
metadata:
  name: php-apache-scaler
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 4
  metrics:
  - type: External
    external:
      metric:
        name: php-apache-cpu-utilization
      target:
        type: AverageValue
        averageValue: 50

Finally, apply these changes by running the following command.

kubectl apply -f hpa.yaml

Generate load on the service

Next, we will evaluate how the container responds when loads are increased. We will start a container and send an infinite loop of queries to the php-apache service:

kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Within a few minutes, we should see the higher CPU load by executing:

kubectl get hpa

Coming back to the php-apache-cpu-utilization metric, the situation in this example would work as follows -

  1. CloudWatch metrics for the cluster are automatically collected by the agent and pushed to CloudWatch
  2. Cloudwatch Adapter obtains the collected metric data from Cloudwatch and serves it as a metric named php-apache-cpu-utilization through the External Metrics API.
  3. The HPA queries the php-apache-cpu-utilization metric from the External Metrics API and uses it to autoscale the php-apache Deployment.

Finally, in the event where the Cloudwatch metrics are stored in a different account from the one where your cluster is operating the k8s-cloudwatch-adapter supports retrieving Cloudwatch metrics from another AWS account using IAM roles. You can configure cross-account metrics by following the steps here.

Tip: Configurable scaling behavior

One concern with autoscaling is thrashing, this is when the number of pods are constantly fluctuating due to short changes in the load. In earlier versions of Kubernetes that problem was mitigated by introducing an upscale delay. Since v1.12, a new algorithmic update removes the need for the upscale delay by using --horizontal-pod-autoscaler-downscale-stabilization flag. In order to use this feature, the flag needs to be set in the kube-controller-manager although this is not configurable within EKS.

Kubernetes v1.18 (v2 beta) API allows scaling behaviour to be configured through the HPA behaviour field, thus making it compatible with EKS. Behaviours are defined independently for scaling up and down within scaleUp or scaleDown sections correspondingly.

Finallly, although only a limited set of metrics collected by the cloudwatch agent is exposed in Container Insights you can leverage additional integrations with StatsD and Prometheus to make more metrics available through Cloudwatch.