When it comes to running production Kubernetes workloads, having a scalable infrastructure isn’t just a nice-to-have — it’s a necessity for any company that’s serious about providing a cloud-native SaaS offering. While it’s possible to manually pump up your Kubernetes worker nodes in those early days of light traffic, let’s be honest: that’s like putting a Band-Aid on a bullet wound in the high-stakes world of real-world, high-demand scenarios.
At PointFive, we embarked on our EKS journey with high hopes and a single Managed Node Group, thinking an Amazon Linux 2 AMI node could gracefully juggle both our application pods and Kubernetes add-ons. But, as our pod population exploded and their demands for memory and CPU grew louder, it didn’t take long for us to realize that our data plane needed a serious upgrade — or we’d risk being outpaced by our own success!
We had to figure out the smartest way to scale our Kubernetes cluster. Here’s what we took into account:
We found ourselves weighing the options between Kubernetes Cluster Autoscaler and Karpenter for horizontal node scaling. Here are the features we considered when comparing our options.
While the Kubernetes Cluster Autoscaler is the more established option, its built-in restrictions ultimately steered us toward Karpenter.
Launched by our partner AWS in December 2021, Karpenter is an open-source Kubernetes cluster autoscaler that breaks free from the limitations of its predecessor. It dynamically provisions the ideal compute resources based on workload demands, offering flexibility by selecting from a wide array of EC2 instance types, including budget-friendly Spot instances.
Once we made our decision, we needed to execute. We did this through a few different methods tailored to our needs.
1apiVersion: karpenter.sh/v1
2kind: NodePool
3metadata:
4 name: main
5spec:
6 disruption:
7 consolidationPolicy: WhenEmptyOrUnderutilized
8 consolidateAfter: 1m
9 limits:
10 cpu: 1k
11 memory: 1000Gi
12 template:
13 metadata: {}
14 spec:
15 nodeClassRef:
16 group: karpenter.k8s.aws
17 kind: EC2NodeClass
18 name: main
19 expireAfter: 720h
20 requirements:
21 - key: topology.kubernetes.io/zone
22 operator: In
23 values:
24 - us-east-1a
25 - us-east-1b
26 - us-east-1c
27 - key: kubernetes.io/arch
28 operator: In
29 values:
30 - arm64
31 - key: karpenter.sh/capacity-type
32 operator: In
33 values:
34 - on-demand
35 - key: kubernetes.io/os
36 operator: In
37 values:
38 - linux
39 - key: karpenter.k8s.aws/instance-category
40 operator: In
41 values:
42 - c
43 - m
44 - r
45 - key: karpenter.k8s.aws/instance-generation
46 operator: Gt
47 values:
48 - "2"
49---
50apiVersion: karpenter.k8s.aws/v1
51kind: EC2NodeClass
52metadata:
53 name: main
54spec:
55 amiFamily: AL2
56 blockDeviceMappings:
57 - deviceName: /dev/xvda
58 ebs:
59 encrypted: true
60 volumeSize: 75Gi
61 volumeType: gp3
62 role: ${karpeneter_node_iam_role_name}
63 securityGroupSelectorTerms:
64 - id: "${cluster_security_group_id}"
65 subnetSelectorTerms:
66 - tags:
67 Type: Private
68 amiSelectorTerms:
69 - alias: al2@latest
While Karpenter made the most sense for our setup, we still needed to consider a few things to ensure we could make the most of our scaler. Our team created a list of key considerations to keep in mind as we launched and continued to use it.
As we wrap up our exploration of scaling EKS clusters cost-effectively with Karpenter, remember: This journey never truly ends. Karpenter isn’t just a tool; it’s your backstage pass to a leaner, meaner, and more efficient Kubernetes environment. And a more efficient Kubernetes environment enables better cloud cost optimization for your entire infrastructure.
So, dive in, play around with the configurations, and let Karpenter handle the heavy lifting. With the right approach, you'll scale up without scaling out of control. Keep experimenting, stay agile, and let your cluster shine, all while staying tuned to Karpenter’s continuous evolution with new versions and features.