Optimizing Kubernetes Autoscaling with Karpenter
Autoscaling is a fundamental feature of Kubernetes, ensuring that workloads receive the required compute resources dynamically. Traditionally, Kubernetes provides Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) to scale applications based on CPU and memory metrics. However, these solutions often lead to additional resource consumption, necessitating a robust cluster autoscaler to manage node provisioning.
Cluster Autoscaler (CA) solutions are widely used across cloud providers like AWS, Azure, and GCP. However, they come with significant drawbacks, including slow provisioning times (up to 10 minutes) and rigid instance type management. Karpenter is an open-source autoscaler designed to overcome these limitations by dynamically provisioning nodes and optimizing cluster efficiency in real time.
Challenges of Traditional Cluster Autoscalers
When workloads scale dynamically, cluster autoscalers often struggle with the following challenges:
- Fixed Node Pool Constraints: Managed node groups require predefined instance types, limiting flexibility.
- Slow Scaling: Cluster Autoscaler follows a multi-step API process, significantly delaying new node provisioning.
- Inefficient Resource Utilization: CA provisions nodes based on fixed scaling groups, often leading to underutilized resources.
- Lack of Cost Optimization: Scaling is often inefficient, leading to unnecessary costs.
What is Karpenter?
Karpenter is a next-generation Kubernetes autoscaler that efficiently provisions and deprovisions nodes directly through the cloud provider’s API, eliminating the need for managed node groups. Key advantages of Karpenter include:
- Faster Scaling: Bypasses the traditional node group API calls, reducing provisioning time.
- Dynamic Instance Selection: Automatically provisions the optimal instance type based on workload requirements.
- Improved Cost Efficiency: Uses consolidation strategies to optimize node utilization and reduce waste.
- Multi-Cloud Support: Supports AWS, Azure, Alibaba Cloud, and Cluster API-based providers.
How Karpenter Works
Karpenter continuously monitors unschedulable pods and provisions nodes that meet their constraints. It operates in four key phases:
- Monitoring: Watches for pending pods that cannot be scheduled due to resource constraints.
- Evaluation: Matches pod requirements with available instance types, taking into account CPU, memory, affinity, and taints.
- Provisioning: Launches the most efficient compute resources directly through the cloud provider’s API.
- Deprovisioning: Detects underutilized nodes and consolidates workloads to reduce unnecessary compute usage.
Core Karpenter Components
Karpenter introduces new Custom Resource Definitions (CRDs) to manage autoscaling:
- Node Class: Defines configurations for cloud-specific nodes (e.g., EC2 for AWS, AKS for Azure).
- Node Pool: Defines constraints for provisioning and scaling nodes dynamically.
- Node Claim: Created dynamically by Karpenter when new nodes are required, based on workload demand.
Deploying Karpenter
1. Install Karpenter
1
2
3
4
5
6
7
8
9
# Ensure you are logged out of public ECR registry before logging in again
helm registry logout public.ecr.aws
# Install Karpenter using Helm with the provided values.yaml file
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
--namespace kube-system \
--create-namespace \
--values values.yaml \
--wait
2. Configure a Node Pool
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: apps
spec:
template:
metadata:
labels:
app-type: default
spec:
nodeClassRef:
name: ec2nodeclass-apps
kind: EC2NodeClass
group: karpenter.k8s.aws
requirements:
- key: "node.kubernetes.io/instance-type"
operator: In
values:
- m5.4xlarge
- c5.4xlarge
- r5.2xlarge
- key: "karpenter.sh/capacity-type"
operator: In
values:
- on-demand
- key: "kubernetes.io/arch"
operator: In
values:
- amd64
limits:
cpu: "32"
memory: "128Gi"
disruption:
consolidateAfter: 30s
expireAfter: null
3. Deploy Workloads
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-app
spec:
replicas: 3
selector:
matchLabels:
app: example-app
template:
metadata:
labels:
app: example-app
spec:
nodeSelector:
app-type: default
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: "node.kubernetes.io/instance-type"
operator: In
values:
- m5.4xlarge
- c5.4xlarge
- r5.2xlarge
- key: "karpenter.sh/capacity-type"
operator: In
values:
- on-demand
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: example-app
topologyKey: "kubernetes.io/hostname"
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: example-app
topologyKey: "topology.kubernetes.io/zone"
tolerations:
- key: "dedicated"
operator: "Equal"
value: "apps"
effect: "NoSchedule"
containers:
- name: example-container
image: nginx
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
Observability in Karpenter
Karpenter provides observability through metrics endpoints, allowing you to monitor node provisioning efficiency. By integrating with Prometheus and Grafana, you can track key metrics such as:
- Pending pods waiting for scheduling
- Node provisioning times
- Resource utilization across nodes
- Consolidation efficiency
1
sum(kube_pod_status_phase{phase="Pending"}) by (namespace)
1
histogram_quantile(0.95, rate(karpenter_provisioning_duration_seconds_bucket[5m]))
1
2
3
sum(rate(node_cpu_seconds_total{mode!="idle"}[5m])) by (node)
/
sum(kube_node_status_capacity_cpu_cores) by (node) * 100
1
2
3
sum(kube_node_status_allocatable_memory_bytes - kube_node_status_allocatable_memory_bytes{job="node-exporter"})
/
sum(kube_node_status_allocatable_memory_bytes) * 100
1
sum(increase(karpenter_nodes_deleted_total[5m]))
Karpenter vs Cluster Autoscaler
Feature | Karpenter | Cluster Autoscaler |
---|---|---|
Provisioning Time | Seconds (Direct API Calls) | Minutes (Multi-layer API) |
Node Type Flexibility | Dynamic & Adaptive | Predefined Node Pools |
Cost Optimization | Yes (Spot Instances, Consolidation) | Limited |
Multi-Cloud Support | AWS, Azure, Alibaba, Cluster API | AWS, GCP, Azure |
Node Consolidation | Yes | No |
Advanced Use Cases for SREs
Karpenter provides additional capabilities that are useful for Site Reliability Engineers (SREs) managing large-scale infrastructure:
- Custom Node Expiration Policies: Automatically recycle nodes based on security requirements.
- Drift Detection and Automated Remediation: Detect configuration drifts and replace outdated nodes.
- Spot Instance Optimization: Configure graceful fallback to on-demand instances when spot capacity is unavailable.
- Multi-Region Failover Strategies: Define region-based affinity rules to optimize failover mechanisms.
- Custom Scheduling Algorithms: Use Karpenter’s flexible configuration to optimize for latency, cost, or performance.
Demo: Hands-on with Karpenter
To help you get started with Karpenter, I have prepared a demo repository that showcases a complete setup, including:
- IaC (via OpenTofu) configuration to provision an EKS cluster.
- Helm values for Karpenter installation with optimized settings.
- Node Pool and EC2 Node Class configurations for efficient autoscaling.
- Example workloads to test Karpenter’s dynamic provisioning.
GitHub Repository 🔗 Karpenter Demo Repository
Conclusion
Karpenter is a game-changer for Kubernetes autoscaling, addressing the shortcomings of traditional Cluster Autoscaler solutions. With its direct API calls, faster provisioning, dynamic instance selection, and intelligent workload consolidation, Karpenter enhances both performance and cost efficiency in cloud-native environments.
For SRE teams looking to improve cluster efficiency, reduce provisioning times, and optimize compute costs, Karpenter is a powerful alternative to traditional scaling solutions.