If you've ever had a pod evicted mysteriously or a node that's "full" despite showing available resources, the problem is almost certainly in your resource configuration.
Requests vs limits — the core concept
Requests are what the scheduler uses to decide where to place pods. They're a guarantee — the kubelet reserves this much CPU and memory on the node.
Limits are the maximum a container can use. Exceeding memory limits causes an OOM kill. Exceeding CPU limits causes throttling.
resources:
requests:
cpu: "250m" # 0.25 CPU cores guaranteed
memory: "256Mi" # 256MB guaranteed
limits:
cpu: "500m" # Can burst up to 0.5 cores
memory: "512Mi" # OOM killed if exceeds 512MB
The QoS classes
Kubernetes assigns Quality of Service classes based on your resource config:
Guaranteed: Requests equal limits for all containers. These pods are the last to be evicted.
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m"
memory: "512Mi"
Burstable: At least one container has requests set, but limits differ from requests. These pods are evicted before Guaranteed pods.
BestEffort: No requests or limits set. These pods are evicted first when the node runs low on resources.
Common mistakes
Mistake 1: No resource requests
# Don't do this in production
containers:
- name: api
image: my-api:latest
# No resources specified = BestEffort
Without requests, the scheduler can pack pods onto a node beyond its capacity. Under load, pods get evicted unpredictably.
Mistake 2: Requests too high
Setting CPU request to 2 cores when your app uses 200m means 1800m of CPU is reserved but unused. Multiply by 50 pods and you've wasted 90 cores of cluster capacity.
Mistake 3: Memory limits too low
Memory is incompressible — when a container exceeds its memory limit, it gets OOM killed. Set memory limits too aggressively and you'll see constant restarts.
Mistake 4: No CPU limits (sometimes intentional)
Some teams intentionally omit CPU limits to allow bursting:
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
# No CPU limit = can use all available CPU
memory: "512Mi"
This works if your nodes have spare capacity, but can cause noisy-neighbor problems.
How to determine the right values
- Deploy without limits in a staging environment
- Observe actual usage over several days under realistic load
- Set requests to the P50 (median) usage
- Set limits to the P99 usage with 20% headroom
# Check current resource usage
kubectl top pods -n production --sort-by=cpu
kubectl top pods -n production --sort-by=memory
Monitoring resource usage
A visual dashboard makes resource management much easier:
- See CPU and memory usage trends over time
- Identify pods that are consistently over-provisioned
- Catch resource spikes before they cause evictions
- Compare requests vs actual usage across namespaces
Setting resources correctly isn't a one-time task — it's an ongoing process that requires visibility into how your workloads actually behave in production.