# Kubernetes cost optimization for startups: 7 patterns that cut bills in half

> The 2026 cost-optimization playbook for startups running Kubernetes on AWS or GCP. Right-sizing, spot, idle sleep, namespace quotas, image pulls, NAT routing, and the one structural change that compounds them all.

- **Published:** 2026-05-18
- **Author:** Ownkube team
- **Category:** Engineering
- **Tags:** kubernetes-cost, cloud-cost, aws, eks, k3s, startup-infrastructure
- **Canonical URL:** https://ownkube.io/blog/kubernetes-cost-optimization-startups
- **Cover:** https://ownkube.io/blog/kubernetes-cost-optimization-startups.png

---
The first surprise about a Kubernetes bill at a startup isn't how high it is. It's how predictable the leaks are. Across the small-team AWS and GCP bills we audit, the same seven patterns drive 70 to 90% of the waste, and they're all fixable in a single quarter.

This post is the consolidated playbook. We'll cover the seven patterns, the realistic savings on each, and the one structural change that compounds them: putting the cost optimization itself on autopilot.

**Skim answer:**

- **The seven highest-leverage patterns:** right-sizing, spot capacity, idle environment sleep, namespace quotas, image-pull traffic reduction, NAT-routing audits, and storage class right-sizing.
- **Combined impact:** typically cuts a small-team Kubernetes bill by 40 to 65%.
- **Timeline:** all seven are fixable in a single quarter.

## Why startup K8s bills are usually 2x what they should be

The math behind the typical waste:

- Most teams set resource requests at 2x to 4x what the workload actually uses, "to be safe". Cluster autoscaler then provisions nodes for the inflated request, not the real usage.
- Idle preview environments and staging clusters run 24/7 even though they're used 30 hours a week.
- Container images get pulled from public registries through NAT gateways, racking up data-processing fees nobody sees.
- EBS volumes are provisioned at default gp2 sizes that exceed real I/O needs by an order of magnitude.

None of these are dramatic failures. They're small consistent bleeds. Add up enough of them and the cluster bill is double what it should be.

## Pattern 1: Right-size every workload

The single largest lever. Most workloads we see have resource requests set during initial deployment, never revisited. A typical web pod might have a 1 vCPU / 1 GB RAM request and a 95th-percentile usage of 0.18 vCPU / 240 MB RAM. The other 80% of the request is reservation paid for and unused.

**The fix**: use [Vertical Pod Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler) in recommendation mode for one to two weeks, then apply the recommendations. For long-tail workloads with bursty traffic, use VPA in `Auto` mode with sensible min/max bounds.

**Realistic savings**: 25 to 45% of cluster compute. The single biggest line-item improvement on every audit.

## Pattern 2: Spot capacity for the workloads that tolerate it

Stateless web pods, queue consumers, build runners, batch jobs, and preview environments are excellent candidates for spot. Database primaries, control plane nodes, and single-replica stateful services are not. We covered the full pattern in [AWS spot instances in production](/blog/aws-spot-instances-production-guide).

**The fix**: mixed instance pools with on-demand base + spot for the rest. Karpenter or [Cluster Autoscaler with mixed ASGs](https://docs.aws.amazon.com/autoscaling/ec2/userguide/asg-purchase-options.html) handles the orchestration. Set `PodDisruptionBudgets` and 30 to 120 second graceful shutdown windows.

**Realistic savings**: 50 to 70% off the spot-eligible portion of the compute bill. For a typical small-team workload, that translates to roughly 30 to 50% off the total cluster compute.

## Pattern 3: Sleep idle environments

Staging, preview, and developer sandbox environments sit idle the majority of the week. A preview environment created on Monday for a PR that merges Friday runs for ~120 hours while being used for maybe 4 hours. The other 116 hours are pure waste.

**The fix**: implement an idle-detection loop that scales preview deployments to zero after N hours of no traffic, then scales back up on the next request. Tools like KEDA, scaler-controller, or platform-layer features handle this.

**Realistic savings**: 50 to 75% of preview environment cost. For teams with 5 to 20 active previews at any time, this can be the second-biggest single-pattern saving.

## Pattern 4: Namespace quotas and request governance

Without `ResourceQuota` on each namespace, any service team can ship a deployment that requests more than it needs. Quotas force the conversation: "you want 8 vCPU for this worker, defend it."

**The fix**: a `ResourceQuota` per namespace pinned to a realistic budget. A `LimitRange` to set sensible default requests for pods without explicit requests. A monthly review where the quotas are revisited.

**Realistic savings**: 10 to 20% indirect via culture change. The bigger win is preventing future drift.

## Pattern 5: Pull images from a private registry

Most teams pull container images from Docker Hub, GitHub Container Registry, or Quay through a NAT gateway. Each pull is a few hundred MB. At a busy CI fleet plus rolling production deploys, that's tens of GB per day routed through NAT at $0.045 per GB.

**The fix**: mirror your base images to a private [Amazon ECR](https://aws.amazon.com/ecr/) registry (or GCP Artifact Registry), enable the ECR VPC interface endpoint, and configure your image pull policy to use the mirror. Image pulls now stay on the AWS backbone at near-zero per-GB cost.

**Realistic savings**: $50 to $400 per month for a typical small team, more for high-deploy-rate orgs.

## Pattern 6: Audit NAT routing

NAT gateway data-processing fees are the most under-noticed cost on AWS bills. We covered the full pattern in our [NAT gateway cost guide](/blog/aws-nat-gateway-cost-fix). The short version:

- Enable S3 and DynamoDB gateway endpoints (free) on every VPC.
- Add interface endpoints for high-volume AWS services (Secrets Manager, STS, CloudWatch Logs).
- Dual-stack your VPC and route IPv6 traffic through a free egress-only IGW.
- Reduce NAT topology from 3 AZs to 2 if availability tolerates.

**Realistic savings**: $200 to $1,500 per month depending on cluster traffic.

## Pattern 7: Storage class right-sizing

EBS gp3 has replaced gp2 as the default sensible choice for most cluster volumes. It's cheaper, faster, and you pay only for the IOPS you provision (rather than the IOPS that scale with volume size on gp2).

**The fix**: migrate from gp2 to gp3 across your cluster. Right-size volume capacity (most workloads provision 100 GB when they use 12 GB). For workloads with very low I/O, consider [sc1 or st1](https://aws.amazon.com/ebs/cold-hdd/) for cold data.

**Realistic savings**: 20 to 35% on EBS line items.

## The structural change that compounds them all

Each of the seven patterns above is a one-time engineering project. Implementing them all takes 2 to 4 weeks of focused work for a small team. The harder question is: what stops the drift from coming back six months later?

The honest answer is "nothing, unless someone owns it." Without an explicit owner, request inflation creeps back in, new services get deployed without quotas, new environments forget the sleep schedule, new container pulls go through NAT, and the cluster bill returns to its old shape within two quarters.

The structural fix is to put the cost watch on autopilot. At [Ownkube](https://ownkube.io) the Cost agent does exactly this, inside your own cloud account:

- **Right-sizing**: continuous VPA-style recommendations applied with safety thresholds. Sample output: "api-worker over-provisioned: 2GB allocated, 340MB peak. Right-sized. ~$18/mo saved."
- **Spot ratio tracking**: realized spot savings reported monthly. Sample output: "Spot ratio: 78%. Realized savings vs on-demand: $612 last month."
- **Idle sleep**: previews auto-scaled to zero after 4 hours of inactivity, scaled up on first request.
- **NAT and image-pull audits**: anomaly detection flags new patterns that drive NAT cost.

You still own the architectural decisions. The agents handle the recurring vigilance.

## A worked example

Take a typical 2026 SaaS running Kubernetes on AWS: 1 cluster, ~24 vCPU production fleet, 8 vCPU staging, 10 active preview environments, RDS Postgres, ElastiCache.

**Before optimization**: ~$2,400/month cluster compute + $180 NAT + $90 EBS = **$2,670/month**.

**After applying all seven patterns**:

| Pattern | Saving |
|---|---|
| Right-sizing (Pattern 1) | -$840 |
| Spot capacity (Pattern 2) | -$520 |
| Idle environment sleep (Pattern 3) | -$280 |
| Namespace quotas + governance (Pattern 4) | -$120 |
| Private image registry (Pattern 5) | -$140 |
| NAT routing audit (Pattern 6) | -$110 |
| Storage class right-sizing (Pattern 7) | -$30 |
| **Total saved** | **-$2,040** |

**After**: ~$630/month. About a 76% reduction on the cluster bill.

Numbers are illustrative for a defined workload. Your savings will vary.

## Decision checklist

Before you start, confirm:

- [ ] Do you have a way to measure current per-workload resource usage (Prometheus, CloudWatch, Datadog)?
- [ ] Is your cluster on Kubernetes 1.27+ (so VPA and Karpenter work cleanly)?
- [ ] Do you have at least one engineer with a couple of weeks of focused time?
- [ ] Is anyone going to own ongoing cost vigilance after the initial pass, or do you need that on autopilot?

If you ticked all four, you're set up to do the work in-house. If you ticked three or fewer, consider a platform layer that runs these patterns by default.

## Closing

Kubernetes cost optimization at a small startup isn't a mysterious art. It's seven well-understood patterns plus one structural change to stop the drift from coming back. Implement the seven, and you'll halve the cluster bill within a quarter. Put the watch on autopilot, and it stays halved.

If you'd rather skip the initial work and start with the patterns already applied, Ownkube runs them by default inside your own AWS account, and the Cost agent watches the drift. [Connect your cloud and try it](https://app.ownkube.io/signup).