Trimming the Bill: Optimizing AWS Costs for a Growing Web App
AWS bills rarely balloon because of one bad decision. They creep. A bigger instance here, a forgotten environment there, a database sized for a traffic spike that happened once. By the time someone asks "why are we spending this much?", the cost is spread across a dozen services and nobody owns the answer.
To make this concrete, picture a fictional but very typical growing e-commerce company. It runs a web app on AWS, it's profitable, and its bill has quietly climbed to about $3,500/month. Nothing is on fire, which is exactly why it never gets looked at. Let's look at it.
Here's roughly where the money goes:
| Area | Monthly | What it is |
|---|---|---|
| Compute | ~$1,400 | App servers running the site and API |
| Database | ~$750 | Managed relational database (multi-AZ) |
| Data transfer and networking | ~$500 | Egress, NAT gateways, cross-AZ traffic |
| Storage | ~$450 | Object storage, disk volumes, snapshots, backups |
| Caching and misc | ~$400 | In-memory cache, logging, monitoring |
| Total | ~$3,500 |
The numbers are illustrative, but the shape is real: compute and database dominate, networking and storage are bigger than people expect, and there's always a "misc" pile that's quietly growing. We'll take them one at a time, and then talk about commitment-based discounts, which is where a lot of the real money is.
Compute: usually the biggest lever
Compute is where most of the bill lives, and where the easiest wins are.
- Right-size first. Most instances are provisioned for a peak that rarely arrives. Pull the actual CPU and memory utilization; if it's sitting at 15%, it's two sizes too big. This alone often reclaims 20 to 30%.
- Scale with demand, not for it. Auto-scaling (or a serverless runtime) means you pay for the Tuesday-night lull at Tuesday-night prices, instead of provisioning for Black Friday year-round.
- Move to Graviton. AWS's ARM-based instances are typically about 20% cheaper for the same performance. For most web workloads it's a config change, not a rewrite.
- Then commit, but only once it's steady. A right-sized, stable baseline is the perfect thing to put a Savings Plan against. We'll come back to commitments below, because the order matters.
Database: the second-biggest, and the scariest to touch
Databases get oversized "to be safe" and then never revisited.
- Right-size and modernize storage. The same utilization check applies. Switching older volume types to current-generation ones (for example gp3) often cuts storage cost with no performance loss.
- Match the engine to the load. Spiky or unpredictable traffic is a good fit for a serverless database tier that scales capacity automatically, instead of paying for a large instance around the clock.
- A production database is prime commitment territory. It runs 24/7 by definition, which, as we'll see, makes it the best Reserved Instance candidate most teams have (and the one they forget).
- Stop paying for idle replicas and old snapshots. Read replicas spun up "temporarily" and automated snapshots with no retention policy are a common, invisible drain.
Data transfer and networking: the sneaky one
This line item surprises people because nothing here is an obvious "server."
- NAT Gateways bill both hourly and per GB. Routing S3 and DynamoDB traffic through VPC endpoints instead keeps it off the NAT gateway entirely.
- Egress to the internet is a real cost. Putting CloudFront in front of your app turns much of that pricey origin egress into cheaper, cached CDN delivery.
- Cross-AZ chatter adds up. Keeping chatty services in the same availability zone (without giving up real redundancy) trims it.
Storage: small per-gigabyte, large in aggregate
- Lifecycle policies automatically move infrequently accessed objects to cheaper tiers (Infrequent Access, then Glacier) and delete what should expire. Most buckets have never had one.
- Clean up the cruft. Orphaned disk volumes, snapshots of instances that no longer exist, incomplete multipart uploads. It's unglamorous and it's free money.
Commitment-based discounts: Savings Plans, RIs, and Spot
On-demand pricing is the rack rate. You pay it for the privilege of walking away at any moment. Once a workload is steady, you're paying that premium for flexibility you aren't using. Commitments hand back some of that flexibility in exchange for 30 to 70% off. There are three tools, and the craft is matching each to the right slice of usage.
- Compute Savings Plans: the flexible default. You commit to a steady dollars-per-hour of compute spend (say $2/hour for a year) and AWS discounts it about 25 to 40%. It applies automatically across EC2, Fargate, and Lambda, any instance family, size, or region, so it survives you re-architecting. For most companies this is the one to reach for first.
- EC2 Instance Savings Plans: deeper, but rigid. Same idea, but you lock to a specific instance family and region for a bigger discount. Worth it only when you're confident that fleet won't change.
- Reserved Instances: for what Savings Plans don't cover. This is the piece teams miss: Savings Plans don't apply to RDS, ElastiCache, OpenSearch, or Redshift. Those use Reserved Instances. Your always-on production database is usually the single best RI you can buy.
- Spot Instances: the other extreme. Up to about 90% off for work that tolerates interruption: batch jobs, async processing, CI. Not a commitment, but part of the same "stop paying on-demand for everything" mindset.
A simple framework ties them together:
- Commit to the baseline, not the peak. Find your minimum steady usage and cover roughly 70 to 80% of it with commitments. Let the bursty top run on on-demand and auto-scaling; send fault-tolerant work to Spot. Over-committing to chase a deeper discount is how you end up paying for capacity you've stopped using.
- Right-size first, commit second. This is the most expensive mistake in the whole exercise: a three-year commitment locked onto oversized infrastructure freezes the waste in place. Optimize the architecture, then buy the commitment against the lean baseline.
- One-year vs three-year, and how much upfront. Three-year terms and All-Upfront payment unlock the deepest discounts but assume you can predict usage that far out. For most growing companies, one-year, No-Upfront captures the bulk of the savings while keeping room to change. A sensible default until your usage is genuinely predictable.
The discipline that keeps it optimized
Tools don't save money; habits do.
- Make it visible. Cost Explorer plus consistent resource tagging lets you answer "what does each product, team, or environment actually cost?", which is half the battle.
- Set budgets and anomaly alerts so the next bit of creep gets caught in week one, not at the quarterly review.
Where it lands
Working through this list (right-sizing compute and database, moving to Graviton, fixing the NAT gateway routing, adding storage lifecycle rules, and committing a one-year Savings Plan on the steady baseline) brings the bill to around $2,000/month.
| Before | After | |
|---|---|---|
| Monthly AWS bill | ~$3,500 | ~$2,000 |
| Annualized | ~$42,000 | ~$24,000 |
That's roughly 40% off, with no loss of performance or redundancy, and most of it is structural, so it keeps paying off every month instead of drifting back up.
The company is fictional; the pattern isn't. We do this for real businesses: find where the AWS bill diverges from actual usage, fix the structure, and put guardrails in place so it stays fixed. If your bill feels bigger than your usage justifies, let's talk.