Maybe you've heard that "coding is solved". Its true, I can now generate, on a whim, whatever code my heart may desire. Does it change the game for applying cost optimization to code? Could I just hand my coding agent a vague wish to "go through the codebase and optimize everything to reduce costs"?
It turns out that's a terrible idea!
I wanted to explain why even the best coding agents, armed with all the information the internet has to offer on optimization strategies, and a great understanding of your code, don't know enough to reliably fix cloud cost problems. They are missing the key information they need to do the job.
Every optimization is a trade
Every optimization that reduces cost spends something else to get there. Look at any of the standard moves and there's a column in red ink somewhere:
- Add a cache and you're trading freshness — the value is now potentially 60 seconds (or 10 minutes, or whatever you picked) out of date. You're also burning memory to hold it, and adding an invalidation path that someone has to maintain.
- Batch your writes and you're trading latency — the record doesn't actually hit storage until the batch flushes, which is also a crash window where it's gone.
- Compress before storing and you're trading CPU on every read and write for fewer bytes on disk.
- Move data to a cheaper storage tier and you're trading retrieval latency (and sometimes per-fetch cost) for cheaper at-rest pricing.
- Reach for a hand-tuned data structure — a bloom filter, an in-process index, a custom queue — and you're trading engineering effort up front and code complexity forever for a smaller bill.
Under the right conditions, any of these is a steal. Choose them indiscriminately and you get every cost in that red-ink column without the savings: caches with confusing staleness behavior, CPU pegged on data nobody reads, memory burned on items nobody asks for twice, hand-built components that need babysitting to stay alive. The worst of everything.
A concrete one
Say you have a function that checks if a user exists. Two lines:
def user_exists(user_id):
return db.query("SELECT 1 FROM users WHERE id = ?", user_id) is not None
Anyone can read it. Fine — until it lands on a hot path, traffic grows 50x, and the database it's pummeling is the bottleneck for everything else.
A bloom filter is the classic move: it tells you "definitely not a user" with certainty and "probably a user" otherwise. The "definitely not" answers skip the database entirely:
bloom = BloomFilter(capacity=20_000_000, error_rate=0.001)
def warm_bloom_filter():
for user_id in db.query("SELECT id FROM users"):
bloom.add(user_id)
def user_exists(user_id):
if user_id not in bloom:
return False # definitely not a user — no DB call
return db.query("SELECT 1 FROM users WHERE id = ?", user_id) is not None
Look at what came with it: every signup has to write to the filter; periodic rebuilds, because bloom filters don't handle deletes; sizing for growth; cold-start behavior; false-positive monitoring; and the next person to touch this file has to learn what all of that is for. The two-line function became a small subsystem.
If user_exists really was crushing the database, all of that is obviously worth it. If it gets called twice an hour, you just added a moving part that will break at 3am for no reason at all. Same code, same optimization — great deal or bad deal.
Deciding if the trade is worth it
Different call sites scale on different things — one grows with user traffic, the next with data volume, the next with how often a particular feature is used. You can't read which one matters off the source. You have to look at what production is actually doing:
Two pieces of data tell you: how the underlying resource is actually being used, and how the bill is broken down. Walk back through the same list:
- A cache is worth it when the read pattern shows real reuse — the same key fetched many times in a short window. If every key gets read once and then never again, the cache is just memory you bought.
- Batching writes is worth it when the bill is being driven by request count, not byte volume. A managed service typically charges for both — when per-request charges dominate, batches of 100 cost the same as a single small write; when bytes dominate, batching changes nothing.
- Compression is worth it when storage or egress is the largest line on the bill and the data is actually compressible. Already-compressed video gives you nothing back. JSON or plain-text logs hit 5-10x easily.
- A cheaper storage tier is worth it when access frequency drops sharply with age. "90% of objects haven't been read in 90 days" is the data point that flips Glacier from "interesting" to "obvious." If everything is touched daily, you'll pay more in retrieval than you saved.
- A hand-tuned data structure is worth it when the call count on the hot path is huge and the resource it's hitting is the dominant cost. The bloom filter only pays back when
user_existsis being called millions of times and the database it's protecting is the largest line on the bill.
Determining the right optimization to apply
Knowing a call site is worth optimizing only gets you half the way there. The right fix depends on which billing dimension is driving the cost — and different dimensions can call for completely different fixes from the same starting code.
Take Lambda. The bill has two main line items: invocations (paid per million requests) and compute (paid per GB-second of memory × duration). Two teams running the same function can have very different bills depending on which curve they've climbed.
Suppose you have this — a Lambda triggered by S3 events that processes each uploaded file:
def lambda_handler(event, context):
for record in event["Records"]:
process_object(record["s3"]["bucket"]["name"],
record["s3"]["object"]["key"])
# template.yaml
Resources:
ProcessUpload:
Type: AWS::Serverless::Function
Properties:
MemorySize: 10240
Events:
S3Upload:
Type: S3
Properties: { Bucket: !Ref UploadBucket, Events: s3:ObjectCreated:* }
Looks fine. Two completely different ways this is costing you money, depending on which dimension is running away.
Scenario 1 — Invocations dominate. You're getting tens of millions of small files a day. Each function runs for 80ms, which is cheap, but you're paying for every single trigger. The fix is upstream of the function: route S3 events through SQS and process them in batches.
Events:
S3Upload:
Type: SQS
Properties: { Queue: !GetAtt UploadQueue.Arn, BatchSize: 1000 }
def lambda_handler(event, context):
for record in event["Records"]: # up to 1000 per invocation
body = json.loads(record["body"])
process_object(body["bucket"], body["key"])
Scenario 2 — GB-seconds dominate. Few invocations, but each runs for minutes at 10GB of memory you provisioned "just in case." The fix isn't routing — it's right-sizing, and maybe a tighter inner loop:
- MemorySize: 10240
+ MemorySize: 1024
Routing events through SQS doesn't shrink GB-seconds. Cutting memory doesn't reduce invocation count. Pick the wrong one and you spent a sprint moving things around for almost no savings.
How Frugal Helps
A coding agent has every known optimization in its training data and a perfect read of your code, and still can't tell which optimizations to make where. It doesn't know what's happening in production. It doesn't see the usage. It doesn't see the bill broken down. Without those, "is this trade worth it?" and "which dimension is driving this?" are unanswerable.
That's the piece Frugal builds. Not just a pipe into your bills and your traces — those exist already, and an agent staring at raw bills is no better off than you are. Frugal turns that data into cost intelligence the agent can use: which code paths are the dominant cost, which billing dimensions are running away on each one, which optimizations would pay back at the observed scale, and what trade-offs each fix would make. Actionable, mapped to specific call sites in your specific code.
Plugged into your coding agent and your PR review, that changes the conversation. Instead of "go optimize everything" — a request the agent can't answer well — the agent is handed the handful of call sites where optimization actually pays, the dimension that's driving each one, and the trade each fix would make. The optimization stops being a guess. New code that would introduce a fresh cost problem gets caught before it merges, against real production usage and real cost numbers, instead of surfacing on next month's bill.
"Optimize everything" was never the goal. Optimizing the things that matter, and leaving the rest simple, is. Frugal tells you which is which.