Cost9 min read

SaaS Cost-Per-User: How to Calculate It and Get It Below $0.50

Cost-per-user is the single most diagnostic metric for SaaS unit economics, and most founders calculate it wrong. Here is the right formula, the right benchmarks, and the levers that actually move the number.

Krishan K Agarwal

Senior System Architect & Fractional CTO

Updated May 2026Published Mar 2026

On this page

If I had to pick one number to diagnose a SaaS business in five minutes, it would be cost-per-user. Not CAC, not LTV, not gross margin in aggregate — cost-per-user, calculated monthly, broken down by infrastructure component. It is the metric that tells you whether you are running a software business or a hosting business with a marketing team.

Most founders either do not track CPU at all or track it incorrectly. Below is how to calculate it, what good looks like by segment, and the specific levers that move the number — with real examples.

The CPU formula and what to include

CPU equals total monthly variable infrastructure spend divided by monthly active users (MAU). Include every line item that scales with users: compute, database, bandwidth, CDN, transactional email, observability, LLM and other API tokens, file storage, search, and any third-party fees keyed to user count or usage. Exclude fixed costs (salaries, office, accounting software), one-time setup costs, and marketing spend — those belong elsewhere on your P&L.

MAU is whichever active-user definition genuinely reflects your product. For a consumer app, monthly logged-in users. For a B2B SaaS, monthly users who performed at least one core action. Pick a definition, document it, and stick with it for at least 6 months — comparisons across changing definitions are useless.

Benchmarks: what good CPU looks like by segment

These are the targets I use when auditing unit economics. They are wide enough to fit different stacks but tight enough to flag real problems. If you are 2x outside the band, something is structurally off — usually over-provisioned infrastructure, a runaway dependency, or a mispriced product.

Segment	Typical CPU range	Plan ARPU range	Healthy gross margin	What drives cost
B2C consumer SaaS	$0.10-0.50	$5-15/mo	92-97%	Bandwidth, CDN, free-tier abuse
B2B SMB tools	$1-5	$29-99/mo	90-96%	Compute, DB, integrations
B2B mid-market	$5-25	$200-2,000/mo	85-94%	Multi-tenant DB, SSO, audit logs
B2B enterprise	$50-200	$2K-50K/mo	75-88%	Dedicated infra, SLA, support
AI-native consumer	$1-5	$10-30/mo	60-85%	LLM tokens dominate
AI-native B2B	$10-100	$100-2,000/mo	70-90%	LLM tokens + retrieval infra

Cost-per-user benchmarks by SaaS segment, 2026. AI-native bands are wider because token routing matters enormously.

Two patterns to notice. First, the AI-native rows are 3 to 10x more expensive than non-AI peers — this is structural, and your pricing has to absorb it. Second, gross margin compresses as you go up-market, but ARPU grows much faster, so absolute gross profit per user goes up. Enterprise SaaS at 80 percent gross margin is more profitable than consumer SaaS at 95 percent on a per-customer basis.

A real example: B2B SaaS at $0.32 CPU

Composite of three B2B products I have audited recently, MRR between $40K and $80K. All three landed in similar territory.

MAU: 4,200 (paying seats and active free users combined)
Compute (Fly.io, 4 services): $310/month — $0.074/user
Database (Neon Pro plus replicas): $230/month — $0.055/user
Bandwidth (Cloudflare, free + Pro $20): $20/month — $0.005/user
Email (Postmark, ~80K emails): $50/month — $0.012/user
Sentry + Better Stack + Checkly: $90/month — $0.021/user
Stripe processing (variable): $440/month — $0.105/user
Misc (Clerk, Inngest, R2): $180/month — $0.043/user

Total: $1,320/month across 4,200 users = $0.314 CPU. On a $99/month plan with 600 paying customers, that is $59,400 MRR against $1,320 infra cost. Gross margin: 97.8 percent. The product can absorb almost any reasonable growth without unit economics deteriorating.

When CPU goes wrong: AI-native and consumer free-tier traps

Two patterns crash CPU and tend to be invisible until they are large. AI-native apps where one power user runs an agent loop overnight and racks up $40 in token costs alone. And consumer free tiers where 95 percent of MAU never pay and a small percentage of those abuse the free quota at scale.

The fix for both is the same: per-user usage caps with graceful degradation. Cap free-tier API calls, cap free-tier LLM tokens, cap free-tier storage. When the cap is hit, downgrade quality (cheaper model, smaller payload, slower delivery) rather than blocking the user — that preserves the conversion funnel while protecting margin. The AI-native architecture guide has the full pattern for AI workloads, and the AWS bill cut playbook covers the infrastructure-side equivalents.

The four levers that move CPU

After auditing 50+ products on CPU, the same four levers account for almost every meaningful reduction. Run them roughly in this order — they are listed in descending ROI per engineering hour.

Add caching everywhere — Redis for hot keys, semantic cache for AI outputs, edge cache via Cloudflare for semi-static content. Typical impact: 30-50 percent cost cut for under a week of work
Optimize the top 10 database queries — pull a slow-query log, fix the worst offenders (missing index, N+1, unbounded scan). Typical impact: 20-40 percent reduction in DB cost and latency
Buy reserved capacity once spend is steady-state — Compute Savings Plans on AWS, annual commits on Render/Fly, RDS Reserved on Postgres. Typical impact: 25-35 percent off the committed portion
Cap and meter heavy users — hard quotas on API calls, LLM tokens, storage, with degradation rather than blocking. Typical impact: clips the long tail that drives 30-60 percent of total spend on power-law products

How CPU should change your pricing

If your CPU is $4 and your plan ARPU is $9, you are not running a SaaS — you are running a low-margin reseller of someone else's infrastructure. The math will not improve at scale because most of your CPU is variable. You have two choices: cut CPU dramatically (caching, model routing, quotas) or raise prices to restore margin. Lowering price further and 'making it up on volume' is the failed move every time.

AI-native pricing in 2026 has to be metered or tiered around usage, not flat per-seat. The 'unlimited AI' plan is a path to bankruptcy when 5 percent of users drive 50 percent of token cost. Either price by tokens, by tasks completed, or by tier-with-clear-quotas — and align your pricing tiers with where users actually deliver value, not where you can squeeze margin.

Putting it all together

Pick a definition of MAU. Pull every variable infrastructure line item. Divide. Run that exercise once a month, track it in a single dashboard, and compare against the segment benchmarks above. If you are inside the band, focus on growth. If you are outside, you have a unit economics problem that scaling will magnify, and you should fix it before you raise or hire.

The cost-per-user lens connects directly to most of the other engineering decisions you make. The DevOps-on-$500-a-month stack is built to keep CPU low through smart vendor choices; the AI-native architecture guide handles the LLM-token side; the AWS bill cut playbook is the post-PaaS version of the same exercise; and the technical debt framework explains how engineering inefficiency translates into the soft-cost-per-user nobody calculates. If you want a senior architect to actually run the CPU audit on your stack and produce a prioritized list of cost levers, that is what an architecture audit (from $1,499) is built for.

Frequently asked questions

What is a good cost-per-user for a SaaS in 2026?

Targets vary by segment: under $0.50 per monthly active user for B2C consumer SaaS, $1 to $5 for B2B SMB tools, $5 to $25 for B2B mid-market, and $50 to $200 for enterprise. AI-native products run 3 to 10x higher because LLM tokens are the dominant cost. Anything outside these bands either has unusual margin upside (rare) or a unit-economics problem you should investigate before raising or scaling.

How do I calculate cost-per-user correctly?

CPU = (monthly infrastructure spend including third-party APIs) / (monthly active users). Include compute, database, bandwidth, CDN, email, observability, LLM tokens, payment processing on the variable line, and any per-user-keyed third-party fees. Exclude one-time setup costs, fixed marketing spend, and salaries — those belong in CAC and OpEx, not CPU. Track it monthly, by user cohort if you can.

Why does my cost-per-user spike when I add AI features?

Because LLM tokens scale linearly with usage and the active users hammer them harder than averages suggest. A power-law distribution means the top 5 percent of users often drive 50 percent of token spend. Without per-user metering and hard caps, one heavy user can cost you more than ten paying users earn. Cap usage and route to cheaper models for non-critical work — see the AI-native architecture guide for the patterns.

What is the single biggest lever to cut cost-per-user?

Caching, almost always. Output caching on AI workloads, query result caching on read-heavy DB workloads, edge caching on static and semi-static content. A single-digit percent of total spend on Redis and CDN typically returns 30 to 50 percent reductions in compute and DB cost. The ROI is unbalanced enough that it is the first place to look on any cost audit.

Should I publish my cost-per-user metric?

Internally, yes — make it a board-level metric reviewed monthly. Externally, only at scale and only if it is favorable. CPU is a powerful trust signal in due diligence and partner conversations once you can claim a 90+ percent gross margin, but it can also expose unit economics problems publicly. Track it relentlessly; share it selectively.

Unit EconomicsSaaSMetrics

Cost

How to Cut Your AWS Bill by 50% Without Breaking Things

Most AWS bills have 30 to 55 percent fat that comes off without architecture changes or downtime. Here is the audit playbook I run, in priority order, with real numbers from real cuts.

11 min readRead

Scaling

How to Scale a SaaS to 1 Million Users Without a Rewrite

Scaling to a million users is not a technology problem. It is a seven-decision problem. Get these right early and you grow without rewrites. Get them wrong and every quarter is a fire.

13 min readRead

Architecture

Caching Strategy for SaaS: Redis, Memcached, or CDN First?

Most SaaS apps cache wrong. They reach for Redis on day one and skip the CDN that would have served 80 percent of their traffic for free. Here is the layered caching strategy I recommend after auditing 30+ production systems.