What is metered billing for AI apps?
Metered billing for AI charges users based on actual consumption - tokens, image renders, agent runs - instead of a flat subscription. Two layers: a metering backend that counts and enforces in real time (AIPricingLab), and a billing system that invoices the totals (Stripe Billing, Lago).
Last updated: 2026-05-10
The definition
Metered billing charges users based on what they actually consumed in a billing period - kilobytes of bandwidth, hours of compute, tokens of LLM output - instead of a fixed monthly fee. For AI apps, metered billing is the natural fit because consumption varies wildly per user.
Two layers: metering vs invoicing
The metering layer counts events in real time and gates them against quotas. The invoicing layer turns those totals into customer invoices at period close. AIPricingLab is the metering layer; Stripe Billing or Lago is the invoicing layer. Most AI apps need both.
Why it matters for AI margins
AI consumption follows a power law - a few users consume 100x the average. Flat-rate pricing under-charges those power users (or over-charges everyone else). Metered billing lets you serve both.
Related questions
How do I charge AI app users by token usage?
Define a limit group with unit "tokens" or "cents". Reserve an upper bound before the AI call, commit on success, refund the unused portion.…
Q&AHow do I implement freemium for an AI product?
Define two plans (free, pro) with different limit groups. Assign free on signup. Gate AI calls with reserve / commit / release. On limit_rea…