How do I charge AI app users by token usage?
Define a limit group with unit "tokens" or "cents". Reserve an upper bound before the AI call, commit on success, refund the unused portion. For pre-paid: bump the user's custom limit on credit purchase. For post-paid: push period totals to a Stripe meter at month-end.
Last updated: 2026-05-10
Tokens or cents?
Use cents if you support multiple models with different costs. Use tokens if you only call one model or charge a flat rate per token. Both work; cents is more flexible.
Reserve an upper bound, then refund
Use a tokenizer (tiktoken for OpenAI) to estimate prompt tokens, then reserve prompt + max_tokens. After the call, commit and refund the unused portion.
Pre-paid credits
On Stripe one-time checkout, bump the user's custom limit by the purchased amount. Their counter keeps ticking from the new ceiling.
await vevee.upsertSubscription({
userId,
planId: "plan_paygo",
customLimits: { tokens: { quota: currentQuota + addedTokens } },
});Live balance display
Use a pk_live_ public key in the browser. vevee.usage(userId) returns the user's remaining quota. Safe to expose - only reads the calling user's own counters.
Related questions
How do I track LLM usage per user?
Call vevee.track(userId, "llm.tokens", tokenCount) after each LLM call. AIPricingLab counts it against the user's plan limits in real time a…
Q&AWhat is metered billing for AI apps?
Metered billing for AI charges users based on actual consumption - tokens, image renders, agent runs - instead of a flat subscription. Two l…