AIPricingLabQ&A
Q&A

How do I track LLM usage per user?

Call vevee.track(userId, "llm.tokens", tokenCount) after each LLM call. AIPricingLab counts it against the user's plan limits in real time and exposes a per-user usage dashboard. For atomic enforcement under concurrency, use reserve / commit / release instead.

Last updated: 2026-05-10

The short answer

Wrap your OpenAI / Anthropic / Gemini call with vevee.reserve before, vevee.commit after success, vevee.release after failure. AIPricingLab tracks every token (or request, or cent) per end-user per plan. No counter table on your side.

Minimal example

Replace the OpenAI call with this pattern. Reserve an upper bound of 4000 tokens, call OpenAI, commit, refund the unused difference. Two parallel calls cannot both pass the same quota.

const r = await vevee.reserve(userId, "llm.tokens", 4000, { model: "gpt-4o" });
if (!r.allowed) throw new Error("limit");
try {
  const res = await openai.chat.completions.create(/*...*/);
  const used = (res.usage?.prompt_tokens ?? 0) + (res.usage?.completion_tokens ?? 0);
  await vevee.commit(r.reservationId!);
  if (used < 4000) await vevee.track(userId, "llm.tokens.refund", 4000 - used);
  return res;
} catch (e) {
  await vevee.release(r.reservationId!);
  throw e;
}

Why not a Postgres counter?

You can build it. You will need: an atomic increment under concurrency, period rollover at the right anchor, refunds for failed calls, an admin UI, and webhooks at threshold. AIPricingLab is the same primitive in one SDK call.

Related questions