Freemium AI SaaS: ship a free → paid funnel without a backend
Build a freemium AI product where the free plan has hard quotas, the paid plan unlocks more, and "you have used 80% of your free renders" nudges drive upgrades. Drop-in implementation, ten minutes from zero to live.
Last updated: 2026-05-10
The problem
Freemium for AI is harder than freemium for regular SaaS, because every action your free users take costs you real money - tokens, GPU time, API fees. A free tier with no enforcement bleeds margin.
You also need the conversion mechanics: track usage in real time, show users how close they are to the limit, fire upgrade nudges at 80%, hard-block at 100%, expose a billing portal at the right moment.
Building all of that in-house - counter tables, period resets, dashboards, webhooks, an end-user portal - is weeks of work that competes with the actual product.
The solution
Define two plans: free (hard caps) and pro (high or no caps). When a user signs up, call vevee.upsertSubscription(userId, "plan_free"). When they pay, switch to plan_pro.
Gate every AI call with vevee.reserve/commit/release. Free users get a clean limit_reached when they hit their cap; your UI catches the error code and shows an upgrade prompt.
Wire webhooks to fire at 80% threshold for upgrade nudges. Use the pk_live_ public key to render "X renders left this month" client-side without a server roundtrip.
Example
Free users hit a clean 429 at quota. The error code drives the upgrade prompt. Paid users keep going.
import { createClient, VeveeError } from "@vevee/sdk";
const vevee = createClient({ apiKey: process.env.VEVEE_KEY! });
export async function generate(userId: string, prompt: string) {
try {
await vevee.track(userId, "image.render", 1);
return await callFlux(prompt);
} catch (err) {
if (err instanceof VeveeError && err.code === "limit_reached") {
// Free user hit their cap - show upgrade prompt in the UI
throw new UpgradeRequiredError("You have reached your free plan limit");
}
throw err;
}
}
// On Stripe checkout success:
export async function onCheckoutComplete(userId: string) {
await vevee.upsertSubscription({
userId,
planId: "plan_pro",
});
}Plan migration without losing counters
When a user upgrades from free to pro mid-month, you don't want to reset their counter (they keep their history) or pretend they used nothing (they ate their free quota). AIPricingLab's default onPlanChange: "carry" keeps counters ticking; the new plan's higher limits just give them more headroom.
Closing the free → paid → cancel → free exploit
Without protection, a savvy user can churn between free and paid to keep getting fresh quota. Set onPlanChange: "block" on a few key limit groups: when they downgrade back to free, those counters pre-fill to quota and stay there until the next period rollover.
Upgrade nudges at 80%
Best-in-class freemium products nudge at ~80% of quota - "20% left this month, want to upgrade?" - not at 100% (too late). AIPricingLab fires threshold webhooks; wire them to your email or in-app message system.
Real-time client-side usage display
A pk_live_ public key reads only the calling user's own counters and is safe in browser code. Render the "12 / 20 renders this month" display without a server roundtrip. Updates within seconds of a track call.
Frequently asked questions
Do I need Stripe to use freemium with AIPricingLab?
No. You can run a free tier indefinitely with hard caps and never touch payments. Add Stripe (or Lemon Squeezy / Paddle) when you want to charge for the upgrade.
Can users get a temporary boost (e.g. "free trial of pro for 7 days")?
Yes. Use customLimits in vevee.upsertSubscription to override quotas for that user, with endsAt set to 7 days out. After expiry, they revert to their actual plan.
How do I show "0 renders left, upgrade now" in the UI?
Catch limit_reached errors in your AI handler and render the upgrade modal. Or proactively render "X renders left" using vevee.usage(userId) on the client.
Can I A/B test different free quotas?
Yes. Use customLimits per-user when assigning the free plan, or define plan_free_v1 / plan_free_v2 and route users to different plans based on your experiment cohort.
Other use cases
LLM usage metering: track tokens per end-user, across providers
Meter LLM token usage per end-user across OpenAI, Anthropic, Gemini, Mistral, and any other provider. Composite events for prompt + completion tokens, real-time per-user limits, atomic enforcement. The drop-in pattern for AI apps.
Use caseImage generation quotas: per-user limits for DALL·E, Flux, Stable Diffusion
Enforce per-user quotas on image generation across DALL·E, Flux, Stable Diffusion, Midjourney API, and Replicate. Atomic reservation pattern stops parallel renders from overshooting. Free tier, premium tier, hard caps - drop in.
Use caseAI agent billing: meter multi-step agents and tool calls
Metering AI agents is harder than metering single LLM calls. One "agent run" can fan out into 20 tool calls and 50 LLM calls. AIPricingLab handles agent-level and step-level metering with composite events and atomic reservations.
Use caseToken-based pricing: charge users for actual AI consumption
Charge AI app users by tokens, requests, or compute seconds. Pre-paid credits, post-paid invoicing, hybrid models - implementation patterns and trade-offs from someone who has shipped all three.