Billing modes
Quota supports developer billing, user billing via OAuth, and sandbox mode. Pick the one that matches who you want to charge and how closely you want local testing to exercise production billing.
01 Developer billing (default)
Every request is charged to your developer balance. New API keys default to this mode — it's the simplest setup and works with plain API-key auth.
Best for:
- Internal tools and admin dashboards
- Free-tier features where you absorb the AI cost
- Fixed-price subscriptions with predictable usage
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.usequota.ai/v1",
apiKey: process.env.QUOTA_API_KEY,
});
// Cost is deducted from YOUR developer balance.
await client.chat.completions.create({
model: "gpt-4o-mini",
messages: [{ role: "user", content: "Hello" }],
});02 User billing via OAuth
Each end user connects their own Quota wallet via OAuth and pays for their own usage. Your app sets a markup percentage and earns revenue on every request. Quota handles the wallet, the top-up flow, and the payouts.
Best for:
- Consumer apps where users bring their own wallet
- Marketplaces and plugin ecosystems
- Anywhere you want per-use revenue rather than subscription
- Users top up in dollar packages ($5–$50)
- Balance is universal — works across every Quota app
- You keep 100% of your markup (no platform fee)
- Payouts via Stripe Connect, daily, with a 7-day delay
// After the user connects via OAuth, your server holds their
// access token. Pass it through on chat requests:
await fetch("https://api.usequota.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer " + userAccessToken,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
}),
});
// Cost (base + your markup) is deducted from the user's balance.03 Sandbox mode
Sandbox mode is an API-key mode for local development, QA, and integration tests. It returns synthetic OpenAI, Anthropic, or Google responses without calling the upstream provider, while still running Quota's metering, balance, and ledger code paths.
Use Sandbox when you want to:
- Verify response shapes without provider credentials or cost
- Test insufficient-credit and per-user billing behavior
- Exercise webhooks, ledgers, and usage reporting end to end
curl -X POST https://api.usequota.ai/developers/keys \
-H "Authorization: Bearer $SESSION_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "local-sandbox",
"billing_mode": "test"
}'04 Choosing between modes
| Developer billingdefault | You pay for everything. Simplest setup. Best for internal tools and apps that bundle AI into a fixed price. |
| User billing (OAuth)markup | Users pay from their own wallet. You set a markup % and keep 100% of it. Best for consumer apps and marketplaces. |
| Sandbox modesandbox | Mock provider responses with real Quota metering and ledger entries. Best for local development, QA, and integration tests. |
Comparison
| Consideration | Developer billing | User billing (OAuth) | Sandbox mode |
|---|---|---|---|
| Who pays | Developer | End user (own wallet) | Developer (sandbox bills the developer wallet) |
| Setup | API key only | OAuth flow | API key with billing_mode: "test" |
| Token handling | n/a | You hold the user's token | n/a |
| Developer revenue | None | 100% of markup, paid via Stripe Connect | None |
05 Rate limits
All billing modes share the same rate limits. The default is 100 requests per minute per API key — a soft cap, not a hard ceiling; auth endpoints use stricter, separate limits (3–5 req/min). See Authentication for rate-limit headers and per-key overrides.