Simple, transparent pricing

Start free. Scale with pay-as-you-go compute. No hidden fees.

Free

$0 forever

1K jobs / month

10 jobs / min

Pay-as-you-go compute
OpenAI-compatible API
Fal.ai-compatible API
Community support

Get started

Pro

$49 per month

100K jobs / month

100 jobs / min

Pay-as-you-go compute
OpenAI-compatible API
Fal.ai-compatible API
Priority support
Higher rate limits
Workflow orchestration

Get started

Enterprise

Custom pricing

Unlimited jobs / month

1,000 jobs / min

Custom compute pricing
Dedicated GPU capacity
SLA guarantees
SSO / SAML
Audit logs
Dedicated support

Contact sales

How billing works

Two components, no surprises.

Plan subscription

Your monthly plan sets your rate limits and included quotas. Think of it as your capacity reservation — Free, Pro, or Enterprise.

Per-model compute

GPU time is metered to the second and billed at the rate card for each model. You only pay for what you use — there is no idle cost.

Per-model rates

All rates are in USD. Compute is billed per-second for continuous workloads and per-request for discrete tasks.

Pricing coming soon — contact us for early access rates.

Frequently asked questions

Is there a free tier?

Yes. The Free plan includes 1,000 jobs and 3,600 compute seconds per month at no cost. You only pay for compute beyond the included quota.

What happens when I hit my quota limits?

Once you exhaust your monthly quota, new job submissions will return a 429 error until the quota resets at the start of the next billing cycle. Upgrade to Pro for higher limits, or contact us for enterprise capacity.

How does billing work?

Your plan subscription covers your monthly quota. Compute usage is metered per-second (or per-request for discrete tasks) and billed separately at the rates shown above. A spending cap can be set to avoid surprise charges.

Can I set a spending cap?

Yes. You can configure a monthly spending cap in the Billing settings. Once the cap is reached, job submissions are paused until you raise the cap or the billing period resets.

What is enterprise pricing?

Enterprise plans offer custom rate cards, dedicated GPU capacity, SLA guarantees, and volume discounts. Contact our sales team to discuss your requirements.

How do I estimate my costs?

Multiply the per-second rate by your expected GPU time per job, then by your monthly job volume. Most LLM inference jobs run in 2–10 seconds; video generation typically takes 2–10 minutes.

What payment methods are accepted?

We accept all major credit and debit cards via Stripe. Enterprise customers may arrange invoice-based billing.

Can I switch plans?

Yes, you can upgrade or downgrade your plan at any time. Upgrades take effect immediately; downgrades apply at the start of the next billing period.

Need dedicated capacity?

Enterprise plans include custom rate cards, dedicated GPU pools, SLA guarantees, and a dedicated support channel.

Contact sales Start for free