# How OpenAI Handles Billing: A Complete Breakdown

> A deep analysis of OpenAI's hybrid billing model - from ChatGPT subscription tiers and API token pricing to prepaid credits and spend-based rate limits. Learn how to build the same system.
- **Author**: Ayush Agarwal
- **Published**: 2026-04-07
- **Category**: AI, Billing
- **URL**: https://dodopayments.com/blogs/openai-billing-model

---

OpenAI runs one of the most complex billing systems in the software industry. On one side, hundreds of millions of people use ChatGPT through flat-rate subscriptions. On the other, millions of developers pay per token through a prepaid credit system. These two models operate simultaneously, serving fundamentally different audiences with different expectations.

This is not a simple "pick a price and charge for it" setup. OpenAI's billing combines [subscription pricing](https://dodopayments.com/blogs/subscription-pricing-models) with [usage-based billing](https://dodopayments.com/blogs/usage-based-billing-saas), layered with prepaid credits, spend-based rate tiers, and soft usage caps. Understanding how these pieces fit together reveals a billing architecture that other AI companies can learn from and replicate.

For a deep technical walkthrough of how to rebuild this exact model, see the [OpenAI Billing Deconstruction on Dodo Payments docs](https://docs.dodopayments.com/developer-resources/billing-deconstructions/openai).

## OpenAI's Pricing at a Glance

OpenAI sells to two distinct audiences, and the pricing reflects that split.

**Consumer products** (ChatGPT, Sora) use flat-rate monthly subscriptions. You pick a plan, pay a fixed amount, and get access to a set of features with usage caps that vary by tier.

**Developer products** (the API) use a pay-as-you-go model built on prepaid credits. You load money into your account, and every API call deducts from that balance based on how many tokens you consume.

This hybrid approach is what makes OpenAI's billing interesting. Most companies pick one model or the other. OpenAI runs both, and the two sides inform each other. The subscription tiers fund model development with predictable revenue, while the API side captures the long tail of developer usage that would be impossible to serve with fixed plans.

## ChatGPT Subscription Tiers

OpenAI currently offers six ChatGPT plans, ranging from free to enterprise-grade. Here is how they break down:

| Plan       | Price             | Billing Type          | Key Features                                                      |
| ---------- | ----------------- | --------------------- | ----------------------------------------------------------------- |
| Free       | $0/month          | N/A                   | Limited GPT-5.3 access, basic features, strict daily caps         |
| Go         | $8/month          | Subscription          | 10x more messages than Free, GPT-5.3 Instant, includes ads        |
| Plus       | $20/month         | Subscription          | GPT-5.4 Thinking, deep research, agent mode, Codex, Sora          |
| Pro        | $200/month        | Subscription          | Unlimited GPT-5.4, exclusive GPT-5.4 Pro, 128K context window     |
| Business   | $25-30/user/month | Per-seat subscription | Everything in Plus, admin controls, SAML SSO, no training on data |
| Enterprise | Custom            | Invoiced              | Unlimited access, data residency, SCIM, dedicated support         |

A few things stand out about this structure.

**The Go tier is ad-supported.** OpenAI introduced the Go plan at $8/month as a bridge between Free and Plus. It gives users more capacity but includes advertisements. This is unusual for a SaaS product and signals that OpenAI is experimenting with ad revenue as a way to subsidize lower-priced tiers.

**Plus is the anchor.** At $20/month, Plus is where most paying users land. It includes access to advanced reasoning models, deep research, agent mode, and Codex. The feature gap between Go and Plus is significant, which makes the $12 price difference feel justified.

**Pro targets power users.** At $200/month, Pro is 10x the price of Plus. The exclusive access to GPT-5.4 Pro and the 128K context window (4x larger than Plus) justify the premium for researchers, developers, and professionals who need maximum capability.

**Enterprise uses invoiced billing.** Large organizations get custom pricing, volume discounts, and payment via invoice rather than credit card. This is standard for B2B SaaS but worth noting because it means OpenAI operates three distinct billing mechanisms: self-serve subscriptions, per-seat billing, and invoiced contracts.

## API Pricing: The Token Economy

The API side of OpenAI's business works entirely differently. Instead of fixed monthly fees, developers pay per token. A token is roughly three-quarters of a word in English. Both input tokens (what you send to the model) and output tokens (what the model generates) are billed, but at different rates.

Here are the current flagship model prices:

| Model        | Input (per 1M tokens) | Cached Input (per 1M tokens) | Output (per 1M tokens) |
| ------------ | --------------------- | ---------------------------- | ---------------------- |
| GPT-5.4      | $2.50                 | $0.25                        | $15.00                 |
| GPT-5.4 mini | $0.75                 | $0.075                       | $4.50                  |
| GPT-5.4 nano | $0.20                 | $0.02                        | $1.25                  |
| GPT-4o       | $2.50                 | $1.25                        | $10.00                 |
| GPT-4o-mini  | $0.15                 | $0.075                       | $0.60                  |

Several pricing mechanics are worth calling out:

**Output tokens cost 4-6x more than input tokens.** This reflects the computational reality. Generating text is more expensive than processing it. For developers building applications, this means the length of model responses has a direct and outsized impact on cost.

**Cached input pricing rewards repeat usage.** If you send the same system prompt or context repeatedly, OpenAI charges 10x less for cached tokens. This incentivizes developers to structure their applications around consistent prompts, which also improves latency.

**The model range spans 75x in price.** GPT-5.4 nano costs $0.20 per million input tokens. GPT-5.4 costs $2.50. This gives developers a clear cost-performance tradeoff and encourages them to [use the right model for each task](https://dodopayments.com/blogs/ai-pricing-models) rather than defaulting to the most expensive option.

Beyond text, OpenAI also charges separately for image tokens, audio tokens, embeddings, fine-tuning, and built-in tools like web search ($10 per 1,000 calls). Each modality has its own pricing table, creating a multi-dimensional billing surface.

## How the Billing Model Actually Works

OpenAI's billing is a hybrid of three distinct mechanisms operating in parallel. Understanding how they interact is key to understanding why the system works.

```mermaid
flowchart LR
    subgraph Consumer ["Consumer Side"]
        A[User Signs Up] --> B[Picks Plan]
        B --> C[Monthly Charge]
        C --> D[Soft Usage Caps]
        D -->|Cap Hit| E[Downgrade to Smaller Model]
    end
    subgraph Developer ["Developer Side"]
        F[Developer Signs Up] --> G[Buys Credit Pack]
        G --> H[Credits Added to Balance]
        H --> I[API Calls Deduct Tokens]
        I --> J{Balance > 0?}
        J -->|Yes| I
        J -->|No| K[API Calls Blocked]
        K --> G
    end
```

### 1. Flat-Rate Subscriptions (ChatGPT)

ChatGPT plans charge a fixed monthly fee regardless of how much you use the product. But "unlimited" does not mean truly unlimited. OpenAI uses **soft caps** - when a user exceeds their tier's usage allocation, they are not blocked. Instead, they get downgraded to a less capable model. A Plus user who burns through their GPT-5.4 Thinking allocation gets routed to GPT-5.3 Instant for the rest of the period.

This is a smart approach. Hard caps create frustration and support tickets. Soft caps let users keep working while nudging them toward upgrading. It is a pattern that works well for [subscription-based products with variable consumption](https://dodopayments.com/blogs/subscriptions-usage-based-billing-saas).

### 2. Prepaid Credits (API)

The API uses a [credit-based billing](https://docs.dodopayments.com/features/credit-based-billing) system. Developers load money into their OpenAI account - $5, $10, $50, or more. These credits are denominated in USD, not abstract points. When you make an API call, the token cost is calculated and deducted from your balance in real time. When your balance hits zero, API calls fail with a 402 error.

> The decision to denominate credits in fiat currency instead of abstract tokens is subtle but important. When a developer sees "$47.23 remaining," they immediately understand the value. Abstract credit systems create confusion and erode trust.
>
> - Ayush Agarwal, Co-founder & CPTO at Dodo Payments

This prepaid model eliminates credit risk for OpenAI. They collect revenue before delivering the service. It also creates a low barrier to entry - a $5 top-up is enough to start experimenting - while allowing heavy users to scale without friction.

### 3. Spend-Based Rate Tiers

OpenAI ties API rate limits to cumulative spending. The more you have spent over your account's lifetime, the higher your rate limits. This creates a trust-based access system:

- **Tier 1** (low spend): Restrictive rate limits, suitable for testing
- **Tier 2** (moderate spend): Higher throughput for production apps
- **Tier 3+** (high spend): Maximum rate limits for scale

This is a clever monetization lever. It rewards loyal customers with better performance and creates a natural incentive to consolidate API spending on OpenAI rather than splitting across providers. For companies building similar systems, this pattern maps well to [tiered pricing models](https://dodopayments.com/blogs/tiered-pricing-model-guide).

## What Makes OpenAI's Billing Work

Several design decisions make this system effective. These are not accidental - they reflect deliberate choices about how to balance revenue, user experience, and operational complexity.

**Credits never expire.** Unlike many [billing credit systems](https://dodopayments.com/blogs/billing-credits-pricing-cashflow) that force "use it or lose it" pressure, OpenAI's credits persist indefinitely. This encourages developers to top up larger amounts because they know the value will not disappear. It also reduces the accounting complexity of tracking expiration dates across millions of accounts.

**Input and output are priced separately.** Most API billing systems charge a single rate per request or per unit. OpenAI's decision to meter input and output tokens independently lets them price based on actual computational cost. Output generation is more expensive, so it costs more. This transparency builds trust with developers who can optimize their costs by controlling response length.

**The Batch API offers 50% savings.** For non-time-sensitive workloads, OpenAI offers a Batch API that processes requests asynchronously over 24 hours at half the standard price. This is a textbook example of [dynamic pricing for usage-based SaaS](https://dodopayments.com/blogs/dynamic-pricing-usage-based-saas) - filling idle compute capacity at a discount while maintaining premium pricing for real-time requests.

**Flex processing trades speed for cost.** Similar to the Batch API, Flex processing provides lower costs in exchange for slower response times and occasional unavailability. This gives developers another lever to optimize their [API monetization](https://dodopayments.com/blogs/api-monetization) costs based on their application's latency requirements.

## Billing Challenges OpenAI Faces

No billing system is perfect, and OpenAI's complexity creates specific challenges that any company building a similar model should anticipate.

### Cost Predictability for Developers

Token-based pricing is inherently unpredictable. A developer building a chatbot cannot easily forecast their monthly API bill because it depends on conversation length, user behavior, and model choice. This unpredictability is one of the main reasons some developers prefer [flat-fee pricing over usage-based billing](https://dodopayments.com/blogs/usage-based-billing-vs-flat-fees-ai-saas).

OpenAI mitigates this with spending limits and usage dashboards, but the fundamental tension remains. Every AI company that adopts per-token billing needs to invest in cost visibility tools for their customers.

### Multi-Dimensional Metering Complexity

OpenAI does not just meter tokens. They meter text tokens, image tokens, audio tokens, tool calls, storage, fine-tuning compute, and more. Each dimension has its own pricing, and some interact with each other (a web search call costs $0.01 plus the token cost of processing the search results).

This creates a billing surface that is difficult for developers to reason about and expensive for OpenAI to maintain. Building [accurate metered billing](https://dodopayments.com/blogs/metered-billing-accurate-billing) at this scale requires robust event ingestion, real-time aggregation, and clear reporting.

### Subscription Cannibalization

The relationship between ChatGPT subscriptions and API access creates tension. A developer on the Plus plan gets access to GPT-5.4 through the ChatGPT interface. But if they want to use the same model programmatically, they need to pay separately through the API. This dual-billing can feel redundant to users who want both interfaces.

### Rate Limit Complexity

Spend-based rate tiers add another layer of complexity. A developer's rate limits depend on their cumulative spend, their current plan, the specific model they are calling, and the time of day (during peak hours, even high-tier users may experience throttling). Communicating these dynamic limits clearly is an ongoing challenge.

## How to Build Similar Billing with Dodo Payments

If you are building an AI product and want to replicate OpenAI's billing model, you need two core capabilities: [credit-based billing](https://docs.dodopayments.com/features/credit-based-billing) for the API side and [subscription billing](https://docs.dodopayments.com/features/subscription) for the consumer side. Dodo Payments supports both natively.

Here is how to set it up.

### Step 1: Create Credit Entitlements for API Billing

Set up a fiat credit entitlement in your Dodo Payments dashboard. This acts as the prepaid balance for your API users.

- **Credit Type:** Fiat Credits (USD)
- **Credit Expiry:** Never (matching OpenAI's no-expiry policy)
- **Overage:** Disabled (API calls fail at zero balance, just like OpenAI)

Then create one-time payment products for credit packs ($5, $10, $50, $100) and attach the credit entitlement to each.

### Step 2: Set Up Usage Meters for Token Tracking

Create meters to track token consumption. For an LLM billing system, you will want separate meters for input and output tokens, since they have different costs.

```typescript
import DodoPayments from "dodopayments";

const client = new DodoPayments({
  bearerToken: process.env["DODO_PAYMENTS_API_KEY"],
});

// After each LLM request, ingest usage events
await client.usageEvents.ingest({
  events: [
    {
      event_id: `req_${requestId}_input`,
      customer_id: customerId,
      event_name: "llm.input_tokens",
      timestamp: new Date().toISOString(),
      metadata: {
        model: "gpt-5.4",
        tokens: "1500",
      },
    },
    {
      event_id: `req_${requestId}_output`,
      customer_id: customerId,
      event_name: "llm.output_tokens",
      timestamp: new Date().toISOString(),
      metadata: {
        model: "gpt-5.4",
        tokens: "800",
      },
    },
  ],
});
```

Link both meters to your fiat credit entitlement. Configure the "meter units per credit" to match your pricing. For example, if you charge $2.50 per 1M input tokens, that is 400,000 tokens per dollar (or per 100 credits in cents).

For a production-ready implementation with automatic token tracking, see the [usage-based billing integration guide](https://docs.dodopayments.com/developer-resources/usage-based-billing-guide) in the Dodo Payments docs.

### Step 3: Add Subscription Products for Consumer Plans

For the ChatGPT-style subscription side, create separate [subscription products](https://docs.dodopayments.com/features/subscription) for each tier. These do not need credit entitlements - they are straightforward recurring charges.

```typescript
// Create a checkout session for a subscription plan
const session = await client.checkoutSessions.create({
  product_cart: [{ product_id: "prod_plus_plan", quantity: 1 }],
  customer: { email: "user@example.com" },
  return_url: "https://yourapp.com/billing",
});
```

For per-seat billing (like OpenAI's Business plan), use quantity-based subscriptions where each seat is a unit.

### Step 4: Implement Soft Caps in Your Application

Soft caps are application-level logic, not billing-level. Track usage for subscription users with the same [usage-based billing](https://docs.dodopayments.com/features/usage-based-billing/introduction) meters, but instead of deducting credits, use the data to make routing decisions:

```typescript
async function getModelForUser(customerId: string) {
  const usage = await getCurrentPeriodUsage(customerId);

  if (usage.premiumModelCalls > SOFT_CAP_THRESHOLD) {
    // Route to a smaller model instead of blocking
    return "gpt-5.4-nano";
  }

  return "gpt-5.4";
}
```

This replicates OpenAI's pattern of downgrading users to a less capable model when they exceed their tier's allocation, rather than cutting them off entirely.

### Step 5: Handle Balance Depletion

Set up [webhooks](https://docs.dodopayments.com/developer-resources/webhooks) to notify users when their credit balance is running low. OpenAI sends email alerts before a user's balance hits zero, giving them time to top up without service interruption.

```typescript
// In your webhook handler
if (event.type === "credit.balance_low") {
  const { customer_id, available_balance } = event.data;
  await sendLowBalanceEmail(customer_id, available_balance);
}
```

For the complete implementation with edge case handling (race conditions, multi-model support, refund handling), see the full [OpenAI billing deconstruction](https://docs.dodopayments.com/developer-resources/billing-deconstructions/openai) on Dodo Payments docs.

## Key Takeaways for AI Founders

OpenAI's billing model is not just a pricing page. It is a revenue architecture that balances multiple competing goals: low barrier to entry, predictable revenue, cost transparency, and scalable monetization.

If you are building an AI product, the core lessons are:

1. **Hybrid billing works.** Subscriptions for consumers and usage-based billing for developers can coexist. They serve different audiences with different expectations.

2. **Prepaid credits reduce risk.** Collecting payment before delivering service eliminates bad debt and simplifies cash flow. Denominating credits in fiat currency makes the value immediately clear.

3. **Soft caps beat hard caps.** Downgrading users to a less capable model is better than blocking them entirely. It keeps users engaged and creates a natural upgrade path.

4. **Multi-dimensional metering is necessary for AI.** Input tokens, output tokens, tool calls, and compute time all have different costs. Your billing system needs to handle this granularity.

5. **Spend-based trust tiers reward loyalty.** Tying rate limits to cumulative spending creates a flywheel that rewards your best customers and encourages consolidation.

Building this kind of billing infrastructure from scratch is a significant engineering investment. Platforms like [Dodo Payments](https://dodopayments.com) provide the building blocks - [credit-based billing](https://docs.dodopayments.com/features/credit-based-billing), [usage event ingestion](https://docs.dodopayments.com/features/usage-based-billing/introduction), [subscription management](https://docs.dodopayments.com/features/subscription), and [webhook-driven automation](https://docs.dodopayments.com/developer-resources/webhooks) - so you can focus on your AI product instead of reinventing billing.

For a hands-on guide to [monetizing your AI application](https://dodopayments.com/blogs/monetize-ai), check out the [pay-as-you-go AI SaaS guide](https://dodopayments.com/blogs/pay-as-you-go-ai-saas) or the [metered billing for GPT wrapper apps](https://dodopayments.com/blogs/metered-billing-gpt-wrapper) tutorial.

## FAQ

### How does OpenAI charge for API usage?

OpenAI uses a prepaid credit system for API billing. Developers load USD-denominated credits into their account, and each API call deducts tokens from that balance. Input and output tokens are priced separately, with output tokens costing 4-6x more than input tokens. When the balance reaches zero, API calls are blocked until the user tops up.

### What is the difference between OpenAI's subscription and API billing?

ChatGPT subscriptions (Free, Go, Plus, Pro, Business, Enterprise) charge a fixed monthly fee for access to the ChatGPT interface with usage caps. The API uses pay-as-you-go token billing with prepaid credits. These are separate billing systems - a ChatGPT Plus subscription does not include API credits, and API credits do not grant ChatGPT Plus features.

### Why does OpenAI price input and output tokens differently?

Generating output tokens requires more computational resources than processing input tokens. By pricing them separately, OpenAI aligns costs with actual compute usage. This transparency lets developers optimize their spending by controlling response length, using system prompts efficiently, and choosing the right model for each task.

### Can you build an OpenAI-style billing system for your own AI product?

Yes. You need credit-based billing for the prepaid API side and subscription billing for consumer plans. Dodo Payments provides both natively, along with usage event ingestion for token tracking and webhook automation for balance alerts. See the [OpenAI billing deconstruction](https://docs.dodopayments.com/developer-resources/billing-deconstructions/openai) for a step-by-step implementation guide.

### What are OpenAI's spend-based rate tiers?

OpenAI ties API rate limits to your account's cumulative spending history. New accounts start with restrictive limits. As you spend more over time, you unlock higher rate limits and throughput. This creates a trust-based system that rewards long-term customers and encourages developers to consolidate their API usage on OpenAI's platform.

## Final Thoughts

OpenAI's billing model is a case study in how to monetize AI at scale. The combination of subscriptions, prepaid credits, per-token metering, and spend-based rate tiers creates a system that serves casual users and enterprise developers from the same platform.

The complexity is real, but so is the payoff. Every piece of the billing architecture serves a strategic purpose: subscriptions provide revenue predictability, credits eliminate credit risk, token-based pricing aligns cost with value, and rate tiers reward loyalty.

For AI founders looking to implement similar billing, the infrastructure exists today. Start with the [Dodo Payments pricing page](https://dodopayments.com/pricing) to see how the building blocks fit together, or jump straight into the [implementation guide](https://dodopayments.com/blogs/implement-usage-based-billing) to start building.
---
- [More AI articles](https://dodopayments.com/blogs/category/ai)
- [All articles](https://dodopayments.com/blogs)