> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ai-stats.phaseo.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Service Tiers

> How Standard, Priority, Flex, and Batch pricing modes work in AI Stats Gateway.

Service tiers let you choose between different pricing and delivery modes when a provider supports them.

Availability varies by provider and model. If a tier is not supported for the model you requested, the Gateway will not route to it.

<Note type="warning">
  Service tiers are currently available only for supported text models on supported providers.
</Note>

`service_tier` is supported across all three text request surfaces:

* Anthropic-compatible Messages at `/v1/messages`
* OpenAI-compatible Chat Completions at `/v1/chat/completions`
* OpenAI-compatible Responses at `/v1/responses`

## Tier overview

| Tier       | How to request it                              | Typical use                                               |
| ---------- | ---------------------------------------------- | --------------------------------------------------------- |
| `Standard` | Default behavior. No extra field required.     | General production traffic.                               |
| `Priority` | Set `service_tier: "priority"` on the request. | Faster or premium routing where supported.                |
| `Flex`     | Set `service_tier: "flex"` on the request.     | Lower-cost routing where supported.                       |
| `Batch`    | Use the Batch API rather than `service_tier`.  | Large deferred workloads where latency is less important. |

## API compatibility

Use the same `service_tier` field when you call any supported synchronous text API:

* [Anthropic Messages API reference](../api-reference/endpoint/anthropic-messages.mdx)
* [Chat Completions API reference](../api-reference/endpoint/chat-completions.mdx)
* [Responses API reference](../api-reference/endpoint/responses.mdx)
* [Shared parameter reference](../api-reference/parameters.mdx)

The accepted `service_tier` values are `standard`, `priority`, `flex`, and `batch`. `Standard` is the default behavior when `service_tier` is omitted. `Priority` and `Flex` are opt-in request modes, and `Batch` is handled by the Batch API rather than synchronous text requests.

<Note>
  AI Stats maps the normalized gateway values to provider-native controls internally. For example, an Anthropic route may receive Anthropic-native tier fields upstream, but the client-facing request still uses the gateway values listed here.
</Note>

## Standard

`Standard` is the default routing mode. You do not need to set `service_tier` to use it.

```json theme={null}
{
  "model": "openai/gpt-5.5",
  "input": "Summarise this incident report."
}
```

## Priority

Use `Priority` when you want the provider's premium or higher-priority offer.

```json theme={null}
{
  "model": "openai/gpt-5.5",
  "input": "Summarise this incident report.",
  "service_tier": "priority"
}
```

### Anthropic Messages example

```json theme={null}
{
  "model": "anthropic/claude-sonnet-4",
  "max_tokens": 512,
  "messages": [
    { "role": "user", "content": "Summarise this incident report." }
  ],
  "service_tier": "priority"
}
```

AI Stats maps this to the appropriate provider-native control when routing to Anthropic.

## Flex

Use `Flex` when the provider exposes a lower-cost service tier and you are willing to trade for that pricing mode.

```json theme={null}
{
  "model": "openai/gpt-5.5",
  "input": "Summarise this incident report.",
  "service_tier": "flex"
}
```

### Chat Completions example

```json theme={null}
{
  "model": "openai/gpt-5.5",
  "messages": [
    { "role": "user", "content": "Summarise this incident report." }
  ],
  "service_tier": "flex"
}
```

### Responses example

```json theme={null}
{
  "model": "openai/gpt-5.5",
  "input": [
    { "role": "user", "content": "Summarise this incident report." }
  ],
  "service_tier": "flex"
}
```

## Batch

`batch` is a recognized tier value for batch execution, but synchronous text APIs reject `service_tier: "batch"` with a validation error that points to the Batch API.

Use the Batch API or batch job workflow because batch pricing applies to deferred batch execution rather than normal synchronous requests.

## Notes

* Tier support is provider-specific and model-specific.
* Pricing cards on model pages show the tier-specific rates when we have data for them.
* Some providers expose specialized upstream offers that AI Stats maps into a unified tiered experience in the catalog.
* Client-facing `service_tier` values are normalized across supported text surfaces; provider-native names are handled inside the gateway.

## Related pages

* [Anthropic Messages API reference](../api-reference/endpoint/anthropic-messages.mdx)
* [Chat Completions API reference](../api-reference/endpoint/chat-completions.mdx)
* [Responses API reference](../api-reference/endpoint/responses.mdx)
* [Parameters](../api-reference/parameters.mdx)
* [Routing and Fallbacks](./routing-and-fallbacks.mdx)
* [Chat completions (TypeScript SDK)](../sdk-reference/typescript/chat-completions.mdx)
* [Responses (TypeScript SDK)](../sdk-reference/typescript/responses.mdx)
