Routing and Fallbacks

The AI Stats Gateway routes each request to a provider that can serve your chosen model. When a provider is slow, rate-limited, or returning errors, the Gateway can attempt fallbacks so your requests still complete.

How routing works at a high level

You send a request with a model id.
The Gateway evaluates provider health, latency, and capability coverage.
A provider is selected and the request is executed.

Inspect request outcomes through your activity logs and response metadata when debugging routing behavior.

Public control surfaces

The current public fallback and routing controls are intentionally explicit:

1) Presets constrain the fallback pool

In Dashboard -> Settings -> Presets, you can define:

allowed models
provider allow lists
provider ignore lists
default prompt and parameter behavior

Those constraints are applied before provider selection, so a preset can intentionally narrow which providers are eligible for retries and failover.

2) Routing mode changes provider ranking

In Dashboard -> Settings -> Routing, workspaces can tune how the Gateway ranks compatible providers:

balanced
price
latency
throughput

The same page also exposes beta and alpha channel toggles so preview traffic can be introduced intentionally instead of appearing as an untracked routing side effect.

3) BYOK fallback is explicit

In Dashboard -> Settings -> BYOK, teams can choose whether a failed BYOK request is allowed to fall back to AI Stats credits. This is the current public control for the common “my own key failed, should the request still complete?” decision.

Fallback behavior

If a provider returns errors or rate limits, the Gateway can retry or route to another provider that supports the same model. You should still handle 429 and 5xx responses with exponential backoff. Read more in:

BYOK considerations

If you bring your own provider key, that provider’s limits and policies apply. Fallbacks are still attempted where possible, but upstream account limits can constrain available options.

What to log

For production workloads, log request ids, response status codes, and model ids so you can correlate failures and confirm routing behavior when debugging.

Start Here

Core Concepts

Features

Integrations

Operations

Platform & Data

Migration Guides

Community

How routing works at a high level

Public control surfaces

1) Presets constrain the fallback pool

2) Routing mode changes provider ranking

3) BYOK fallback is explicit

Fallback behavior

BYOK considerations

What to log

Start Here

Core Concepts

Features

Integrations

Operations

Platform & Data

Migration Guides

Community

Documentation Index

​How routing works at a high level

​Public control surfaces

​1) Presets constrain the fallback pool

​2) Routing mode changes provider ranking

​3) BYOK fallback is explicit

​Fallback behavior

​BYOK considerations

​What to log

​Related guides

How routing works at a high level

Public control surfaces

1) Presets constrain the fallback pool

2) Routing mode changes provider ranking

3) BYOK fallback is explicit

Fallback behavior

BYOK considerations

What to log

Related guides