> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ai-stats.phaseo.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Streaming in Production

> Operational patterns for robust streaming UX and backend reliability.

Use these patterns when moving from prototype to production.

## Reliability checklist

* Set client-side timeout and cancellation controls.
* Handle `429` and `5xx` with exponential backoff.
* Treat disconnected streams as retryable.
* Persist partial text only after completion if correctness matters.

## Browser/Node cancellation example

```typescript theme={null}
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 30_000);

try {
  const response = await fetch("https://api.phaseo.app/v1/chat/completions", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.AI_STATS_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      model: "openai/gpt-5-nano",
      messages: [{ role: "user", content: "Stream a short summary." }],
      stream: true,
    }),
    signal: controller.signal,
  });
  // Parse SSE frames from response.body here.
} finally {
  clearTimeout(timeout);
}
```

## UI guidance

* Render token deltas progressively.
* Show an explicit "generation ended" state when `[DONE]` arrives.
* Keep request id and model id in logs for replay/debugging.

## When not to stream

* Tool-calling flows under current request validation (`stream + tools` rejected).
* Workflows requiring full validation before any user-visible output.
