Use these patterns when moving from prototype to production.
Reliability checklist
- Set client-side timeout and cancellation controls.
- Handle
429 and 5xx with exponential backoff.
- Treat disconnected streams as retryable.
- Persist partial text only after completion if correctness matters.
Browser/Node cancellation example
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 30_000);
try {
const response = await fetch("https://api.phaseo.app/v1/chat/completions", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.AI_STATS_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "openai/gpt-5-nano",
messages: [{ role: "user", content: "Stream a short summary." }],
stream: true,
}),
signal: controller.signal,
});
// Parse SSE frames from response.body here.
} finally {
clearTimeout(timeout);
}
UI guidance
- Render token deltas progressively.
- Show an explicit “generation ended” state when
[DONE] arrives.
- Keep request id and model id in logs for replay/debugging.
When not to stream
- Tool-calling flows under current request validation (
stream + tools rejected).
- Workflows requiring full validation before any user-visible output.
Last modified on February 18, 2026