> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ai-stats.phaseo.app/llms.txt
> Use this file to discover all available pages before exploring further.

# WebSocket Mode

> Gateway-first walkthrough for /v1/responses/ws with concrete connection, turn, and recovery flow.

This page explains how AI Stats Gateway `/v1/responses/ws` currently behaves.

<Note type="warning">
  `/v1/responses/ws` is OpenAI-only and requires `openai/<model>` format.
</Note>

<Note type="warning">
  This endpoint is still experimental on AI Stats Gateway and is currently not recommended for production workloads. For production, prefer `POST /v1/responses` (and SSE streaming when needed).
</Note>

## Endpoint and handshake

* URL: `wss://api.phaseo.app/v1/responses/ws`
* Method semantics: WebSocket starts as an HTTP `GET` with `Upgrade: websocket`.
* Success status: `101 Switching Protocols` (not `200`).
* Auth: `Authorization: Bearer YOUR_API_KEY`

If you call this path as a normal HTTP GET without WebSocket upgrade headers, the gateway returns `426 websocket_upgrade_required`.
If websocket mode is disabled for the current deployment, the handshake instead returns `501 responses_websocket_disabled` before upgrade or auth processing begins.

## How the gateway processes this endpoint

For each socket session, the gateway enforces these rules:

* Only `type: "response.create"` client messages are accepted.
* Model must use provider/model format and OpenAI provider: `openai/<model>`.
* Exactly one in-flight response is allowed per connection.
* The model is locked after the first valid turn (`model_mismatch` if changed later).
* `store` is always forced to `false`.
* HTTP-style flags are removed before upstream send: `stream`, `stream_options`, `background`.

## Step-by-step implementation

### 1) Pick an OpenAI-routable model

Use `GET /v1/gateway/models` to discover candidate model IDs, then run a first turn over `/v1/responses/ws` to confirm routing for your team/key.

Example model IDs that work with this endpoint:

* `openai/gpt-5-nano`

### 2) Open one authenticated WebSocket connection

```ts theme={null}
import WebSocket from "ws";

const ws = new WebSocket("wss://api.phaseo.app/v1/responses/ws", {
  headers: {
    Authorization: `Bearer ${process.env.AI_STATS_API_KEY}`,
  },
});

await new Promise<void>((resolve, reject) => {
  ws.once("open", () => resolve());
  ws.once("error", reject);
});
```

### 3) Send your first `response.create`

```ts theme={null}
ws.send(JSON.stringify({
  type: "response.create",
  model: "openai/gpt-5-nano",
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "input_text", text: "Summarize this issue in 3 bullets." }],
    },
  ],
  tools: [],
}));
```

### 4) Read server events until completion

The gateway forwards OpenAI Responses WebSocket events. In practice, handle at least:

* `response.created`
* `response.output_text.delta`
* `response.completed`
* `response.failed`
* `error`

```ts theme={null}
let lastResponseId: string | null = null;

ws.on("message", (raw) => {
  const msg = JSON.parse(String(raw));

  if (msg.type === "response.output_text.delta") {
    process.stdout.write(msg.delta ?? "");
    return;
  }

  if (msg.type === "response.completed") {
    lastResponseId = msg.response?.id ?? null;
    console.log("\ncompleted:", lastResponseId);
    return;
  }

  if (msg.type === "response.failed" || msg.type === "error") {
    console.error("gateway ws error:", msg.error ?? msg);
  }
});
```

### 5) Continue the same conversation chain

For subsequent turns on the same chain:

* Keep the same `model`.
* Send only new turn input.
* Set `previous_response_id` to the prior completed response id.

```ts theme={null}
ws.send(JSON.stringify({
  type: "response.create",
  model: "openai/gpt-5-nano",
  previous_response_id: lastResponseId,
  input: [
    {
      type: "message",
      role: "user",
      content: [{ type: "input_text", text: "Now convert that to an action list." }],
    },
  ],
}));
```

### 6) Handle gateway-specific errors with explicit recovery

* `invalid_response_create`: gateway pre-validation failed (payload must be a JSON object with `type: "response.create"` and `model` in `openai/<model>` format).
* `openai_routing_failed`: gateway could not route your OpenAI model for this key/team; inspect `error.message` details (which may include `openai_provider_unavailable` or `pricing_unavailable`) and choose a routable model.
* `response_already_in_flight`: wait for current turn to finish before sending the next turn.
* `model_mismatch`: open a new socket if you want a different model.
* `upstream_websocket_handshake_failed` / `upstream_websocket_closed`: reconnect with backoff and retry the turn.
* `previous_response_not_found`: resend without `previous_response_id` and include full context.

Gateway behavior detail: if `previous_response_not_found` occurs and your prior payload had a non-null `previous_response_id` with array `input`, the gateway automatically retries once with `previous_response_id: null`.

### 7) Evaluation checklist (non-production)

* Keep one socket per active conversation worker.
* Enforce per-socket turn queueing to avoid `response_already_in_flight`.
* Add timeout + reconnect with exponential backoff.
* Log response ids, error codes, and close codes.
* Rotate API keys via normal gateway key management flow.

## Related pages

1. [Responses WebSocket API Reference](../api-reference/endpoint/responses-ws)
2. [Responses Endpoint](../api-reference/endpoint/responses)
3. [Streaming](./streaming)
