If your app already uses LLM Gateway through an OpenAI-compatible client, most teams can keep their request payloads unchanged and migrate by replacing only the gateway boundary first.Documentation Index
Fetch the complete documentation index at: https://docs.ai-stats.phaseo.app/llms.txt
Use this file to discover all available pages before exploring further.
Before you start
- Your existing LLM Gateway endpoint and API key configuration.
AI_STATS_API_KEYadded to local, staging, and production environments.- A baseline sample of output quality, latency, and error rate.
1) Inventory integration points
Identify the exact files that create and configure your LLM Gateway client.- Locate all
LLM_GATEWAY_*env var usage. - Find all base URL references in runtime config.
- Capture the currently active model IDs and fallback chains.
- Note any shared prompt defaults, provider allow/deny logic, or parameter presets that should move into Gateway presets.
2) Switch endpoint and credentials
Keep payloads unchanged first. Start with a pure endpoint and key migration to reduce risk.3) Validate model compatibility
Query the AI Stats model catalog and verify every model used in production. If your current setup uses unprefixed aliases such asgpt-4o, normalize them at one boundary layer instead of changing every caller.
If your current gateway layer also centralizes request defaults or provider restrictions, map that behavior into Presets and Routing and Fallbacks during the migration instead of re-implementing it per caller.
4) LLMGateway migration checklist
- All
LLM_GATEWAY_*env vars mapped or removed. - Base URL updated to
https://api.phaseo.app/v1. AI_STATS_API_KEYconfigured in all deployment environments.- Production model IDs verified against
/v1/models. - One non-streaming and one streaming request validated in staging.
- Invalid-key and invalid-model failure handling rechecked.
- Shared prompt/routing defaults moved into presets where appropriate.
- Generation lookups rechecked through
GET /v1/generations?id=<request_id>so failed requests can be replayed from the storedreplay_requestpayload whenreplay_supported=true.
5) Validate and roll out
- Run your golden prompt suite and compare quality, latency, and cost metrics to baseline.
- Confirm failed staging requests can be recovered through the replay payload returned by
GET /v1/generations. - Release behind a canary flag and move from a low percentage of traffic to full traffic once stable.
- Observe production metrics for at least one release cycle before removing old configuration.
Validation commands
- Run one non-streaming and one streaming request in staging.
- Replay your golden prompts and compare to baseline.