Skip to main content
Adopting the Gateway across production workloads is straightforward when you establish strong foundations. Use the recommendations below to improve resilience, security, and maintainability.

Architecture

  • Isolate provider logic. Wrap Gateway calls in a dedicated service or SDK so changes stay confined.
  • Stream responses when possible. Reduce latency for assistants by streaming tokens to the UI as they arrive.
  • Batch non-critical jobs. Queue background generations or evaluations to smooth out traffic spikes.

Security

  • Store API keys in secret managers (AWS Secrets Manager, Doppler, 1Password) instead of .env files in production.
  • Enable request signing on your own APIs so client applications never expose Gateway keys.
  • Log only truncated keys (for example sk_prod_abcd…) to avoid leaking credentials.

Reliability

  • Implement exponential backoff with jitter for all retryable errors (HTTP 429 or 5xx).
  • Respect the Retry-After header before retrying rate-limited requests.
  • Monitor provider metadata to detect automatic failovers from one vendor to another.

Observability

SignalRecommended action
LatencyTrack p95 and p99 timings to catch regressions early.
Token usageCorrelate costs with business metrics to evaluate ROI.
Error ratesAlert when failures exceed 1% for a sustained period.
Provider mixEnsure traffic is distributed as expected across vendors.

Collaboration

  • Document prompt templates, input parameters, and output handling in your internal wiki.
  • Share dashboards that highlight benchmark movements and how they affect your product.
  • Encourage regular reviews with product, research, and support teams to keep behaviour aligned.

Further reading