Use this recipe when the model stays the same but you want the gateway to rank provider offers differently for one request.Documentation Index
Fetch the complete documentation index at: https://docs.ai-stats.phaseo.app/llms.txt
Use this file to discover all available pages before exploring further.
Goal
- Prefer the cheapest provider for one request.
- Prefer the lowest-latency provider for interactive paths.
- Prefer the highest-throughput provider for bulk generation.
1. Add provider.sort to the request
For text requests, set the sort you want directly on the provider object.
pricelatencythroughput
2. Pick the sort that matches the workload
Use:pricewhen cost matters more than tail latencylatencyfor chat, copilots, and human-in-the-loop toolsthroughputfor high-volume generation or backfills
3. Understand what the gateway compares
When you send an explicit request-level sort, the gateway ranks candidates deterministically instead of using the normal balanced weighted shuffle. For text models:pricecompares a common price basis across the eligible providers- if shared text meters exist, the gateway prefers matching
input_text_tokensandoutput_text_tokens latencyuses the latest provider latency datathroughputuses the latest throughput measurements
4. Keep the provider pool realistic
Sorting works best after you narrow the candidate pool when needed. For example, sort only among one approved set of providers:5. Verify the ranked outcome
When debugging, inspect the request in Gateway -> Usage and look for:- providers considered
- ranked providers
- the routing score factors for price, latency, or throughput