Goal
- keep the Python caller small
- route through a preset slug instead of one hard-coded model
- request strict structured output
- retain enough response metadata to debug routing or plugin behavior
1. Start with one shared client
2. Move stable defaults into a preset
Create a preset in Dashboard -> Settings -> Presets when these should stay stable across multiple callers:- system prompt
- model or model allowlist
- provider preferences
- reasoning config
- temperature and related generation parameters
- response caching policy when deterministic replay matters
3. Request one strict JSON shape
presetkeeps routing and prompt defaults outside application coderesponse_formatkeeps the contract explicitpluginscan recover near-valid malformed JSON when that workflow allows itmetapreserves routing and plugin execution detail for debugging
4. Parse the JSON, then log the operational identifiers
- the request detail dialog in the dashboard
- routing diagnostics
- plugin execution metadata
5. Debug before you override
If one request routes differently from what you expected:- open the request in Gateway -> Usage
- inspect routing diagnostics and provider candidates
- inspect plugin execution metadata if structured JSON was involved
- change the preset only after the logs show what actually happened
6. Keep caching compatible when you want reuse
If the preset enables response caching:- keep the prompt wording stable
- keep the response schema stable
- avoid unnecessary per-request provider overrides
- avoid volatile tool lists