Mistral: Mistral Small 3.2 to Mistral Small 4
Use this guide when migrating Mistral Small routes to:
mistral/mistral-small-4-2026-03-16
Previous common route:
mistral/mistral-small-3-2-2025-06-20
What changed
- Mistral Small 4 exposes a reasoning mode control via
reasoning_effort.
- Gateway sends only two upstream values for this model route:
none (reasoning off)
high (reasoning on)
- Gateway normalizes Mistral thinking blocks back into standard response fields:
message.content for assistant-visible text
message.reasoning_content and message.reasoning_details for reasoning text
Required request changes
Reasoning defaults to off for this route.
Gateway mapping rules:
- if you omit reasoning controls, Gateway sends
reasoning_effort: "none"
- if you pass
reasoning.enabled: true, Gateway sends reasoning_effort: "high"
- if you pass
reasoning.effort: "none", Gateway sends reasoning_effort: "none"
- if you pass any other
reasoning.effort value, Gateway sends reasoning_effort: "high"
Recommended usage:
- use
reasoning.effort: "none" for lightweight behavior
- use
reasoning.effort: "high" for deep reasoning
Example:
{
"model": "mistral/mistral-small-4-2026-03-16",
"messages": [
{ "role": "user", "content": "Why is fast inference important?" }
],
"reasoning": {
"effort": "high"
}
}
Response handling notes
Mistral may return mixed assistant content blocks (for example thinking + text). Gateway maps that into a stable shape so you can consume:
- final user-visible answer from
choices[0].message.content
- reasoning text from
choices[0].message.reasoning_content
This lets you preserve downstream behavior that expects OpenAI-style chat completion fields.
Rollout checklist
- Update model ID to
mistral/mistral-small-4-2026-03-16.
- Validate reasoning controls using only
none and high.
- Re-run output quality and schema tests on your high-effort routes.
- Compare latency/cost for
none vs high before full cutover.
Last modified on March 17, 2026