Tool calling lets models request structured actions (for example, database lookups, weather checks, or internal API calls) instead of guessing answers.
The Gateway supports tool payloads across these text endpoints:
/v1/chat/completions (OpenAI-style tools and tool_calls)
/v1/responses (Responses-style function_call output items)
/v1/messages (Anthropic-style tool_use blocks)
Request
curl https://api.phaseo.app/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-5-nano",
"messages": [
{ "role": "user", "content": "What is the weather in London?" }
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather by city",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
}
}
}
],
"tool_choice": {
"type": "function",
"function": { "name": "get_weather" }
},
"stream": false
}'
Response
{
"id": "chatcmpl_...",
"object": "chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "tool_calls",
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"London\"}"
}
}
]
}
}
]
}
Run your tool, then send the tool result back in the next request so the assistant can finish the answer.
The gateway currently exposes these built-in server tools:
gateway:datetime
ai-stats:web_search
ai-stats:web_fetch
ai-stats:advisor
ai-stats:image_generation
ai-stats:apply_patch
This tool runs on the gateway side (no client-side executor required). The gateway rewrites it into an upstream tool/function call, executes it, and feeds the tool result back into the model loop.
For full configuration, usage, and pricing details, see Server Tools.
Supported request shape:
{
"tools": [
{
"type": "gateway:datetime",
"parameters": {
"timezone": "Europe/London"
}
}
]
}
Notes:
parameters.timezone is optional and must be a valid IANA timezone.
timezone can also be provided as a top-level shortcut on the tool object.
- The result contains ISO datetime plus the resolved timezone.
- Usage includes
usage.server_tool_use.datetime_requests.
- Prefer
tool_choice: "auto" so the model can decide when to call it.
Web search example
{
"tools": [
{
"type": "ai-stats:web_search",
"parameters": {
"engine": "exa",
"max_results": 5,
"max_total_results": 15,
"search_context_size": "medium",
"max_characters": 2048,
"allowed_domains": ["arxiv.org", "nature.com"],
"include_highlights": true
}
}
]
}
Notes:
- The model supplies the search query when it calls the tool.
engine: "auto" resolves to managed Exa search. engine: "exa", engine: "parallel", and engine: "firecrawl" run managed gateway search when the matching provider key is configured.
engine: "native" on ai-stats:web_search is converted to the provider-native web search tool for the request surface, such as OpenAI web_search_preview or Anthropic web_search_20250305.
max_results caps each search call; max_total_results caps cumulative results across the server-tool loop.
- Managed search supports
allowed_domains / excluded_domains, search_context_size, and max_characters where the selected engine exposes matching controls.
- Usage includes
usage.server_tool_use.web_search_requests, usage.server_tool_use.web_search_results, and usage.server_tool_use.web_search_extra_results.
- Pricing can bill managed Exa search with
server_tool_web_search_requests and server_tool_web_search_extra_results meters.
Web fetch example
{
"tools": [
{
"type": "ai-stats:web_fetch",
"parameters": {
"engine": "direct",
"max_chars": 12000,
"allowed_domains": ["docs.example.com"],
"blocked_domains": ["internal.example.com"]
}
}
]
}
Notes:
- The model supplies the target
url when it calls the tool.
- Only HTTP(S) URLs and text-like content types are supported.
engine: "auto" uses native fetch on the Anthropic Messages surface, otherwise Exa when EXA_API_KEY is configured, otherwise direct gateway HTTP fetch.
engine: "direct" uses direct gateway HTTP fetch. engine: "exa" uses Exa content extraction when EXA_API_KEY is configured.
engine: "parallel" uses Parallel Extract when PARALLEL_API_KEY is configured. engine: "firecrawl" uses Firecrawl Scrape when FIRECRAWL_API_KEY is configured.
engine: "native" on the Anthropic Messages surface is converted to Anthropic’s native web_fetch_20260209 tool. Other request surfaces should use engine: "direct" or a managed extraction engine.
max_content_tokens is accepted as a token-style bounded fetch size alias when max_chars is omitted.
allowed_domains and blocked_domains constrain which URLs can be fetched.
- HTML content is reduced to bounded plain text before being injected back into the model loop.
- Usage includes
usage.server_tool_use.web_fetch_requests.
- Pricing can bill managed fetch with the
server_tool_web_fetch_requests meter. Provider-native fetch/search usage is priced with native_web_fetch_requests and native_web_search_requests; model price cards can override the built-in provider defaults.
Native Anthropic fetch example:
{
"tools": [
{
"type": "ai-stats:web_fetch",
"parameters": {
"engine": "native",
"max_content_tokens": 9000,
"allowed_domains": ["docs.example.com"]
}
}
],
"tool_choice": "ai-stats:web_fetch"
}
Advisor example
{
"tools": [
{
"type": "ai-stats:advisor",
"parameters": {
"name": "reviewer",
"model": "claude-opus-4-8",
"instructions": "Review plans for correctness, missing edge cases, and implementation risk.",
"forward_transcript": true,
"max_uses": 2,
"max_completion_tokens": 1400,
"temperature": 0.2
}
}
],
"tool_choice": "ai-stats:advisor"
}
Notes:
- Advisor is gateway-managed and works across supported text models. The calling model receives an
ai_stats_advisor tool, or a named variant such as ai_stats_advisor_reviewer, and the gateway executes the Advisor request.
parameters.name is optional. Use unique names to expose multiple advisors; names may contain letters, numbers, spaces, underscores, and dashes.
parameters.model pins the Advisor model. If omitted, the tool call can provide model; otherwise the gateway falls back to the outer request model.
parameters.forward_transcript defaults to false. Set it to true when the Advisor should receive the current conversation transcript.
- The model normally supplies the Advisor
prompt when it calls the tool. When forward_transcript is true, the gateway can execute a transcript-only Advisor call if no prompt is supplied. max_tokens is accepted as a legacy alias for max_completion_tokens.
- Usage includes
usage.server_tool_use.advisor_requests.
Image generation example
{
"tools": [
{
"type": "ai-stats:image_generation",
"parameters": {
"model": "openai/gpt-image-2",
"quality": "high",
"aspect_ratio": "16:9",
"output_format": "png"
}
}
]
}
Notes:
- The model supplies the image
prompt when it calls the tool. description is also accepted as a prompt alias.
parameters.model pins the image model. If omitted, the tool call can provide model; otherwise AI Stats uses the default image model.
- The tool result contains either
imageUrl or base64 image data, depending on the provider response.
- Usage includes
usage.server_tool_use.image_generation_requests; image-model token usage is merged into the parent request.
Apply patch example
{
"tools": [
{
"type": "ai-stats:apply_patch"
}
],
"tool_choice": "auto"
}
Notes:
ai-stats:apply_patch is supported on the Responses API.
- AI Stats validates patch operations and returns them in the tool result. Your client decides whether to apply or reject the patch.
- Supported operation types are
create_file, update_file, and delete_file.
- Usage includes
usage.server_tool_use.apply_patch_requests.
Streaming behavior
Tool-calling requests can also use stream: true.
For gateway-managed server tools, the gateway may:
- materialize the upstream tool-call turn
- execute the server tool
- continue the model loop
- re-emit a synthetic stream back to the client
That keeps the client-side contract streaming-friendly even when the gateway executes part of the tool loop itself.
Next guides
- Tool Calling Patterns
- Tool Calling Safety and Validation
- Structured Outputs
Last modified on June 11, 2026