Skip to main content
Tool calling lets models request structured actions (for example, database lookups, weather checks, or internal API calls) instead of guessing answers. The Gateway supports tool payloads across these text endpoints:
  • /v1/chat/completions (OpenAI-style tools and tool_calls)
  • /v1/responses (Responses-style function_call output items)
  • /v1/messages (Anthropic-style tool_use blocks)

Request

curl https://api.phaseo.app/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5-nano",
    "messages": [
      { "role": "user", "content": "What is the weather in London?" }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather by city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": { "type": "string" }
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": {
      "type": "function",
      "function": { "name": "get_weather" }
    },
    "stream": false
  }'

Response

{
  "id": "chatcmpl_...",
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "tool_calls",
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\":\"London\"}"
            }
          }
        ]
      }
    }
  ]
}
Run your tool, then send the tool result back in the next request so the assistant can finish the answer.

Built-in server tools

The gateway currently exposes these built-in server tools:
  • gateway:datetime
  • ai-stats:web_search
  • ai-stats:web_fetch
  • ai-stats:advisor
  • ai-stats:image_generation
  • ai-stats:apply_patch
This tool runs on the gateway side (no client-side executor required). The gateway rewrites it into an upstream tool/function call, executes it, and feeds the tool result back into the model loop. For full configuration, usage, and pricing details, see Server Tools. Supported request shape:
{
  "tools": [
    {
      "type": "gateway:datetime",
      "parameters": {
        "timezone": "Europe/London"
      }
    }
  ]
}
Notes:
  • parameters.timezone is optional and must be a valid IANA timezone.
  • timezone can also be provided as a top-level shortcut on the tool object.
  • The result contains ISO datetime plus the resolved timezone.
  • Usage includes usage.server_tool_use.datetime_requests.
  • Prefer tool_choice: "auto" so the model can decide when to call it.

Web search example

{
  "tools": [
    {
      "type": "ai-stats:web_search",
      "parameters": {
        "engine": "exa",
        "max_results": 5,
        "max_total_results": 15,
        "search_context_size": "medium",
        "max_characters": 2048,
        "allowed_domains": ["arxiv.org", "nature.com"],
        "include_highlights": true
      }
    }
  ]
}
Notes:
  • The model supplies the search query when it calls the tool.
  • engine: "auto" resolves to managed Exa search. engine: "exa", engine: "parallel", and engine: "firecrawl" run managed gateway search when the matching provider key is configured.
  • engine: "native" on ai-stats:web_search is converted to the provider-native web search tool for the request surface, such as OpenAI web_search_preview or Anthropic web_search_20250305.
  • max_results caps each search call; max_total_results caps cumulative results across the server-tool loop.
  • Managed search supports allowed_domains / excluded_domains, search_context_size, and max_characters where the selected engine exposes matching controls.
  • Usage includes usage.server_tool_use.web_search_requests, usage.server_tool_use.web_search_results, and usage.server_tool_use.web_search_extra_results.
  • Pricing can bill managed Exa search with server_tool_web_search_requests and server_tool_web_search_extra_results meters.

Web fetch example

{
  "tools": [
    {
      "type": "ai-stats:web_fetch",
      "parameters": {
        "engine": "direct",
        "max_chars": 12000,
        "allowed_domains": ["docs.example.com"],
        "blocked_domains": ["internal.example.com"]
      }
    }
  ]
}
Notes:
  • The model supplies the target url when it calls the tool.
  • Only HTTP(S) URLs and text-like content types are supported.
  • engine: "auto" uses native fetch on the Anthropic Messages surface, otherwise Exa when EXA_API_KEY is configured, otherwise direct gateway HTTP fetch.
  • engine: "direct" uses direct gateway HTTP fetch. engine: "exa" uses Exa content extraction when EXA_API_KEY is configured.
  • engine: "parallel" uses Parallel Extract when PARALLEL_API_KEY is configured. engine: "firecrawl" uses Firecrawl Scrape when FIRECRAWL_API_KEY is configured.
  • engine: "native" on the Anthropic Messages surface is converted to Anthropic’s native web_fetch_20260209 tool. Other request surfaces should use engine: "direct" or a managed extraction engine.
  • max_content_tokens is accepted as a token-style bounded fetch size alias when max_chars is omitted.
  • allowed_domains and blocked_domains constrain which URLs can be fetched.
  • HTML content is reduced to bounded plain text before being injected back into the model loop.
  • Usage includes usage.server_tool_use.web_fetch_requests.
  • Pricing can bill managed fetch with the server_tool_web_fetch_requests meter. Provider-native fetch/search usage is priced with native_web_fetch_requests and native_web_search_requests; model price cards can override the built-in provider defaults.
Native Anthropic fetch example:
{
  "tools": [
    {
      "type": "ai-stats:web_fetch",
      "parameters": {
        "engine": "native",
        "max_content_tokens": 9000,
        "allowed_domains": ["docs.example.com"]
      }
    }
  ],
  "tool_choice": "ai-stats:web_fetch"
}

Advisor example

{
  "tools": [
    {
      "type": "ai-stats:advisor",
      "parameters": {
        "name": "reviewer",
        "model": "claude-opus-4-8",
        "instructions": "Review plans for correctness, missing edge cases, and implementation risk.",
        "forward_transcript": true,
        "max_uses": 2,
        "max_completion_tokens": 1400,
        "temperature": 0.2
      }
    }
  ],
  "tool_choice": "ai-stats:advisor"
}
Notes:
  • Advisor is gateway-managed and works across supported text models. The calling model receives an ai_stats_advisor tool, or a named variant such as ai_stats_advisor_reviewer, and the gateway executes the Advisor request.
  • parameters.name is optional. Use unique names to expose multiple advisors; names may contain letters, numbers, spaces, underscores, and dashes.
  • parameters.model pins the Advisor model. If omitted, the tool call can provide model; otherwise the gateway falls back to the outer request model.
  • parameters.forward_transcript defaults to false. Set it to true when the Advisor should receive the current conversation transcript.
  • The model normally supplies the Advisor prompt when it calls the tool. When forward_transcript is true, the gateway can execute a transcript-only Advisor call if no prompt is supplied. max_tokens is accepted as a legacy alias for max_completion_tokens.
  • Usage includes usage.server_tool_use.advisor_requests.

Image generation example

{
  "tools": [
    {
      "type": "ai-stats:image_generation",
      "parameters": {
        "model": "openai/gpt-image-2",
        "quality": "high",
        "aspect_ratio": "16:9",
        "output_format": "png"
      }
    }
  ]
}
Notes:
  • The model supplies the image prompt when it calls the tool. description is also accepted as a prompt alias.
  • parameters.model pins the image model. If omitted, the tool call can provide model; otherwise AI Stats uses the default image model.
  • The tool result contains either imageUrl or base64 image data, depending on the provider response.
  • Usage includes usage.server_tool_use.image_generation_requests; image-model token usage is merged into the parent request.

Apply patch example

{
  "tools": [
    {
      "type": "ai-stats:apply_patch"
    }
  ],
  "tool_choice": "auto"
}
Notes:
  • ai-stats:apply_patch is supported on the Responses API.
  • AI Stats validates patch operations and returns them in the tool result. Your client decides whether to apply or reject the patch.
  • Supported operation types are create_file, update_file, and delete_file.
  • Usage includes usage.server_tool_use.apply_patch_requests.

Streaming behavior

Tool-calling requests can also use stream: true. For gateway-managed server tools, the gateway may:
  • materialize the upstream tool-call turn
  • execute the server tool
  • continue the model loop
  • re-emit a synthetic stream back to the client
That keeps the client-side contract streaming-friendly even when the gateway executes part of the tool loop itself.

Next guides

  1. Tool Calling Patterns
  2. Tool Calling Safety and Validation
  3. Structured Outputs
Last modified on June 11, 2026