> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ai-stats.phaseo.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Tool Calling

> Use model-driven function calls safely through the Gateway.

Tool calling lets models request structured actions (for example, database lookups, weather checks, or internal API calls) instead of guessing answers.

The Gateway supports tool payloads across these text endpoints:

* `/v1/chat/completions` (OpenAI-style `tools` and `tool_calls`)
* `/v1/responses` (Responses-style `function_call` output items)
* `/v1/messages` (Anthropic-style `tool_use` blocks)

## Request

```bash theme={null}
curl https://api.phaseo.app/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5-nano",
    "messages": [
      { "role": "user", "content": "What is the weather in London?" }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather by city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": { "type": "string" }
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": {
      "type": "function",
      "function": { "name": "get_weather" }
    },
    "stream": false
  }'
```

## Response

```json theme={null}
{
  "id": "chatcmpl_...",
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "tool_calls",
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\":\"London\"}"
            }
          }
        ]
      }
    }
  ]
}
```

Run your tool, then send the tool result back in the next request so the assistant can finish the answer.

## Built-in server tools

The gateway currently exposes these built-in server tools:

* `gateway:datetime`
* `ai-stats:web_search`
* `ai-stats:web_fetch`
* `ai-stats:advisor`
* `ai-stats:image_generation`
* `ai-stats:apply_patch`

This tool runs on the gateway side (no client-side executor required). The gateway rewrites it into an upstream tool/function call, executes it, and feeds the tool result back into the model loop.

For full configuration, usage, and pricing details, see [Server Tools](./server-tools).

Supported request shape:

```json theme={null}
{
  "tools": [
    {
      "type": "gateway:datetime",
      "parameters": {
        "timezone": "Europe/London"
      }
    }
  ]
}
```

Notes:

* `parameters.timezone` is optional and must be a valid IANA timezone.
* `timezone` can also be provided as a top-level shortcut on the tool object.
* The result contains ISO datetime plus the resolved timezone.
* Usage includes `usage.server_tool_use.datetime_requests`.
* Prefer `tool_choice: "auto"` so the model can decide when to call it.

### Web search example

```json theme={null}
{
  "tools": [
    {
      "type": "ai-stats:web_search",
      "parameters": {
        "engine": "exa",
        "max_results": 5,
        "max_total_results": 15,
        "search_context_size": "medium",
        "max_characters": 2048,
        "allowed_domains": ["arxiv.org", "nature.com"],
        "include_highlights": true
      }
    }
  ]
}
```

Notes:

* The model supplies the search query when it calls the tool.
* `engine: "auto"` resolves to managed Exa search. `engine: "exa"`, `engine: "parallel"`, and `engine: "firecrawl"` run managed gateway search when the matching provider key is configured.
* `engine: "native"` on `ai-stats:web_search` is converted to the provider-native web search tool for the request surface, such as OpenAI `web_search_preview` or Anthropic `web_search_20250305`.
* `max_results` caps each search call; `max_total_results` caps cumulative results across the server-tool loop.
* Managed search supports `allowed_domains` / `excluded_domains`, `search_context_size`, and `max_characters` where the selected engine exposes matching controls.
* Usage includes `usage.server_tool_use.web_search_requests`, `usage.server_tool_use.web_search_results`, and `usage.server_tool_use.web_search_extra_results`.
* Pricing can bill managed Exa search with `server_tool_web_search_requests` and `server_tool_web_search_extra_results` meters.

### Web fetch example

```json theme={null}
{
  "tools": [
    {
      "type": "ai-stats:web_fetch",
      "parameters": {
        "engine": "direct",
        "max_chars": 12000,
        "allowed_domains": ["docs.example.com"],
        "blocked_domains": ["internal.example.com"]
      }
    }
  ]
}
```

Notes:

* The model supplies the target `url` when it calls the tool.
* Only HTTP(S) URLs and text-like content types are supported.
* `engine: "auto"` uses native fetch on the Anthropic Messages surface, otherwise Exa when `EXA_API_KEY` is configured, otherwise direct gateway HTTP fetch.
* `engine: "direct"` uses direct gateway HTTP fetch. `engine: "exa"` uses Exa content extraction when `EXA_API_KEY` is configured.
* `engine: "parallel"` uses Parallel Extract when `PARALLEL_API_KEY` is configured. `engine: "firecrawl"` uses Firecrawl Scrape when `FIRECRAWL_API_KEY` is configured.
* `engine: "native"` on the Anthropic Messages surface is converted to Anthropic's native `web_fetch_20260209` tool. Other request surfaces should use `engine: "direct"` or a managed extraction engine.
* `max_content_tokens` is accepted as a token-style bounded fetch size alias when `max_chars` is omitted.
* `allowed_domains` and `blocked_domains` constrain which URLs can be fetched.
* HTML content is reduced to bounded plain text before being injected back into the model loop.
* Usage includes `usage.server_tool_use.web_fetch_requests`.
* Pricing can bill managed fetch with the `server_tool_web_fetch_requests` meter. Provider-native fetch/search usage is priced with `native_web_fetch_requests` and `native_web_search_requests`; model price cards can override the built-in provider defaults.

Native Anthropic fetch example:

```json theme={null}
{
  "tools": [
    {
      "type": "ai-stats:web_fetch",
      "parameters": {
        "engine": "native",
        "max_content_tokens": 9000,
        "allowed_domains": ["docs.example.com"]
      }
    }
  ],
  "tool_choice": "ai-stats:web_fetch"
}
```

### Advisor example

```json theme={null}
{
  "tools": [
    {
      "type": "ai-stats:advisor",
      "parameters": {
        "name": "reviewer",
        "model": "claude-opus-4-8",
        "instructions": "Review plans for correctness, missing edge cases, and implementation risk.",
        "forward_transcript": true,
        "max_uses": 2,
        "max_completion_tokens": 1400,
        "temperature": 0.2
      }
    }
  ],
  "tool_choice": "ai-stats:advisor"
}
```

Notes:

* Advisor is gateway-managed and works across supported text models. The calling model receives an `ai_stats_advisor` tool, or a named variant such as `ai_stats_advisor_reviewer`, and the gateway executes the Advisor request.
* `parameters.name` is optional. Use unique names to expose multiple advisors; names may contain letters, numbers, spaces, underscores, and dashes.
* `parameters.model` pins the Advisor model. If omitted, the tool call can provide `model`; otherwise the gateway falls back to the outer request model.
* `parameters.forward_transcript` defaults to `false`. Set it to `true` when the Advisor should receive the current conversation transcript.
* The model normally supplies the Advisor `prompt` when it calls the tool. When `forward_transcript` is `true`, the gateway can execute a transcript-only Advisor call if no prompt is supplied. `max_tokens` is accepted as a legacy alias for `max_completion_tokens`.
* Usage includes `usage.server_tool_use.advisor_requests`.

### Image generation example

```json theme={null}
{
  "tools": [
    {
      "type": "ai-stats:image_generation",
      "parameters": {
        "model": "openai/gpt-image-2",
        "quality": "high",
        "aspect_ratio": "16:9",
        "output_format": "png"
      }
    }
  ]
}
```

Notes:

* The model supplies the image `prompt` when it calls the tool. `description` is also accepted as a prompt alias.
* `parameters.model` pins the image model. If omitted, the tool call can provide `model`; otherwise AI Stats uses the default image model.
* The tool result contains either `imageUrl` or base64 image data, depending on the provider response.
* Usage includes `usage.server_tool_use.image_generation_requests`; image-model token usage is merged into the parent request.

### Apply patch example

```json theme={null}
{
  "tools": [
    {
      "type": "ai-stats:apply_patch"
    }
  ],
  "tool_choice": "auto"
}
```

Notes:

* `ai-stats:apply_patch` is supported on the Responses API.
* AI Stats validates patch operations and returns them in the tool result. Your client decides whether to apply or reject the patch.
* Supported operation types are `create_file`, `update_file`, and `delete_file`.
* Usage includes `usage.server_tool_use.apply_patch_requests`.

## Streaming behavior

Tool-calling requests can also use `stream: true`.

For gateway-managed server tools, the gateway may:

* materialize the upstream tool-call turn
* execute the server tool
* continue the model loop
* re-emit a synthetic stream back to the client

That keeps the client-side contract streaming-friendly even when the gateway executes part of the tool loop itself.

## Next guides

1. [Tool Calling Patterns](./tool-calling-patterns)
2. [Tool Calling Safety and Validation](./tool-calling-safety)
3. [Structured Outputs](./structured-outputs)