Update index.mdx

This commit is contained in:
Classic298
2026-01-07 17:06:36 +01:00
committed by GitHub
parent 344b1d8ee1
commit 8079e02fe6

View File

@@ -84,25 +84,30 @@ You can also let your LLM auto-select the right Tools using the [**AutoTool Filt
---
## Tool Calling Modes: Default vs. Native
## Tool Calling Modes: Default vs. Native (Agentic Mode)
Open WebUI offers two distinct ways for models to interact with tools. Choosing the right mode depends on your model's capabilities and your performance requirements.
### 🟡 Default Mode (Prompt-based)
In Default Mode, Open WebUI manages tool selection by injecting a specific prompt template that guides the model to output a tool request.
In Default Mode, Open WebUI manages tool selection by injecting a specific prompt template that guides the model to output a tool request.
- **Compatibility**: Works with **practically any model**, including older or smaller local models that lack native function-calling support.
- **Flexibility**: Highly customizable via prompt templates.
- **Caveat**: Can be slower (requires extra tokens) and less reliable for complex, multi-step tool chaining.
### 🟢 Native Mode (System Function Calling)
Native Mode leverages the model's built-in capability to handle tool definitions and return structured tool calls (JSON). This is the **recommended mode** for high-performance agentic workflows.
### 🟢 Native Mode (Agentic Mode / System Function Calling)
Native Mode (also called **Agentic Mode**) leverages the model's built-in capability to handle tool definitions and return structured tool calls (JSON). This is the **recommended mode** for high-performance agentic workflows.
#### Why use Native Mode?
:::warning Model Quality Matters
**Agentic tool calling requires high-quality models to work reliably.** While small local models may technically support function calling, they often struggle with the complex reasoning required for multi-step tool usage. For best results, use frontier models like **GPT-5**, **Claude 4.5 Sonnet**, **Gemini 3 Flash**, or **MiniMax M2.1**. Small local models may produce malformed JSON or fail to follow the strict state management required for agentic behavior.
:::
#### Why use Native Mode (Agentic Mode)?
- **Speed & Efficiency**: Lower latency as it avoids bulky prompt-based tool selection.
- **Reliability**: Higher accuracy in following tool schemas.
- **Reliability**: Higher accuracy in following tool schemas (with quality models).
- **Multi-step Chaining**: Essential for **Agentic Research** and **Interleaved Thinking** where a model needs to call multiple tools in succession.
- **Autonomous Decision-Making**: Models can decide when to search, which tools to use, and how to combine results.
#### How to Enable Native Mode
#### How to Enable Native Mode (Agentic Mode)
Native Mode can be enabled at two levels:
1. **Global/Administrator Level (Recommended)**:
@@ -118,8 +123,19 @@ Native Mode can be enabled at two levels:
#### Model Requirements & Caveats
- **Recommended Models**: High-tier models like **GPT-5**, **Claude 4.5 Sonnet**, **Gemini 3 Flash**, and **MiniMax M2.1** excel in Native Mode.
- **Local Model Warning**: While large local models (e.g., Qwen 3 32B) support native tool calling, **small local models** often struggle with Native Mode. They may produce malformed JSON or fail to follow the strict state management required for sequential calls. For these models, **Default Mode** is usually more reliable.
:::tip Recommended Models for Agentic Mode
For reliable agentic tool calling, use high-tier frontier models:
- **GPT-5** (OpenAI)
- **Claude 4.5 Sonnet** (Anthropic)
- **Gemini 3 Flash** (Google)
- **MiniMax M2.1**
These models excel at multi-step reasoning, proper JSON formatting, and autonomous tool selection.
:::
- **Large Local Models**: Some large local models (e.g., Qwen 3 32B, Llama 3.3 70B) can work with Native Mode, but results vary significantly by model quality.
- **Small Local Models Warning**: **Small local models** (under 30B parameters) often struggle with Native Mode. They may produce malformed JSON, fail to follow strict state management, or make poor tool selection decisions. For these models, **Default Mode** is usually more reliable.
| Feature | Default Mode | Native Mode |
|:---|:---|:---|
@@ -128,15 +144,21 @@ Native Mode can be enabled at two levels:
| **Logic** | Prompt-based (Open WebUI) | Model-native (API/Ollama) |
| **Complex Chaining** | ⚠️ Limited | ✅ Excellent |
### Built-in System Tools (Native Mode)
### Built-in System Tools (Native/Agentic Mode)
🛠️ When **Native Mode** is enabled, Open WebUI automatically injects powerful system tools based on the features toggled for the chat. This unlocks "Agentic" behaviors where models (like GPT-5, Claude 4.5, or MiniMax M2.1) can perform multi-step research or manage user memory dynamically.
🛠️ When **Native Mode (Agentic Mode)** is enabled, Open WebUI automatically injects powerful system tools based on the features toggled for the chat. This unlocks truly agentic behaviors where capable models (like GPT-5, Claude 4.5 Sonnet, Gemini 3 Flash, or MiniMax M2.1) can perform multi-step research, explore knowledge bases, or manage user memory autonomously.
| Tool | Purpose | Requirements |
|------|---------|--------------|
| **Search & Web** | | |
| `web_search` | Performs a search using the configured Search Engine. | `ENABLE_WEB_SEARCH` enabled. |
| `search_web` | Search the public web for information. Best for current events, external references, or topics not covered in internal documents. | `ENABLE_WEB_SEARCH` enabled. |
| `fetch_url` | Visits a URL and extracts text content via the Web Loader. | Part of Web Search feature. |
| **Knowledge Base** | | |
| `list_knowledge_bases` | List the user's accessible knowledge bases with file counts. | Always available. |
| `search_knowledge_bases` | Search knowledge bases by name and description. | Always available. |
| `search_knowledge_files` | Search files across accessible knowledge bases by filename. | Always available. |
| `view_knowledge_file` | Get the full content of a file from a knowledge base. | Always available. |
| `query_knowledge_bases` | Search internal knowledge bases using semantic/vector search. Should be your first choice for finding information before searching the web. | Always available. |
| **Image Gen** | | |
| `generate_image` | Generates a new image based on a prompt (supports `steps`). | `ENABLE_IMAGE_GENERATION` enabled. |
| `edit_image` | Edits an existing image based on a prompt and URL. | `ENABLE_IMAGE_EDIT` enabled.|
@@ -161,17 +183,21 @@ Native Mode can be enabled at two levels:
| `get_current_timestamp` | Get the current UTC Unix timestamp and ISO date. | Always available. |
| `calculate_timestamp` | Calculate relative timestamps (e.g., "3 days ago"). | Always available. |
**Why use these?** It allows for **Deep Research** (searching multiple times), **Contextual Awareness** (looking up previous chats or notes), **Dynamic Personalization** (saving facts), and **Precise Automation** (generating content based on existing notes).
**Why use these?** It allows for **Deep Research** (searching the web multiple times, or querying knowledge bases), **Contextual Awareness** (looking up previous chats or notes), **Dynamic Personalization** (saving facts), and **Precise Automation** (generating content based on existing notes or documents).
### Interleaved Thinking {#interleaved-thinking}
🧠 When using **Native Mode**, high-tier models can engage in **Interleaved Thinking**. This is a powerful "Thought → Action → Thought → Action → Thought → ..." loop where the model can reason about a task, execute one or more tools, evaluate the results, and then decide on its next move.
🧠 When using **Native Mode (Agentic Mode)**, high-tier models can engage in **Interleaved Thinking**. This is a powerful "Thought → Action → Thought → Action → Thought → ..." loop where the model can reason about a task, execute one or more tools, evaluate the results, and then decide on its next move.
:::info Quality Models Required
Interleaved thinking requires models with strong reasoning capabilities. This feature works best with frontier models (GPT-5, Claude 4.5+, Gemini 3+) that can maintain context across multiple tool calls and make intelligent decisions about which tools to use when.
:::
This is fundamentally different from a single-shot tool call. In an interleaved workflow, the model follows a cycle:
1. **Reason**: Analyze the user's intent and identify information gaps.
2. **Act**: Call a tool (e.g., `web_search` and `fetch_url`).
2. **Act**: Call a tool (e.g., `query_knowledge_bases` for internal docs or `search_web` and `fetch_url` for web research).
3. **Think**: Read the tool's output and update its internal understanding.
4. **Iterate**: If the answer isn't clear, call another tool (e.g., `fetch_url` to read a specific page) or refine the search.
4. **Iterate**: If the answer isn't clear, call another tool (e.g., `view_knowledge_file` to read a specific document or `fetch_url` to read a specific page) or refine the search.
5. **Finalize**: Only after completing this "Deep Research" cycle does the model provide a final, grounded answer.
This behavior is what transforms a standard chatbot into an **Agentic AI** capable of solving complex, multi-step problems autonomously.