mirror of
https://github.com/open-webui/docs.git
synced 2026-01-03 10:20:09 +07:00
Merge pull request #921 from Classic298/dev
This commit is contained in:
70
docs/features/chat-features/multi-model-chats.mdx
Normal file
70
docs/features/chat-features/multi-model-chats.mdx
Normal file
@@ -0,0 +1,70 @@
|
||||
---
|
||||
sidebar_position: 15
|
||||
title: "Multi-Model Chats"
|
||||
---
|
||||
|
||||
# Multi-Model Chats
|
||||
|
||||
Open WebUI allows you to interact with **multiple models simultaneously** within a single chat interface. This powerful feature enables you to compare responses, verify facts, and leverage the unique strengths of different LLMs side-by-side.
|
||||
|
||||
## Overview
|
||||
|
||||
In a Multi-Model Chat, your prompt is sent to two or more selected models at the same time. Their responses are displayed in parallel columns (or stacked, depending on screen size), giving you immediate insight into how different AI architectures approach the same problem.
|
||||
|
||||
## How to Use
|
||||
|
||||
1. **Select Models**: In the chat header (Model Selector), click the **+ (Plus)** button to add more models to your current session.
|
||||
* *Example Setup*: Select **GPT-5.1 Thinking** (for reasoning), **Gemini 3** (for creative writing), and **Claude Sonnet 4.5** (for overall performance).
|
||||
2. **Send Prompt**: Type your question as usual.
|
||||
3. **View Results**: Watch as all models generate their responses simultaneously in the chat window.
|
||||
|
||||
## Usage Scenarios
|
||||
|
||||
* **Model Comparison/Benchmarking**: Test which model writes better Python code or which one hallucinates less on niche topics.
|
||||
* **Fact Validation**: "Cross-examine" models. If two models say X and one says Y, you can investigate further.
|
||||
* **Diverse Perspectives**: Get a "Creative" take from one model and a "Technical" take from another for the same query.
|
||||
|
||||
## Permissions
|
||||
|
||||
Admins can control access to Multi-Model Chats on a per-role or per-group basis.
|
||||
|
||||
* **Location**: Admin Panel > Settings > General > User Permissions > Chat > **Multiple Models**
|
||||
* **Environment Variable**: `USER_PERMISSIONS_CHAT_MULTIPLE_MODELS` (Default: `True`)
|
||||
|
||||
If disabled, users will not see the "plus" button in the model selector and cannot initiate multi-model sessions.
|
||||
|
||||
---
|
||||
|
||||
## Merging Responses (Mixture of Agents)
|
||||
|
||||
Once you have responses from multiple models, Open WebUI offers an advanced capability to **Merge** them into a single, superior answer. This implements a **Mixture of Agents (MOA)** workflow.
|
||||
|
||||
### What is Merging?
|
||||
|
||||
Merging takes the outputs from all your active models and sends them—along with your original prompt—to a "Synthesizer Model." This Synthesizer Model reads all the draft answers and combines them into one final, polished response.
|
||||
|
||||
### How to Merge
|
||||
|
||||
1. Start a **Multi-Model Chat** and get responses from your selected models.
|
||||
2. Look for the **Merge** (or "Synthesize") button in the response controls area (often near the regeneration controls).
|
||||
3. Open WebUI will generate a **new response** that aggregates the best parts of the previous outputs.
|
||||
|
||||
### Advantages of Merging
|
||||
|
||||
* **Higher Accuracy**: Research suggests that aggregating outputs from multiple models often outperforms any single model acting alone.
|
||||
* **Best of Both Worlds**: You might get the code accuracy of Model A combined with the clear explanations of Model B.
|
||||
* **Reduced Hallucinations**: The synthesizer model can filter out inconsistencies found in individual responses.
|
||||
|
||||
### Configuration
|
||||
|
||||
The merging process relies on the backend **Tasks** system.
|
||||
|
||||
* **Task Model**: The specific model used to perform the merger can be configured in **Admin Panel > Settings > Tasks**. We recommend using a highly capable model (like GPT-5.1 or Claude Sonnet 4.5) as the task model for the best results.
|
||||
* **Prompt Template**: The system uses a specialized prompt template to instruct the AI on how to synthesize the answers.
|
||||
|
||||
:::info Experimental
|
||||
The Merging/MOA feature is an advanced capability. While powerful, it requires a capable Task Model to work effectively.
|
||||
:::
|
||||
|
||||
|
||||
|
||||
37
docs/features/experimental/direct-connections.mdx
Normal file
37
docs/features/experimental/direct-connections.mdx
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
sidebar_position: 1510
|
||||
title: "Direct Connections"
|
||||
---
|
||||
|
||||
**Direct Connections** is a feature that allows users to connect their Open WebUI client directly to OpenAI-compatible API endpoints, bypassing the Open WebUI backend for inference requests.
|
||||
|
||||
## Overview
|
||||
|
||||
In a standard deployment, Open WebUI acts as a proxy: the browser sends the prompt to the Open WebUI backend, which then forwards it to the LLM provider (Ollama, OpenAI, etc.).
|
||||
|
||||
With **Direct Connections**, the browser communicates directly with the API provider.
|
||||
|
||||
## Benefits
|
||||
|
||||
* **Privacy & Control**: Users can use their own personal API keys without storing them on the Open WebUI server (keys are stored in the browser's local storage).
|
||||
* **Reduced Latency**: Removes the "middleman" hop through the Open WebUI backend, potentially speeding up response times.
|
||||
* **Server Load Reduction**: Offloads the network traffic and connection management from the Open WebUI server to the individual client browsers.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Admin Enablement**: The administrator must enable this feature globally.
|
||||
* **Admin Panel > Settings > Connections > Direct Connections**: Toggle **On**.
|
||||
* Alternatively, set the environment variable: `ENABLE_DIRECT_CONNECTIONS=true`.
|
||||
2. **CORS Configuration**: Since the browser is making the request, the API provider must have **Cross-Origin Resource Sharing (CORS)** configured to allow requests from your Open WebUI domain.
|
||||
* *Note: Many strict providers (like official OpenAI) might block direct browser requests due to CORS policies. This feature is often best used with flexible providers or internal API gateways.*
|
||||
|
||||
## User Configuration
|
||||
|
||||
Once enabled by the admin, users can configure their own connections:
|
||||
|
||||
1. Go to **User Settings > Connections**.
|
||||
2. Click **+ (Add Connection)**.
|
||||
3. Enter the **Base URL** (e.g., `https://api.groq.com/openai/v1`) and your **API Key**.
|
||||
4. Click **Save**.
|
||||
|
||||
The models from this direct connection will now appear in your model list, often indistinguishable from backend-provided models, but requests will flow directly from your machine to the provider.
|
||||
@@ -5049,6 +5049,12 @@ For configuration using individual parameters or encrypted SQLite, see the relev
|
||||
|
||||
:::
|
||||
|
||||
#### `ENABLE_DB_MIGRATIONS`
|
||||
|
||||
- Type: `bool`
|
||||
- Default: `True`
|
||||
- Description: Controls whether database migrations are automatically run on startup. In multi-pod or multi-worker deployments, set this to `False` on all pods except one to designate a "master" pod responsible for migrations, preventing race conditions or schema corruption.
|
||||
|
||||
:::warning
|
||||
|
||||
**Required for Multi-Replica Setups**
|
||||
@@ -5056,6 +5062,8 @@ For multi-replica or high-availability deployments (Kubernetes, Docker Swarm), y
|
||||
|
||||
:::
|
||||
|
||||
|
||||
|
||||
#### `DATABASE_TYPE`
|
||||
|
||||
- Type: `str`
|
||||
@@ -5469,16 +5477,19 @@ If you use UVICORN_WORKERS, you also need to ensure that related environment var
|
||||
|
||||
:::
|
||||
|
||||
:::warning Database Migrations with Multiple Workers
|
||||
When `UVICORN_WORKERS > 1`, starting the application can trigger concurrent database migrations from multiple worker processes, potentially causing database schema corruption or inconsistent states.
|
||||
:::warning Database Migrations with Multiple Workers / Multi-Pod Deployments
|
||||
When `UVICORN_WORKERS > 1` or when running multiple replicas, starting the application can trigger concurrent database migrations from multiple processes, potentially causing database schema corruption or inconsistent states.
|
||||
|
||||
**Recommendation:**
|
||||
After pulling a new image or installing an update, **always run Open WebUI with a single worker (`UVICORN_WORKERS=1`) first**. This ensures the database migration completes successfully in a single process. Once the migration is finished and the application has started, you can then restart it with your desired number of workers.
|
||||
To handle migrations safely in multi-process/multi-pod environments, you can:
|
||||
1. **Designate a Master (Recommended):** Set `ENABLE_DB_MIGRATIONS=False` on all but one instance/worker. The instance with `ENABLE_DB_MIGRATIONS=True` (default) will handle the migration, while others will wait or skip it.
|
||||
2. **Scale Down:** Temporarily scale down to a single instance/worker to let migrations finish before scaling back up.
|
||||
|
||||
**For Kubernetes, Helm, Minikube, and other orchestrated setups:**
|
||||
Ensure that your deployment strategy allows for a single-replica or single-worker init container/job to handle migrations before scaling up to multiple replicas or workers. This is critical to prevent race conditions during schema updates.
|
||||
**For Kubernetes, Helm, and Orchestrated Setups:**
|
||||
It is recommended to use the `ENABLE_DB_MIGRATIONS` variable to designate a specific pod for migrations, or use an init container/job to handle migrations before scaling up the main application pods. This ensures schema updates are applied exactly once.
|
||||
:::
|
||||
|
||||
|
||||
### Cache Settings
|
||||
|
||||
#### `CACHE_CONTROL`
|
||||
|
||||
@@ -14,7 +14,10 @@ Keeping Open WebUI updated ensures you have the latest features, security patche
|
||||
- **Backup your data** before major version updates
|
||||
- **Check release notes** at https://github.com/open-webui/open-webui/releases for breaking changes
|
||||
- **Clear browser cache** after updating to ensure the latest web interface loads
|
||||
- **Running Multiple Workers?** If you use `UVICORN_WORKERS > 1`, you **MUST** run the updated container with `UVICORN_WORKERS=1` first to perform database migrations safely. Once started successfully, you can restart with multiple workers.
|
||||
- **Running Multiple Workers?** If you use `UVICORN_WORKERS > 1`, you **MUST** ensure migrations run safely by either:
|
||||
1. Running the updated container with `UVICORN_WORKERS=1` first.
|
||||
2. Designating a master worker using `ENABLE_DB_MIGRATIONS=True` (default) on one instance and `False` on others.
|
||||
Once migrations complete, you can run with multiple workers normally.
|
||||
:::
|
||||
:::
|
||||
|
||||
|
||||
@@ -123,16 +123,27 @@ The `/app/backend/data` directory is not shared or is not consistent across repl
|
||||
### Updates and Migrations
|
||||
|
||||
:::danger Critical: Avoid Concurrent Migrations
|
||||
**Always scale down to 1 replica (and 1 worker) before upgrading Open WebUI versions.**
|
||||
**Always ensure only one process is running database migrations when upgrading Open WebUI versions.**
|
||||
:::
|
||||
|
||||
Database migrations run automatically on startup. If multiple replicas (or multiple workers within a single container) start simultaneously with a new version, they may try to run migrations concurrently, leading to race conditions or database schema corruption.
|
||||
Database migrations run automatically on startup. If multiple replicas (or multiple workers within a single container) start simultaneously with a new version, they may try to run migrations concurrently, potentially leading to race conditions or database schema corruption.
|
||||
|
||||
**Safe Update Procedure:**
|
||||
1. **Scale Down:** Set replicas to `1` (and ensure `UVICORN_WORKERS=1` if you customized it).
|
||||
2. **Update Image:** Application restarts with the new version.
|
||||
3. **Wait for Health Check:** Ensure the single instance starts up fully and completes DB migrations.
|
||||
4. **Scale Up:** Increase replicas (or `UVICORN_WORKERS`) back to your desired count.
|
||||
|
||||
There are two ways to safely handle migrations in a multi-replica environment:
|
||||
|
||||
#### Option 1: Designate a Master Migration Pod (Recommended)
|
||||
1. Identify one pod/replica as the "master" for migrations.
|
||||
2. Set `ENABLE_DB_MIGRATIONS=True` (default) on the master pod.
|
||||
3. Set `ENABLE_DB_MIGRATIONS=False` on all other pods.
|
||||
4. When updating, the master pod will handle the database schema update while other pods skip the migration step.
|
||||
|
||||
#### Option 2: Scale Down During Update
|
||||
1. **Scale Down:** Set replicas to `1` (and ensure `UVICORN_WORKERS=1`).
|
||||
2. **Update Image:** Update the image or version.
|
||||
3. **Wait for Health Check:** Wait for the single instance to start fully and complete migrations.
|
||||
4. **Scale Up:** Increase replicas back to your desired count.
|
||||
|
||||
|
||||
### Session Affinity (Sticky Sessions)
|
||||
While Open WebUI is designed to be stateless with proper Redis configuration, enabling **Session Affinity** (Sticky Sessions) at your Load Balancer / Ingress level can improve performance and reduce occasional jitter in WebSocket connections.
|
||||
|
||||
Reference in New Issue
Block a user