refac: reorganisation

This commit is contained in:
Timothy Jaeryang Baek
2025-11-13 18:39:14 -05:00
parent 374be96e1d
commit 365e40a9d5
52 changed files with 59 additions and 47 deletions

View File

@@ -0,0 +1,7 @@
{
"label": "Speech-to-Text & Text-to-Speech",
"position": 500,
"link": {
"type": "generated-index"
}
}

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 7
sidebar_position: 1000
title: "Channels"
---

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 1
sidebar_position: 800
title: "Chat Features"
---

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 6
sidebar_position: 1100
title: "Evaluation"
---

View File

@@ -1,6 +1,6 @@
{
"label": "Image Generation and Editing",
"position": 5,
"label": "Create & Edit Images",
"position": 400,
"link": {
"type": "generated-index"
}

View File

@@ -1,6 +1,6 @@
{
"label": "Interface",
"position": 6,
"position": 900,
"link": {
"type": "generated-index"
}

View File

@@ -1,5 +1,6 @@
---
title: Model Context Protocol (MCP)
sidebar_position: 1200
---
Open WebUI natively supports **MCP (Model Context Protocol)** starting in **v0.6.31**. This page shows how to enable it quickly, harden it for production, and troubleshoot common snags.

View File

@@ -1,4 +1,4 @@
{
"label": "Pipelines",
"position": 9000
"position": 999999
}

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 3
sidebar_position: 300
title: "Tools & Functions (Plugins)"
---

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 2
sidebar_position: 200
title: "Retrieval Augmented Generation (RAG)"
---
@@ -7,11 +7,6 @@ title: "Retrieval Augmented Generation (RAG)"
If you're using **Ollama**, note that it **defaults to a 2048-token context length**. This severely limits **Retrieval-Augmented Generation (RAG) performance**, especially for web search, because retrieved data may **not be used at all** or only partially processed.
**Why This Is Critical for Web Search:**
Web pages typically contain 4,000-8,000+ tokens even after content extraction, including main content, navigation elements, headers, footers, and metadata. With only 2048 tokens available, you're getting less than half the page content, often missing the most relevant information. Even 4096 tokens is frequently insufficient for comprehensive web content analysis.
**To Fix This:** Navigate to **Admin Panel > Models > Settings** (of your Ollama model) > **Advanced Parameters** and **increase the context length to 8192+ (or rather, more than 16000) tokens**. This setting specifically applies to Ollama models. For OpenAI and other integrated models, ensure you're using a model with sufficient built-in context length (e.g., GPT-4 Turbo with 128k tokens).
:::
Retrieval Augmented Generation (RAG) is a cutting-edge technology that enhances the conversational capabilities of chatbots by incorporating context from diverse sources. It works by retrieving relevant information from a wide range of sources such as local and remote documents, web content, and even multimedia sources like YouTube videos. The retrieved text is then combined with a predefined RAG template and prefixed to the user's prompt, providing a more informed and contextually relevant response.
@@ -26,6 +21,12 @@ You can also load documents into the workspace area with their access by startin
## Web Search for RAG
:::warning
**Context Length Warning for Ollama Users:** Web pages typically contain 4,000-8,000+ tokens even after content extraction, including main content, navigation elements, headers, footers, and metadata. With only 2048 tokens available, you're getting less than half the page content, often missing the most relevant information. Even 4096 tokens is frequently insufficient for comprehensive web content analysis.
**To Fix This:** Navigate to **Admin Panel > Models > Settings** (of your Ollama model) > **Advanced Parameters** and **increase the context length to 8192+ (or rather, more than 16000) tokens**. This setting specifically applies to Ollama models. For OpenAI and other integrated models, ensure you're using a model with sufficient built-in context length (e.g., GPT-4 Turbo with 128k tokens).
:::
For web content integration, start a query in a chat with `#`, followed by the target URL. Click on the formatted URL in the box that appears above the chat box. Once selected, a document icon appears above `Send a message`, indicating successful retrieval. Open WebUI fetches and parses information from the URL if it can.
:::tip

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 1
sidebar_position: 100
title: "Role-Based Access Control (RBAC)"
---

View File

@@ -1,6 +1,6 @@
{
"label": "Web Search",
"position": 6,
"position": 600,
"link": {
"type": "generated-index"
}

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 0
sidebar_position: 700
title: "Workspace"
---

View File

@@ -1101,7 +1101,7 @@ If `OFFLINE_MODE` is enabled, this `ENABLE_VERSION_UPDATE_CHECK` flag is always
- OAuth authentication providers
- Web search and RAG with external APIs
Read more about `offline mode` in the [offline mode guide](/docs/tutorials/offline-mode.md).
Read more about `offline mode` in the [offline mode guide](/tutorials/offline-mode).
:::

View File

@@ -2,8 +2,6 @@
Using Docker Compose simplifies the management of multi-container Docker applications.
If you don't have Docker installed, check out our [Docker installation tutorial](docs/tutorials/docker-install.md).
Docker Compose requires an additional package, `docker-compose-v2`.
:::warning

View File

@@ -3,6 +3,13 @@ sidebar_position: 310
title: "Exporting and Importing Database"
---
:::warning
This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.
:::
If you need to migrate your **Open WebUI** data (e.g., chat histories, configurations, etc.) from one server to another or back it up for later use, you can export and import the database. This guide assumes you're running Open WebUI using the internal SQLite database (not PostgreSQL).
Follow the steps below to export and import the `webui.db` file, which contains your database.

View File

@@ -1,8 +1,12 @@
---
sidebar_position: 24
sidebar_position: 300
title: "Offline Mode"
---
import { TopBanners } from "@site/src/components/TopBanners";
<TopBanners />
:::warning
This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the [contributing tutorial](../contributing.mdx).
@@ -49,7 +53,7 @@ Consider if you need to start the application offline from the beginning of your
### I: Speech-To-Text
The local `whisper` installation does not include the model by default. In this regard, you can follow the [guide](/docs/tutorials/speech-to-text/stt-config.md) only partially if you want to use an external model/provider. To use the local `whisper` application, you must first download the model of your choice (e.g. [Huggingface - Systran](https://huggingface.co/Systran)).
The local `whisper` installation does not include the model by default. In this regard, you can follow the [guide](/features/audio/speech-to-text/stt-config.md) only partially if you want to use an external model/provider. To use the local `whisper` application, you must first download the model of your choice (e.g. [Huggingface - Systran](https://huggingface.co/Systran)).
```python
from faster_whisper import WhisperModel
@@ -88,14 +92,6 @@ The contents of the download directory must be copied to `/app/backend/data/cach
This is the easiest approach to achieving the offline setup with almost all features available in the online version. Apply only the features you want to use for your deployment.
### II: Speech-To-Text
Follow the [guide](./speech-to-text/stt-config.md).
### II: Text-To-Speech
Follow one of the [guides](https://docs.openwebui.com/category/%EF%B8%8F-text-to-speech).
### II: Embedding Model
In your Open WebUI installation, navigate to `Admin Settings` > `Settings` > `Documents` and select the embedding model you would like to use (e.g. [sentence-transformer/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)). After the selection, click the download button next to it.

View File

@@ -1,6 +1,6 @@
{
"label": "Tips & Tricks",
"position": 900,
"position": 0,
"link": {
"type": "generated-index"
}

View File

@@ -15,7 +15,7 @@ We appreciate your interest in contributing tutorials to the Open WebUI document
## Contributing Steps
1. **Fork the `openwebui/docs` GitHub Repository**
1. **Fork the `open-webui/docs` GitHub Repository**
- Navigate to the [Open WebUI Docs Repository](https://github.com/open-webui/docs) on GitHub.
- Click the **Fork** button at the top-right corner to create a copy under your GitHub account.

View File

@@ -11,21 +11,23 @@ This guide explains how to optimize your setup by configuring a dedicated, light
---
> [!TIP]
>
>## Why Does Open-WebUI Feel Slow?
>
>By default, Open-WebUI has several background tasks that can make it feel like magic but can also place a heavy load on local resources:
>
>- **Title Generation**
>- **Tag Generation**
>- **Autocomplete Generation** (this function triggers on every keystroke)
>- **Search Query Generation**
>
>Each of these features makes asynchronous requests to your model. For example, continuous calls from the autocomplete feature can significantly delay responses on devices with limited memory >or processing power, such as a Mac with 32GB of RAM running a 32B quantized model.
>
>Optimizing the task model can help isolate these background tasks from your main chat application, improving overall responsiveness.
>
:::tip
## Why Does Open-WebUI Feel Slow?
By default, Open-WebUI has several background tasks that can make it feel like magic but can also place a eavy load on local resources:
- **Title Generation**
- **Tag Generation**
- **Autocomplete Generation** (this function triggers on every keystroke)
- **Search Query Generation**
Each of these features makes asynchronous requests to your model. For example, continuous calls from the utocomplete feature can significantly delay responses on devices with limited memory >or processing power, uch as a Mac with 32GB of RAM running a 32B quantized model.
Optimizing the task model can help isolate these background tasks from your main chat application, improving verall responsiveness.
:::
---
## ⚡ How to Optimize Task Model Performance