diff --git a/docs/enterprise/architecture.md b/docs/enterprise/architecture.md index 1761d93b..8c524a51 100644 --- a/docs/enterprise/architecture.md +++ b/docs/enterprise/architecture.md @@ -28,7 +28,9 @@ For organizations with demanding uptime requirements, Open WebUI supports produc | Component | Capability | | :--- | :--- | | **Load Balancing** | Multiple container instances behind a load balancer for resilience and optimal performance. | -| **External Databases** | Scalable, reliable data storage separate from application instances. | +| **External Databases** | PostgreSQL for the main database (SQLite is not supported for multi-instance). | +| **External Vector Database** | A client-server vector database (PGVector, Milvus, Qdrant) or ChromaDB in HTTP server mode. The default ChromaDB local mode uses SQLite which is not safe for multi-process access. | +| **Redis** | Required for session management, WebSocket coordination, and configuration sync across instances. | | **Persistent Storage** | Flexible storage backends to meet your data residency and performance requirements. | | **Observability** | Integration with modern logging and metrics tools for proactive monitoring. | diff --git a/docs/getting-started/quick-start/tab-docker/DockerSwarm.md b/docs/getting-started/quick-start/tab-docker/DockerSwarm.md index 62c32c60..74a1b689 100644 --- a/docs/getting-started/quick-start/tab-docker/DockerSwarm.md +++ b/docs/getting-started/quick-start/tab-docker/DockerSwarm.md @@ -5,6 +5,16 @@ This installation method requires knowledge on Docker Swarms, as it utilizes a s It includes isolated containers of ChromaDB, Ollama, and OpenWebUI. Additionally, there are pre-filled [Environment Variables](https://docs.openwebui.com/reference/env-configuration) to further illustrate the setup. +:::info Why ChromaDB Runs as a Separate Container + +This stack correctly deploys ChromaDB as a **separate HTTP server** container, with Open WebUI connecting to it via `CHROMA_HTTP_HOST` and `CHROMA_HTTP_PORT`. This is **required** for any multi-worker or multi-replica deployment. + +The default ChromaDB mode (without `CHROMA_HTTP_HOST`) uses a local SQLite-backed `PersistentClient` that is **not fork-safe** — concurrent writes from multiple worker processes will crash workers instantly. Running ChromaDB as a separate server avoids this by using HTTP connections instead of direct SQLite access. + +If you plan to scale the `openWebUI` service to multiple replicas, you should also switch to PostgreSQL for the main database and set up Redis. See the [Scaling & HA guide](https://docs.openwebui.com/troubleshooting/multi-replica) for full requirements. + +::: + Choose the appropriate command based on your hardware setup: - **Before Starting**: diff --git a/docs/getting-started/quick-start/tab-kubernetes/Helm.md b/docs/getting-started/quick-start/tab-kubernetes/Helm.md index 811c513c..b32777d6 100644 --- a/docs/getting-started/quick-start/tab-kubernetes/Helm.md +++ b/docs/getting-started/quick-start/tab-kubernetes/Helm.md @@ -31,9 +31,11 @@ Helm helps you manage Kubernetes applications. :::warning -If you intend to scale Open WebUI using multiple nodes/pods/workers in a clustered environment, you need to setup a NoSQL key-value database. +If you intend to scale Open WebUI using multiple nodes/pods/workers in a clustered environment, you need to setup a NoSQL key-value database (Redis). There are some [environment variables](https://docs.openwebui.com/reference/env-configuration/) that need to be set to the same value for all service-instances, otherwise consistency problems, faulty sessions and other issues will occur! +**Important:** The default vector database (ChromaDB) uses a local SQLite-backed client that is **not safe for multi-replica or multi-worker deployments**. SQLite connections are not fork-safe, and concurrent writes from multiple processes will crash workers instantly. You **must** switch to an external vector database (PGVector, Milvus, Qdrant) via [`VECTOR_DB`](https://docs.openwebui.com/reference/env-configuration#vector_db), or run ChromaDB as a separate HTTP server via [`CHROMA_HTTP_HOST`](https://docs.openwebui.com/reference/env-configuration#chroma_http_host). See the [Scaling & HA guide](https://docs.openwebui.com/troubleshooting/multi-replica) for full requirements. + ::: :::danger Critical for Updates diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx index 063b692c..69c97475 100644 --- a/docs/reference/env-configuration.mdx +++ b/docs/reference/env-configuration.mdx @@ -1715,6 +1715,20 @@ modeling files for reranking. - Default: `chroma` - Description: Specifies which vector database system to use. This setting determines which vector storage system will be used for managing embeddings. +:::danger ChromaDB (Default) Is Not Safe for Multi-Worker or Multi-Replica Deployments + +The default ChromaDB configuration uses a **local `PersistentClient`** backed by **SQLite**. SQLite is not fork-safe — when uvicorn forks multiple worker processes (`UVICORN_WORKERS > 1`), each worker inherits a copy of the same SQLite connection. Concurrent writes from these forked processes cause immediate crashes (`Child process died`) or database corruption. + +**This also applies to multi-replica deployments** (Kubernetes, Docker Swarm) where multiple containers point at the same ChromaDB data directory. + +If you need multiple workers or replicas, you **must** either: +1. **Switch to a client-server vector database** such as [PGVector](/reference/env-configuration#pgvector_db_url), Milvus, or Qdrant (recommended). +2. **Run ChromaDB as a separate HTTP server** and configure [`CHROMA_HTTP_HOST`](/reference/env-configuration#chroma_http_host) / [`CHROMA_HTTP_PORT`](/reference/env-configuration#chroma_http_port) so that Open WebUI uses an `HttpClient` instead of the local `PersistentClient`. + +See the [Scaling & HA guide](/troubleshooting/multi-replica) for full details. + +::: + :::note PostgreSQL Dependencies @@ -1735,6 +1749,14 @@ The other vector stores are community-added vector databases. ### ChromaDB +:::warning Local vs. HTTP Mode + +By default (when `CHROMA_HTTP_HOST` is **not** set), ChromaDB runs as a local `PersistentClient` using SQLite for storage. This mode is **only safe for single-worker, single-instance deployments** (`UVICORN_WORKERS=1`, one replica). + +For multi-worker or multi-replica setups, you **must** configure `CHROMA_HTTP_HOST` and `CHROMA_HTTP_PORT` to point to a standalone ChromaDB server, or switch to a different vector database entirely. See the [VECTOR_DB](#vector_db) warning above. + +::: + #### `CHROMA_TENANT` - Type: `str` @@ -1750,7 +1772,7 @@ The other vector stores are community-added vector databases. #### `CHROMA_HTTP_HOST` - Type: `str` -- Description: Specifies the hostname of a remote ChromaDB Server. Uses a local ChromaDB instance if not set. +- Description: Specifies the hostname of a remote ChromaDB Server. Uses a local ChromaDB instance if not set. **Setting this variable is required for multi-worker or multi-replica deployments** — it switches ChromaDB from the local SQLite-backed `PersistentClient` to a fork-safe `HttpClient`. #### `CHROMA_HTTP_PORT` @@ -6285,6 +6307,14 @@ If you use UVICORN_WORKERS, you also need to ensure that related environment var ::: +:::danger ChromaDB Not Compatible with Multiple Workers + +The default vector database (ChromaDB) uses a local SQLite-backed `PersistentClient`. SQLite connections are **not fork-safe** — when uvicorn forks multiple workers, each process inherits the same SQLite connection, and concurrent writes will crash workers instantly (`Child process died`). + +If you set `UVICORN_WORKERS > 1`, you **must** either switch to a client-server vector database (PGVector, Milvus, Qdrant) via [`VECTOR_DB`](/reference/env-configuration#vector_db), or run ChromaDB as a separate server and set [`CHROMA_HTTP_HOST`](/reference/env-configuration#chroma_http_host). + +::: + :::warning Database Migrations with Multiple Workers / Multi-Pod Deployments When `UVICORN_WORKERS > 1` or when running multiple replicas, starting the application can trigger concurrent database migrations from multiple processes, potentially causing database schema corruption or inconsistent states. diff --git a/docs/troubleshooting/multi-replica.mdx b/docs/troubleshooting/multi-replica.mdx index 5be15724..312bcb0f 100644 --- a/docs/troubleshooting/multi-replica.mdx +++ b/docs/troubleshooting/multi-replica.mdx @@ -15,7 +15,7 @@ Before troubleshooting specific errors, ensure your deployment meets these **abs 2. **External Database:** You **MUST** use an external PostgreSQL database (see [`DATABASE_URL`](/reference/env-configuration#database_url)). SQLite is **NOT** supported for multiple instances. 3. **Redis for WebSockets:** [`ENABLE_WEBSOCKET_SUPPORT=True`](/reference/env-configuration#enable_websocket_support) and [`WEBSOCKET_MANAGER=redis`](/reference/env-configuration#websocket_manager) with a valid [`WEBSOCKET_REDIS_URL`](/reference/env-configuration#websocket_redis_url) are required. 4. **Shared Storage:** A persistent volume (RWX / ReadWriteMany if possible, or ensuring all replicas map to the same underlying storage for `data/`) is critical for RAG (uploads/vectors) and generated images. -5. **External Vector Database (Recommended):** While embedded Chroma works with shared storage, using a dedicated external Vector DB (e.g., [PGVector](/reference/env-configuration#pgvector_db_url), Milvus, Qdrant) is **highly recommended** to avoid file locking issues and improve performance. +5. **External Vector Database (Required):** The default ChromaDB uses a local SQLite-backed `PersistentClient` that is **not safe for multi-worker or multi-replica deployments**. SQLite connections are not fork-safe, and concurrent writes from multiple processes will crash workers instantly. You **must** use a dedicated external Vector DB (e.g., [PGVector](/reference/env-configuration#pgvector_db_url), Milvus, Qdrant) via [`VECTOR_DB`](/reference/env-configuration#vector_db), or run ChromaDB as a [separate HTTP server](/reference/env-configuration#chroma_http_host). 6. **Database Session Sharing (Optional):** For PostgreSQL deployments with adequate resources, consider enabling [`DATABASE_ENABLE_SESSION_SHARING=True`](/reference/env-configuration#database_enable_session_sharing) to improve performance under high concurrency. --- @@ -117,7 +117,47 @@ The `/app/backend/data` directory is not shared or is not consistent across repl - **Kubernetes:** Use a `PersistentVolumeClaim` with `ReadWriteMany` (RWX) access mode if your storage provider supports it (e.g., NFS, CephFS, AWS EFS). - **Docker Swarm/Compose:** Mount a shared volume (e.g., NFS mount) to `/app/backend/data` on all containers. -### 6. Slow Performance in Cloud vs. Local Kubernetes +### 6. Worker Crashes During Document Upload (ChromaDB + Multi-Worker) + +**Symptoms:** +- Logs show the following sequence, all within the same second: + ``` + save_docs_to_vector_db:1619 - adding to collection file-id + INFO: Waiting for child process [pid] + INFO: Child process [pid] died + ``` +- Workers die immediately during RAG document ingestion. +- The crash is instant (not a timeout). + +**Cause:** +The default ChromaDB configuration uses a local `PersistentClient` backed by **SQLite**. When uvicorn forks multiple workers (`UVICORN_WORKERS > 1`), each worker process inherits a copy of the same SQLite database connection — all pointing at the same file on disk (`data/vector_db/`). + +When two workers attempt to write to the collection simultaneously (e.g., during document upload), SQLite's file-level locking fails across forked processes. The result is either a database lock error or a segfault from corrupted internal state inherited across the `fork()` call, which kills the worker process instantly. + +This is a [well-known SQLite limitation](https://www.sqlite.org/howtocorrupt.html#_carrying_an_open_database_connection_across_a_fork_): open database connections must not be carried across a `fork()`. + +**Solution:** +You **must** stop using the default local ChromaDB with multiple workers. Pick one of these options: + +| Option | Change | Tradeoff | +|--------|--------|----------| +| **Keep 1 worker** | Set `UVICORN_WORKERS=1` (the default) | Simplest, but limits concurrency | +| **Use ChromaDB HTTP mode** | Set [`CHROMA_HTTP_HOST`](/reference/env-configuration#chroma_http_host) / [`CHROMA_HTTP_PORT`](/reference/env-configuration#chroma_http_port) to point to a separate Chroma server | Each worker connects via HTTP instead of SQLite — fully fork-safe | +| **Switch vector DB** | Set [`VECTOR_DB`](/reference/env-configuration#vector_db) to `pgvector`, `milvus`, `qdrant`, etc. | These are client-server databases, inherently multi-process safe | + +**Recommended fix** — run ChromaDB as a separate server: + +```bash +# Run chroma server separately +chroma run --host 0.0.0.0 --port 8000 --path /data/vector_db + +# Then set these env vars for Open WebUI +CHROMA_HTTP_HOST=localhost +CHROMA_HTTP_PORT=8000 +UVICORN_WORKERS=4 +``` + +### 7. Slow Performance in Cloud vs. Local Kubernetes **Symptoms:** - Open WebUI performs well locally but experiences significant degradation or timeouts when deployed to cloud providers (AKS, EKS, GKE). @@ -131,7 +171,7 @@ Refer to the **[Cloud Infrastructure Latency](/troubleshooting/performance#%EF%B If you need more tips for performance improvements, check out the full [Optimization & Performance Guide](/troubleshooting/performance). -### 7. Optimizing Database Performance +### 8. Optimizing Database Performance For PostgreSQL deployments with adequate resources, consider these optimizations: diff --git a/docs/troubleshooting/performance.md b/docs/troubleshooting/performance.md index 089f4003..2d5b1058 100644 --- a/docs/troubleshooting/performance.md +++ b/docs/troubleshooting/performance.md @@ -125,10 +125,11 @@ For high-concurrency PostgreSQL deployments, the default connection pool setting ### Vector Database (RAG) For multi-user setups, the choice of Vector DB matters. -- **ChromaDB**: **NOT RECOMMENDED** for multi-user environments due to performance limitations and locking issues. +- **ChromaDB (Default)**: **NOT SAFE** for multi-worker (`UVICORN_WORKERS > 1`) or multi-replica deployments. The default ChromaDB configuration uses a local `PersistentClient` backed by **SQLite**. SQLite connections are not fork-safe — when uvicorn forks multiple workers, each process inherits the same database connection, and concurrent writes cause instant worker crashes (`Child process died`) or database corruption. This is a fundamental SQLite limitation, not a bug. See the [Scaling & HA troubleshooting guide](/troubleshooting/multi-replica#6-worker-crashes-during-document-upload-chromadb--multi-worker) for the full crash sequence and solutions. - **Recommendations**: - * **Milvus** or **Qdrant**: Best for improved scale and performance. - * **PGVector**: Excellent choice if you are already using PostgreSQL. + * **Milvus** or **Qdrant**: Best for improved scale and performance. These are client-server databases, inherently safe for multi-process access. + * **PGVector**: Excellent choice if you are already using PostgreSQL. Also fully multi-process safe. + * **ChromaDB HTTP mode**: If you want to keep using ChromaDB, run it as a [separate server](/reference/env-configuration#chroma_http_host) so Open WebUI connects via HTTP instead of local SQLite. - **Multitenancy**: If using Milvus or Qdrant, enabling multitenancy offers better resource sharing. * `ENABLE_MILVUS_MULTITENANCY_MODE=True` * `ENABLE_QDRANT_MULTITENANCY_MODE=True` @@ -359,7 +360,7 @@ If resource usage is critical, disable automated features that constantly trigge 2. **Workers**: `THREAD_POOL_SIZE=2000` (Prevent timeouts). 3. **Streaming**: `CHAT_RESPONSE_STREAM_DELTA_CHUNK_SIZE=7` (Reduce CPU/Net/DB writes). 4. **Chat Saving**: `ENABLE_REALTIME_CHAT_SAVE=False`. -5. **Vector DB**: **Milvus**, **Qdrant**, or **PGVector**. **Avoid ChromaDB.** +5. **Vector DB**: **Milvus**, **Qdrant**, or **PGVector**. **Do not use ChromaDB's default local mode** — its SQLite backend will crash under multi-worker/multi-replica access. 6. **Task Model**: External/Hosted (Offload compute). 7. **Caching**: `ENABLE_BASE_MODELS_CACHE=True`, `MODELS_CACHE_TTL=300`, `ENABLE_QUERIES_CACHE=True`. diff --git a/docs/troubleshooting/rag.mdx b/docs/troubleshooting/rag.mdx index 5488c75f..2086eb80 100644 --- a/docs/troubleshooting/rag.mdx +++ b/docs/troubleshooting/rag.mdx @@ -309,26 +309,48 @@ If PDFs containing images with text are returning empty content: | 📄 API returns "empty content" error | Wait for file processing to complete before adding to knowledge base | | 💥 CUDA OOM during embedding | Reduce batch size, isolate GPU, or restart container | | 📷 PDF images not extracted | Use Tika/Docling, enable OCR, or update pypdf | -| 💀 Worker dies during upload | Update Open WebUI, or increase `--timeout-worker-healthcheck` | +| 💀 Worker dies during upload (instant) | Switch away from default ChromaDB (SQLite) in multi-worker setups | +| 💀 Worker dies during upload (timeout) | Update Open WebUI, or increase `--timeout-worker-healthcheck` | --- -### 12. Worker Dies During Document Upload (SentenceTransformers) 💀 +### 12. Worker Dies During Document Upload 💀 -When uploading documents with the default SentenceTransformers embedding engine in a **multi-worker** deployment, you may see: +When uploading documents in a **multi-worker** deployment, you may see: ``` INFO: Waiting for child process [12] INFO: Child process [12] died ``` -**The Problem**: When using **multiple uvicorn workers** (`--workers 2` or more), uvicorn monitors worker health via periodic pings. The default health check timeout is just **5 seconds**. Local SentenceTransformers embedding operations can take longer than this, and in older versions of Open WebUI, the embedding call blocked the event loop entirely — preventing the worker from responding to health checks. Uvicorn then killed the worker as unresponsive. +There are **two distinct causes** for this in multi-worker setups: + +#### Cause A: ChromaDB SQLite + Fork (Instant Crash) + +If you are using the **default ChromaDB** vector database (which uses a local SQLite-backed `PersistentClient`) with `UVICORN_WORKERS > 1`, the crash is caused by SQLite being **not fork-safe**. When uvicorn forks multiple workers, each process inherits the same SQLite database connection. Concurrent writes to the vector database from multiple workers cause an immediate crash — not a timeout, but an instant fatal error. + +You will typically see this pattern all within the same second: +``` +save_docs_to_vector_db:1619 - adding to collection file-id +INFO: Waiting for child process [pid] +INFO: Child process [pid] died +``` + +**Solution:** You **must** switch away from the default local ChromaDB when using multiple workers: +- Set [`VECTOR_DB`](/reference/env-configuration#vector_db) to `pgvector`, `milvus`, or `qdrant` +- Or run ChromaDB as a separate HTTP server and set [`CHROMA_HTTP_HOST`](/reference/env-configuration#chroma_http_host) / [`CHROMA_HTTP_PORT`](/reference/env-configuration#chroma_http_port) + +See the [Scaling & HA guide](/troubleshooting/multi-replica#6-worker-crashes-during-document-upload-chromadb--multi-worker) for full details. + +#### Cause B: SentenceTransformers Health Check Timeout (Older Versions) + +When using the **default SentenceTransformers** embedding engine (local embeddings) with multiple workers, uvicorn monitors worker health via periodic pings. The default health check timeout is just **5 seconds**. In older versions of Open WebUI, the embedding call blocked the event loop entirely — preventing the worker from responding to health checks. Uvicorn then killed the worker as unresponsive. :::note This issue was **fixed** in Open WebUI. The embedding system now uses `run_coroutine_threadsafe` to keep the main event loop responsive during embedding operations, so workers will no longer be killed during uploads regardless of how long embeddings take. -If you are running a version with this fix and still experiencing worker death, ensure your Open WebUI is up to date. +If you are running a version with this fix and still experiencing worker death, check **Cause A** above (ChromaDB SQLite) first, then ensure your Open WebUI is up to date. ::: diff --git a/docs/tutorials/integrations/redis.md b/docs/tutorials/integrations/redis.md index afd199aa..fa502963 100644 --- a/docs/tutorials/integrations/redis.md +++ b/docs/tutorials/integrations/redis.md @@ -419,6 +419,20 @@ If you set `UVICORN_WORKERS` to any value greater than 1, you **must** configure ::: +:::danger + +**Critical: Default ChromaDB (SQLite) Not Compatible with Multiple Workers** + +In addition to Redis, you must also address the **vector database**. The default ChromaDB uses a local SQLite-backed `PersistentClient` that is **not fork-safe**. When uvicorn forks multiple workers, concurrent writes to the same SQLite file will crash workers instantly during document uploads (`Child process died`). + +You must either: +- Switch to a client-server vector database (`VECTOR_DB=pgvector`, `milvus`, or `qdrant`) +- Run ChromaDB as a separate HTTP server and set `CHROMA_HTTP_HOST` / `CHROMA_HTTP_PORT` + +See the [Scaling & HA guide](/troubleshooting/multi-replica#6-worker-crashes-during-document-upload-chromadb--multi-worker) for details. + +::: + ## Verification ### Verify Redis Connection