complete docs overhaul

2026-03-27 13:28:37 +07:00 · 2026-02-14 01:02:31 +01:00
parent 6e35ee70ff
commit c10c8d15ec
186 changed files with 1427 additions and 1348 deletions
--- a/docs/reference/_category_.json
+++ b/docs/reference/_category_.json
@@ -0,0 +1,7 @@
+{
+    "label": "📖 Reference",
+    "position": 150,
+    "link": {
+        "type": "generated-index"
+    }
+}
--- a/docs/reference/api-endpoints.md
+++ b/docs/reference/api-endpoints.md
@@ -0,0 +1,601 @@
+---
+slug: /getting-started/api-endpoints
+sidebar_position: 400
+title: "API Endpoints"
+---
+
+This guide provides essential information on how to interact with the API endpoints effectively to achieve seamless integration and automation using our models. Please note that this is an experimental setup and may undergo future updates for enhancement.
+
+## Authentication
+
+To ensure secure access to the API, authentication is required 🛡️. You can authenticate your API requests using the Bearer Token mechanism. Obtain your API key from **Settings > Account** in the Open WebUI, or alternatively, use a JWT (JSON Web Token) for authentication.
+
+## Swagger Documentation Links
+
+:::important
+
+Make sure to set the `ENV` environment variable to `dev` in order to access the Swagger documentation for any of these services. Without this configuration, the documentation will not be available.
+
+:::
+
+Access detailed API documentation for different services provided by Open WebUI:
+
+| Application | Documentation Path      |
+|-------------|-------------------------|
+| Main        | `/docs`                 |
+
+## Notable API Endpoints
+
+### 📜 Retrieve All Models
+
+- **Endpoint**: `GET /api/models`
+- **Description**: Fetches all models created or added via Open WebUI.
+- **Example**:
+
+  ```bash
+  curl -H "Authorization: Bearer YOUR_API_KEY" http://localhost:3000/api/models
+  ```
+
+### 💬 Chat Completions
+
+- **Endpoint**: `POST /api/chat/completions`
+- **Description**: Serves as an OpenAI API compatible chat completion endpoint for models on Open WebUI including Ollama models, OpenAI models, and Open WebUI Function models.
+
+- **Curl Example**:
+
+  ```bash
+  curl -X POST http://localhost:3000/api/chat/completions \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "model": "llama3.1",
+        "messages": [
+          {
+            "role": "user",
+            "content": "Why is the sky blue?"
+          }
+        ]
+      }'
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def chat_with_model(token):
+      url = 'http://localhost:3000/api/chat/completions'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      data = {
+        "model": "granite3.1-dense:8b",
+        "messages": [
+          {
+            "role": "user",
+            "content": "Why is the sky blue?"
+          }
+        ]
+      }
+      response = requests.post(url, headers=headers, json=data)
+      return response.json()
+  ```
+
+### 🔄 OpenResponses
+
+- **Endpoint**: `POST /api/responses`
+- **Description**: Proxies requests to the [OpenResponses API](https://openresponses.org). Automatically routes to the correct upstream backend based on the `model` field, including Azure OpenAI deployments. Supports both streaming (SSE) and non-streaming responses.
+
+- **Curl Example**:
+
+  ```bash
+  curl -X POST http://localhost:3000/api/responses \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "model": "gpt-4.1",
+        "input": "Explain quantum entanglement in simple terms."
+      }'
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def create_response(token, model, input_text):
+      url = 'http://localhost:3000/api/responses'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      data = {
+          "model": model,
+          "input": input_text
+      }
+      response = requests.post(url, headers=headers, json=data)
+      return response.json()
+  ```
+
+- **Streaming Example**:
+
+  ```python
+  import requests
+
+  def stream_response(token, model, input_text):
+      url = 'http://localhost:3000/api/responses'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      data = {
+          "model": model,
+          "input": input_text,
+          "stream": True
+      }
+      response = requests.post(url, headers=headers, json=data, stream=True)
+      for line in response.iter_lines():
+          if line:
+              print(line.decode('utf-8'))
+  ```
+
+:::info
+The Responses API endpoint supports the same model-based routing as Chat Completions. If you have multiple OpenAI-compatible backends configured, the request is automatically routed based on which backend hosts the specified model. Azure OpenAI deployments are also supported with appropriate API version handling.
+:::
+
+### 🔧 Filter and Function Behavior with API Requests
+
+When using the API endpoints directly, filters (Functions) behave differently than when requests come from the web interface.
+
+:::info Authentication Note
+Open WebUI accepts both **API keys** (prefixed with `sk-`) and **JWT tokens** for API authentication. This is intentional—the web interface uses JWT tokens internally for the same API endpoints. Both authentication methods provide equivalent API access.
+:::
+
+#### Filter Execution
+
+| Filter Function | WebUI Request | Direct API Request |
+|----------------|--------------|-------------------|
+| `inlet()` | ✅ Runs | ✅ Runs |
+| `stream()` | ✅ Runs | ✅ Runs |
+| `outlet()` | ✅ Runs | ❌ **Does NOT run** |
+
+The `inlet()` function always executes, making it ideal for:
+- **Rate limiting** - Track and limit requests per user
+- **Request logging** - Log all API usage for monitoring
+- **Input validation** - Reject invalid requests before they reach the model
+
+#### Triggering Outlet Processing
+
+The `outlet()` function only runs when the WebUI calls `/api/chat/completed` after a chat finishes. For direct API requests, you must call this endpoint yourself if you need outlet processing:
+
+- **Endpoint**: `POST /api/chat/completed`
+- **Description**: Triggers outlet filter processing for a completed chat
+
+- **Curl Example**:
+
+  ```bash
+  curl -X POST http://localhost:3000/api/chat/completed \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "model": "llama3.1",
+        "messages": [
+          {"role": "user", "content": "Hello"},
+          {"role": "assistant", "content": "Hi! How can I help you today?"}
+        ],
+        "chat_id": "optional-uuid",
+        "session_id": "optional-session-id"
+      }'
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def complete_chat_with_outlet(token, model, messages, chat_id=None):
+      """
+      Call after receiving the full response from /api/chat/completions
+      to trigger outlet filter processing.
+      """
+      url = 'http://localhost:3000/api/chat/completed'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      payload = {
+          'model': model,
+          'messages': messages  # Include the full conversation with assistant response
+      }
+      if chat_id:
+          payload['chat_id'] = chat_id
+      
+      response = requests.post(url, headers=headers, json=payload)
+      return response.json()
+  ```
+
+:::tip
+For more details on writing filters that work with API requests, see the [Filter Function documentation](/features/plugin/functions/filter#-filter-behavior-with-api-requests).
+:::
+
+### 🦙 Ollama API Proxy Support
+
+If you want to interact directly with Ollama models—including for embedding generation or raw prompt streaming—Open WebUI offers a transparent passthrough to the native Ollama API via a proxy route.
+
+- **Base URL**: `/ollama/<api>`
+- **Reference**: [Ollama API Documentation](https://github.com/ollama/ollama/blob/main/docs/api.md)
+
+#### 🔁 Generate Completion (Streaming)
+
+```bash
+curl http://localhost:3000/ollama/api/generate \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+  "model": "llama3.2",
+  "prompt": "Why is the sky blue?"
+}'
+```
+
+#### 📦 List Available Models
+
+```bash
+curl http://localhost:3000/ollama/api/tags \
+  -H "Authorization: Bearer YOUR_API_KEY"
+```
+
+#### 🧠 Generate Embeddings
+
+```bash
+curl -X POST http://localhost:3000/ollama/api/embed \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+  "model": "llama3.2",
+  "input": ["Open WebUI is great!", "Let'\''s generate embeddings."]
+}'
+```
+
+:::info
+When using the Ollama Proxy endpoints, you **must** include the `Content-Type: application/json` header for POST requests, or the API may fail to parse the body. Authorization headers are also required if your instance is secured.
+:::
+
+This is ideal for building search indexes, retrieval systems, or custom pipelines using Ollama models behind the Open WebUI.
+
+### 🧩 Retrieval Augmented Generation (RAG)
+
+The Retrieval Augmented Generation (RAG) feature allows you to enhance responses by incorporating data from external sources. Below, you will find the methods for managing files and knowledge collections via the API, and how to use them in chat completions effectively.
+
+#### Uploading Files
+
+To utilize external data in RAG responses, you first need to upload the files. The content of the uploaded file is automatically extracted and stored in a vector database.
+
+- **Endpoint**: `POST /api/v1/files/`
+- **Query Parameters**:
+  - `process` (boolean, default: `true`): Whether to extract content and compute embeddings
+  - `process_in_background` (boolean, default: `true`): Whether to process asynchronously
+- **Curl Example**:
+
+  ```bash
+  curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Accept: application/json" \
+  -F "file=@/path/to/your/file" http://localhost:3000/api/v1/files/
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def upload_file(token, file_path):
+      url = 'http://localhost:3000/api/v1/files/'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Accept': 'application/json'
+      }
+      files = {'file': open(file_path, 'rb')}
+      response = requests.post(url, headers=headers, files=files)
+      return response.json()
+  ```
+
+:::warning Async Processing and Race Conditions
+
+By default, file uploads are processed **asynchronously**. The upload endpoint returns immediately with a file ID, but content extraction and embedding computation continue in the background. 
+
+If you attempt to add the file to a knowledge base before processing completes, you will receive a `400` error:
+
+```
+The content provided is empty. Please ensure that there is text or data present before proceeding.
+```
+
+**You must wait for file processing to complete before adding files to knowledge bases.** See the [Checking File Processing Status](#checking-file-processing-status) section below.
+
+:::
+
+#### Checking File Processing Status
+
+Before adding a file to a knowledge base, verify that processing has completed using the status endpoint.
+
+- **Endpoint**: `GET /api/v1/files/{id}/process/status`
+- **Query Parameters**:
+  - `stream` (boolean, default: `false`): If `true`, returns a Server-Sent Events (SSE) stream
+
+**Status Values:**
+| Status | Description |
+|--------|-------------|
+| `pending` | File is still being processed |
+| `completed` | Processing finished successfully |
+| `failed` | Processing failed (check `error` field for details) |
+
+- **Python Example** (Polling):
+
+  ```python
+  import requests
+  import time
+
+  def wait_for_file_processing(token, file_id, timeout=300, poll_interval=2):
+      """
+      Wait for a file to finish processing.
+      
+      Returns:
+          dict: Final status with 'status' key ('completed' or 'failed')
+      
+      Raises:
+          TimeoutError: If processing doesn't complete within timeout
+      """
+      url = f'http://localhost:3000/api/v1/files/{file_id}/process/status'
+      headers = {'Authorization': f'Bearer {token}'}
+      
+      start_time = time.time()
+      while time.time() - start_time < timeout:
+          response = requests.get(url, headers=headers)
+          result = response.json()
+          status = result.get('status')
+          
+          if status == 'completed':
+              return result
+          elif status == 'failed':
+              raise Exception(f"File processing failed: {result.get('error')}")
+          
+          time.sleep(poll_interval)
+      
+      raise TimeoutError(f"File processing did not complete within {timeout} seconds")
+  ```
+
+- **Python Example** (SSE Streaming):
+
+  ```python
+  import requests
+  import json
+
+  def wait_for_file_processing_stream(token, file_id):
+      """
+      Wait for file processing using Server-Sent Events stream.
+      More efficient than polling for long-running operations.
+      """
+      url = f'http://localhost:3000/api/v1/files/{file_id}/process/status?stream=true'
+      headers = {'Authorization': f'Bearer {token}'}
+      
+      with requests.get(url, headers=headers, stream=True) as response:
+          for line in response.iter_lines():
+              if line:
+                  line = line.decode('utf-8')
+                  if line.startswith('data: '):
+                      data = json.loads(line[6:])
+                      status = data.get('status')
+                      
+                      if status == 'completed':
+                          return data
+                      elif status == 'failed':
+                          raise Exception(f"File processing failed: {data.get('error')}")
+      
+      raise Exception("Stream ended unexpectedly")
+  ```
+
+#### Adding Files to Knowledge Collections
+
+After uploading, you can group files into a knowledge collection or reference them individually in chats.
+
+:::important
+**Always wait for file processing to complete before adding files to a knowledge base.** Files that are still processing will have empty content, causing a `400` error. Use the status endpoint described above to verify the file status is `completed`.
+:::
+
+- **Endpoint**: `POST /api/v1/knowledge/{id}/file/add`
+- **Curl Example**:
+
+  ```bash
+  curl -X POST http://localhost:3000/api/v1/knowledge/{knowledge_id}/file/add \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{"file_id": "your-file-id-here"}'
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def add_file_to_knowledge(token, knowledge_id, file_id):
+      url = f'http://localhost:3000/api/v1/knowledge/{knowledge_id}/file/add'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      data = {'file_id': file_id}
+      response = requests.post(url, headers=headers, json=data)
+      return response.json()
+  ```
+
+#### Complete Workflow Example
+
+Here's a complete example that uploads a file, waits for processing, and adds it to a knowledge base:
+
+```python
+import requests
+import time
+
+WEBUI_URL = 'http://localhost:3000'
+TOKEN = 'your-api-key-here'
+
+def upload_and_add_to_knowledge(file_path, knowledge_id, timeout=300):
+    """
+    Upload a file and add it to a knowledge base.
+    Properly waits for processing to complete before adding.
+    """
+    headers = {
+        'Authorization': f'Bearer {TOKEN}',
+        'Accept': 'application/json'
+    }
+    
+    # Step 1: Upload the file
+    with open(file_path, 'rb') as f:
+        response = requests.post(
+            f'{WEBUI_URL}/api/v1/files/',
+            headers=headers,
+            files={'file': f}
+        )
+    
+    if response.status_code != 200:
+        raise Exception(f"Upload failed: {response.text}")
+    
+    file_data = response.json()
+    file_id = file_data['id']
+    print(f"File uploaded with ID: {file_id}")
+    
+    # Step 2: Wait for processing to complete
+    print("Waiting for file processing...")
+    start_time = time.time()
+    
+    while time.time() - start_time < timeout:
+        status_response = requests.get(
+            f'{WEBUI_URL}/api/v1/files/{file_id}/process/status',
+            headers=headers
+        )
+        status_data = status_response.json()
+        status = status_data.get('status')
+        
+        if status == 'completed':
+            print("File processing completed!")
+            break
+        elif status == 'failed':
+            raise Exception(f"Processing failed: {status_data.get('error')}")
+        
+        time.sleep(2)  # Poll every 2 seconds
+    else:
+        raise TimeoutError("File processing timed out")
+    
+    # Step 3: Add to knowledge base
+    add_response = requests.post(
+        f'{WEBUI_URL}/api/v1/knowledge/{knowledge_id}/file/add',
+        headers={**headers, 'Content-Type': 'application/json'},
+        json={'file_id': file_id}
+    )
+    
+    if add_response.status_code != 200:
+        raise Exception(f"Failed to add to knowledge: {add_response.text}")
+    
+    print(f"File successfully added to knowledge base!")
+    return add_response.json()
+
+# Usage
+result = upload_and_add_to_knowledge('/path/to/document.pdf', 'your-knowledge-id')
+```
+
+#### Using Files and Collections in Chat Completions
+
+You can reference both individual files or entire collections in your RAG queries for enriched responses.
+
+##### Using an Individual File in Chat Completions
+
+This method is beneficial when you want to focus the chat model's response on the content of a specific file.
+
+- **Endpoint**: `POST /api/chat/completions`
+- **Curl Example**:
+
+  ```bash
+  curl -X POST http://localhost:3000/api/chat/completions \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "model": "gpt-4-turbo",
+        "messages": [
+          {"role": "user", "content": "Explain the concepts in this document."}
+        ],
+        "files": [
+          {"type": "file", "id": "your-file-id-here"}
+        ]
+      }'
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def chat_with_file(token, model, query, file_id):
+      url = 'http://localhost:3000/api/chat/completions'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      payload = {
+          'model': model,
+          'messages': [{'role': 'user', 'content': query}],
+          'files': [{'type': 'file', 'id': file_id}]
+      }
+      response = requests.post(url, headers=headers, json=payload)
+      return response.json()
+  ```
+
+##### Using a Knowledge Collection in Chat Completions
+
+Leverage a knowledge collection to enhance the response when the inquiry may benefit from a broader context or multiple documents.
+
+- **Endpoint**: `POST /api/chat/completions`
+- **Curl Example**:
+
+  ```bash
+  curl -X POST http://localhost:3000/api/chat/completions \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "model": "gpt-4-turbo",
+        "messages": [
+          {"role": "user", "content": "Provide insights on the historical perspectives covered in the collection."}
+        ],
+        "files": [
+          {"type": "collection", "id": "your-collection-id-here"}
+        ]
+      }'
+  ```
+
+- **Python Example**:
+
+  ```python
+  import requests
+
+  def chat_with_collection(token, model, query, collection_id):
+      url = 'http://localhost:3000/api/chat/completions'
+      headers = {
+          'Authorization': f'Bearer {token}',
+          'Content-Type': 'application/json'
+      }
+      payload = {
+          'model': model,
+          'messages': [{'role': 'user', 'content': query}],
+          'files': [{'type': 'collection', 'id': collection_id}]
+      }
+      response = requests.post(url, headers=headers, json=payload)
+      return response.json()
+  ```
+
+These methods enable effective utilization of external knowledge via uploaded files and curated knowledge collections, enhancing chat applications' capabilities using the Open WebUI API. Whether using files individually or within collections, you can customize the integration based on your specific needs.
+
+## Advantages of Using Open WebUI as a Unified LLM Provider
+
+Open WebUI offers a myriad of benefits, making it an essential tool for developers and businesses alike:
+
+- **Unified Interface**: Simplify your interactions with different LLMs through a single, integrated platform.
+- **Ease of Implementation**: Quick start integration with comprehensive documentation and community support.
+
+By following these guidelines, you can swiftly integrate and begin utilizing the Open WebUI API. Should you encounter any issues or have questions, feel free to reach out through our Discord Community or consult the FAQs. Happy coding! 🌟
--- a/docs/reference/env-configuration.mdx
+++ b/docs/reference/env-configuration.mdx
--- a/docs/reference/https-encryption.md
+++ b/docs/reference/https-encryption.md
@@ -0,0 +1,64 @@
+---
+slug: /getting-started/advanced-topics/https-encryption
+sidebar_position: 6
+title: "Enabling HTTPS Encryption"
+---
+
+# Secure Your Open WebUI with HTTPS 🔒
+
+While **HTTPS is not strictly required** for basic local operation, it is **highly recommended** for all deployments and **mandatory** for enabling specific features like Voice Calls.
+
+:::warning Critical Feature Dependency
+Modern browsers require a **Secure Context** (HTTPS) to access the microphone. 
+**Voice Calls will NOT work** if you access Open WebUI via `http://` (unless using `localhost`).
+:::
+
+## Why HTTPS Matters 🛡️
+
+Enabling HTTPS encryption provides essential benefits:
+
+1.  **🔒 Privacy & Security**: Encrypts all data between the user and the server, protecting chat history and credentials.
+2.  **🎤 Feature Unlocking**: Enables browser restrictions for Microphone (Voice Mode) and Camera access.
+3.  **💪 Integrity**: Ensures data is not tampered with in transit.
+4.  **✅ Trust**: Displays the padlock icon, reassuring users that the service is secure.
+
+## Choosing Your Solution 🛠️
+
+The best method depends on your infrastructure.
+
+### 🏠 For Local/Docker Users
+If you are running Open WebUI with Docker, the standard approach is to use a **Reverse Proxy**. This sits in front of Open WebUI and handles the SSL encryption.
+
+*   **[Nginx](/tutorials/https/nginx)**: The industry standard. Highly configurable, great performance.
+*   **[Caddy](/tutorials/https/caddy)**: **Easiest option**. Automatically obtains and renews Let's Encrypt certificates with minimal config.
+*   **[HAProxy](/tutorials/https/haproxy)**: Robust choice for advanced load balancing needs.
+
+### ☁️ For Cloud Deployments
+*   **Cloud Load Balancers**: (AWS ALB, Google Cloud Load Balancing) often handle SSL termination natively.
+*   **Cloudflare Tunnel**: Excellent for exposing localhost to the web securely without opening ports.
+
+### 🧪 For Development
+*   **Ngrok**: Good for quickly testing Voice features locally. *Not for production.*
+
+## 📚 Implementation Guides
+
+Ready to set it up? Check out our dedicated tutorial category for step-by-step configurations:
+
+<div className="card-grid">
+
+  <a className="card" href="/tutorials/https/nginx">
+    <h3>Nginx Setup</h3>
+    <p>Manual control and high performance.</p>
+  </a>
+
+  <a className="card" href="/tutorials/https/caddy">
+    <h3>Caddy Setup</h3>
+    <p>Zero-config automatic HTTPS.</p>
+  </a>
+
+  <a className="card" href="/tutorials/https/">
+    <h3>📂 View All HTTPS Tutorials</h3>
+    <p>Browse the full category of guides.</p>
+  </a>
+
+</div>
--- a/docs/reference/monitoring/index.md
+++ b/docs/reference/monitoring/index.md
@@ -0,0 +1,251 @@
+---
+slug: /getting-started/advanced-topics/monitoring
+sidebar_position: 6
+title: "API Keys & Monitoring"
+---
+
+# API Keys & Monitoring Your Open WebUI 🔑🩺
+
+This guide covers two essential topics: setting up API keys for programmatic access to Open WebUI, and monitoring your instance to ensure reliability and performance.
+
+**Why Monitor?**
+
+- **Ensure Uptime:**  Proactively detect outages and service disruptions.
+- **Performance Insights:**  Track response times and identify potential bottlenecks.
+- **Early Issue Detection:**  Catch problems before they impact users significantly.
+- **Peace of Mind:**  Gain confidence that your Open WebUI instance is running smoothly.
+
+## 🚦 Levels of Monitoring
+
+We'll cover three levels of monitoring, progressing from basic to more comprehensive:
+
+1. **Basic Health Check:**  Verifies if the Open WebUI service is running and responding.
+2. **Model Connectivity Check:** Confirms that Open WebUI can connect to and list your configured models.
+3. **Model Response Testing (Deep Health Check):**  Ensures that models can actually process requests and generate responses.
+
+## Level 1: Basic Health Check Endpoint ✅
+
+The simplest level of monitoring is checking the `/health` endpoint. This endpoint is publicly accessible (no authentication required) and returns a `200 OK` status code when the Open WebUI service is running correctly.
+
+**How to Test:**
+
+You can use `curl` or any HTTP client to check this endpoint:
+
+```bash
+   # Basic health check - no authentication needed
+   curl https://your-open-webui-instance/health
+```
+
+**Expected Output:**  A successful health check will return a `200 OK` HTTP status code. The content of the response body is usually not important for a basic health check.
+
+### Using Uptime Kuma for Basic Health Checks 🐻
+
+[Uptime Kuma](https://github.com/louislam/uptime-kuma) is a fantastic, open-source, and easy-to-use self-hosted uptime monitoring tool. It's highly recommended for monitoring Open WebUI.
+
+**Steps to Set Up in Uptime Kuma:**
+
+1. **Add a New Monitor:** In your Uptime Kuma dashboard, click "Add New Monitor".
+2. **Configure Monitor Settings:**
+   - **Monitor Type:**  Select "HTTP(s)".
+   - **Name:**  Give your monitor a descriptive name, e.g., "Open WebUI Health Check".
+   - **URL:** Enter the health check endpoint URL: `http://your-open-webui-instance:8080/health` (Replace `your-open-webui-instance:8080` with your actual Open WebUI address and port).
+   - **Monitoring Interval:** Set the frequency of checks (e.g., `60 seconds` for every minute).
+   - **Retry Count:**  Set the number of retries before considering the service down (e.g., `3` retries).
+
+**What This Check Verifies:**
+
+- **Web Server Availability:**  Ensures the web server (e.g., Nginx, Uvicorn) is responding to requests.
+- **Application Running:**  Confirms that the Open WebUI application itself is running and initialized.
+- **Basic Database Connectivity:**  Typically includes a basic check to ensure the application can connect to the database.
+
+## Level 2: Open WebUI Model Connectivity 🔗
+
+To go beyond basic availability, you can monitor the `/api/models` endpoint. This endpoint **requires authentication** and verifies that Open WebUI can successfully communicate with your configured model providers (e.g., Ollama, OpenAI) and retrieve a list of available models.
+
+**Why Monitor Model Connectivity?**
+
+- **Model Provider Issues:** Detect problems with your model provider services (e.g., API outages, authentication failures).
+- **Configuration Errors:**  Identify misconfigurations in your model provider settings within Open WebUI.
+- **Ensure Model Availability:**  Confirm that the models you expect to be available are actually accessible to Open WebUI.
+
+**API Endpoint Details:**
+
+See the [Open WebUI API documentation](https://docs.openwebui.com/getting-started/api-endpoints/#-retrieve-all-models) for full details about the `/api/models` endpoint and its response structure.
+
+**How to Test with `curl` (Authenticated):**
+
+You'll need an API key to access this endpoint. See the "Authentication Setup" section below for instructions on generating an API key.
+
+```bash
+   # Authenticated model connectivity check
+   curl -H "Authorization: Bearer YOUR_API_KEY" https://your-open-webui-instance/api/models
+```
+
+*(Replace `YOUR_API_KEY` with your actual API key and `your-open-webui-instance` with your Open WebUI address.)*
+
+**Expected Output:** A successful request will return a `200 OK` status code and a JSON response containing a list of models.
+
+### Authentication Setup for API Key 🔑
+
+Before you can monitor the `/api/models` endpoint, you need to configure API keys in Open WebUI. **API key access now requires a two-level permission structure**: first, the global API keys feature must be enabled, and second, individual users or groups must be granted API key creation permissions.
+
+#### Step 1: Enable API Keys Globally (Admin Required)
+
+1. Log in to Open WebUI as an **administrator**.
+2. Click on your **profile icon** in the bottom-left corner of the sidebar, then select **Admin Panel**.
+3. Navigate to **Settings** > **General**.
+4. Scroll down to the **Authentication** section.
+5. Find the **"Enable API Keys"** toggle and **turn it ON**.
+6. *(Optional)* Configure additional API key restrictions:
+   - **API Key Endpoint Restrictions**: Enable this to limit which endpoints can be accessed via API keys.
+   - **Allowed Endpoints**: Specify a comma-separated list of allowed endpoints (e.g., `/api/v1/models,/api/v1/chat/completions`).
+7. Click **Save** at the bottom of the page.
+
+:::info
+
+This enables the API key feature globally but does not automatically grant users permission to create API keys. You must also configure user or group permissions in Step 2.
+
+:::
+
+#### Step 2: Grant API Key Permission (Admin Required)
+
+**API key creation is disabled by default for all users, including administrators.** Admins are **not** exempt from this permission requirement—to use API keys, they must also grant themselves the permission. Administrators can grant API key permissions using one of the following methods:
+
+##### Option A: Grant Permission via Default Permissions
+
+This grants the API Keys permission to **all users with the "user" role**:
+
+1. In the **Admin Panel**, navigate to **Users** > **Groups**.
+2. At the bottom of the Groups page, click on **"Default permissions"** (this applies to all users with the "user" role).
+3. In the modal that opens, scroll to the **Features Permissions** section.
+4. Find **"API Keys"** and **toggle it ON**.
+5. Click **Save**.
+
+:::info
+
+**Note for Administrators:** "Default permissions" only applies to accounts with the "user" role. If you are an admin and need API key access, you must use **Option B** (User Groups)—create or select a group with API Keys enabled and add yourself to that group.
+
+:::
+
+:::warning
+
+Enabling API Keys for all users means any user can generate API keys that provide programmatic access to Open WebUI with their account's permissions. Consider using User Groups (Option B) for more restrictive access control.
+
+:::
+
+##### Option B: Grant Permission via User Groups
+
+For more granular control, you can grant API key permissions to specific user groups only:
+
+1. In the **Admin Panel**, navigate to **Users** > **Groups**.
+2. Select the group you want to grant API key permissions to (or click the **+ button** to create a new group).
+3. In the group edit modal, click on the **Permissions** tab.
+4. Scroll to **Features Permissions**.
+5. Find **"API Keys"** and **toggle it ON**.
+6. Click **Save**.
+
+:::tip
+
+Create a dedicated monitoring group (e.g., "Monitoring Users") and add only the accounts that need API key access for monitoring purposes. This follows the principle of least privilege.
+
+:::
+
+#### Step 3: Generate an API Key (User Action)
+
+Once global API keys are enabled and the user has been granted permission:
+
+1. Log in to Open WebUI with a user account that has API key permissions.
+2. Click on your **profile icon** in the bottom-left corner of the sidebar.
+3. Select **Settings** > **Account**.
+4. In the **API Keys** section, click **Generate New API Key**.
+5. Give the API key a descriptive name (e.g., "Monitoring API Key").
+6. **Copy the generated API key** immediately and store it securely—you won't be able to view it again.
+
+:::warning
+
+Treat your API key like a password! Store it securely and never share it publicly. If you suspect an API key has been compromised, delete it immediately and generate a new one.
+
+:::
+
+#### Recommended: Create a Dedicated Monitoring Account
+
+For production monitoring, we recommend:
+
+1. Create a **non-administrator user account** specifically for monitoring (e.g., "monitoring-bot").
+2. Add this account to a group with API key permissions (or ensure default permissions allow API key creation).
+3. Generate an API key from this account.
+
+This approach limits the potential impact if the monitoring API key is compromised—the attacker would only have access to the permissions granted to that specific monitoring account, not administrator privileges.
+
+#### Troubleshooting
+
+If you don't see the API key generation option in your account settings:
+
+- **Check global setting**: Verify that an administrator has enabled API keys globally under **Admin Panel** > **Settings** > **General** > **Enable API Keys**. See [`ENABLE_API_KEYS`](/getting-started/env-configuration#enable_api_keys).
+- **Check your permissions**: Verify that your user account or group has been granted the "API Keys" feature permission under **Features Permissions**. See [`USER_PERMISSIONS_FEATURES_API_KEYS`](/getting-started/env-configuration#user_permissions_features_api_keys).
+
+### Using Uptime Kuma for Model Connectivity Monitoring 🐻
+
+1. **Create a New Monitor in Uptime Kuma:**
+   - Monitor Type: "HTTP(s) - JSON Query".
+   - Name: "Open WebUI Model Connectivity Check".
+   - URL: `http://your-open-webui-instance:8080/api/models` (Replace with your URL).
+   - Method: "GET".
+   - Expected Status Code: `200`.
+
+2. **Configure JSON Query (Verify Model List):**
+   - **JSON Query:**  `$count(data[*])>0`
+     - **Explanation:** This JSONata query checks if the `data` array in the API response (which contains the list of models) has a count greater than 0. In other words, it verifies that at least one model is returned.
+   - **Expected Value:** `true` (The query should return `true` if models are listed).
+
+3. **Add Authentication Headers:**
+   - In the "Headers" section of the Uptime Kuma monitor configuration, click "Add Header".
+   - **Header Name:** `Authorization`
+   - **Header Value:** `Bearer YOUR_API_KEY` (Replace `YOUR_API_KEY` with the API key you generated).
+
+4. **Set Monitoring Interval:**  Recommended interval: `300 seconds` (5 minutes) or longer, as model lists don't typically change very frequently.
+
+**Alternative JSON Queries (Advanced):**
+
+You can use more specific JSONata queries to check for particular models or providers. Here are some examples:
+
+- **Check for at least one Ollama model:**  `$count(data[owned_by='ollama'])>0`
+- **Check if a specific model exists (e.g., 'gpt-4o'):** `$exists(data[id='gpt-4o'])`
+- **Check if multiple specific models exist (e.g., 'gpt-4o' and 'gpt-4o-mini'):** `$count(data[id in ['gpt-4o', 'gpt-4o-mini']]) = 2`
+
+You can test and refine your JSONata queries at [jsonata.org](https://try.jsonata.org/) using a sample API response to ensure they work as expected.
+
+## Level 3: Model Response Testing (Deep Health Check) 🤖
+
+For the most comprehensive monitoring, you can test if models are actually capable of processing requests and generating responses. This involves sending a simple chat completion request to the `/api/chat/completions` endpoint.
+
+**Why Test Model Responses?**
+
+- **End-to-End Verification:**  Confirms that the entire model pipeline is working, from API request to model response.
+- **Model Loading Issues:**  Detects problems with specific models failing to load or respond.
+- **Backend Processing Errors:**  Catches errors in the backend logic that might prevent models from generating completions.
+
+**How to Test with `curl` (Authenticated POST Request):**
+
+This test requires an API key and sends a POST request with a simple message to the chat completions endpoint.
+
+```bash
+
+# Test model response - authenticated POST request
+curl -X POST https://your-open-webui-instance/api/chat/completions \
+  -H "Authorization: Bearer YOUR_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "messages": [{"role": "user", "content": "Respond with the word HEALTHY"}],
+    "model": "llama3.1",  # Replace with a model you expect to be available
+    "temperature": 0      # Set temperature to 0 for consistent responses
+  }'
+```
+
+*(Replace `YOUR_API_KEY`, `your-open-webui-instance`, and `llama3.1` with your actual values.)*
+
+**Expected Output:** A successful request will return a `200 OK` status code and a JSON response containing a chat completion. You can verify that the response includes the word "HEALTHY" (or a similar expected response based on your prompt).
+
+**Setting up Level 3 Monitoring in Uptime Kuma would involve configuring an HTTP(s) monitor with a POST request, JSON body, authentication headers, and potentially JSON query to validate the response content. This is a more advanced setup and can be customized based on your specific needs.**
+
+By implementing these monitoring levels, you can proactively ensure the health, reliability, and performance of your Open WebUI instance, providing a consistently positive experience for users.
--- a/docs/reference/monitoring/otel.md
+++ b/docs/reference/monitoring/otel.md
@@ -0,0 +1,129 @@
+---
+slug: /getting-started/advanced-topics/monitoring/opentelemetry
+sidebar_position: 7
+title: "OpenTelemetry"
+---
+
+Open WebUI supports **distributed tracing and metrics** export via the OpenTelemetry (OTel) protocol (OTLP). This enables integration with modern observability stacks such as **Grafana LGTM (Loki, Grafana, Tempo, Mimir)**, as well as **Jaeger**, **Tempo**, and **Prometheus** to monitor requests, database/Redis queries, response times, and more in real-time.
+
+:::warning Additional Dependencies
+
+If you are running Open WebUI from source or via `pip` (outside of the official Docker images), OpenTelemetry dependencies **may not be installed by default**. You may need to install them manually:
+
+```bash
+pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
+```
+
+:::
+
+## 🚀 Quick Start with Docker Compose
+
+The fastest way to get started with observability is with the pre-configured Docker Compose:
+
+```bash
+
+# Spin up Open WebUI and the latest Grafana LGTM stack, all-in-one
+docker compose -f docker-compose.otel.yaml up -d
+```
+
+The `docker-compose.otel.yaml` file sets up these components:
+
+| Service     | Port(s)                                   | Description                                          |
+|-------------|------------------------------------------|------------------------------------------------------|
+| **grafana** | 3000 (UI), 4317 (OTLP/gRPC), 4318 (HTTP) | Grafana LGTM (Loki+Grafana+Tempo+Mimir) all-in-one   |
+| **open-webui** | 8088 (default) → 8080                     | WebUI, OTEL enabled, exposes on host port 8088          |
+
+After startup, access the Grafana dashboard at [http://localhost:3000](http://localhost:3000)
+Login: `admin` / `admin`
+
+## ⚙️ Environment Variables
+
+You can configure OpenTelemetry in Open WebUI with these environment variables (as used in the Compose file):
+
+| Variable                            | Default                         | Description                                         |
+|--------------------------------------|---------------------------------|-----------------------------------------------------|
+| `ENABLE_OTEL`                       | **true** in Compose             | Master switch to enable OpenTelemetry setup         |
+| `ENABLE_OTEL_TRACES`                | **true** in Compose             | Enable distributed tracing export                   |
+| `ENABLE_OTEL_METRICS`                | **true** in Compose             | Enable FastAPI HTTP metrics export                  |
+| `OTEL_EXPORTER_OTLP_ENDPOINT`        | `http://grafana:4317` in Compose| OTLP gRPC/HTTP Collector endpoint URL               |
+| `OTEL_EXPORTER_OTLP_INSECURE`        | **true** in Compose             | Insecure (no TLS) connection for OTLP               |
+| `OTEL_SERVICE_NAME`                  | `open-webui`                    | Service name (tagged in traces and metrics)         |
+| `OTEL_BASIC_AUTH_USERNAME` / `OTEL_BASIC_AUTH_PASSWORD` | *(empty)*      | Basic Auth credentials if Collector requires them   |
+
+:::tip
+
+Override defaults in your `.env` file or Compose file as needed.
+
+:::
+
+```yaml
+  open-webui:
+    environment:
+      - ENABLE_OTEL=true
+      - ENABLE_OTEL_TRACES=true
+      - ENABLE_OTEL_METRICS=true
+      - OTEL_EXPORTER_OTLP_INSECURE=true # Use insecure connection for OTLP, you may want to remove this in production
+      - OTEL_EXPORTER_OTLP_ENDPOINT=http://grafana:4317
+      - OTEL_SERVICE_NAME=open-webui
+      # You may set OTEL_BASIC_AUTH_USERNAME/PASSWORD here if needed
+```
+
+## 📊 Data Collection
+
+### Distributed Tracing
+
+The Open WebUI backend automatically instruments:
+
+- **FastAPI** (routes)
+- **SQLAlchemy** (database queries)
+- **Redis**
+- **requests**, **httpx**, **aiohttp** (external calls)
+
+Each trace span includes rich data such as:
+
+- `db.instance`, `db.statement`, `redis.args`
+- `http.url`, `http.method`, `http.status_code`
+- Error details (`error.message`, `error.kind`) on exceptions
+
+### Metrics Collection
+
+WebUI exports the following metrics via OpenTelemetry:
+
+| Instrument             | Type      | Unit | Labels                               |
+|------------------------|-----------|------|--------------------------------------|
+| `http.server.requests` | Counter   | 1    | `http.method`, `http.route`, `http.status_code` |
+| `http.server.duration` | Histogram | ms   | (same as above)                      |
+
+Metrics are sent via OTLP (default every 10 seconds) and can be visualized in **Grafana** (via Prometheus/Mimir).
+
+## 🔧 Custom Collector Setup
+
+To use a different (external) OpenTelemetry Collector/Stack:
+
+```bash
+docker run -d --name open-webui \
+  -p 8088:8080 \
+  -e ENABLE_OTEL=true \
+  -e ENABLE_OTEL_TRACES=true \
+  -e ENABLE_OTEL_METRICS=true \
+  -e OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4317 \
+  -e OTEL_EXPORTER_OTLP_INSECURE=true \
+  -e OTEL_SERVICE_NAME=open-webui \
+  -v open-webui:/app/backend/data \
+  ghcr.io/open-webui/open-webui:main
+```
+
+## 🚨 Troubleshooting
+
+**Traces/metrics not appearing in Grafana?**
+
+- Double-check `ENABLE_OTEL`, `ENABLE_OTEL_TRACES`, and `ENABLE_OTEL_METRICS` are all set to `true`
+- Is the endpoint correct? (`OTEL_EXPORTER_OTLP_ENDPOINT`)
+- Inspect logs from Open WebUI (`docker logs open-webui`) for OTLP errors
+- Collector's OTLP port (`4317`) should be open and reachable. Try:
+  `curl http://localhost:4317` (replace host as needed)
+
+**Authentication required?**
+
+- Set `OTEL_BASIC_AUTH_USERNAME` and `OTEL_BASIC_AUTH_PASSWORD` for auth-protected collectors
+- If using SSL/TLS, adjust or remove `OTEL_EXPORTER_OTLP_INSECURE` as appropriate
--- a/docs/reference/network-diagrams.mdx
+++ b/docs/reference/network-diagrams.mdx
@@ -0,0 +1,163 @@
+---
+slug: /getting-started/advanced-topics/network-diagrams
+sidebar_position: 3
+title: "Network Diagrams"
+---
+
+Here, we provide clear and structured diagrams to help you understand how various components of the network interact within different setups. This documentation is designed to assist both macOS/Windows and Linux users. Each scenario is illustrated using Mermaid diagrams to show how the interactions are set up depending on the different system configurations and deployment strategies.
+
+## Mac OS/Windows Setup Options 🖥️
+
+### Ollama on Host, Open WebUI in Container
+
+In this scenario, `Ollama` runs directly on the host machine while `Open WebUI` operates within a Docker container.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Mac OS/Windows") {
+   Person(user, "User")
+   Boundary(b1, "Docker Desktop's Linux VM") {
+      Component(openwebui, "Open WebUI", "Listening on port 8080")
+   }
+   Component(ollama, "Ollama", "Listening on port 11434")
+}
+Rel(openwebui, ollama, "Makes API calls via Docker proxy", "http://host.docker.internal:11434")
+Rel(user, openwebui, "Makes requests via exposed port -p 3000:8080", "http://localhost:3000")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+### Ollama and Open WebUI in Compose Stack
+
+Both `Ollama` and `Open WebUI` are configured within the same Docker Compose stack, simplifying network communications.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Mac OS/Windows") {
+   Person(user, "User")
+   Boundary(b1, "Docker Desktop's Linux VM") {
+      Boundary(b2, "Compose Stack") {
+         Component(openwebui, "Open WebUI", "Listening on port 8080")
+         Component(ollama, "Ollama", "Listening on port 11434")
+      }
+   }
+}
+Rel(openwebui, ollama, "Makes API calls via inter-container networking", "http://ollama:11434")
+Rel(user, openwebui, "Makes requests via exposed port -p 3000:8080", "http://localhost:3000")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+### Ollama and Open WebUI, Separate Networks
+
+Here, `Ollama` and `Open WebUI` are deployed in separate Docker networks, potentially leading to connectivity issues.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Mac OS/Windows") {
+   Person(user, "User")
+   Boundary(b1, "Docker Desktop's Linux VM") {
+      Boundary(b2, "Network A") {
+         Component(openwebui, "Open WebUI", "Listening on port 8080")
+      }
+      Boundary(b3, "Network B") {
+         Component(ollama, "Ollama", "Listening on port 11434")
+      }
+   }
+}
+Rel(openwebui, ollama, "Unable to connect")
+Rel(user, openwebui, "Makes requests via exposed port -p 3000:8080", "http://localhost:3000")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+### Open WebUI in Host Network
+
+In this configuration, `Open WebUI` utilizes the host network, which impacts its ability to connect in certain environments.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Mac OS/Windows") {
+   Person(user, "User")
+   Boundary(b1, "Docker Desktop's Linux VM") {
+      Component(openwebui, "Open WebUI", "Listening on port 8080")
+   }
+}
+Rel(user, openwebui, "Unable to connect, host network is the VM's network")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+## Linux Setup Options 🐧
+
+### Ollama on Host, Open WebUI in Container (Linux)
+
+This diagram is specific to the Linux platform, with `Ollama` running on the host and `Open WebUI` deployed inside a Docker container.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Linux") {
+   Person(user, "User")
+   Boundary(b1, "Container Network") {
+      Component(openwebui, "Open WebUI", "Listening on port 8080")
+   }
+   Component(ollama, "Ollama", "Listening on port 11434")
+}
+Rel(openwebui, ollama, "Makes API calls via Docker proxy", "http://host.docker.internal:11434")
+Rel(user, openwebui, "Makes requests via exposed port -p 3000:8080", "http://localhost:3000")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+### Ollama and Open WebUI in Compose Stack (Linux)
+
+A set-up where both `Ollama` and `Open WebUI` reside within the same Docker Compose stack, allowing for straightforward networking on Linux.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Linux") {
+   Person(user, "User")
+   Boundary(b1, "Container Network") {
+      Boundary(b2, "Compose Stack") {
+         Component(openwebui, "Open WebUI", "Listening on port 8080")
+         Component(ollama, "Ollama", "Listening on port 11434")
+      }
+   }
+}
+Rel(openwebui, ollama, "Makes API calls via inter-container networking", "http://ollama:11434")
+Rel(user, openwebui, "Makes requests via exposed port -p 3000:8080", "http://localhost:3000")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+### Ollama and Open WebUI, Separate Networks (Linux)
+
+A scenario in which `Ollama` and `Open WebUI` are in different Docker networks under a Linux environment, which could hinder connectivity.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Linux") {
+   Person(user, "User")
+   Boundary(b2, "Container Network A") {
+      Component(openwebui, "Open WebUI", "Listening on port 8080")
+   }
+   Boundary(b3, "Container Network B") {
+      Component(ollama, "Ollama", "Listening on port 11434")
+   }
+}
+Rel(openwebui, ollama, "Unable to connect")
+Rel(user, openwebui, "Makes requests via exposed port -p 3000:8080", "http://localhost:3000")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+### Open WebUI in Host Network, Ollama on Host (Linux)
+
+An optimal layout where both `Open WebUI` and `Ollama` use the host’s network, facilitating seamless interaction on Linux systems.
+
+```mermaid
+C4Context
+Boundary(b0, "Hosting Machine - Linux") {
+   Person(user, "User")
+   Component(openwebui, "Open WebUI", "Listening on port 8080")
+   Component(ollama, "Ollama", "Listening on port 11434")
+}
+Rel(openwebui, ollama, "Makes API calls via localhost", "http://localhost:11434")
+Rel(user, openwebui, "Makes requests via listening port", "http://localhost:8080")
+UpdateRelStyle(user, openwebui, $offsetX="-100", $offsetY="-50")
+```
+
+Each setup addresses different deployment strategies and networking configurations to help you choose the best layout for your requirements.