Files
open-webui-docs/docs/features/chat-features/autocomplete.md
DrMelone 227808f251 update
2025-12-20 20:57:29 +01:00

71 lines
3.2 KiB
Markdown

---
sidebar_position: 10
title: "Autocomplete"
---
# ✨ Autocomplete
Open WebUI offers an **AI-powered Autocomplete** feature that suggests text completions in real-time as you type your prompt. It acts like a "Copilot" for your chat input, helping you craft prompts faster using your configured task model.
## How It Works
When enabled, Open WebUI monitors your input in the chat box. When you pause typing, it sends your current text to a lightweight **Task Model**. This model predicts the likely next words or sentences, which appear as "ghost text" overlaying your input.
- **Accept Suggestion**: Press `Tab` (or the `Right Arrow` key) to accept the suggestion.
- **Reject/Ignore**: Simply keep typing to overwrite the suggestion.
:::info
**Performance Recommendation**
Autocomplete functionality relies heavily on the response speed of your **Task Model**. We recommend using a small, fast, **non-reasoning** model to ensure suggestions appear instantly.
**Recommended Models:**
- **Llama 3.2** (1B or 3B)
- **Qwen 3** (0.6B or 3B)
- **Gemma 3** (1B or 4B)
- **GPT-5 Nano** (Optimized for low latency)
Avoid using "Reasoning" models (e.g., o1, o3) or heavy Chain-of-Thought models for this feature, as the latency will make the autocomplete experience sluggish.
:::
## Configuration
The Autocomplete feature is controlled by a two-layer system: **Global** availability and **User** preference.
### 1. Global Configuration (Admin)
Admins control whether the autocomplete feature is available on the server.
### 1. Configuring Autocomplete (Global)
**Admin Panel Settings:**
Go to **Admin Settings > Interface > Task Model** and toggle **Autocomplete Generation**.
### 2. User Configuration (Personal)
Even if enabled globally, individual users can turn it off for themselves if they find it distracting.
- Go to **Settings > Interface**.
- Toggle **Autocomplete Generation**.
:::note
If the Admin has disabled Autocomplete globally, users will **not** be able to enable it in their personal settings.
:::
## Performance & Troubleshooting
### Why aren't suggestions appearing?
1. **Check Settings**: Ensure it is enabled in **both** Admin and User settings.
2. **Task Model**: Go to **Admin Settings > Interface** and verify a **Task Model** is selected. If no model is selected, the feature cannot generate predictions.
3. **Latency**: If your Task Model is large or running on slow hardware, predictions might arrive too late to be useful. Switch to a smaller model.
4. **Reasoning Models**: Ensure you are **not** using a "Reasoning" model (like o1 or o3), as their internal thought process creates excessive latency that breaks real-time autocomplete.
### Performance Impact
Autocomplete sends a request to your LLM essentially every time you pause typing (debounced).
- **Local Models**: This can consume significant GPU/CPU resources on the host machine.
- **API Providers**: This will generate a high volume of API calls (though usually with very short token counts). Be mindful of your provider's **Rate Limits** (Requests Per Minute/RPM and Tokens Per Minute/TPM) to avoid interruptions.
:::warning
For multi-user instances running on limited local hardware, we recommend **disabling** Autocomplete to prioritize resources for actual chat generation.
:::