mirror of
https://github.com/open-webui/docs.git
synced 2026-01-04 10:46:26 +07:00
71 lines
3.2 KiB
Markdown
71 lines
3.2 KiB
Markdown
---
|
|
sidebar_position: 10
|
|
title: "Autocomplete"
|
|
---
|
|
|
|
# ✨ Autocomplete
|
|
|
|
Open WebUI offers an **AI-powered Autocomplete** feature that suggests text completions in real-time as you type your prompt. It acts like a "Copilot" for your chat input, helping you craft prompts faster using your configured task model.
|
|
|
|
## How It Works
|
|
|
|
When enabled, Open WebUI monitors your input in the chat box. When you pause typing, it sends your current text to a lightweight **Task Model**. This model predicts the likely next words or sentences, which appear as "ghost text" overlaying your input.
|
|
|
|
- **Accept Suggestion**: Press `Tab` (or the `Right Arrow` key) to accept the suggestion.
|
|
- **Reject/Ignore**: Simply keep typing to overwrite the suggestion.
|
|
|
|
:::info
|
|
**Performance Recommendation**
|
|
|
|
Autocomplete functionality relies heavily on the response speed of your **Task Model**. We recommend using a small, fast, **non-reasoning** model to ensure suggestions appear instantly.
|
|
|
|
**Recommended Models:**
|
|
- **Llama 3.2** (1B or 3B)
|
|
- **Qwen 3** (0.6B or 3B)
|
|
- **Gemma 3** (1B or 4B)
|
|
- **GPT-5 Nano** (Optimized for low latency)
|
|
|
|
Avoid using "Reasoning" models (e.g., o1, o3) or heavy Chain-of-Thought models for this feature, as the latency will make the autocomplete experience sluggish.
|
|
:::
|
|
|
|
## Configuration
|
|
|
|
The Autocomplete feature is controlled by a two-layer system: **Global** availability and **User** preference.
|
|
|
|
### 1. Global Configuration (Admin)
|
|
|
|
Admins control whether the autocomplete feature is available on the server.
|
|
|
|
### 1. Configuring Autocomplete (Global)
|
|
|
|
**Admin Panel Settings:**
|
|
Go to **Admin Settings > Interface > Task Model** and toggle **Autocomplete Generation**.
|
|
|
|
### 2. User Configuration (Personal)
|
|
|
|
Even if enabled globally, individual users can turn it off for themselves if they find it distracting.
|
|
|
|
- Go to **Settings > Interface**.
|
|
- Toggle **Autocomplete Generation**.
|
|
|
|
:::note
|
|
If the Admin has disabled Autocomplete globally, users will **not** be able to enable it in their personal settings.
|
|
:::
|
|
|
|
## Performance & Troubleshooting
|
|
|
|
### Why aren't suggestions appearing?
|
|
1. **Check Settings**: Ensure it is enabled in **both** Admin and User settings.
|
|
2. **Task Model**: Go to **Admin Settings > Interface** and verify a **Task Model** is selected. If no model is selected, the feature cannot generate predictions.
|
|
3. **Latency**: If your Task Model is large or running on slow hardware, predictions might arrive too late to be useful. Switch to a smaller model.
|
|
4. **Reasoning Models**: Ensure you are **not** using a "Reasoning" model (like o1 or o3), as their internal thought process creates excessive latency that breaks real-time autocomplete.
|
|
|
|
### Performance Impact
|
|
Autocomplete sends a request to your LLM essentially every time you pause typing (debounced).
|
|
- **Local Models**: This can consume significant GPU/CPU resources on the host machine.
|
|
- **API Providers**: This will generate a high volume of API calls (though usually with very short token counts). Be mindful of your provider's **Rate Limits** (Requests Per Minute/RPM and Tokens Per Minute/TPM) to avoid interruptions.
|
|
|
|
:::warning
|
|
For multi-user instances running on limited local hardware, we recommend **disabling** Autocomplete to prioritize resources for actual chat generation.
|
|
:::
|