mirror of
https://github.com/open-webui/docs.git
synced 2025-12-12 07:29:49 +07:00
Update env-configuration.mdx
This commit is contained in:
@@ -2167,25 +2167,94 @@ Note: this configuration assumes that AWS credentials will be available to your
|
||||
|
||||
- Type: `str`
|
||||
- Default: `http://docling:5001`
|
||||
- Description: Specifies the URL for the Docling server. Requires Docling version 1.0.0 or later.
|
||||
- Description: Specifies the URL for the Docling server. Requires Docling version 2.0.0 or later for full compatibility with the new parameter-based configuration system.
|
||||
- Persistence: This environment variable is a `PersistentConfig` variable.
|
||||
|
||||
#### `DOCLING_OCR_ENGINE`
|
||||
:::warning
|
||||
|
||||
**Docling 2.0.0+ Required**
|
||||
|
||||
The Docling integration has been refactored to use server-side parameter passing. If you are using Docling:
|
||||
|
||||
1. Upgrade to Docling server version 2.0.0 or later
|
||||
2. Migrate all individual `DOCLING_*` configuration variables to the `DOCLING_PARAMS` JSON object
|
||||
3. Remove all deprecated `DOCLING_*` environment variables from your configuration
|
||||
4. Add `DOCLING_API_KEY` if your server requires authentication
|
||||
|
||||
The old individual environment variables (`DOCLING_OCR_ENGINE`, `DOCLING_OCR_LANG`, etc.) are no longer supported and will be ignored.
|
||||
|
||||
:::
|
||||
|
||||
#### `DOCLING_API_KEY`
|
||||
|
||||
- Type: `str`
|
||||
- Default: `tesseract`
|
||||
- Description: Specifies the OCR engine used by Docling.
|
||||
Supported values include: `tesseract` (default), `easyocr`, `ocrmac`, `rapidocr`, and `tesserocr`.
|
||||
- Default: `None`
|
||||
- Description: Sets the API key for authenticating with the Docling server. Required when the Docling server has authentication enabled.
|
||||
- Persistence: This environment variable is a `PersistentConfig` variable.
|
||||
|
||||
#### `DOCLING_OCR_LANG`
|
||||
#### `DOCLING_PARAMS`
|
||||
|
||||
- Type: `str` (JSON)
|
||||
- Default: `{}`
|
||||
- Description: Specifies all Docling processing parameters in JSON format. This is the primary configuration method for Docling processing options. All previously individual Docling settings are now configured through this single JSON object.
|
||||
|
||||
**Supported Parameters:**
|
||||
- `do_ocr` (bool): Enable OCR processing
|
||||
- `force_ocr` (bool): Force OCR even when text layer exists
|
||||
- `ocr_engine` (str): OCR engine to use (`tesseract`, `easyocr`, `ocrmac`, `rapidocr`, `tesserocr`)
|
||||
- `ocr_lang` (str): OCR language codes (e.g., `eng,fra,deu,spa`)
|
||||
- `pdf_backend` (str): PDF processing backend
|
||||
- `table_mode` (str): Table extraction mode
|
||||
- `pipeline` (str): Processing pipeline to use
|
||||
- `do_picture_description` (bool): Enable image description generation
|
||||
- `picture_description_mode` (str): Mode for picture descriptions
|
||||
- `picture_description_local` (str): Local model for picture descriptions
|
||||
- `picture_description_api` (str): API endpoint for picture descriptions
|
||||
- `vlm_pipeline_model_api` (str): Vision-language model API configuration
|
||||
|
||||
- Example:
|
||||
```json
|
||||
{
|
||||
"do_ocr": true,
|
||||
"ocr_engine": "tesseract",
|
||||
"ocr_lang": "eng,fra,deu,spa",
|
||||
"force_ocr": false,
|
||||
"do_picture_description": true,
|
||||
"picture_description_mode": "api",
|
||||
"vlm_pipeline_model_api": "openai://gpt-4o"
|
||||
}
|
||||
```
|
||||
|
||||
- Type: `str`
|
||||
- Default: `eng,fra,deu,spa` (when using the default `tesseract` engine)
|
||||
- Description: Specifies the OCR language(s) to be used with the configured `DOCLING_OCR_ENGINE`.
|
||||
The format and available language codes depend on the selected OCR engine.
|
||||
- Persistence: This environment variable is a `PersistentConfig` variable.
|
||||
|
||||
:::info
|
||||
|
||||
**Migration from Individual Docling Variables**
|
||||
|
||||
If you were previously using individual `DOCLING_*` environment variables (such as `DOCLING_OCR_ENGINE`, `DOCLING_OCR_LANG`, etc.), these are now deprecated. You must migrate to using `DOCLING_PARAMS` as a single JSON configuration object.
|
||||
|
||||
**Example Migration:**
|
||||
```bash
|
||||
# Old configuration (deprecated)
|
||||
DOCLING_OCR_ENGINE=tesseract
|
||||
DOCLING_OCR_LANG=eng,fra
|
||||
DOCLING_DO_OCR=true
|
||||
|
||||
# New configuration (required)
|
||||
DOCLING_PARAMS='{"do_ocr": true, "ocr_engine": "tesseract", "ocr_lang": "eng,fra"}'
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
:::warning
|
||||
|
||||
When setting this environment variable in a `.env` file, ensure proper JSON formatting and escape quotes as needed:
|
||||
```
|
||||
DOCLING_PARAMS="{\"do_ocr\": true, \"ocr_engine\": \"tesseract\", \"ocr_lang\": \"eng,fra,deu,spa\"}"
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
## Retrieval Augmented Generation (RAG)
|
||||
|
||||
### Core Configuration
|
||||
|
||||
Reference in New Issue
Block a user