diff --git a/en/use-dify/knowledge/create-knowledge/import-text-data/readme.mdx b/en/use-dify/knowledge/create-knowledge/import-text-data/readme.mdx index f29c9281..fc0136ec 100644 --- a/en/use-dify/knowledge/create-knowledge/import-text-data/readme.mdx +++ b/en/use-dify/knowledge/create-knowledge/import-text-data/readme.mdx @@ -54,5 +54,5 @@ When quick-creating a knowledge base, you can upload local files as its data sou - `![alt text](image_url "optional title")` - If you select a multimodal embedding model (indicated by the **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval. + If you select a multimodal embedding model (marked with a **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval. \ No newline at end of file diff --git a/en/use-dify/knowledge/create-knowledge/setting-indexing-methods.mdx b/en/use-dify/knowledge/create-knowledge/setting-indexing-methods.mdx index 8c9a0a84..539aca28 100644 --- a/en/use-dify/knowledge/create-knowledge/setting-indexing-methods.mdx +++ b/en/use-dify/knowledge/create-knowledge/setting-indexing-methods.mdx @@ -22,7 +22,7 @@ The knowledge base offers two index methods: **High-Quality** and **Economical** Think of these vectors as coordinates in a multi-dimensional space—the closer two points are, the more similar their meanings. This allows the system to find relevant information based on semantic similarity, not just exact keyword matches. - To enable cross-modal retrieval—retrieving both text and images based on semantic relevance—select a multimodal embedding model (marked with the **Vision** icon). Images extracted from documents will then be embedded and indexed for retrieval. + To enable cross-modal retrieval—retrieving both text and images based on semantic relevance—select a multimodal embedding model (marked with a **Vision** icon). Images extracted from documents will then be embedded and indexed for retrieval. Knowledge bases using such embedding models are labeled **Multimodal** on their cards. @@ -95,7 +95,7 @@ Both retrieval methods are supported in Dify’s knowledge base. The specific re **Rerank Model**: Disabled by default. When enabled, a third-party Rerank model will sort the text chunks returned by Vector Search to optimize results. This helps the LLM access more precise information and improve output quality. Before enabling this option, go to **Settings** → **Model Providers** and configure the Rerank model’s API key. - If the selected embedding model is multimodal, select a multimodal rerank model (indicated by the **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. + If the selected embedding model is multimodal, select a multimodal rerank model (marked with a **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. > Enabling this feature will consume tokens from the Rerank model. For more details, refer to the associated model’s pricing page. @@ -117,7 +117,7 @@ Both retrieval methods are supported in Dify’s knowledge base. The specific re **Rerank Model**: Disabled by default. When enabled, a third-party Rerank model will sort the text chunks returned by Full-Text Search to optimize results. This helps the LLM access more precise information and improve output quality. Before enabling this option, go to **Settings** → **Model Providers** and configure the Rerank model’s API key. - If the selected embedding model is multimodal, select a multimodal rerank model (indicated by the **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. + If the selected embedding model is multimodal, select a multimodal rerank model (marked with a **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. > Enabling this feature will consume tokens from the Rerank model. For more details, refer to the associated model’s pricing page. @@ -158,7 +158,7 @@ In this mode, you can specify **"Weight settings"** without needing to configure Disabled by default. When enabled, a third-party Rerank model will sort the text chunks returned by Hybrid Search to optimize results. This helps the LLM access more precise information and improve output quality. Before enabling this option, go to **Settings** → **Model Providers** and configure the Rerank model’s API key. - If the selected embedding model is multimodal, select a multimodal rerank model (indicated by the **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. + If the selected embedding model is multimodal, select a multimodal rerank model (marked with a **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. > Enabling this feature will consume tokens from the Rerank model. For more details, refer to the associated model’s pricing page. diff --git a/en/use-dify/knowledge/knowledge-pipeline/knowledge-pipeline-orchestration.mdx b/en/use-dify/knowledge/knowledge-pipeline/knowledge-pipeline-orchestration.mdx index e786019f..56973713 100644 --- a/en/use-dify/knowledge/knowledge-pipeline/knowledge-pipeline-orchestration.mdx +++ b/en/use-dify/knowledge/knowledge-pipeline/knowledge-pipeline-orchestration.mdx @@ -213,7 +213,7 @@ If no images are extracted by the selected processor, Dify will automatically ex - Maximum number of attachments per chunk: `SINGLE_CHUNK_ATTACHMENT_LIMIT` -If you select a multimodal embedding model (indicated by the **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval. +If you select a multimodal embedding model (marked with a **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval. @@ -398,7 +398,7 @@ The knowledge base provides two index methods: **High Quality** and **Economical The High Quality method uses embedding models to convert chunks into numerical vectors, helping to compress and store large amounts of information more effectively. This enables the system to find semantically relevant accurate answers even when the user's question wording doesn't exactly match the document. - To enable cross-modal retrieval—retrieving both text and images based on semantic relevance—select a multimodal embedding model (marked with the **Vision** icon). Images extracted from documents will then be embedded and indexed for retrieval. + To enable cross-modal retrieval—retrieving both text and images based on semantic relevance—select a multimodal embedding model (marked with a **Vision** icon). Images extracted from documents will then be embedded and indexed for retrieval. Knowledge bases using such embedding models are labeled **Multimodal** on their cards. @@ -419,7 +419,7 @@ In the Economical method, each block uses 10 keywords for retrieval without call | Economical | Inverted Index | Common search engine retrieval method, matches queries with key content | - If the selected embedding model is multimodal, select a multimodal rerank model (indicated by the **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. + If the selected embedding model is multimodal, select a multimodal rerank model (marked with a **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the retrieval results. You can also refer to the table below for information on configuring chunk structure, index methods, parameters, and retrieval settings. diff --git a/en/use-dify/knowledge/manage-knowledge/maintain-knowledge-documents.mdx b/en/use-dify/knowledge/manage-knowledge/maintain-knowledge-documents.mdx index 5b81a7ce..470f9cd1 100644 --- a/en/use-dify/knowledge/manage-knowledge/maintain-knowledge-documents.mdx +++ b/en/use-dify/knowledge/manage-knowledge/maintain-knowledge-documents.mdx @@ -44,7 +44,7 @@ From the chunk list within a document, you can view and manage all its chunks to | Enable / Disable | Temporarily include or exclude a chunk from retrieval. Disabled chunks cannot be edited.| | Edit | Modify the content of a chunk. Edited chunks are marked **Edited**.

For documents chunked with Parent-child mode: When images in documents are extracted as chunk attachments, their URLs remain in the chunk text. Deleting these URLs won't affect the extracted image attachments.| | Add / Edit / Delete Keywords | In knowledge bases using the Economical index method, you can add or modify keywords for each chunk to improve its retrievability.

Each chunk can have up to 10 keywords.| -| Add / Delete Image Attachments | Delete images extracted from documents or upload new ones within their corresponding chunk.

Image attachments and their chunks can be edited independently without affecting each other. Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.

For self-hosted deployments, you can adjust this limit via the environment variable `SINGLE_CHUNK_ATTACHMENT_LIMIT`.
To enable cross-modal retrieval—retrieving both text and images based on semantic relevance, choose a multimodal embedding model (indicated by the **Vision** icon) for the knowledge base.

Image attachments will then be embedded and indexed for retrieval.
| +| Add / Delete Image Attachments | Delete images extracted from documents or upload new ones within their corresponding chunk.

Image attachments and their chunks can be edited independently without affecting each other. Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.

For self-hosted deployments, you can adjust this limit via the environment variable `SINGLE_CHUNK_ATTACHMENT_LIMIT`.
To enable cross-modal retrieval—retrieving both text and images based on semantic relevance, choose a multimodal embedding model (marked with a **Vision** icon) for the knowledge base.

Image attachments will then be embedded and indexed for retrieval.
| ## Best Practices diff --git a/en/use-dify/nodes/knowledge-retrieval.mdx b/en/use-dify/nodes/knowledge-retrieval.mdx index 947ec8d6..834edb2f 100644 --- a/en/use-dify/nodes/knowledge-retrieval.mdx +++ b/en/use-dify/nodes/knowledge-retrieval.mdx @@ -54,7 +54,7 @@ Provide the query content that the node should search for in the selected knowle The **Query Images** option is available only when at least one multimodal knowledge base are added. - Such knowledge bases are marked with the **Vision** icon, indicating that they are using a multimodal embedding model. + Such knowledge bases are marked with a **Vision** icon, indicating that they are using a multimodal embedding model. ### Select Knowledge to Search @@ -64,7 +64,7 @@ Add one or more existing knowledge bases for the node to search for content rele When multiple knowledge bases are added, knowledge is first retrieved from all of them simultaneously, then combined and processed according to the [node-level retrieval settings](#configure-node-level-retrieval-settings). - Knowledge bases marked with the **Vision** icon support cross-modal retrieval—retrieving both text and images based on semantic relevance. + Knowledge bases marked with a **Vision** icon support cross-modal retrieval—retrieving both text and images based on semantic relevance. @@ -94,7 +94,7 @@ Further fine-tune how the node processes retrieval results after they are fetche - **Rerank Model**: The rerank model to re-score and reorder all the results based on their relevance to the query. - If any multimodal knowledge bases are added, select a multimodal rerank model (indicated by the **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the final output. + If any multimodal knowledge bases are added, select a multimodal rerank model (marked with a **Vision** icon) as well. Otherwise, retrieved images will be excluded from reranking and the final output. - **Top K**: The maximum number of top results to return after reranking. When a rerank model is selected, this value will be automatically adjusted based on the model's maximum input capacity (how much text the model can process at once). @@ -125,7 +125,7 @@ To use the retrieval results as context to answer user questions in an LLM node: 2. In the prompt field, reference both the `Context` variable and the user input variable (e.g., `userinput.query` in Chatflows). -3. (Optional) If the LLM supports vision capabilities (indicated by a **Vision** tag), enable **Vision** to let it interpret the retrieved images. +3. (Optional) If the LLM supports vision capabilities (marked with a **Vision** icon), enable **Vision** to let it interpret the retrieved images. Once **Vision** is enabled, the LLM automatically processes the retrieved images. You don't need to manually reference the `Context` variable again in the **Vision** input field.