Add more notes about the extracted image url

This commit is contained in:
Riskey
2025-12-26 11:48:58 +08:00
parent 47b663d99e
commit a82cdffc83
3 changed files with 15 additions and 13 deletions

View File

@@ -30,6 +30,10 @@ When quick-creating a knowledge base, you can upload local files as its data sou
JPG, JPEG, PNG, and GIF images under 2 MB are automatically extracted as attachments to their corresponding chunks. These images can be managed independently and are returned alongside their chunks during retrieval. JPG, JPEG, PNG, and GIF images under 2 MB are automatically extracted as attachments to their corresponding chunks. These images can be managed independently and are returned alongside their chunks during retrieval.
URLs of extracted images remain in the chunk text, but you can safely remove these URLs to keep the text clean—this won't affect the extracted images.
If you select a multimodal embedding model (marked with a **Vision** icon) in index settings, the extracted images will also be embedded and indexed for retrieval.
Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted. Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted.
<Tip> <Tip>
@@ -44,15 +48,11 @@ When quick-creating a knowledge base, you can upload local files as its data sou
- Images embedded in DOCX files - Images embedded in DOCX files
<Note> <Tip>
Images embedded in other file types (e.g., PDF) can only be extracted by using appropriate document extraction plugins in [knowledge pipelines](/en/use-dify/knowledge/knowledge-pipeline/readme). Images embedded in other file types (e.g., PDF) can be extracted by using appropriate document extraction plugins in [knowledge pipelines](/en/use-dify/knowledge/knowledge-pipeline/readme).
</Note> </Tip>
- Images referenced via accessible URLs using the following Markdown syntax in any file type: - Images referenced via accessible URLs using the following Markdown syntax in any file type:
- `![alt text](image_url)` - `![alt text](image_url)`
- `![alt text](image_url "optional title")` - `![alt text](image_url "optional title")`
<Tip>
If you select a multimodal embedding model (marked with a **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval.
</Tip>

View File

@@ -196,7 +196,9 @@ You can choose Dify's Doc Extractor to process files, or select tools based on y
<Accordion title="For images in documents"> <Accordion title="For images in documents">
Images in documents can be extracted using appropriate doc processors. Extracted images are attached to their corresponding chunks, can be managed independently, and are returned alongside those chunks during retrieval. Images in documents can be extracted using appropriate document processors. Extracted images are attached to their corresponding chunks, can be managed independently, and are returned alongside those chunks during retrieval.
URLs of extracted images remain in the chunk text, but you can safely remove these URLs to keep the text clean—this won't affect the extracted images.
Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted. Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted.
@@ -213,7 +215,7 @@ If no images are extracted by the selected processor, Dify will automatically ex
- Maximum number of attachments per chunk: `SINGLE_CHUNK_ATTACHMENT_LIMIT` - Maximum number of attachments per chunk: `SINGLE_CHUNK_ATTACHMENT_LIMIT`
</Tip> </Tip>
If you select a multimodal embedding model (marked with a **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval. If you select a multimodal embedding model (marked with a **Vision** icon) in index settings, the extracted images will also be embedded and indexed for retrieval.
</Accordion> </Accordion>

View File

@@ -32,7 +32,7 @@ According to its chunk settings, every document is split into content chunks—t
From the chunk list within a document, you can view and manage all its chunks to improve the retrieval efficiency and accuracy. From the chunk list within a document, you can view and manage all its chunks to improve the retrieval efficiency and accuracy.
<Tip> <Tip>
Click the document name in the upperleft corner to quickly switch between documents. Click the document name in the upper-left corner to quickly switch between documents.
</Tip> </Tip>
![Manage Knowledge Chunks](/images/manage_document_chunks.png) ![Manage Knowledge Chunks](/images/manage_document_chunks.png)
@@ -42,9 +42,9 @@ From the chunk list within a document, you can view and manage all its chunks to
| Add | Add one or batch add multiple new chunks. <br/><br/>For documents chunked with Parent-child mode, both new parent and child chunks can be added. <Info>*Add chunks* is a paid feature on Dify Cloud. [Upgrade to Professional or Team](https://dify.ai/pricing) to use it.</Info>| | Add | Add one or batch add multiple new chunks. <br/><br/>For documents chunked with Parent-child mode, both new parent and child chunks can be added. <Info>*Add chunks* is a paid feature on Dify Cloud. [Upgrade to Professional or Team](https://dify.ai/pricing) to use it.</Info>|
| Delete | Permanently remove a chunk. **Deletion cannot be undone**.| | Delete | Permanently remove a chunk. **Deletion cannot be undone**.|
| Enable / Disable | Temporarily include or exclude a chunk from retrieval. Disabled chunks cannot be edited.| | Enable / Disable | Temporarily include or exclude a chunk from retrieval. Disabled chunks cannot be edited.|
| Edit | Modify the content of a chunk. Edited chunks are marked **Edited**.<br/><br/>For documents chunked with Parent-child mode: <ul><li>When editing a parent chunk, you can choose to regenerate its child chunks or keep them unchanged.</li><li>Editing a child chunk does not update its parent chunk. </li></ul><Tip>When images in documents are extracted as chunk attachments, their URLs remain in the chunk text. Deleting these URLs won't affect the extracted image attachments.</Tip>| | Edit | Modify the content of a chunk. Edited chunks are marked **Edited**.<br/><br/>For documents chunked with Parent-child mode: <ul><li>When editing a parent chunk, you can choose to regenerate its child chunks or keep them unchanged.</li><li>Editing a child chunk does not update its parent chunk. </li></ul>|
| Add / Edit / Delete Keywords | In knowledge bases using the Economical index method, you can add or modify keywords for each chunk to improve its retrievability. <br/><br/>Each chunk can have up to 10 keywords.| | Add / Edit / Delete Keywords | In knowledge bases using the Economical index method, you can add or modify keywords for each chunk to improve its retrievability. <br/><br/>Each chunk can have up to 10 keywords.|
| Add / Delete Image Attachments | Delete images extracted from documents or upload new ones within their corresponding chunk.<br/><br/>Image attachments and their chunks can be edited independently without affecting each other. <Note> Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.<br/><br/>For self-hosted deployments, you can adjust this limit via the environment variable `SINGLE_CHUNK_ATTACHMENT_LIMIT`.</Note><Tip>To enable cross-modal retrieval—retrieving both text and images based on semantic relevance, choose a multimodal embedding model (marked with a **Vision** icon) for the knowledge base. <br/><br/>Image attachments will then be embedded and indexed for retrieval.</Tip>| | Add / Delete Image Attachments | Delete images extracted from documents or upload new ones within their corresponding chunk.<br/><br/>URLs of extracted images remain in the chunk text, but you can safely remove these URLs to keep the text clean—this won't affect the extracted images. <Note> Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.<br/><br/>For self-hosted deployments, you can adjust this limit via the environment variable `SINGLE_CHUNK_ATTACHMENT_LIMIT`.</Note><Tip>If you select a multimodal embedding model (marked with a **Vision** icon), the extracted images will also be embedded and indexed for retrieval.</Tip>|
## Best Practices ## Best Practices