mirror of
https://github.com/langgenius/dify-docs.git
synced 2026-03-27 13:28:32 +07:00
Add more notes about the extracted image url
This commit is contained in:
@@ -30,6 +30,10 @@ When quick-creating a knowledge base, you can upload local files as its data sou
|
|||||||
|
|
||||||
JPG, JPEG, PNG, and GIF images under 2 MB are automatically extracted as attachments to their corresponding chunks. These images can be managed independently and are returned alongside their chunks during retrieval.
|
JPG, JPEG, PNG, and GIF images under 2 MB are automatically extracted as attachments to their corresponding chunks. These images can be managed independently and are returned alongside their chunks during retrieval.
|
||||||
|
|
||||||
|
URLs of extracted images remain in the chunk text, but you can safely remove these URLs to keep the text clean—this won't affect the extracted images.
|
||||||
|
|
||||||
|
If you select a multimodal embedding model (marked with a **Vision** icon) in index settings, the extracted images will also be embedded and indexed for retrieval.
|
||||||
|
|
||||||
Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted.
|
Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted.
|
||||||
|
|
||||||
<Tip>
|
<Tip>
|
||||||
@@ -44,15 +48,11 @@ When quick-creating a knowledge base, you can upload local files as its data sou
|
|||||||
|
|
||||||
- Images embedded in DOCX files
|
- Images embedded in DOCX files
|
||||||
|
|
||||||
<Note>
|
<Tip>
|
||||||
Images embedded in other file types (e.g., PDF) can only be extracted by using appropriate document extraction plugins in [knowledge pipelines](/en/use-dify/knowledge/knowledge-pipeline/readme).
|
Images embedded in other file types (e.g., PDF) can be extracted by using appropriate document extraction plugins in [knowledge pipelines](/en/use-dify/knowledge/knowledge-pipeline/readme).
|
||||||
</Note>
|
</Tip>
|
||||||
|
|
||||||
- Images referenced via accessible URLs using the following Markdown syntax in any file type:
|
- Images referenced via accessible URLs using the following Markdown syntax in any file type:
|
||||||
|
|
||||||
- ``
|
- ``
|
||||||
- ``
|
- ``
|
||||||
|
|
||||||
<Tip>
|
|
||||||
If you select a multimodal embedding model (marked with a **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval.
|
|
||||||
</Tip>
|
|
||||||
@@ -196,7 +196,9 @@ You can choose Dify's Doc Extractor to process files, or select tools based on y
|
|||||||
|
|
||||||
<Accordion title="For images in documents">
|
<Accordion title="For images in documents">
|
||||||
|
|
||||||
Images in documents can be extracted using appropriate doc processors. Extracted images are attached to their corresponding chunks, can be managed independently, and are returned alongside those chunks during retrieval.
|
Images in documents can be extracted using appropriate document processors. Extracted images are attached to their corresponding chunks, can be managed independently, and are returned alongside those chunks during retrieval.
|
||||||
|
|
||||||
|
URLs of extracted images remain in the chunk text, but you can safely remove these URLs to keep the text clean—this won't affect the extracted images.
|
||||||
|
|
||||||
Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted.
|
Each chunk supports up to 10 image attachments; images beyond this limit will not be extracted.
|
||||||
|
|
||||||
@@ -213,7 +215,7 @@ If no images are extracted by the selected processor, Dify will automatically ex
|
|||||||
- Maximum number of attachments per chunk: `SINGLE_CHUNK_ATTACHMENT_LIMIT`
|
- Maximum number of attachments per chunk: `SINGLE_CHUNK_ATTACHMENT_LIMIT`
|
||||||
</Tip>
|
</Tip>
|
||||||
|
|
||||||
If you select a multimodal embedding model (marked with a **Vision** icon) in subsequent index settings, the extracted images will be embedded and indexed for retrieval.
|
If you select a multimodal embedding model (marked with a **Vision** icon) in index settings, the extracted images will also be embedded and indexed for retrieval.
|
||||||
|
|
||||||
</Accordion>
|
</Accordion>
|
||||||
|
|
||||||
|
|||||||
@@ -32,7 +32,7 @@ According to its chunk settings, every document is split into content chunks—t
|
|||||||
From the chunk list within a document, you can view and manage all its chunks to improve the retrieval efficiency and accuracy.
|
From the chunk list within a document, you can view and manage all its chunks to improve the retrieval efficiency and accuracy.
|
||||||
|
|
||||||
<Tip>
|
<Tip>
|
||||||
Click the document name in the upper—left corner to quickly switch between documents.
|
Click the document name in the upper-left corner to quickly switch between documents.
|
||||||
</Tip>
|
</Tip>
|
||||||
|
|
||||||

|

|
||||||
@@ -42,9 +42,9 @@ From the chunk list within a document, you can view and manage all its chunks to
|
|||||||
| Add | Add one or batch add multiple new chunks. <br/><br/>For documents chunked with Parent-child mode, both new parent and child chunks can be added. <Info>*Add chunks* is a paid feature on Dify Cloud. [Upgrade to Professional or Team](https://dify.ai/pricing) to use it.</Info>|
|
| Add | Add one or batch add multiple new chunks. <br/><br/>For documents chunked with Parent-child mode, both new parent and child chunks can be added. <Info>*Add chunks* is a paid feature on Dify Cloud. [Upgrade to Professional or Team](https://dify.ai/pricing) to use it.</Info>|
|
||||||
| Delete | Permanently remove a chunk. **Deletion cannot be undone**.|
|
| Delete | Permanently remove a chunk. **Deletion cannot be undone**.|
|
||||||
| Enable / Disable | Temporarily include or exclude a chunk from retrieval. Disabled chunks cannot be edited.|
|
| Enable / Disable | Temporarily include or exclude a chunk from retrieval. Disabled chunks cannot be edited.|
|
||||||
| Edit | Modify the content of a chunk. Edited chunks are marked **Edited**.<br/><br/>For documents chunked with Parent-child mode: <ul><li>When editing a parent chunk, you can choose to regenerate its child chunks or keep them unchanged.</li><li>Editing a child chunk does not update its parent chunk. </li></ul><Tip>When images in documents are extracted as chunk attachments, their URLs remain in the chunk text. Deleting these URLs won't affect the extracted image attachments.</Tip>|
|
| Edit | Modify the content of a chunk. Edited chunks are marked **Edited**.<br/><br/>For documents chunked with Parent-child mode: <ul><li>When editing a parent chunk, you can choose to regenerate its child chunks or keep them unchanged.</li><li>Editing a child chunk does not update its parent chunk. </li></ul>|
|
||||||
| Add / Edit / Delete Keywords | In knowledge bases using the Economical index method, you can add or modify keywords for each chunk to improve its retrievability. <br/><br/>Each chunk can have up to 10 keywords.|
|
| Add / Edit / Delete Keywords | In knowledge bases using the Economical index method, you can add or modify keywords for each chunk to improve its retrievability. <br/><br/>Each chunk can have up to 10 keywords.|
|
||||||
| Add / Delete Image Attachments | Delete images extracted from documents or upload new ones within their corresponding chunk.<br/><br/>Image attachments and their chunks can be edited independently without affecting each other. <Note> Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.<br/><br/>For self-hosted deployments, you can adjust this limit via the environment variable `SINGLE_CHUNK_ATTACHMENT_LIMIT`.</Note><Tip>To enable cross-modal retrieval—retrieving both text and images based on semantic relevance, choose a multimodal embedding model (marked with a **Vision** icon) for the knowledge base. <br/><br/>Image attachments will then be embedded and indexed for retrieval.</Tip>|
|
| Add / Delete Image Attachments | Delete images extracted from documents or upload new ones within their corresponding chunk.<br/><br/>URLs of extracted images remain in the chunk text, but you can safely remove these URLs to keep the text clean—this won't affect the extracted images. <Note> Each chunk can have up to 10 image attachments, which are returned alongside it during retrieval; images beyond this limit will not be extracted.<br/><br/>For self-hosted deployments, you can adjust this limit via the environment variable `SINGLE_CHUNK_ATTACHMENT_LIMIT`.</Note><Tip>If you select a multimodal embedding model (marked with a **Vision** icon), the extracted images will also be embedded and indexed for retrieval.</Tip>|
|
||||||
|
|
||||||
## Best Practices
|
## Best Practices
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user