Compare commits

...

27 Commits

Author SHA1 Message Date
Xiaomeng Zhao
45f8ad1d5c Merge pull request #4305 from opendatalab/release-2.7.1
Release 2.7.1
2026-01-06 14:47:23 +08:00
Xiaomeng Zhao
b69191ba2b Merge pull request #4304 from opendatalab/dev
Dev
2026-01-06 14:46:18 +08:00
Xiaomeng Zhao
0028514ced Merge pull request #4303 from myhloli/dev
Dev
2026-01-06 14:45:35 +08:00
myhloli
8d8daf6851 fix: add qwen-vl-utils dependency to pyproject.toml 2026-01-06 14:44:53 +08:00
myhloli
815280dd23 fix: update pdfminer.six dependency to resolve CVE-2025-64512 and improve EXIF handling 2026-01-06 14:42:48 +08:00
myhloli
7b52f92aea fix: update pdfminer.six dependency to resolve CVE-2025-64512 and improve EXIF handling 2026-01-06 14:41:47 +08:00
Xiaomeng Zhao
33543b76c9 Merge pull request #4301 from myhloli/dev
Dev
2026-01-06 14:10:08 +08:00
myhloli
ea5f8e98dd fix: update pdfminer.six version to 20251230 in pyproject.toml 2026-01-06 11:54:17 +08:00
myhloli
8996e06448 fix: restore hybrid analyze imports in common.py for backend processing 2026-01-06 11:51:31 +08:00
myhloli
bfb304ef1f fix: improve EXIF handling and save PDF logic in pdf_image_tools.py 2026-01-05 00:27:01 +08:00
Xiaomeng Zhao
17e6016b58 Merge pull request #4283 from kingdomad/fix/image-exif-rotation
fix: add EXIF orientation handling for image inputs
2026-01-04 18:31:06 +08:00
Xiaomeng Zhao
ba06cd14ef Update pdf_image_tools.py 2026-01-04 18:29:51 +08:00
Xiaomeng Zhao
0209ada8d0 Merge pull request #4287 from myhloli/dev
Dev
2026-01-04 15:26:16 +08:00
myhloli
e2140222bc docs: update VastAI.md with new version numbers and improved instructions 2026-01-04 15:24:23 +08:00
myhloli
d679d99192 docs: update heading from '快速开始' to '快速入门' for consistency 2026-01-04 15:16:15 +08:00
Xiaomeng Zhao
4bfcc0b808 Merge pull request #4286 from opendatalab/master
master->dev
2026-01-04 15:12:00 +08:00
Xiaomeng Zhao
ead29489ff Merge pull request #4285 from myhloli/dev
docs: update navigation and terminology in documentation for clarity
2026-01-04 15:11:29 +08:00
myhloli
c01e35b4c6 docs: update navigation and terminology in documentation for clarity 2026-01-04 15:10:37 +08:00
Xiaomeng Zhao
a89249069c Merge pull request #4284 from myhloli/dev
Dev
2026-01-04 14:34:15 +08:00
myhloli
2fc395bcff docs: add reference section to mkdocs.yml for improved documentation structure 2026-01-04 14:33:32 +08:00
史提芬达
0ca244ad62 fix: add EXIF orientation handling for image inputs 2026-01-04 13:41:55 +08:00
myhloli
8acc7dd326 Merge remote-tracking branch 'origin/dev' into dev 2025-12-31 16:57:13 +08:00
myhloli
1cde3fe5ad fix: add additional continuation markers for improved table merging 2025-12-31 16:57:00 +08:00
Xiaomeng Zhao
0a4c87fc22 Merge pull request #4273 from myhloli/dev
fix: update table rows for mineru, mineru-api, and mineru-gradio to reflect correct engine names
2025-12-30 18:52:41 +08:00
myhloli
12d803079f fix: update table rows for mineru, mineru-api, and mineru-gradio to reflect correct engine names 2025-12-30 18:49:52 +08:00
myhloli
8c4b3ef3a2 Update version.py with new version 2025-12-30 10:21:16 +00:00
Xiaomeng Zhao
ed6894c178 Merge pull request #4272 from opendatalab/release-2.7.0
Release 2.7.0
2025-12-30 18:08:29 +08:00
14 changed files with 167 additions and 161 deletions

View File

@@ -45,6 +45,11 @@
# Changelog
- 2026/01/06 2.7.1 Release
- fix bug: #4300
- Updated pdfminer.six dependency version to resolve [CVE-2025-64512](https://github.com/advisories/GHSA-wf5f-4jwr-ppcp)
- Support automatic correction of input image exif orientation to improve OCR recognition accuracy #4283
- 2025/12/30 2.7.0 Release
- Simplified installation process. No need to separately install `vlm` acceleration engine dependencies. Using `uv pip install mineru[all]` during installation will install all optional backend dependencies.
- Added new `hybrid` backend, which combines the advantages of `pipeline` and `vlm` backends. Built on vlm, it integrates some capabilities of pipeline, adding extra extensibility on top of high accuracy:

View File

@@ -45,6 +45,11 @@
# 更新记录
- 2026/01/06 2.7.1 发布
- fix bug: #4300
- 更新pdfminer.six的依赖版本以解决 [CVE-2025-64512](https://github.com/advisories/GHSA-wf5f-4jwr-ppcp)
- 支持输入图像的exif方向自动校正提升OCR识别效果 #4283
- 2025/12/30 2.7.0 发布
- 简化安装流程,现在不再需要单独安装`vlm`加速引擎依赖包,安装时使用`uv pip install mineru[all]`即可安装所有可选后端的依赖包。
- 增加全新后端`hybrid`,该后端结合了`pipeline``vlm`后端的优势在vlm的基础上融入了pipeline的部分能力在高精度的基础上增加了额外的扩展性

View File

@@ -1,4 +1,4 @@
# 快速开始
# 快速入门
如果遇到任何安装问题,请先查询 [FAQ](../faq/index.md)

View File

@@ -105,65 +105,50 @@ docker run -u root --name mineru_docker --privileged=true \
</thead>
<tbody>
<tr>
<td rowspan="4">命令行工具(mineru)</td>
<td rowspan="3">命令行工具(mineru)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td rowspan="4">fastapi服务(mineru-api)</td>
<td rowspan="3">fastapi服务(mineru-api)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td rowspan="4">gradio界面(mineru-gradio)</td>
<td rowspan="3">gradio界面(mineru-gradio)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>

View File

@@ -82,65 +82,50 @@ docker run --ipc host \
</thead>
<tbody>
<tr>
<td rowspan="4">命令行工具(mineru)</td>
<td rowspan="3">命令行工具(mineru)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>🟡</td>
<td>🟡</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td rowspan="4">fastapi服务(mineru-api)</td>
<td rowspan="3">fastapi服务(mineru-api)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>🟡</td>
<td>🟡</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td rowspan="4">gradio界面(mineru-gradio)</td>
<td rowspan="3">gradio界面(mineru-gradio)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>🟡</td>
<td>🟡</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>

View File

@@ -73,65 +73,50 @@ docker run --privileged=true \
</thead>
<tbody>
<tr>
<td rowspan="4">命令行工具(mineru)</td>
<td rowspan="3">命令行工具(mineru)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td rowspan="4">fastapi服务(mineru-api)</td>
<td rowspan="3">fastapi服务(mineru-api)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td rowspan="4">gradio界面(mineru-gradio)</td>
<td rowspan="3">gradio界面(mineru-gradio)</td>
<td>pipeline</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>&lt;vlm/hybrid&gt;-auto-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-&lt;engine_name&gt;-engine</td>
<td>🟢</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>&lt;vlm/hybrid&gt;-http-client</td>
<td>🟢</td>
<td>🟢</td>
</tr>

View File

@@ -14,21 +14,21 @@
cpu: Hygon C86-4G
gpu: VA16 / VA1L / VA10L
torch: 2.8.0+cpu
torch-vacc: 1.3.3.626
torch-vacc: 1.3.3.777
vllm: 0.11.1.dev0+gb8b302cde.d20251030.cpu
vllm-vacc: 0.11.0.626
driver: 00.25.12.02 d3_3_v2_9_a3_1 3ef7cf3 20251202
vllm-vacc: 0.11.0.777
driver: 00.25.12.30 d3_3_v2_9_a3_1 a76bf37 20251230
docker: 28.1.1
```
## 3. 环境准备
- 获取Docker镜像
- 获取vllm_vacc基础镜像
```bash
sudo docker pull harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP1
sudo docker pull harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP2
```
- 启动Docker容器
- 启动容器
```bash
sudo docker run -it \
--privileged=true \
@@ -36,23 +36,7 @@
--name vllm_service \
--ipc=host \
--network=host \
harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP1 bash
```
>[!TIP]
> - 镜像内已包含`torch/vllm`等相关依赖
> - 和`NVIDIA`硬件下`CUDA_VISIBLE_DEVICES`类似;在`VastAI`硬件中可以使用`VACC_VISIBLE_DEVICES`指定`可见计算卡ID`,如`-e VACC_VISIBLE_DEVICES=0,1,2,3`
> - 需指定适当的`--shm-size`虚拟内存
## 4. MinerU功能
>[!NOTE]
> - `VastAI`加速卡仅支持使用`vlm-vllm-engine`和`vlm-http-client`形式进行`VLM`模型推理加速
- 进入容器
```bash
sudo docker exec -it vllm_service bash
harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP2 bash
```
- 安装MinerU
@@ -60,6 +44,10 @@
- 参考官方文档安装:[README_zh-CN.md#安装-mineru](https://github.com/opendatalab/MinerU/blob/master/README_zh-CN.md#安装-mineru)
```bash
# 启动容器
# sudo docker exec -it vllm_service bash
# 可选pypi源
# https://mirrors.163.com/pypi/simple/
# https://mirrors.aliyun.com/pypi/simple/
# https://pypi.mirrors.ustc.edu.cn/simple/
@@ -68,26 +56,42 @@
# 通过源码安装MinerU
git clone https://github.com/opendatalab/MinerU.git
git checkout eed479eb56bba93ee99c1a8c255d509bd2f837e5
git checkout 8c4b3ef3a20b11ddac9903f25124d24ea82639b5
pip install -e .[core] -i https://mirrors.aliyun.com/pypi/simple
# 使用pip安装MinerU
pip install -U "mineru[core]" -i https://mirrors.aliyun.com/pypi/simple
# 使用pip安装MinerU
pip install -U "mineru[core]==2.7.0" -i https://mirrors.aliyun.com/pypi/simple
```
> [!NOTE]
> - `vllm_vacc`基础镜像内已包含`torch/vllm`等相关依赖
> - 截至`2025/12/31``VastAI`已支持`MinerU`至最新版本`2.7.0``master分支8c4b3ef3`
> - 和`NVIDIA`硬件下`CUDA_VISIBLE_DEVICES`类似;在`VastAI`硬件中可以使用`VACC_VISIBLE_DEVICES`指定`可见计算卡ID`,如`-e VACC_VISIBLE_DEVICES=0,1,2,3`
> - 需指定适当的`--shm-size`虚拟内存
## 4. MinerU功能
> [!NOTE]
> - `VastAI`加速卡仅支持使用`vlm-auto-engine`和`vlm-http-client`形式进行`VLM`模型推理加速
- 进入容器
```bash
sudo docker exec -it vllm_service bash
```
- 使用MinerU
- 模型准备,参考官方介绍:[model_source.md](https://github.com/opendatalab/MinerU/blob/master/docs/zh/usage/model_source.md)
- 方式一:`vlm-vllm-engine`
- 方式一:`vlm-auto-engine`
```bash
export MINERU_MODEL_SOURCE=modelscope
# step1, 以`vlm-vllm-engine`方式启动MinerU解析任务
mineru -p /path/to/demo/pdfs/demo1.pdf \
# step1, 以`vlm-auto-engine`方式启动MinerU解析任务
mineru -p image.png \
-o ./output \
-b vlm-vllm-engine \
-b vlm-auto-engine \
--http-timeout 1200 \
--tensor-parallel-size 2 \
--enforce_eager \
@@ -108,7 +112,7 @@
--served-model-name MinerU2.5-2509-1.2B
# step2以`vlm-http-client`方式启动MinerU解析任务
mineru -p /path/to/demo/pdfs/demo1.pdf \
mineru -p demo/pdfs/demo1.pdf \
-o ./output \
-b vlm-http-client \
-u http://127.0.0.1:8090 \
@@ -116,8 +120,7 @@
```
>[!NOTE]
> - 截至`2025/12/23``VastAI`已支持`MinerU`至最新版本`2.6.8``master分支eed479eb`
> [!NOTE]
> - 注意在执行任意与`vllm`相关命令需追加`--enforce_eager`参数
@@ -140,17 +143,17 @@
<td>🔴</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>hybrid-http-client</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-vllm-engine</td>
<td>hybrid-auto-engine</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-auto-engine</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-lmdeploy-client</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
@@ -161,17 +164,17 @@
<td>🔴</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>hybrid-http-client</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-vllm-engine</td>
<td>hybrid-auto-engine</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-auto-engine</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-lmdeploy-engine</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
@@ -182,17 +185,17 @@
<td>🔴</td>
</tr>
<tr>
<td>vlm-transformers</td>
<td>hybrid-http-client</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-vllm-engine</td>
<td>hybrid-auto-engine</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-auto-engine</td>
<td>🟢</td>
</tr>
<tr>
<td>vlm-lmdeploy-engine</td>
<td>🔴</td>
</tr>
<tr>
<td>vlm-http-client</td>
<td>🟢</td>
@@ -202,18 +205,19 @@
<td>🟢</td>
</tr>
<tr>
<td colspan="2">Tensor并行 (--tensor-parallel-size/--tp)</td>
<td>🟢 仅支持tp1/tp2</td>
<td colspan="2">Tensor并行 (--tensor-parallel-size)</td>
<td>🟢</td>
</tr>
<tr>
<td colspan="2">数据并行 (--data-parallel-size/--dp)</td>
<td colspan="2">数据并行 (--data-parallel-size)</td>
<td>🔴</td>
</tr>
</tbody>
</table>
注:
🟢: 支持运行较稳定精度与Nvidia GPU基本一致
🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
🔴: 不支持,无法运行,或精度存在较大差异
> [!NOTE]
> - 🟢: 支持运行较稳定精度与NVIDIA GPU基本一致
> - 🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
> - 🔴: 不支持,无法运行,或精度存在较大差异
> - `vlm-auto-engine`VastAI仅支持vLLM后端

View File

@@ -4,10 +4,10 @@
## 目录
- 本地部署
* [快速使用](./quick_usage.md) - 快速上手和基本使用
* [基础使用](./quick_usage.md) - 快速上手和基本使用
* [模型源配置](./model_source.md) - 模型源的详细配置说明
* [命令行工具](./cli_tools.md) - 命令行工具的详细参数说明
* [进阶优化参数](./advanced_cli_parameters.md) - 一些适配命令行工具的进阶参数说明
* [命令行进阶参数](./advanced_cli_parameters.md) - 一些适配命令行工具的进阶参数说明
- 其他加速卡适配(🚀官方支持/❤️社区贡献)
* [昇腾 Ascend](acceleration_cards/Ascend.md) 🚀
* [平头哥 T-Head](acceleration_cards/THead.md) 🚀

View File

@@ -17,8 +17,6 @@ from mineru.utils.pdf_image_tools import images_bytes_to_pdf_bytes
from mineru.backend.vlm.vlm_middle_json_mkcontent import union_make as vlm_union_make
from mineru.backend.vlm.vlm_analyze import doc_analyze as vlm_doc_analyze
from mineru.backend.vlm.vlm_analyze import aio_doc_analyze as aio_vlm_doc_analyze
from mineru.backend.hybrid.hybrid_analyze import doc_analyze as hybrid_doc_analyze
from mineru.backend.hybrid.hybrid_analyze import aio_doc_analyze as aio_hybrid_doc_analyze
from mineru.utils.pdf_page_id import get_end_page_id
if os.getenv("MINERU_LMDEPLOY_DEVICE", "") == "maca":
@@ -326,6 +324,7 @@ def _process_hybrid(
server_url=None,
**kwargs,
):
from mineru.backend.hybrid.hybrid_analyze import doc_analyze as hybrid_doc_analyze
"""同步处理hybrid后端逻辑"""
if not backend.endswith("client"):
server_url = None
@@ -378,8 +377,8 @@ async def _async_process_hybrid(
server_url=None,
**kwargs,
):
from mineru.backend.hybrid.hybrid_analyze import aio_doc_analyze as aio_hybrid_doc_analyze
"""异步处理hybrid后端逻辑"""
if not backend.endswith("client"):
server_url = None

View File

@@ -5,7 +5,7 @@ from io import BytesIO
import numpy as np
import pypdfium2 as pdfium
from loguru import logger
from PIL import Image
from PIL import Image, ImageOps
from mineru.data.data_reader_writer import FileBasedDataWriter
from mineru.utils.check_sys_env import is_windows_environment
@@ -41,19 +41,23 @@ def pdf_page_to_image(page: pdfium.PdfPage, dpi=200, image_type=ImageType.PIL) -
return image_dict
def _load_images_from_pdf_worker(pdf_bytes, dpi, start_page_id, end_page_id, image_type):
def _load_images_from_pdf_worker(
pdf_bytes, dpi, start_page_id, end_page_id, image_type
):
"""用于进程池的包装函数"""
return load_images_from_pdf_core(pdf_bytes, dpi, start_page_id, end_page_id, image_type)
return load_images_from_pdf_core(
pdf_bytes, dpi, start_page_id, end_page_id, image_type
)
def load_images_from_pdf(
pdf_bytes: bytes,
dpi=200,
start_page_id=0,
end_page_id=None,
image_type=ImageType.PIL,
timeout=None,
threads=4,
pdf_bytes: bytes,
dpi=200,
start_page_id=0,
end_page_id=None,
image_type=ImageType.PIL,
timeout=None,
threads=4,
):
"""带超时控制的 PDF 转图片函数,支持多进程加速
@@ -77,7 +81,7 @@ def load_images_from_pdf(
dpi,
start_page_id,
get_end_page_id(end_page_id, len(pdf_doc)),
image_type
image_type,
), pdf_doc
else:
if timeout is None:
@@ -116,7 +120,7 @@ def load_images_from_pdf(
dpi,
range_start,
range_end,
image_type
image_type,
)
futures.append((range_start, future))
@@ -163,7 +167,14 @@ def load_images_from_pdf_core(
return images_list
def cut_image(bbox: tuple, page_num: int, page_pil_img, return_path, image_writer: FileBasedDataWriter, scale=2):
def cut_image(
bbox: tuple,
page_num: int,
page_pil_img,
return_path,
image_writer: FileBasedDataWriter,
scale=2,
):
"""从第page_num页的page中根据bbox进行裁剪出一张jpg图片返回图片路径 save_path需要同时支持s3和本地,
图片存放在save_path下文件名是:
{page_num}_{bbox[0]}_{bbox[1]}_{bbox[2]}_{bbox[3]}.jpg , bbox内数字取整。"""
@@ -197,7 +208,6 @@ def get_crop_img(bbox: tuple, pil_img, scale=2):
def get_crop_np_img(bbox: tuple, input_img, scale=2):
if isinstance(input_img, Image.Image):
np_img = np.asarray(input_img)
elif isinstance(input_img, np.ndarray):
@@ -212,17 +222,27 @@ def get_crop_np_img(bbox: tuple, input_img, scale=2):
int(bbox[3] * scale),
)
return np_img[scale_bbox[1]:scale_bbox[3], scale_bbox[0]:scale_bbox[2]]
return np_img[scale_bbox[1] : scale_bbox[3], scale_bbox[0] : scale_bbox[2]]
def images_bytes_to_pdf_bytes(image_bytes):
# 内存缓冲区
pdf_buffer = BytesIO()
# 载入并转换所有图像为 RGB 模式
image = Image.open(BytesIO(image_bytes)).convert("RGB")
image = Image.open(BytesIO(image_bytes))
# 根据 EXIF 信息自动转正(处理手机拍摄的带 Orientation 标记的图片)
image = ImageOps.exif_transpose(image) or image
# 只在必要时转换
if image.mode != "RGB":
image = image.convert("RGB")
# 第一张图保存为 PDF其余追加
image.save(pdf_buffer, format="PDF", save_all=True)
image.save(
pdf_buffer,
format="PDF",
# save_all=True
)
# 获取 PDF bytes 并重置指针(可选)
pdf_bytes = pdf_buffer.getvalue()

View File

@@ -9,7 +9,14 @@ from mineru.utils.char_utils import full_to_half
from mineru.utils.enum_class import BlockType, SplitFlag
CONTINUATION_MARKERS = ["(续)", "(续表)", "(continued)", "(cont.)"]
CONTINUATION_MARKERS = [
"(续)",
"(续表)",
"(续上表)",
"(continued)",
"(cont.)",
"(contd)",
]
def calculate_table_total_columns(soup):

View File

@@ -1 +1 @@
__version__ = "2.6.8"
__version__ = "2.7.0"

View File

@@ -93,7 +93,18 @@ nav:
- FAQ: faq/index.md
- Demo:
- Demo: demo/index.md
- Quick Start:
- Quick Start: quick_start/index.md
- Extension Modules: quick_start/extension_modules.md
- Docker Deployment: quick_start/docker_deployment.md
- Usage:
- Usage: usage/index.md
- Quick Usage: usage/quick_usage.md
- Model Source: usage/model_source.md
- CLI Tools: usage/cli_tools.md
- Advanced CLI Parameters: usage/advanced_cli_parameters.md
- Reference:
- Reference: reference/index.md
- Output File Format: reference/output_files.md
- Changelog: reference/changelog.md
- FAQ:
@@ -119,13 +130,13 @@ plugins:
build: true
nav_translations:
Home: 主页
Quick Start: 快速开始
Quick Start: 快速入门
Extension Modules: 扩展模块安装
Docker Deployment: Docker部署
Usage: 使用方法
Quick Usage: 快速使用
Usage: 使用指南
Quick Usage: 基础使用
CLI Tools: 命令行工具
Model Source: 模型源
Model Source: 模型源配置
Advanced CLI Parameters: 命令行进阶参数
FAQ: 常见问题解答
Reference: 参考资料

View File

@@ -21,7 +21,7 @@ dependencies = [
"click>=8.1.7",
"loguru>=0.7.2",
"numpy>=1.21.6",
"pdfminer.six==20250506",
"pdfminer.six>=20251230",
"tqdm>=4.67.1",
"requests",
"httpx",
@@ -94,10 +94,10 @@ core = [
"mineru[pipeline]",
"mineru[api]",
"mineru[gradio]",
"mineru[mlx] ; sys_platform == 'darwin'",
]
all = [
"mineru[core]",
"mineru[mlx] ; sys_platform == 'darwin'",
"mineru[vllm] ; sys_platform == 'linux'",
"mineru[lmdeploy] ; sys_platform == 'windows'",
]