Merge pull request #4271 from myhloli/dev

Dev
This commit is contained in:
Xiaomeng Zhao
2025-12-30 17:58:45 +08:00
committed by GitHub
9 changed files with 74 additions and 55 deletions

View File

@@ -51,6 +51,6 @@ mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:3
### Installing Lightweight Client to Connect to OpenAI-compatible servers (for hybrid-http-client mode)
If you need to install a lightweight client on edge devices to connect to an OpenAI-compatible server for using hybrid mode, you can install the mineru pipeline extension package, which is relatively lightweight and can be used on devices with only CPU and network connectivity, while running faster on devices that support GPU acceleration.
```bash
uv pip install mineru[pipeline]
uv pip install "mineru[pipeline]"
mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000
```

View File

@@ -129,9 +129,13 @@ You can get the [Docker Deployment Instructions](./docker_deployment.md) in the
### Using MinerU
The simplest command line invocation is:
If your device meets the GPU acceleration requirements in the table above, you can use a simple command line for document parsing:
```bash
mineru -p <input_path> -o <output_path>
```
If your device does not meet the GPU acceleration requirements, you can specify the backend as `pipeline` to run in a pure CPU environment:
```bash
mineru -p <input_path> -o <output_path> -b pipeline
```
You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the [Usage Guide](../usage/index.md).

View File

@@ -103,9 +103,6 @@ Additionally, with the release of vlm 2.5, we have made some adjustments to the
## 2.2 - 2.4 Series Versions
<details>
<summary>View 2.2.x - 2.4.x version history</summary>
### 2.2.2 (2025/09/10)
- Fixed the issue where the new table recognition model would affect the overall parsing task when some table parsing failed
@@ -128,15 +125,10 @@ Additionally, with the release of vlm 2.5, we have made some adjustments to the
- Added `bbox` field (mapped to 0-1000 range) in the output `content_list.json`, making it convenient for users to directly obtain position information for each content block
- Removed the `pipeline_old_linux` installation option, no longer supporting legacy Linux systems such as `CentOS 7`, to provide better support for `uv`'s `sync`/`run` commands
</details>
---
## 2.1 Series Versions
<details>
<summary>View 2.1.x version history</summary>
### 2.1.10 (2025/08/01)
- Fixed an issue in the `pipeline` backend where block overlap caused the parsing results to deviate from expectations #3232
@@ -204,15 +196,10 @@ This is the first major update of MinerU 2, which includes a large number of new
- Updated the `pipeline` backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
- Introduced limited support for vertical text layout in the `pipeline` backend.
</details>
---
## 2.0 Series Versions
<details>
<summary>View 2.0.x version history</summary>
### 2.0.6 (2025/06/20)
- Fixed occasional parsing interruptions caused by invalid block content in `vlm` mode
@@ -260,15 +247,10 @@ To improve overall architectural rationality and long-term maintainability, this
- Python package name changed from `magic-pdf` to `mineru`, and the command-line tool changed from `magic-pdf` to `mineru`. Please update your scripts and command calls accordingly.
- For modular system design and ecosystem consistency considerations, MinerU 2.0 no longer includes the LibreOffice document conversion module. If you need to process Office documents, we recommend converting them to PDF format through an independently deployed LibreOffice service before proceeding with subsequent parsing operations.
</details>
---
## 1.x Series Historical Versions
<details>
<summary>View 1.x version history</summary>
### 1.3.12 (2025/05/24)
Added support for PPOCRv5 models, updated `ch_server` model to `PP-OCRv5_rec_server`, and `ch_lite` model to `PP-OCRv5_rec_mobile` (model update required)
@@ -418,15 +400,10 @@ This is our first official release, where we have introduced a completely new AP
- By introducing a new language recognition model, setting the `lang` configuration to `auto` during document parsing will automatically select the appropriate OCR language model, improving the accuracy of scanned document parsing.
</details>
---
## 0.x Series Historical Versions
<details>
<summary>View 0.x version history</summary>
### 0.10.0 (2024/11/22)
Introducing hybrid OCR text extraction capabilities:
@@ -482,5 +459,3 @@ Optimized dependency conflict issues and installation documentation
MinerU project's first open-source release
</details>

View File

@@ -0,0 +1,27 @@
# Reference Documentation
This section provides detailed reference materials for MinerU project. Here you can find technical specifications, API documentation, output file formats, and version history.
## Table of Contents
- [Output Files Documentation](./output_files.md) - Detailed explanation of all output files and their formats
- [Changelog](./changelog.md) - Version update history and release notes
## Documentation Overview
### Output Files Documentation
Understanding the output files generated by MinerU is crucial for effective use of the tool. The output files documentation provides:
- **Visual debugging files**: Help you understand the document parsing process
- **Structured data files**: Contain detailed parsing results for further processing
- **File format specifications**: Detailed descriptions of each output file type
### Changelog
The changelog documents the evolution of MinerU, including:
- **Version updates**: New features and improvements for each release
- **Bug fixes**: Issues resolved in each version
- **Breaking changes**: Important changes that may affect your usage
- **Deprecations**: Features that are being phased out

View File

@@ -49,6 +49,6 @@ mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:3
### 安装轻量版client连接兼容openai服务器使用 (适用hybrid-http-client模式)
如果您需要在边缘设备上安装轻量版的 client 端以连接兼容 openai 接口的服务端来使用hybrid模式可以安装mineru的pipeline扩展包相对较轻量可以在只有cpu和网络连接的设备上使用同时在支持gpu加速的设备上可以更快运行。
```bash
uv pip install mineru[pipeline]
uv pip install "mineru[pipeline]"
mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000
```

View File

@@ -134,9 +134,19 @@ MinerU提供了便捷的docker部署方式这有助于快速搭建环境并
### 使用 MinerU
最简单的命令行调用方式:
>[!TIP]
>默认使用托管在`huggingface`的模型进行解析,首次使用时会自动下载所需模型文件,后续使用将直接加载本地缓存的模型。如果您无法访问`huggingface`,可以通过以下命令切换至国内镜像源:
>```bash
>export MINERU_MODEL_SOURCE=modelscope
>```
如果您的设备满足上表中GPU加速的条件可以使用简单的命令行进行文档解析:
```bash
mineru -p <input_path> -o <output_path>
```
如果您的设备不满足GPU加速条件可以指定后端为`pipeline`以在纯CPU环境下运行:
```bash
mineru -p <input_path> -o <output_path> -b pipeline
```
您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析具体使用方法请参考[使用指南](../usage/index.md)。

View File

@@ -109,9 +109,6 @@
## 2.2 - 2.4 系列版本
<details>
<summary>查看 2.2.x - 2.4.x 版本历史</summary>
### 2.2.2 (2025/09/10)
- 修复新的表格识别模型在部分表格解析失败时影响整体解析任务的问题
@@ -134,15 +131,10 @@
- 在输出的`content_list.json`中增加了`bbox`字段(映射至0-1000范围内),方便用户直接获取每个内容块的位置信息
- 移除`pipeline_old_linux`安装可选项不再支持老版本的Linux系统如`Centos 7`等,以便对`uv``sync`/`run`等命令进行更好的支持
</details>
---
## 2.1 系列版本
<details>
<summary>查看 2.1.x 版本历史</summary>
### 2.1.10 (2025/08/01)
- 修复`pipeline`后端因block覆盖导致的解析结果与预期不符 #3232
@@ -210,15 +202,10 @@
- `pipeline`后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别平均精度涨幅超30%。[详情](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
- `pipeline`后端增加对竖排文本的有限支持
</details>
---
## 2.0 系列版本
<details>
<summary>查看 2.0.x 版本历史</summary>
### 2.0.6 (2025/06/20)
- 修复`vlm`模式下,某些偶发的无效块内容导致解析中断问题
@@ -266,15 +253,10 @@ MinerU 2.0 集成了我们最新研发的小参数量、高性能多模态文档
- Python 包名从 `magic-pdf` 更改为 `mineru`,命令行工具也由 `magic-pdf` 改为 `mineru`,请同步更新脚本与调用命令。
- 出于对系统模块化设计与生态一致性的考虑MinerU 2.0 已不再内置 LibreOffice 文档转换模块。如需处理 Office 文档,建议通过独立部署的 LibreOffice 服务先行转换为 PDF 格式,再进行后续解析操作。
</details>
---
## 1.x 系列历史版本
<details>
<summary>查看 1.x 版本历史</summary>
### 1.3.12 (2025/05/24)
增加ppocrv5模型的支持`ch_server`模型更新为`PP-OCRv5_rec_server``ch_lite`模型更新为`PP-OCRv5_rec_mobile`(需更新模型)
@@ -424,15 +406,10 @@ MinerU 2.0 集成了我们最新研发的小参数量、高性能多模态文档
- 通过引入全新的语言识别模型, 在文档解析中将 `lang` 配置为 `auto`即可自动选择合适的OCR语言模型提升扫描类文档解析的准确性。
</details>
---
## 0.x 系列历史版本
<details>
<summary>查看 0.x 版本历史</summary>
### 0.10.0 (2024/11/22)
通过引入混合OCR文本提取能力
@@ -487,6 +464,3 @@ MinerU 2.0 集成了我们最新研发的小参数量、高性能多模态文档
### 首次开源 (2024/07/05)
MinerU项目首次开源发布
</details>

View File

@@ -0,0 +1,27 @@
# 参考文档
本章节提供了 MinerU 项目的详细参考资料。在这里您可以找到技术规范、API 文档、输出文件格式说明以及版本历史记录。
## 目录
- [输出文件说明](./output_files.md) - 详细介绍所有输出文件及其格式
- [更新日志](./changelog.md) - 版本更新历史和发布说明
## 文档概览
### 输出文件说明
理解 MinerU 生成的输出文件对于有效使用工具至关重要。输出文件文档提供了:
- **可视化调试文件**:帮助您理解文档解析过程
- **结构化数据文件**:包含详细的解析结果,可用于进一步处理
- **文件格式规范**:每种输出文件类型的详细说明
### 更新日志
更新日志记录了 MinerU 的演进历程,包括:
- **版本更新**:每个版本的新功能和改进
- **错误修复**:每个版本中解决的问题
- **重大变更**:可能影响您使用的重要变更
- **功能弃用**:正在逐步淘汰的功能

View File

@@ -86,7 +86,9 @@ nav:
- CLI Tools: usage/cli_tools.md
- Advanced CLI Parameters: usage/advanced_cli_parameters.md
- Reference:
- Reference: reference/index.md
- Output File Format: reference/output_files.md
- Changelog: reference/changelog.md
- FAQ:
- FAQ: faq/index.md
- Demo: