Merge pull request #3533 from myhloli/dev

Update ModelScope link in README_zh-CN.md for MinerU2.5 release
2026-03-27 11:08:32 +07:00 · 2025-09-19 16:39:40 +08:00 · 2025-09-19 16:36:42 +08:00 · 2025-09-19 16:34:29 +08:00 · 2025-09-19 16:31:30 +08:00 · 2025-09-19 16:30:43 +08:00
3 changed files with 4 additions and 4 deletions
--- a/README.md
+++ b/README.md
@@ -44,7 +44,7 @@

 # Changelog

- 2025/09/19 2.5.1 Released
+- 2025/09/19 2.5.2 Released

  We are officially releasing MinerU2.5, currently the most powerful multimodal large model for document parsing.
  With only 1.2B parameters, MinerU2.5's accuracy on the OmniDocBench benchmark comprehensively surpasses top-tier multimodal models like Gemini 2.5 Pro, GPT-4o, and Qwen2.5-VL-72B. It also significantly outperforms leading specialized models such as dots.ocr, MonkeyOCR, and PP-StructureV3.
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -44,9 +44,9 @@

 # 更新记录

- 2025/09/19 2.5.1 发布
+- 2025/09/19 2.5.2 发布
  我们正式发布 MinerU2.5，当前最强文档解析多模态大模型。仅凭 1.2B 参数，MinerU2.5 在 OmniDocBench 文档解析评测中，精度已全面超越 Gemini2.5-Pro、GPT-4o、Qwen2.5-VL-72B等顶级多模态大模型，并显著领先于主流文档解析专用模型（如 dots.ocr, MonkeyOCR, PP-StructureV3 等）。
-  模型已发布至[HuggingFace](https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B)和[ModelScope](https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B)平台，欢迎大家下载使用！
+  模型已发布至[HuggingFace](https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B)和[ModelScope](https://modelscope.cn/models/opendatalab/MinerU2.5-2509-1.2B)平台，欢迎大家下载使用！
  - 核心亮点
    - 极致能效，性能SOTA: 以 1.2B 的轻量化规模，实现了超越百亿乃至千亿级模型的SOTA性能，重新定义了文档解析的能效比。
    - 先进架构，全面领先: 通过 “两阶段推理” (解耦布局分析与内容识别) 与 原生高分辨率架构 的结合，在布局分析、文本识别、公式识别、表格识别及阅读顺序五大方面均达到 SOTA 水平。
--- a/mineru/backend/vlm/vlm_middle_json_mkcontent.py
+++ b/mineru/backend/vlm/vlm_middle_json_mkcontent.py
@@ -54,7 +54,7 @@ def mk_blocks_to_markdown(para_blocks, make_mode, formula_enable, table_enable,
        elif para_type == BlockType.LIST:
            for block in para_block['blocks']:
                item_text = merge_para_with_text(block, formula_enable=formula_enable, img_buket_path=img_buket_path)
-                para_text += f"{item_text}\n"
+                para_text += f"{item_text}  \n"
        elif para_type == BlockType.TITLE:
            title_level = get_title_level(para_block)
            para_text = f'{"#" * title_level} {merge_para_with_text(para_block)}'
Author	SHA1	Message	Date
Xiaomeng Zhao	6aac639686	Merge pull request #3533 from myhloli/dev Update ModelScope link in README_zh-CN.md for MinerU2.5 release	2025-09-19 16:39:40 +08:00
myhloli	82f94a9a84	Update ModelScope link in README_zh-CN.md for MinerU2.5 release	2025-09-19 16:36:42 +08:00
Xiaomeng Zhao	d928334c61	Merge pull request #3532 from myhloli/dev Fix formatting in vlm_middle_json_mkcontent.py to ensure proper line breaks in list items	2025-09-19 16:34:29 +08:00
myhloli	ebad82bd8c	Update version in README to 2.5.2 for MinerU2.5 release	2025-09-19 16:31:30 +08:00
myhloli	b03c5fb449	Fix formatting in vlm_middle_json_mkcontent.py to ensure proper line breaks in list items	2025-09-19 16:30:43 +08:00