Merge pull request #2355 from opendatalab/master

update version
Merge remote-tracking branch 'origin/master'
2026-03-27 19:18:34 +07:00 · 2025-04-23 18:50:01 +08:00 · 2025-04-23 18:48:46 +08:00 · 2025-04-23 18:48:32 +08:00 · 2025-04-23 10:41:07 +00:00 · 2025-04-23 18:38:42 +08:00
22 changed files with 15781 additions and 121 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -1,4 +1,4 @@
-name: Bug Report | 反馈 Bug
+name: 🐛 Bug Report
 description: Create a bug report for MinerU | MinerU 的 Bug 反馈
 labels: bug

@@ -6,14 +6,32 @@ labels: bug
 # empty string, Github seems to reject this .yml file.

 body:
+  - type: markdown
+    attributes:
+      value: |
+        Thank you for submitting a MinerU 🐛 Bug Report! | 感谢您提交 MinerU 🐛 Bug 反馈！
+
+  - type: checkboxes
+    attributes:
+      label: 🔎 Search before asking | 提交之前请先搜索
+      description: >
+        Please search the MinerU [Readme](https://github.com/opendatalab/MinerU), [Issues](https://github.com/opendatalab/MinerU/issues) and [Discussions](https://github.com/opendatalab/MinerU/discussions) to see if a similar bug report already exists.
+      options:
+        - label: I have searched the MinerU [Readme](https://github.com/opendatalab/MinerU) and found no similar bug report.
+          required: true
+        - label: I have searched the MinerU [Issues](https://github.com/opendatalab/MinerU/issues) and found no similar bug report.
+          required: true
+        - label: I have searched the MinerU [Discussions](https://github.com/opendatalab/MinerU/discussions) and found no similar bug report.
+          required: true

  - type: textarea
    id: description
    attributes:
      label: Description of the bug | 错误描述
      description: |
-        A clear and concise description of the bug. | 简单描述遇到的问题  
-        
+        Provide console output with error messages and/or screenshots of the bug. | 请提供详细报错信息或者截图
+      placeholder: |
+        💡 ProTip! Include as much information as possible (screenshots, logs, tracebacks etc.) to receive the most helpful response.
    validations:
      required: true
  
@@ -24,11 +42,12 @@ body:
      
      # Should not word-wrap this description here.
      description: |
-        * Explain the steps required to reproduce the bug. | 说明复现此错误所需的步骤。
-        * Include required code snippets, example files, etc. | 包含必要的代码片段、示例文件等。
-        * Describe what you expected to happen (if not obvious). | 描述你期望发生的情况。
-        * If applicable, add screenshots to help explain the problem. | 添加截图以帮助解释问题。
-        * Include any other information that could be relevant, for example information about the Python environment. | 包括任何其他可能相关的信息。
+        If you have questions about the parsing results or encounter errors during execution: | 如对解析结果有疑问或在运行中出现报错等异常:
+        * Provide a minimal reproducible example. | 请提供一个最小可复现的demo。 
+        * The demo should include the complete steps, code, and the PDF file to be parsed. | demo需要包含完整的操作步骤，代码，以及需要解析的PDF文件。
+        * When reporting parsing result anomalies and runtime errors, reproducible PDF files are essential. If the document is too large or confidential, you can print the problematic page(s) via the browser and submit the corresponding example file.
+        * 在反馈解析结果异常和运行时报错时，可复现的PDF文件是必不可少的，如文档过大或涉密，您可通过浏览器打印出出现问题的某一页或某几页再提交相应的示例文件。
+        
        
        For problems when building or installing MinerU: | 在构建或安装 MinerU 时遇到的问题:
        * Give the **exact** build/install commands that were run. | 提供**确切**的构建/安装命令。
@@ -44,9 +63,9 @@ body:


  - type: dropdown
-    id: os_name
+    id: os_mode
    attributes:
-      label: Operating system | 操作系统
+      label: Operating System Mode | 操作系统类型
      #multiple: true
      options:
        -
@@ -56,6 +75,22 @@ body:
    validations:
      required: true

+  - type: textarea
+    id: os_name_version
+    attributes:
+      label: Operating System Version| 操作系统版本
+      #multiple: true
+      description: |
+        * 如果您使用的是Linux系统，请提供Linux系统的**发行版名称**和**版本号**来帮助开发人员排查问题。 
+        * If you are using a Linux system, please provide the Linux distribution and version number to help developers troubleshoot the issue.
+        * 如果您使用的是Windows或MacOS系统，请提供操作系统的**版本号**来帮助开发人员排查问题。
+        * If you are using a Windows or MacOS system, please provide the version number of the operating system to help developers troubleshoot the issue.
+        * 例如：Ubuntu 22.04, CentOS 7.9, MacOS 15.1, Windows 11
+        * For example: Ubuntu 22.04, CentOS 7.9, MacOS 15.1, Windows 11.
+
+    validations:
+      required: true
+
  - type: dropdown
    id: python_version
    attributes:
@@ -94,6 +129,7 @@ body:
        -
        - cpu
        - cuda
+        - mps
        - npu
    validations:
      required: true
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1,11 @@
+blank_issues_enabled: false
+contact_links:
+  - name: 🙏 Q&A
+    url: https://github.com/opendatalab/MinerU/discussions/categories/q-a
+    about: Ask the community for help
+  - name: 💡 Feature requests and ideas
+    url: https://github.com/opendatalab/MinerU/discussions/categories/ideas
+    about: Share ideas for new features
+  - name: 🙌 Show and tell
+    url: https://github.com/opendatalab/MinerU/discussions/categories/show-and-tell
+    about: Show off something you've made
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -1,28 +0,0 @@
---
-name: Feature request | 功能需求
-about: Suggest an idea for this project | 提出一个有价值的idea
-title: ''
-labels: enhancement
-assignees: ''
-
---
-
-**Is your feature request related to a problem? Please describe.**
-**您的特性请求是否与某个问题相关？请描述。**
-A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
-对存在的问题进行清晰且简洁的描述。例如：我一直很困扰的是 [...]
-
-**Describe the solution you'd like**
-**描述您期望的解决方案**
-A clear and concise description of what you want to happen.
-清晰且简洁地描述您希望实现的内容。
-
-**Describe alternatives you've considered**
-**描述您已考虑的替代方案**
-A clear and concise description of any alternative solutions or features you've considered.
-清晰且简洁地描述您已经考虑过的任何替代解决方案。
-
-**Additional context**
-**提供更多细节**
-Add any other context or screenshots about the feature request here.
-请附上任何相关截图、链接或文件，以帮助我们更好地理解您的请求。
--- a/README.md
+++ b/README.md
@@ -48,6 +48,15 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte
 </div>

 # Changelog
+- 2025/04/23 1.3.8 Released
+  - The default `ocr` model (`ch`) has been updated to `PP-OCRv4_server_rec_doc` (model update required)
+    - `PP-OCRv4_server_rec_doc` is trained on a mix of more Chinese document data and PP-OCR training data, enhancing recognition capabilities for some traditional Chinese characters, Japanese, and special characters. It supports over 15,000 recognizable characters, improving text recognition in documents while also boosting general text recognition.
+    - [Performance comparison between PP-OCRv4_server_rec_doc, PP-OCRv4_server_rec, and PP-OCRv4_mobile_rec](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/text_recognition.html#ii-supported-model-list)
+    - Verified results show that the `PP-OCRv4_server_rec_doc` model significantly improves accuracy in both single-language (`Chinese`, `English`, `Japanese`, `Traditional Chinese`) and mixed-language scenarios, with speed comparable to `PP-OCRv4_server_rec`, making it suitable for most use cases.
+    - In a small number of pure English scenarios, the `PP-OCRv4_server_rec_doc` model may encounter word concatenation issues, whereas `PP-OCRv4_server_rec` performs better in such cases. Therefore, we have retained the `PP-OCRv4_server_rec` model, which users can invoke by passing the parameter `lang='ch_server'`(python api) or `--lang ch_server`(cli).
+- 2025/04/22 1.3.7 Released
+  - Fixed the issue where the `lang` parameter was ineffective during table parsing model initialization.
+  - Fixed the significant slowdown in OCR and table parsing speed in `cpu` mode.
 - 2025/04/16 1.3.4 Released
  - Slightly improved the speed of OCR detection by removing some unused blocks.
  - Fixed page-level sorting errors caused by footnotes in certain cases.
--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -47,6 +47,15 @@
 </div>

 # 更新记录
+- 2025/04/23 1.3.8 发布
+  - `ocr`默认模型(`ch`)更新为`PP-OCRv4_server_rec_doc`（需更新模型）
+    - `PP-OCRv4_server_rec_doc`是在`PP-OCRv4_server_rec`的基础上，在更多中文文档数据和PP-OCR训练数据的混合数据训练而成，增加了部分繁体字、日文、特殊字符的识别能力，可支持识别的字符为1.5万+，除文档相关的文字识别能力提升外，也同时提升了通用文字的识别能力。
+    - [PP-OCRv4_server_rec_doc/PP-OCRv4_server_rec/PP-OCRv4_mobile_rec 性能对比](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/text_recognition.html#_3)
+    - 经验证，`PP-OCRv4_server_rec_doc`模型在`中英日繁`单种语言或多种语言混合场景均有明显精度提升，且速度与`PP-OCRv4_server_rec`相当，适合绝大部分场景使用。
+    - `PP-OCRv4_server_rec_doc`在小部分纯英文场景可能会发生单词粘连问题，`PP-OCRv4_server_rec`则在此场景下表现更好，因此我们保留了`PP-OCRv4_server_rec`模型，用户可通过增加参数`lang='ch_server'`(python api)或`--lang ch_server`(命令行)调用。
+- 2025/04/22 1.3.7 发布
+  - 修复表格解析模型初始化时lang参数失效的问题
+  - 修复在`cpu`模式下ocr和表格解析速度大幅下降的问题
 - 2025/04/16 1.3.4 发布
  - 通过移除一些无用的块，小幅提升了ocr-det的速度
  - 修复部分情况下由footnote导致的页面内排序错误
--- a/docker/ascend_npu/Dockerfile
+++ b/docker/ascend_npu/Dockerfile
@@ -36,7 +36,7 @@ RUN /bin/bash -c "wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/m
    source /opt/mineru_venv/bin/activate && \
    pip3 install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple && \
    pip3 install torch==2.3.1 torchvision==0.18.1 -i https://mirrors.aliyun.com/pypi/simple && \
-    pip3 install -U magic-pdf[full] -i https://mirrors.aliyun.com/pypi/simple && \
+    pip3 install -U magic-pdf[full] 'numpy<2' decorator attrs absl-py cloudpickle ml-dtypes tornado einops -i https://mirrors.aliyun.com/pypi/simple && \
    wget https://gitee.com/ascend/pytorch/releases/download/v6.0.rc2-pytorch2.3.1/torch_npu-2.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl && \
    pip3 install torch_npu-2.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl"

--- a/magic_pdf/data/read_api.py
+++ b/magic_pdf/data/read_api.py
@@ -116,7 +116,7 @@ def read_local_office(path: str) -> list[PymuDocDataset]:
    shutil.rmtree(temp_dir)
    return ret

-def read_local_images(path: str, suffixes: list[str]=['.png', '.jpg']) -> list[ImageDataset]:
+def read_local_images(path: str, suffixes: list[str]=['.png', '.jpg', '.jpeg']) -> list[ImageDataset]:
    """Read images from path or directory.

    Args:
--- a/magic_pdf/libs/version.py
+++ b/magic_pdf/libs/version.py
@@ -1 +1 @@
-__version__ = "1.3.3"
+__version__ = "1.3.8"
--- a/magic_pdf/model/batch_analyze.py
+++ b/magic_pdf/model/batch_analyze.py
@@ -161,20 +161,13 @@ class BatchAnalyze:
            for table_res_dict in tqdm(table_res_list_all_page, desc="Table Predict"):
                _lang = table_res_dict['lang']
                atom_model_manager = AtomModelSingleton()
-                ocr_engine = atom_model_manager.get_atom_model(
-                    atom_model_name='ocr',
-                    ocr_show_log=False,
-                    det_db_box_thresh=0.5,
-                    det_db_unclip_ratio=1.6,
-                    lang=_lang
-                )
                table_model = atom_model_manager.get_atom_model(
                    atom_model_name='table',
                    table_model_name='rapid_table',
                    table_model_path='',
                    table_max_time=400,
                    device='cpu',
-                    ocr_engine=ocr_engine,
+                    lang=_lang,
                    table_sub_model_name='slanet_plus'
                )
                html_code, table_cell_bboxes, logic_points, elapse = table_model.predict(table_res_dict['table_img'])
--- a/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorch_paddle.py
+++ b/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorch_paddle.py
@@ -53,6 +53,12 @@ class PytorchPaddleOCR(TextSystem):
        args = parser.parse_args(args)

        self.lang = kwargs.get('lang', 'ch')
+
+        device = get_device()
+        if device == 'cpu' and self.lang in ['ch', 'ch_server']:
+            logger.warning("The current device in use is CPU. To ensure the speed of parsing, the language is automatically switched to ch_lite.")
+            self.lang = 'ch_lite'
+
        if self.lang in latin_lang:
            self.lang = 'latin'
        elif self.lang in arabic_lang:
@@ -74,7 +80,7 @@ class PytorchPaddleOCR(TextSystem):
        kwargs['rec_char_dict_path'] = os.path.join(root_dir, 'pytorchocr', 'utils', 'resources', 'dict', dict_file)
        # kwargs['rec_batch_num'] = 8

-        kwargs['device'] = get_device()
+        kwargs['device'] = device

        default_args = vars(args)
        default_args.update(kwargs)
--- a/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorchocr/utils/resources/arch_config.yaml
+++ b/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorchocr/utils/resources/arch_config.yaml
@@ -171,6 +171,31 @@ ch_PP-OCRv4_rec_server_infer:
          nrtr_dim: 384
          max_text_length: 25

+ch_PP-OCRv4_rec_server_doc_infer:
+  model_type: rec
+  algorithm: SVTR_HGNet
+  Transform:
+  Backbone:
+    name: PPHGNet_small
+  Head:
+    name: MultiHead
+    out_channels_list:
+      CTCLabelDecode: 15631
+    head_list:
+      - CTCHead:
+          Neck:
+            name: svtr
+            dims: 120
+            depth: 2
+            hidden_dims: 120
+            kernel_size: [ 1, 3 ]
+            use_guide: True
+          Head:
+            fc_decay: 0.00001
+      - NRTRHead:
+          nrtr_dim: 384
+          max_text_length: 25
+
 chinese_cht_PP-OCRv3_rec_infer:
  model_type: rec
  algorithm: SVTR
--- a/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorchocr/utils/resources/dict/ppocrv4_doc_dict.txt
+++ b/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorchocr/utils/resources/dict/ppocrv4_doc_dict.txt
--- a/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorchocr/utils/resources/models_config.yml
+++ b/magic_pdf/model/sub_modules/ocr/paddleocr2pytorch/pytorchocr/utils/resources/models_config.yml
@@ -3,10 +3,14 @@ lang:
    det: ch_PP-OCRv3_det_infer.pth
    rec: ch_PP-OCRv4_rec_infer.pth
    dict: ppocr_keys_v1.txt
-  ch:
+  ch_server:
    det: ch_PP-OCRv3_det_infer.pth
    rec: ch_PP-OCRv4_rec_server_infer.pth
    dict: ppocr_keys_v1.txt
+  ch:
+    det: ch_PP-OCRv3_det_infer.pth
+    rec: ch_PP-OCRv4_rec_server_doc_infer.pth
+    dict: ppocrv4_doc_dict.txt
  en:
    det: en_PP-OCRv3_det_infer.pth
    rec: en_PP-OCRv4_rec_infer.pth
--- a/magic_pdf/utils/office_to_pdf.py
+++ b/magic_pdf/utils/office_to_pdf.py
@@ -4,6 +4,8 @@ import platform
 from pathlib import Path
 import shutil

+from loguru import logger
+

 class ConvertToPdfError(Exception):
    def __init__(self, msg):
@@ -11,35 +13,24 @@ class ConvertToPdfError(Exception):
        super().__init__(self.msg)


-# Chinese font list
-REQUIRED_CHS_FONTS = ['SimSun', 'Microsoft YaHei', 'Noto Sans CJK SC']
-
-
 def check_fonts_installed():
    """Check if required Chinese fonts are installed."""
    system_type = platform.system()

-    if system_type == 'Windows':
-        # Windows: check fonts via registry or system font folder
-        font_dir = Path("C:/Windows/Fonts")
-        installed_fonts = [f.name for f in font_dir.glob("*.ttf")]
-        if any(font for font in REQUIRED_CHS_FONTS if any(font in f for f in installed_fonts)):
-            return True
-        raise EnvironmentError(
-            f"Missing Chinese font. Please install at least one of: {', '.join(REQUIRED_CHS_FONTS)}"
-        )
+    if system_type in ['Windows', 'Darwin']:
+        pass
    else:
-        # Linux/macOS: use fc-list
+        # Linux: use fc-list
        try:
            output = subprocess.check_output(['fc-list', ':lang=zh'], encoding='utf-8')
-            for font in REQUIRED_CHS_FONTS:
-                if font in output:
-                    return True
-            raise EnvironmentError(
-                f"Missing Chinese font. Please install at least one of: {', '.join(REQUIRED_CHS_FONTS)}"
-            )
-        except Exception as e:
-            raise EnvironmentError(f"Font detection failed. Please install 'fontconfig' and fonts: {str(e)}")
+            if output.strip():  # 只要有任何输出（非空）
+                return True
+            else:
+                logger.warning(
+                    f"No Chinese fonts were detected, the converted document may not display Chinese content properly."
+                )
+        except Exception:
+            pass


 def get_soffice_command():
--- a/projects/README.md
+++ b/projects/README.md
@@ -4,6 +4,6 @@

 - [llama_index_rag](./llama_index_rag/README.md): Build a lightweight RAG system based on llama_index
 - [gradio_app](./gradio_app/README.md): Build a web app based on gradio
- [web_demo](./web_demo/README.md): MinerU online [demo](https://opendatalab.com/OpenSourceTools/Extractor/PDF/) localized deployment version
+- ~~[web_demo](./web_demo/README.md): MinerU online [demo](https://opendatalab.com/OpenSourceTools/Extractor/PDF/) localized deployment version~~(Deprecated)
 - [web_api](./web_api/README.md): Web API Based on FastAPI
 - [multi_gpu](./multi_gpu/README.md): Multi-GPU parallel processing based on LitServe
--- a/projects/README_zh-CN.md
+++ b/projects/README_zh-CN.md
@@ -4,6 +4,6 @@

 - [llama_index_rag](./llama_index_rag/README_zh-CN.md): 基于 llama_index 构建轻量级 RAG 系统
 - [gradio_app](./gradio_app/README_zh-CN.md): 基于 Gradio 的 Web 应用
- [web_demo](./web_demo/README_zh-CN.md): MinerU在线[demo](https://opendatalab.com/OpenSourceTools/Extractor/PDF/)本地化部署版本
+- ~~[web_demo](./web_demo/README_zh-CN.md): MinerU在线[demo](https://opendatalab.com/OpenSourceTools/Extractor/PDF/)本地化部署版本~~(已过时)
 - [web_api](./web_api/README.md): 基于 FastAPI 的 Web API
 - [multi_gpu](./multi_gpu/README.md): 基于 LitServe 的多 GPU 并行处理
--- a/projects/gradio_app/app.py
+++ b/projects/gradio_app/app.py
@@ -158,7 +158,7 @@ devanagari_lang = [
        'hi', 'mr', 'ne', 'bh', 'mai', 'ang', 'bho', 'mah', 'sck', 'new', 'gom',  # noqa: E126
        'sa', 'bgc'
 ]
-other_lang = ['ch', 'en', 'korean', 'japan', 'chinese_cht', 'ta', 'te', 'ka']
+other_lang = ['ch', 'ch_lite', 'ch_server', 'en', 'korean', 'japan', 'chinese_cht', 'ta', 'te', 'ka']
 add_lang = ['latin', 'arabic', 'cyrillic', 'devanagari']

 # all_lang = ['', 'auto']
--- a/projects/gradio_app/examples/complex_layout.pdf
+++ b/projects/gradio_app/examples/complex_layout.pdf
--- a/projects/web_api/app.py
+++ b/projects/web_api/app.py
@@ -28,7 +28,7 @@ app = FastAPI()

 pdf_extensions = [".pdf"]
 office_extensions = [".ppt", ".pptx", ".doc", ".docx"]
-image_extensions = [".png", ".jpg"]
+image_extensions = [".png", ".jpg", ".jpeg"]

 class MemoryDataWriter(DataWriter):
    def __init__(self):
@@ -128,7 +128,7 @@ def process_file(
        Tuple[InferenceResult, PipeResult]: Returns inference result and pipeline result
    """

-    ds = Union[PymuDocDataset, ImageDataset]
+    ds: Union[PymuDocDataset, ImageDataset] = None
    if file_extension in pdf_extensions:
        ds = PymuDocDataset(file_bytes)
    elif file_extension in office_extensions:
--- a/signatures/version1/cla.json
+++ b/signatures/version1/cla.json
@@ -239,6 +239,14 @@
      "created_at": "2025-04-14T10:40:54Z",
      "repoId": 765083837,
      "pullRequestNo": 2226
+    },
+    {
+      "name": "vloum",
+      "id": 75369577,
+      "comment_id": 2811669681,
+      "created_at": "2025-04-17T03:54:59Z",
+      "repoId": 765083837,
+      "pullRequestNo": 2267
    }
  ]
 }
--- a/tests/test_cli/test_cli_sdk.py
+++ b/tests/test_cli/test_cli_sdk.py
@@ -323,44 +323,6 @@ class TestCli:
        logging.info(cmd)
        os.system(cmd)
    
-
-    @pytest.mark.P1
-    def test_local_magic_pdf_open_st_table(self):
-        """magic pdf cli open st table."""
-        time.sleep(2)
-        #pre_cmd = "cp ~/magic_pdf_st.json ~/magic-pdf.json"
-        value = {
-        "model": "struct_eqtable",
-        "enable": True,
-        "max_time": 400
-        }   
-        common.update_config_file(magic_pdf_config, "table-config", value)
-        pdf_path = os.path.join(pdf_dev_path, "pdf", "test_rearch_report.pdf")
-        common.delete_file(pdf_res_path)
-        cli_cmd = "magic-pdf -p %s -o %s" % (pdf_path, pdf_res_path)
-        os.system(cli_cmd)
-        res = common.check_html_table_exists(os.path.join(pdf_res_path, "test_rearch_report", "auto", "test_rearch_report.md"))
-        assert res is True
-  
-    @pytest.mark.P1
-    def test_local_magic_pdf_open_tablemaster_cuda(self):
-        """magic pdf cli open table master html table cuda mode."""
-        time.sleep(2)
-        #pre_cmd = "cp ~/magic_pdf_html.json ~/magic-pdf.json"
-        #os.system(pre_cmd)
-        value = {
-        "model": "tablemaster",
-        "enable": True,
-        "max_time": 400
-        }   
-        common.update_config_file(magic_pdf_config, "table-config", value)
-        pdf_path = os.path.join(pdf_dev_path, "pdf", "test_rearch_report.pdf")
-        common.delete_file(pdf_res_path)
-        cli_cmd = "magic-pdf -p %s -o %s" % (pdf_path, pdf_res_path)
-        os.system(cli_cmd)
-        res = common.check_html_table_exists(os.path.join(pdf_res_path, "test_rearch_report", "auto", "test_rearch_report.md"))
-        assert res is True
-    
    @pytest.mark.P1
    def test_local_magic_pdf_open_rapidai_table(self):
        """magic pdf cli open rapid ai table."""
@@ -370,6 +332,7 @@ class TestCli:
        value = {
        "model": "rapid_table",
        "enable": True,
+        "sub_model": "slanet_plus",
        "max_time": 400
        }   
        common.update_config_file(magic_pdf_config, "table-config", value)
@@ -397,6 +360,7 @@ class TestCli:
        os.system(cli_cmd)
        common.cli_count_folders_and_check_contents(os.path.join(pdf_res_path, "test_rearch_report", "auto"))

+    @pytest.mark.skip(reason="layoutlmv3废弃")
    @pytest.mark.P1
    def test_local_magic_pdf_layoutlmv3_yolo(self):
        """magic pdf cli open layoutlmv3."""
@@ -419,8 +383,9 @@ class TestCli:
        #pre_cmd = "cp ~/magic_pdf_html_table_cpu.json ~/magic-pdf.json"
        #os.system(pre_cmd)
        value = {
-        "model": "tablemaster",
-        "enable": False,
+        "model": "rapid_table",
+        "enable": True,
+        "sub_model": "slanet_plus",
        "max_time": 400
        }   
        common.update_config_file(magic_pdf_config, "table-config", value)
@@ -439,8 +404,9 @@ class TestCli:
        #pre_cmd = "cp ~/magic_pdf_close_table.json ~/magic-pdf.json"
        #os.system(pre_cmd)
        value = {
-        "model": "tablemaster",
+        "model": "rapid_table",
        "enable": False,
+        "sub_model": "slanet_plus",
        "max_time": 400
        }   
        common.update_config_file(magic_pdf_config, "table-config", value)
--- a/tests/unittest/test_table/test_rapidtable.py
+++ b/tests/unittest/test_table/test_rapidtable.py
@@ -1,4 +1,5 @@
 import unittest
+import os
 from PIL import Image
 from lxml import etree

@@ -8,7 +9,7 @@ from magic_pdf.model.sub_modules.table.rapidtable.rapid_table import RapidTableM

 class TestppTableModel(unittest.TestCase):
    def test_image2html(self):
-        img = Image.open("assets/table.jpg")
+        img = Image.open(os.path.join(os.path.dirname(__file__), "assets/table.jpg"))
        atom_model_manager = AtomModelSingleton()
        ocr_engine = atom_model_manager.get_atom_model(
            atom_model_name='ocr',
@@ -40,7 +41,7 @@ class TestppTableModel(unittest.TestCase):
        # 检查第一行数据
        first_row = tree.xpath('//table/tr[2]/td')
        assert len(first_row) == 5, "First row should have 5 cells"
-        assert first_row[0].text and first_row[0].text.strip() == "SegLink[26]", "First cell should be 'SegLink[26]'"
+        assert first_row[0].text and 'SegLink' in first_row[0].text.strip(), "First cell should be 'SegLink [26]'"
        assert first_row[1].text and first_row[1].text.strip() == "70.0", "Second cell should be '70.0'"
        assert first_row[2].text and first_row[2].text.strip() == "86.0", "Third cell should be '86.0'"
        assert first_row[3].text and first_row[3].text.strip() == "77.0", "Fourth cell should be '77.0'"
Author	SHA1	Message	Date
Xiaomeng Zhao	dde90293f1	Merge pull request #2355 from opendatalab/master update version	2025-04-23 18:50:01 +08:00
myhloli	a24b9ed8fd	Merge remote-tracking branch 'origin/master'	2025-04-23 18:48:46 +08:00
myhloli	e0dc6c8473	docs(README): update changelog for version 1.3.8 release	2025-04-23 18:48:32 +08:00
myhloli	801d3ade19	Update version.py with new version	2025-04-23 10:41:07 +00:00
Xiaomeng Zhao	6b7a861e8f	Merge pull request #2354 from opendatalab/release-1.3.8 Release 1.3.8	2025-04-23 18:38:42 +08:00
Xiaomeng Zhao	9fbaee9e89	Merge pull request #2353 from myhloli/dev test(table): update test_rapidtable.py to handle SegLink text variations	2025-04-23 18:27:20 +08:00
myhloli	61fa95d4e0	test(table): update test_rapidtable.py to handle SegLink text variations - Modify assertion for first cell text to check for 'SegLink' instead of exact match - This change accommodates variations in SegLink text format	2025-04-23 18:26:19 +08:00
Xiaomeng Zhao	5c232f0587	Merge pull request #2352 from myhloli/dev feat(ocr): add new Chinese OCR model and update language support	2025-04-23 18:15:25 +08:00
myhloli	45f5082613	refactor(ocr): update device parameter handling in paddleocr2pytorch - Replace get_device() function call with direct 'device' variable usage - Simplify device configuration in OCR model initialization	2025-04-23 18:13:58 +08:00
myhloli	4f88fcaa51	feat(ocr): add new Chinese OCR model and update language support - Add new Chinese OCR model (ch_PP-OCRv4_rec_server_doc_infer) for server-side use - Update language support in app.py to include new Chinese model - Modify models_config.yml to add new model configuration	2025-04-23 18:06:12 +08:00
Xiaomeng Zhao	3cf1ea1f5b	Merge pull request #2316 from opendatalab/master master->dev	2025-04-22 19:28:21 +08:00
myhloli	d874563e38	Update version.py with new version	2025-04-22 11:27:25 +00:00
Xiaomeng Zhao	55fcb7387f	Merge pull request #2315 from opendatalab/release-1.3.7 Release 1.3.7	2025-04-22 19:26:03 +08:00
Xiaomeng Zhao	f2169686e1	Merge pull request #2314 from myhloli/dev refactor(table): replace ocr_engine with lang in table model prediction	2025-04-22 19:25:00 +08:00
myhloli	9c4e779b91	fix(lang\|performance): resolve lang parameter issue and speed up OCR/table parsing - Fix lang parameter ineffectiveness during table parsing model initialization - Resolve significant slowdown in OCR and table parsing speed in CPU mode - Update changelog in README.md and README_zh-CN.md	2025-04-22 19:15:29 +08:00
myhloli	8d9070db10	fix(lang\|performance): resolve lang parameter issue and speed up OCR/table parsing - Fix lang parameter ineffectiveness during table parsing model initialization - Resolve significant slowdown in OCR and table parsing speed in CPU mode - Update changelog in README.md and README_zh-CN.md	2025-04-22 19:15:01 +08:00
myhloli	69cdea908d	fix(ocr): switch to ch_lite model for Chinese OCR on CPU - Automatically change to ch_lite model when using CPU for Chinese OCR - This modification improves performance on CPU devices	2025-04-22 19:12:35 +08:00
myhloli	1d1c7ba9ab	refactor(table): replace ocr_engine with lang in table model prediction - Remove OCR engine instantiation inside the loop - Pass language directly to the table model instead of OCR engine - Simplify code structure and improve readability	2025-04-22 18:55:10 +08:00
myhloli	4d5fd0ee55	Update version.py with new version	2025-04-21 06:45:36 +00:00
Xiaomeng Zhao	601b44bfe0	Merge pull request #2298 from opendatalab/release-1.3.6 Release 1.3.6	2025-04-21 14:37:23 +08:00
Xiaomeng Zhao	012327badb	Merge pull request #2297 from myhloli/dev feat: add support for JPEG images and update documentation	2025-04-21 14:26:35 +08:00
myhloli	fcb5660f6a	feat: add support for JPEG images and update documentation - Add '.jpeg' to the list of supported image extensions in app.py and read_api.py - Update projects READMEs to indicate that web_demo is deprecated	2025-04-21 14:22:23 +08:00
myhloli	d105d87cf5	Merge remote-tracking branch 'origin/dev' into dev	2025-04-18 10:56:36 +08:00
myhloli	619b3b6d32	docs(README): update bug report template to reference Readme instead of Docs - Update the bug report template to direct users to search the MinerU Readme instead of Docs - This change ensures users check the most relevant and up-to-date information source before reporting issues	2025-04-18 10:56:26 +08:00
Xiaomeng Zhao	6fbbe3e6f0	Merge pull request #2274 from opendatalab/dev docs: update issue templates and disable blank issues	2025-04-17 18:46:05 +08:00
Xiaomeng Zhao	a47b17cd88	Merge pull request #2273 from myhloli/dev docs: update issue templates and disable blank issues	2025-04-17 18:45:26 +08:00
myhloli	737d7d6eb9	docs: update issue templates and disable blank issues - Update bug report template with more detailed instructions and sections - Add operating system version field to bug report - Include support for MPS in device options - Disable blank issues and provide alternative contact links - Remove feature request template	2025-04-17 18:44:20 +08:00
Xiaomeng Zhao	3492744ce1	Merge pull request #2269 from dt-yy/dev update test case	2025-04-17 15:23:19 +08:00
dt-yy	a1fe370270	update test case	2025-04-17 15:21:41 +08:00
dt-yy	fea756fd3e	update test case	2025-04-17 14:34:54 +08:00
dt-yy	e98988920e	update test case	2025-04-17 14:24:58 +08:00
github-actions[bot]	19fd2cfa37	@vloum has signed the CLA in opendatalab/MinerU#2267	2025-04-17 03:55:12 +00:00
Xiaomeng Zhao	74f9978e02	Merge pull request #2266 from opendatalab/master master->dev	2025-04-17 11:42:23 +08:00
myhloli	0c9572c871	Update version.py with new version	2025-04-17 03:34:11 +00:00
Xiaomeng Zhao	8fb6794b95	Merge pull request #2265 from opendatalab/release-1.3.5 Release 1.3.5	2025-04-17 11:31:24 +08:00
Xiaomeng Zhao	af53a46311	Merge pull request #2264 from myhloli/dev refactor(office_to_pdf): simplify font checking and add logging	2025-04-17 11:29:20 +08:00
myhloli	2e5e55cfe2	refactor(office_to_pdf): simplify font checking and add logging - Remove specific Chinese font list and detailed font checking - Add logging warning if no Chinese fonts are detected - Make font checking more robust and less platform-specific	2025-04-17 10:52:08 +08:00
myhloli	658e6bc768	refactor(utils): comment out Chinese font check on Windows - Temporarily disable Chinese font check for Windows systems - This change allows bypassing the font check when the required fonts are not present	2025-04-17 00:54:28 +08:00
myhloli	4641264e12	build(docker): update magic-pdf installation and add dependencies - Update magic-pdf installation to include specific version with full dependencies - Add numpy, decorator, attrs, absl-py, cloudpickle, ml-dtypes, tornado, and einops as separate packages - Specify numpy version to be less than 2	2025-04-17 00:16:20 +08:00
Xiaomeng Zhao	4bd3381c92	Merge pull request #2256 from myhloli/dev fix(test_table): update image path to use relative path	2025-04-16 18:24:37 +08:00
myhloli	f5a56bf157	fix(test_table): update image path to use relative path - Replace hardcoded image path with dynamic path generation - Use os.path.join to create platform-independent file paths - Improve code maintainability and portability across different environments	2025-04-16 18:23:13 +08:00
Xiaomeng Zhao	78d11172e3	Merge pull request #2255 from opendatalab/master master->dev	2025-04-16 18:12:06 +08:00
myhloli	a2b07bfde4	Update version.py with new version	2025-04-16 10:02:13 +00:00
Xiaomeng Zhao	1b35f04453	Merge pull request #2252 from opendatalab/release-1.3.4 Release 1.3.4	2025-04-16 18:00:29 +08:00
Xiaomeng Zhao	8f3c178003	build(docker): add torch and torchvision installation build(docker): add torch and torchvision installation	2025-04-15 09:57:25 +08:00
Xiaomeng Zhao	27883619f5	Merge pull request #2231 from myhloli/dev build(docker): add torch and torchvision installation	2025-04-15 09:56:47 +08:00