Commit Graph

  • 1864c54c8c fix: reset list state in DocxConverter to prevent order disruption when processing tables myhloli 2026-02-27 19:49:21 +08:00
  • 6a8d27246b feat: enhance heading detection logic in DocxConverter to accurately identify multi-level lists for title block conversion myhloli 2026-02-27 19:29:35 +08:00
  • 4bcf377669 fix: adjust title block level in DocxConverter for correct hierarchy myhloli 2026-02-27 18:39:34 +08:00
  • d2fd2af9fd feat: detect heading-style list numIds for title block conversion in DocxConverter myhloli 2026-02-27 18:36:28 +08:00
  • a215f2ee69 feat: enhance text block parsing to support font styles and improve hyperlink formatting in DocxConverter myhloli 2026-02-27 15:06:55 +08:00
  • 637dba5d8d @marswen has signed the CLA in opendatalab/MinerU#4555 github-actions[bot] 2026-02-27 06:26:42 +00:00
  • 031f6ec3bf fix: adjust iframe height and enhance HTML label visibility in document preview myhloli 2026-02-26 21:01:22 +08:00
  • 0fa13c87e8 feat: add HTML minification method to DocxConverter for cleaner output myhloli 2026-02-26 20:46:21 +08:00
  • 89683d024e feat: inject OMML equations into tables during DOCX to HTML conversion myhloli 2026-02-26 20:41:59 +08:00
  • 20527e3fa4 feat: enhance image handling in DocxConverter and HTML processing to support base64 images and improve table rendering myhloli 2026-02-26 19:36:30 +08:00
  • 446a965d16 fix: correct string formatting in image_to_b64str function for base64 encoding myhloli 2026-02-26 16:51:17 +08:00
  • 45a9da0d85 fix: enhance image handling in DocxConverter to support anchored images and adjust text placement myhloli 2026-02-26 16:42:02 +08:00
  • e10a87277a fix: update file preview URL structure to use gradio_api for improved access myhloli 2026-02-26 15:28:57 +08:00
  • b64a41ff97 fix: 添加文件预览功能,支持根据文件类型动态显示文档预览或隐藏选项 myhloli 2026-02-26 14:59:31 +08:00
  • afd892fd8c fix: 修改索引跨度的目标锚点格式为列表,以支持多目标链接 myhloli 2026-02-26 14:36:50 +08:00
  • c698f1e92f fix: 增加索引块的 Markdown 转换支持,并优化标题锚点生成 myhloli 2026-02-26 12:28:08 +08:00
  • 9300aef884 fix: 修正 HTML 表格的 colspan 不一致问题并增强数学模式符号处理 myhloli 2026-02-26 02:44:06 +08:00
  • f18d6ac7f9 fix: enhance math-mode symbol handling in latex_dict and omml for improved LaTeX output myhloli 2026-02-26 01:51:34 +08:00
  • 594fcaaf0f fix: enhance TOC processing with flat list detection and level correction myhloli 2026-02-25 22:08:45 +08:00
  • 6f493fbb9a fix: add index block handling and parsing for improved document structure myhloli 2026-02-25 21:34:47 +08:00
  • 4537e09991 fix: enhance office file handling and improve image base64 conversion in markdown myhloli 2026-02-25 19:54:41 +08:00
  • 2ad134c8aa fix: refactor list merging functions for improved markdown conversion and structure myhloli 2026-02-25 19:14:20 +08:00
  • a3b9ab01ce fix: enhance image handling in docx_converter and model_output_to_middle_json for better format support myhloli 2026-02-25 16:48:30 +08:00
  • 86116144e3 fix: refactor markdown conversion functions for improved title handling and content merging myhloli 2026-02-25 15:42:26 +08:00
  • efbb8384e1 fix: enhance section numbering extraction logic in model_output_to_middle_json.py myhloli 2026-02-25 12:45:27 +08:00
  • 5d56c93559 fix: ensure section end detection in docx_converter.py is accurate myhloli 2026-02-24 18:37:40 +08:00
  • 92570f1896 fix: remove unused hybrid analyze imports from common.py myhloli 2026-02-24 17:32:53 +08:00
  • ff9d8b23be fix: add bbox inclusion option for subject-object association logic myhloli 2026-02-24 16:52:49 +08:00
  • 61557b00a7 Merge pull request #24 from myhloli/add_docx Xiaomeng Zhao 2026-02-24 16:37:49 +08:00
  • 3a678548d9 Merge branch 'dev' into add_docx Xiaomeng Zhao 2026-02-24 16:37:29 +08:00
  • 39464aaa6a Merge remote-tracking branch 'origin/add_docx' into add_docx myhloli 2026-02-24 16:28:20 +08:00
  • 0c66f933ac feat: remove unused list block methods to streamline DOCX processing myhloli 2026-02-24 16:28:07 +08:00
  • 8e0cf05222 Update pyproject.toml Xiaomeng Zhao 2026-02-24 16:24:34 +08:00
  • a7634b9fcd 更新mineru/utils/enum_class.py Xiaomeng Zhao 2026-02-24 16:23:16 +08:00
  • eeab175f5c Merge pull request #4544 from myhloli/dev Xiaomeng Zhao 2026-02-24 16:02:36 +08:00
  • d5fed5ec0e feat: update file paths in DOCX processing scripts for improved accessibility myhloli 2026-02-24 15:59:51 +08:00
  • c19c9e53ff feat: update DOCX processing to support grouped images and enhance text handling in paragraphs myhloli 2026-02-24 15:49:47 +08:00
  • 15a1c373aa fix: normalize title block content formatting in hybrid and vlm magic models myhloli 2026-02-24 12:33:42 +08:00
  • 9ec96f17f2 feat: enhance DOCX processing to classify captions based on preceding table or image blocks myhloli 2026-02-24 11:12:13 +08:00
  • 2e64d632db feat: enhance DOCX processing by adding bbox inclusion option and removing unused block types myhloli 2026-02-09 18:39:31 +08:00
  • db26572058 feat: update DOCX processing to unify content handling and streamline block type management myhloli 2026-02-09 18:21:15 +08:00
  • a12610fb3e Merge pull request #4526 from myhloli/dev Xiaomeng Zhao 2026-02-09 17:44:40 +08:00
  • 97f257a8ab Merge pull request #4525 from myhloli/dev Xiaomeng Zhao 2026-02-09 17:43:10 +08:00
  • 53aad4c900 fix: improve formatting of VastAI reference in index.md myhloli 2026-02-09 17:41:50 +08:00
  • 345c46a457 fix: update documentation to include Biren platform details myhloli 2026-02-09 17:38:15 +08:00
  • c2fa06f606 feat: enhance DOCX processing to support inline equations and hyperlinks in text blocks myhloli 2026-02-09 17:29:11 +08:00
  • e460f33c95 Merge pull request #4523 from boshi91/dev Xiaomeng Zhao 2026-02-09 16:14:06 +08:00
  • e9091876b6 feat: add Biren platform documentation for vLLM support boshi91 2026-02-09 16:04:19 +08:00
  • c68dc3682a Merge pull request #4518 from myhloli/dev Xiaomeng Zhao 2026-02-09 10:51:03 +08:00
  • 40796b9a7e Merge remote-tracking branch 'origin/dev' into dev myhloli 2026-02-09 10:50:23 +08:00
  • 31122e655b fix: update index.md to improve AMD reference formatting myhloli 2026-02-09 10:50:07 +08:00
  • 3eef5157f8 Merge pull request #4513 from opendatalab/master Xiaomeng Zhao 2026-02-06 19:19:07 +08:00
  • 5cc95f3760 Update version.py with new version myhloli 2026-02-06 03:35:08 +00:00
  • e31c0ec34d Merge pull request #4508 from opendatalab/release-2.7.6 mineru-2.7.6-released Xiaomeng Zhao 2026-02-06 11:32:49 +08:00
  • 3e51cb4e81 Merge pull request #4507 from myhloli/dev release-2.7.6 Xiaomeng Zhao 2026-02-06 11:28:40 +08:00
  • bc63b17ae4 fix: update README and index to reflect support for Kunlunxin and Tecorigin platforms myhloli 2026-02-06 10:59:38 +08:00
  • 7f986fc1e3 Merge pull request #4505 from myhloli/dev Xiaomeng Zhao 2026-02-06 01:10:28 +08:00
  • 9d2f5f3012 @wzgrx has signed the CLA in opendatalab/MinerU#4504 github-actions[bot] 2026-02-05 15:26:55 +00:00
  • 5fb8d50b70 fix: update Tecorigin.md to reflect correct CPU and GPU support information myhloli 2026-02-05 20:46:59 +08:00
  • 3ce9500894 fix: update index.md to include Kunlunxin and reorder Tecorigin reference myhloli 2026-02-05 20:40:49 +08:00
  • 142dc30a03 Merge pull request #4503 from myhloli/dev Xiaomeng Zhao 2026-02-05 20:19:54 +08:00
  • 5e3db4a472 fix: update MinerU support references for Kunlunxin acceleration cards in documentation myhloli 2026-02-05 19:43:48 +08:00
  • 90b77a2809 feat: add chunked prefill and prefix caching options to utils.py myhloli 2026-02-05 18:10:25 +08:00
  • 948161c527 fix: remove outdated tips regarding MinerU support for Cambricon acceleration cards in Kunlunxin.md myhloli 2026-02-05 16:58:39 +08:00
  • 5397c74a34 Merge pull request #4500 from myhloli/dev Xiaomeng Zhao 2026-02-05 15:46:32 +08:00
  • 97450688d6 fix: update status indicators in documentation and improve config handling in utils.py myhloli 2026-02-05 15:09:24 +08:00
  • 6e7c6b082d feat: add interline region filtering option to batch_predict method myhloli 2026-02-05 14:51:05 +08:00
  • 6f281be4ff fix: remove outdated notes and unnecessary lines in Tecorigin.md myhloli 2026-02-05 14:40:44 +08:00
  • 880cdd02b2 Merge branch 'opendatalab:dev' into dev Xiaomeng Zhao 2026-02-05 14:30:19 +08:00
  • 73b31d1118 feat: add Kunlunxin platform documentation and Dockerfile for vLLM support myhloli 2026-02-05 14:25:43 +08:00
  • 74ec4894e0 Merge pull request #4498 from Arrmsgt/master Xiaomeng Zhao 2026-02-05 14:24:26 +08:00
  • 238c1ef3a1 @Arrmsgt has signed the CLA in opendatalab/MinerU#4498 github-actions[bot] 2026-02-05 05:40:16 +00:00
  • c1022fc3e2 update Tecorigin.md Arrmsgt 2026-02-05 12:40:53 +08:00
  • 6270b05d3a update Tecorigin.md Arrmsgt 2026-02-05 12:33:54 +08:00
  • bbd214dbc3 Merge pull request #4475 from opendatalab/master Xiaomeng Zhao 2026-02-02 20:08:32 +08:00
  • 5fa66202a7 Update version.py with new version myhloli 2026-02-02 11:56:17 +00:00
  • 4dc45f6621 Merge pull request #4474 from opendatalab/dev mineru-2.7.5-released Xiaomeng Zhao 2026-02-02 19:54:52 +08:00
  • 65b3204d5a Merge pull request #4473 from myhloli/dev Xiaomeng Zhao 2026-02-02 19:53:45 +08:00
  • 636bd89b38 fix: remove unnecessary blank line in os_env_config.py myhloli 2026-02-02 19:49:30 +08:00
  • 586a4fb06b feat: enhance PDF rendering options with thread count and timeout details myhloli 2026-02-02 19:30:03 +08:00
  • 951ebd8c04 feat: add support for configurable thread count in PDF rendering myhloli 2026-02-02 19:21:09 +08:00
  • 8fe856f969 feat: update DOCX processing to replace list_items with content for improved structure myhloli 2026-02-02 17:22:49 +08:00
  • 30758634e3 Merge pull request #4471 from myhloli/dev Xiaomeng Zhao 2026-02-02 15:21:42 +08:00
  • aa960b105a fix: update accelerator card usage instructions across multiple documentation files myhloli 2026-02-02 15:17:59 +08:00
  • 3124678a20 Merge pull request #4468 from myhloli/add_docx add_docx Xiaomeng Zhao 2026-02-02 11:23:43 +08:00
  • ac2099a332 Merge branch 'opendatalab:add_docx' into add_docx Xiaomeng Zhao 2026-02-02 11:23:10 +08:00
  • eba787c22b Merge pull request #4459 from opendatalab/dev Xiaomeng Zhao 2026-01-30 23:36:54 +08:00
  • 0a288743ba Merge pull request #4458 from myhloli/dev Xiaomeng Zhao 2026-01-30 23:35:01 +08:00
  • 538280f589 Fix duplicate instruction in Cambricon.md Xiaomeng Zhao 2026-01-30 23:32:43 +08:00
  • 0af0080c85 Merge pull request #4457 from myhloli/dev Xiaomeng Zhao 2026-01-30 23:21:15 +08:00
  • 25058ea982 更新 IluvatarCorex.md Xiaomeng Zhao 2026-01-30 23:17:58 +08:00
  • 41f0e3e26d 更新 Cambricon.md Xiaomeng Zhao 2026-01-30 23:17:38 +08:00
  • 0624f7eb5b Merge pull request #4456 from opendatalab/master Xiaomeng Zhao 2026-01-30 21:56:20 +08:00
  • 4fef9e863c Merge pull request #4455 from myhloli/dev Xiaomeng Zhao 2026-01-30 21:55:43 +08:00
  • 97d1a9b1ed fix: update Cambricon documentation to correct accelerator card reference myhloli 2026-01-30 21:54:56 +08:00
  • d17a5ff7f2 Merge pull request #4454 from myhloli/dev Xiaomeng Zhao 2026-01-30 21:48:06 +08:00
  • 7448029a9d Merge pull request #4448 from Sidney233/add_pptx Xiaomeng Zhao 2026-01-30 21:47:13 +08:00
  • 47c207a906 Update version.py with new version myhloli 2026-01-30 13:45:23 +00:00
  • a91c35137a fix: correct formatting in Cambricon documentation for clarity myhloli 2026-01-30 21:44:27 +08:00
  • c2c998ae11 Merge pull request #4453 from opendatalab/release-2.7.4 mineru-2.7.4-released Xiaomeng Zhao 2026-01-30 21:43:18 +08:00