diff --git a/README.md b/README.md index 6dc6b9a1..83ac47df 100644 --- a/README.md +++ b/README.md @@ -44,6 +44,10 @@ # Changelog +- 2025/11/04 2.6.4 Release + - Added timeout configuration for PDF image rendering, default is 300 seconds, can be configured via environment variable `MINERU_PDF_RENDER_TIMEOUT` to prevent long blocking of the rendering process caused by some abnormal PDF files. + - Added CPU thread count configuration options for ONNX models, default is the system CPU core count, can be configured via environment variables `MINERU_INTRA_OP_NUM_THREADS` and `MINERU_INTER_OP_NUM_THREADS` to reduce CPU resource contention conflicts in high concurrency scenarios. + - 2025/10/31 2.6.3 Release - Added support for a new backend `vlm-mlx-engine`, enabling MLX-accelerated inference for the MinerU2.5 model on Apple Silicon devices. Compared to the `vlm-transformers` backend, `vlm-mlx-engine` delivers a 100%–200% speed improvement. - Bug fixes: #3849, #3859 diff --git a/README_zh-CN.md b/README_zh-CN.md index b36f7332..e9dea28a 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -44,6 +44,10 @@ # 更新记录 +- 2025/11/04 2.6.4 发布 + - 为pdf渲染图片增加超时配置,默认为300秒,可通过环境变量`MINERU_PDF_RENDER_TIMEOUT`进行配置,防止部分异常pdf文件导致渲染过程长时间阻塞。 + - 为onnx模型增加cpu线程数配置选项,默认为系统cpu核心数,可通过环境变量`MINERU_INTRA_OP_NUM_THREADS`和`MINERU_INTER_OP_NUM_THREADS`进行配置,以减少高并发场景下的对cpu资源的抢占冲突。 + - 2025/10/31 2.6.3 发布 - 增加新后端`vlm-mlx-engine`支持,在Apple Silicon设备上支持使用`MLX`加速`MinerU2.5`模型推理,相比`vlm-transformers`后端,`vlm-mlx-engine`后端速度提升100%~200%。 - bug修复: #3849 #3859 diff --git a/docs/en/usage/cli_tools.md b/docs/en/usage/cli_tools.md index 5b81027e..ccffbc43 100644 --- a/docs/en/usage/cli_tools.md +++ b/docs/en/usage/cli_tools.md @@ -100,7 +100,7 @@ Here are the environment variables and their descriptions: * Used to enable table merging functionality * Default is `true`, can be set to `false` via environment variable to disable table merging functionality. -- `MINERU_PDF_LOAD_IMAGES_TIMEOUT`: +- `MINERU_PDF_RENDER_TIMEOUT`: * Used to set the timeout period (in seconds) for rendering PDF to images * Default is `300` seconds, can be set to other values via environment variable to adjust the image rendering timeout. diff --git a/docs/zh/usage/cli_tools.md b/docs/zh/usage/cli_tools.md index 15d4d2eb..218da1e2 100644 --- a/docs/zh/usage/cli_tools.md +++ b/docs/zh/usage/cli_tools.md @@ -95,7 +95,7 @@ MinerU命令行工具的某些参数存在相同功能的环境变量配置, * 用于启用表格合并功能 * 默认为`true`,可通过环境变量设置为`false`来禁用表格合并功能。 -- `MINERU_PDF_LOAD_IMAGES_TIMEOUT`: +- `MINERU_PDF_RENDER_TIMEOUT`: * 用于设置将PDF渲染为图片的超时时间(秒) * 默认为`300`秒,可通过环境变量设置为其他值以调整渲染图片的超时时间。 diff --git a/mineru/utils/os_env_config.py b/mineru/utils/os_env_config.py index 684976ca..43d01334 100644 --- a/mineru/utils/os_env_config.py +++ b/mineru/utils/os_env_config.py @@ -7,7 +7,7 @@ def get_op_num_threads(env_name: str) -> int: def get_load_images_timeout() -> int: - env_value = os.getenv('MINERU_PDF_LOAD_IMAGES_TIMEOUT', None) + env_value = os.getenv('MINERU_PDF_RENDER_TIMEOUT', None) return get_value_from_string(env_value, 300)