fix: update documentation for mineru-api and improve concurrency settings

2026-03-27 11:08:32 +07:00 · 2025-12-02 01:29:10 +08:00
parent 0bf3ed7970
commit e36ef652ee
5 changed files with 25 additions and 5 deletions
--- a/README.md
+++ b/README.md
@@ -44,6 +44,13 @@
 </div>

 # Changelog
+
+- 2025/12/02 2.6.6 Release
+  - `mineru-api` tool optimizations
+    - Added descriptive text to `mineru-api` interface parameters to improve API documentation readability.
+    - You can use the environment variable `MINERU_API_ENABLE_FASTAPI_DOCS` to control whether the auto-generated interface documentation page is enabled (enabled by default).
+    - Added concurrency configuration options for the `vlm-vllm-async-engine`, `vlm-lmdeploy-engine`, and `vlm-http-client` backends. Users can use the environment variable `MINERU_API_MAX_CONCURRENT_REQUESTS` to set the maximum number of concurrent API requests (unlimited by default).
+
 - 2025/11/26 2.6.5 Release
  - Added support for a new backend vlm-lmdeploy-engine. Its usage is similar to vlm-vllm-(async)engine, but it uses lmdeploy as the inference engine and additionally supports native inference acceleration on Windows platforms compared to vllm.

@@ -797,6 +804,8 @@ Currently, some models in this project are trained based on YOLO. However, since
 - [pdfminer.six](https://github.com/pdfminer/pdfminer.six)
 - [pypdf](https://github.com/py-pdf/pypdf)
 - [magika](https://github.com/google/magika)
+- [vLLM](https://github.com/vllm-project/vllm)
+- [LMDeploy](https://github.com/InternLM/lmdeploy)

 # Citation

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -45,6 +45,15 @@

 # 更新记录

+- 2025/12/02 2.6.6 发布
+  - `Ascend`适配优化
+    - 优化命令行工具初始化流程，使Ascend适配方案中`vlm-vllm-engine`后端在命令行工具中可用。
+    - 为Atlas 300I Duo(310p)设备更新适配文档。
+  - `mineru-api`工具优化
+    - 为`mineru-api`接口参数增加描述性文本，优化接口文档可读性。
+    - 可通过环境变量`MINERU_API_ENABLE_FASTAPI_DOCS`控制是否启用自动生成的接口文档页面，默认为启用。
+    - 为`vlm-vllm-async-engine`、`vlm-lmdeploy-engine`、`vlm-http-client`后端增加并发数配置选项，用户可通过环境变量`MINERU_API_MAX_CONCURRENT_REQUESTS`控制api接口的最大并发请求数，默认为不限制数量。
+
 - 2025/11/26 2.6.5 发布
  - 增加新后端`vlm-lmdeploy-engine`支持，使用方式与`vlm-vllm-(async)engine`类似，但使用`lmdeploy`作为推理引擎，与`vllm`相比额外支持Windows平台原生推理加速。
  - 新增国产算力平台`昇腾/npu`、`平头哥/ppu`、`沐曦/maca`的适配支持，用户可在对应平台上使用`pipeline`与`vlm`模型，并使用`vllm`/`lmdeploy`引擎加速vlm模型推理，具体使用方式请参考[其他加速卡适配](https://opendatalab.github.io/MinerU/zh/usage/)。
@@ -791,6 +800,8 @@ mineru -p <input_path> -o <output_path>
 - [pdfminer.six](https://github.com/pdfminer/pdfminer.six)
 - [pypdf](https://github.com/py-pdf/pypdf)
 - [magika](https://github.com/google/magika)
+- [vLLM](https://github.com/vllm-project/vllm)
+- [LMDeploy](https://github.com/InternLM/lmdeploy)

 # Citation

--- a/docker/china/npu.Dockerfile
+++ b/docker/china/npu.Dockerfile
@@ -1,6 +1,6 @@
 # 基础镜像配置 vLLM 或 LMDeploy ，请根据实际需要选择其中一个，要求 ARM(AArch64) CPU + Ascend NPU。
 # Base image containing the vLLM inference environment, requiring ARM(AArch64) CPU + Ascend NPU.
-FROM quay.io/ascend/vllm-ascend:v0.11.0rc2
+FROM quay.m.daocloud.io/ascend/vllm-ascend:v0.11.0rc2
 # Base image containing the LMDeploy inference environment, requiring ARM(AArch64) CPU + Ascend NPU.
 # FROM crpi-4crprmm5baj1v8iv.cn-hangzhou.personal.cr.aliyuncs.com/lmdeploy_dlinfer/ascend:mineru-a2

--- a/docs/zh/usage/acceleration_cards/Ascend.md
+++ b/docs/zh/usage/acceleration_cards/Ascend.md
@@ -186,9 +186,9 @@ docker run -u root --name mineru_docker --privileged=true \
 🔴: 不支持，无法运行，或精度存在较大差异  

 >[!NOTE]
->在使用vllm镜像启动mineru-api服务时，如先使用了pipeline后端解析，再切换到vlm-vllm-async-engine后端，会由于vllm-ascend的bug导致服务异常退出。  
+>在使用vllm镜像启动mineru-api服务时，如先使用了pipeline后端解析，再切换到vlm-vllm-async-engine后端，会出现vllm引擎初始化失败的问题。  
 >如需在一个mineru-api服务中同时使用pipeline和vlm-vllm-async-engine两种后端，请先使用vlm-vllm-async-engine后端解析一次，之后即可自由切换。  
->如在服务中切换推理后端类型遇到报错或异常，请重新启动服务即可。
+>如在服务中切换推理后端类型时遇到报错或异常，请重新启动服务即可。

 >[!TIP]
 >NPU加速卡指定可用加速卡的方式与NVIDIA GPU类似，请参考[ASCEND_RT_VISIBLE_DEVICES](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/850alpha001/maintenref/envvar/envref_07_0028.html)
--- a/mineru/cli/fast_api.py
+++ b/mineru/cli/fast_api.py
@@ -39,7 +39,7 @@ async def limit_concurrency():

 def create_app():
    # By default, the OpenAPI documentation endpoints (openapi_url, docs_url, redoc_url) are enabled.
-    # To disable the FastAPI docs and schema endpoints, set the environment variable ENABLE_FASTAPI_DOCS=0.
+    # To disable the FastAPI docs and schema endpoints, set the environment variable MINERU_API_ENABLE_FASTAPI_DOCS=0.
    enable_docs = str(os.getenv("MINERU_API_ENABLE_FASTAPI_DOCS", "1")).lower() in ("1", "true", "yes")
    app = FastAPI(
        openapi_url="/openapi.json" if enable_docs else None,
@@ -47,7 +47,7 @@ def create_app():
        redoc_url="/redoc" if enable_docs else None,
    )

-    # 初始化并发控制器：从环境变量读取，避免使用未定义的 kwargs
+    # 初始化并发控制器：从环境变量MINERU_API_MAX_CONCURRENT_REQUESTS读取
    global _request_semaphore
    try:
        max_concurrent_requests = int(os.getenv("MINERU_API_MAX_CONCURRENT_REQUESTS", "0"))