Compare commits

..

9 Commits

Author SHA1 Message Date
Xiaomeng Zhao
d7011f42e2 Merge pull request #4719 from opendatalab/master
master->Dev
2026-04-01 21:29:00 +08:00
myhloli
ede8d95bf1 Update version.py with new version 2026-04-01 13:20:54 +00:00
Xiaomeng Zhao
54b68d4bf1 Merge pull request #4718 from opendatalab/dev
3.0.7
2026-04-01 21:14:24 +08:00
Xiaomeng Zhao
1b478c24cf Merge pull request #4717 from myhloli/dev
fix: strip newline characters from paragraph text in office_middle_json_mkcontent
2026-04-01 21:13:47 +08:00
myhloli
39b62cc76a fix: strip newline characters from paragraph text in office_middle_json_mkcontent 2026-04-01 21:12:14 +08:00
Xiaomeng Zhao
13465ff43f Merge pull request #4716 from opendatalab/master
master->dev
2026-04-01 20:53:24 +08:00
myhloli
d18b7df766 Update version.py with new version 2026-04-01 12:52:13 +00:00
Xiaomeng Zhao
a97753c86f Merge pull request #4715 from myhloli/dev
fix: correct formatting of usage instructions in quick_usage.md
2026-04-01 20:51:04 +08:00
myhloli
a3b65470cf fix: correct formatting of usage instructions in quick_usage.md 2026-04-01 20:50:07 +08:00
4 changed files with 14 additions and 14 deletions

View File

@@ -43,12 +43,12 @@ If you need to adjust parsing options through custom parameters, you can also ch
>- API outputs are controlled by the server and written to `./output` by default
>- Uploads currently support `PDF`, image, and `DOCX` files
>
>`POST /tasks` returns immediately with a `task_id`. `POST /file_parse` uses the same task manager internally, waits for the task to finish, and then returns the final result synchronously.
>When a task is waiting in the queue, both the submission response and task-status response may include `queued_ahead` to indicate how many tasks are ahead of it.
>Tasks are tracked only in-process for a single `mineru-api` instance. Task status is not preserved across service restarts, `--reload`, or multi-process deployments.
>Completed or failed tasks are retained for 24 hours by default, then their task state and output directory are cleaned automatically. After cleanup, task status and result endpoints return `404`.
>Use `MINERU_API_TASK_RETENTION_SECONDS` and `MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS` to adjust retention and cleanup polling intervals.
>Use `--enable-vlm-preload true` to warm up the local VLM model during service startup instead of waiting for the first VLM or hybrid request.
>- `POST /tasks` returns immediately with a `task_id`. `POST /file_parse` uses the same task manager internally, waits for the task to finish, and then returns the final result synchronously.
>- When a task is waiting in the queue, both the submission response and task-status response may include `queued_ahead` to indicate how many tasks are ahead of it.
>- Tasks are tracked only in-process for a single `mineru-api` instance. Task status is not preserved across service restarts, `--reload`, or multi-process deployments.
>- Completed or failed tasks are retained for 24 hours by default, then their task state and output directory are cleaned automatically. After cleanup, task status and result endpoints return `404`.
>- Use `MINERU_API_TASK_RETENTION_SECONDS` and `MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS` to adjust retention and cleanup polling intervals.
>- Use `--enable-vlm-preload true` to warm up the local VLM model during service startup instead of waiting for the first VLM or hybrid request.
>
>Asynchronous task submission example:
>```bash

View File

@@ -43,12 +43,12 @@ mineru -p <input_path> -o <output_path>
>- API 输出目录由服务端固定控制,默认写入 `./output`
>- 上传文件当前支持 `PDF`、图片与 `DOCX`
>
>`POST /tasks` 会立即返回 `task_id``POST /file_parse` 会在内部提交到同一个任务管理器,等待任务完成后同步返回最终结果。
>当任务处于排队状态时,任务提交结果和状态查询结果中可能会返回 `queued_ahead` 字段,用于表示前方排队任务数。
>任务为单进程、进程内状态实现,服务重启、`--reload` 热重载或多进程部署后不保证仍可查询历史任务状态。
>默认任务完成或失败后保留 24 小时,随后自动清理任务状态和输出目录;清理后访问任务状态或结果会返回 `404`。
>可通过环境变量 `MINERU_API_TASK_RETENTION_SECONDS` 和 `MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS` 调整保留时长与清理轮询间隔。
>可通过 `--enable-vlm-preload true` 在服务启动阶段预热本地 VLM 模型,避免首次 VLM 或 hybrid 请求时再初始化。
>- `POST /tasks` 会立即返回 `task_id``POST /file_parse` 会在内部提交到同一个任务管理器,等待任务完成后同步返回最终结果。
>- 当任务处于排队状态时,任务提交结果和状态查询结果中可能会返回 `queued_ahead` 字段,用于表示前方排队任务数。
>- 任务为单进程、进程内状态实现,服务重启、`--reload` 热重载或多进程部署后不保证仍可查询历史任务状态。
>- 默认任务完成或失败后保留 24 小时,随后自动清理任务状态和输出目录;清理后访问任务状态或结果会返回 `404`。
>- 可通过环境变量 `MINERU_API_TASK_RETENTION_SECONDS` 和 `MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS` 调整保留时长与清理轮询间隔。
>- 可通过 `--enable-vlm-preload true` 在服务启动阶段预热本地 VLM 模型,避免首次 VLM 或 hybrid 请求时再初始化。
>
>异步任务提交示例:
>```bash

View File

@@ -701,7 +701,7 @@ def mk_blocks_to_markdown(para_blocks, make_mode, img_buket_path='', page_idx=No
continue
else:
# page_markdown.append(para_text.strip())
page_markdown.append(para_text)
page_markdown.append(para_text.strip('\r\n'))
return page_markdown

View File

@@ -1 +1 @@
__version__ = "3.0.5"
__version__ = "3.0.7"