docs(README): update model download instructions for PDF-Extract-Kit 1.0

- Update README.md and README_zh-CN.md to include new model download instructions - Provide detailed steps on how to download models after PDF-Extract-Kit 1.0 repository change - Emphasize the need to re-download models due to repository change
2026-03-27 11:08:32 +07:00 · 2024-10-28 18:25:57 +08:00
parent 377b09cf8c
commit 247576c18e
2 changed files with 3 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -52,6 +52,8 @@
  - Integrated [PDF-Extract-Kit 1.0](https://github.com/opendatalab/PDF-Extract-Kit):
    - Added the self-developed `doclayout_yolo` model, which speeds up processing by more than 10 times compared to the original solution while maintaining similar parsing effects, and can be freely switched with `layoutlmv3` via the configuration file.
    - Upgraded formula parsing to `unimernet 0.2.1`, improving formula parsing accuracy while significantly reducing memory usage.
+    - Due to the repository change for `PDF-Extract-Kit 1.0`, you need to re-download the model. Please refer to [How to Download Models](docs/how_to_download_models_en.md) for detailed steps.
+
 - 2024/09/27 Version 0.8.1 released, Fixed some bugs, and providing a [localized deployment version](projects/web_demo/README.md) of the [online demo](https://opendatalab.com/OpenSourceTools/Extractor/PDF/) and the [front-end interface](projects/web/README.md).
 - 2024/09/09: Version 0.8.0 released, supporting fast deployment with Dockerfile, and launching demos on Huggingface and Modelscope.
 - 2024/08/30: Version 0.7.1 released, add paddle tablemaster table recognition option