docs(README): update model download instructions for PDF-Extract-Kit 1.0

- Update README.md and README_zh-CN.md to include new model download instructions
- Provide detailed steps on how to download models after PDF-Extract-Kit 1.0 repository change
- Emphasize the need to re-download models due to repository change
This commit is contained in:
myhloli
2024-10-28 18:25:57 +08:00
parent 377b09cf8c
commit 247576c18e
2 changed files with 3 additions and 0 deletions

View File

@@ -52,6 +52,8 @@
- Integrated [PDF-Extract-Kit 1.0](https://github.com/opendatalab/PDF-Extract-Kit):
- Added the self-developed `doclayout_yolo` model, which speeds up processing by more than 10 times compared to the original solution while maintaining similar parsing effects, and can be freely switched with `layoutlmv3` via the configuration file.
- Upgraded formula parsing to `unimernet 0.2.1`, improving formula parsing accuracy while significantly reducing memory usage.
- Due to the repository change for `PDF-Extract-Kit 1.0`, you need to re-download the model. Please refer to [How to Download Models](docs/how_to_download_models_en.md) for detailed steps.
- 2024/09/27 Version 0.8.1 released, Fixed some bugs, and providing a [localized deployment version](projects/web_demo/README.md) of the [online demo](https://opendatalab.com/OpenSourceTools/Extractor/PDF/) and the [front-end interface](projects/web/README.md).
- 2024/09/09: Version 0.8.0 released, supporting fast deployment with Dockerfile, and launching demos on Huggingface and Modelscope.
- 2024/08/30: Version 0.7.1 released, add paddle tablemaster table recognition option