Commit Graph

48 Commits

Author SHA1 Message Date
myhloli
9a96362db7 build(deps): update torch and torchvision version requirements
- Specify torch==2.3.1 and torchvision==0.18.1 for Windows CUDA installation
- Add torch and torchvision version constraints in setup.py:
  - torch>=2.2.2,<=2.3.1
  - torchvision>=0.17.2,<=0.18.1
- Update installation instructions in both English and Chinese README files
2024-12-11 14:31:49 +08:00
myhloli
a296ea41f9 refactor(magic_pdf): optimize environment setup and dependencies
- Add environment variables to disable albumentations and yolo updates
- Import torchtext and disable deprecation warnings
- Update unimernet to 0.2.2
- Specify ultralytics version as >=8.3.48
- Remove upper version limit for torch
2024-12-09 18:08:27 +08:00
myhloli
2ae1039408 build(deps): update dependency versions
- Update ultralytics to >=8.3.47
2024-12-09 14:26:48 +08:00
myhloli
1f1335c290 build(deps): specify minimum version for ultralytics
- Update `ultralytics` dependency to version >= 8.3.43
- This change ensures compatibility with yolov8 for formula detection
2024-12-06 17:28:24 +08:00
myhloli
d0f633e2d5 build(setup): add old_linux specific dependencies
- Add albumentations package with version <=1.4.20 for old_linux
- This version is compatible with Linux systems from 2019 and earlier
- Version 1.4.21 and above introduced simsimd which is not supported on older Linux systems
2024-11-18 22:34:09 +08:00
myhloli
08f46125a0 refactor(model): rename and restructure model modules 2024-11-15 18:50:05 +08:00
myhloli
fe2c2c0d8e feat(table): add RapidOCR support for RapidTable model
- Integrate RapidOCR with RapidTable model for table recognition
- Improve memory management for devices with <= 8GB VRAM
- Update table recognition process to use RapidOCR for RapidTable
- Add rapidocr-paddle dependency in setup.py
2024-11-09 00:59:59 +08:00
myhloli
240fe99e3c feat(table): integrate RapidTable model for table recognition
- Add RapidTable model support for table recognition
- Update table model configuration and initialization
- Modify table recognition process to use RapidTable when specified
- Add RapidTable dependency to setup.py
2024-11-08 18:26:00 +08:00
myhloli
11f23843b1 feat(table): upgrade StructEqTable model and integrate into PDF Extract Kit
- Update StructTableModel to use the latest struct-eqtable library
- Add support for HTML table extraction in PDF Extract Kit
- Improve error handling and model initialization
- Update dependencies in setup.py for struct-eqtable
2024-11-04 17:08:19 +08:00
myhloli
73fe8914cb build(setup): add doclayout_yolo dependency
- Add doclayout_yolo==0.0.2 to the list of dependencies in setup.py
2024-10-23 17:32:07 +08:00
Xiaomeng Zhao
20212a3763 Update setup.py
update UniMERNet to 0.2.1
2024-09-10 22:03:05 +08:00
myhloli
3e9bc7a457 refactor(pdf_extract_kit): update model config and weight paths for UniMERNet-0.2.0
Update the paths to model weights and configuration files for the UniMERNet architecture
in both the demo.yaml and model_configs.yaml files. Adjust the mfr_model_init function toreflect the new weight and configuration paths. The changes include specifying more detailed
paths to the unimernet_base directory and changing the weight file extension to .pth.
2024-09-10 16:11:58 +08:00
myhloli
252139099b fix(setup): allow latest matplotlib versions on non-Windows platforms
The restriction on the matplotlib version has been updated to only apply on Windows
platforms, where precompiled packages are not available starting from version 3.9.1.
This change enables users on Linux and macOS to install newer versions of matplotlib,
addressing compatibility issues with recent bug fixes.
2024-08-04 20:29:39 +08:00
myhloli
9ececf3a1e fix(dependencies): remove unnecessary pypandoc and struct-eqtable packages;fix matplotlib>=3.9.1 not support Windows system without compilation environment. 2024-08-04 20:23:21 +08:00
icecraft
40e0827e60 Feat/impl cli (#264)
* feat: refractor cli command

* feat: add docs to describe the output files of cli

* feat: resove review comments

* feat: updat docs about middle.json

---------

Co-authored-by: shenguanlin <shenguanlin@pjlab.org.cn>
2024-08-01 19:21:15 +08:00
myhloli
2c09109ef0 fix(setup): pin unimernet version to 0.1.6 for compatibility 2024-07-30 10:50:03 +08:00
myhloli
46d7549926 fix(setup): update PyMuPDF and paddlepaddle dependencies 2024-07-28 15:49:35 +08:00
myhloli
5c963168fb feat(setup.py): restructure extras_require options for clarity
Refactor the `extras_require` section in `setup.py` to simplify and clarify
the available options. Consolidate CPU and GPU requirements into single
"lite" and "full" options to streamline installation for users.
2024-07-23 23:45:26 +08:00
myhloli
61fab96eae fix(setup): specify paddleocr version to fix compatibility issue 2024-07-12 19:57:02 +08:00
myhloli
d458b705aa feat(setup.py): include package data for magic_pdf.resources
Update the setup.py file to explicitly include the package data for the
magic_pdf.resources directory. This ensures that all files within thisdirectory are packaged and available for use with the magic_pdf package.
2024-07-12 13:06:12 +08:00
myhloli
bc0f69321a feat(model): add model mode selection for PDF analysis
Introduce a new feature that allows users to choose between a "lite" and a "full"
model mode for PDF document analysis. The "lite" mode uses a faster, less
accurate model, while the "full" mode employs a higher-precision model at the
cost of speed. This selection can be made through the CLI or API, providing
flexibility for different use cases.
2024-07-11 17:10:14 +08:00
myhloli
1cedf4572e update: Update the homepage link 2024-07-08 11:18:30 +08:00
赵小蒙
3aa8ccdceb update requirements and setup 2024-06-25 19:05:42 +08:00
赵小蒙
129288aae6 update setup config 2024-06-20 17:43:45 +08:00
赵小蒙
756792a3f6 update:
add entry points can exec in shell
2024-06-20 16:42:48 +08:00
赵小蒙
9dc5033cf7 update requirements 2024-06-18 14:51:06 +08:00
赵小蒙
9b5b116369 fix: change garbled_rate 0.1 -> 0.02 2024-06-05 15:21:14 +08:00
赵小蒙
07f6c49707 chanage update version logic 2024-06-04 11:33:57 +08:00
赵小蒙
1de37e4c65 add version_name to middle json 2024-06-04 11:15:52 +08:00
赵小蒙
bd1834284e add version_name to middle json 2024-06-03 18:51:38 +08:00
赵小蒙
75478eda89 update setup 2024-05-30 10:26:10 +08:00
赵小蒙
3f3edc39f5 update setup 2024-05-30 10:25:02 +08:00
赵小蒙
a706743372 setup从tag中自动获取版本号 2024-03-05 15:05:51 +08:00
赵小蒙
7242a4a76e 更新模块版本号 2024-03-05 12:17:02 +08:00
赵小蒙
6cbf7fabcf 更新模块版本号 2024-03-05 12:03:12 +08:00
赵小蒙
779d2e8aaf 修正一些依赖库的版本,兼容spark环境 2024-03-04 17:07:01 +08:00
赵小蒙
044b7de34b 0.1.0 版本released 2024-03-04 12:30:07 +08:00
赵小蒙
03bd97c54f 0.1.2 版本released 2024-03-04 12:25:55 +08:00
赵小蒙
38c7dc100a 0.1.1 版本released 2024-03-04 12:23:52 +08:00
赵小蒙
518005abeb 0.1版本 release 2024-03-04 12:12:51 +08:00
赵小蒙
7228545841 更新版本号 2024-03-01 18:23:55 +08:00
赵小蒙
4033ab154d 更新工作流配置 2024-03-01 18:03:11 +08:00
赵小蒙
d2380d5a14 更新release配置 2024-03-01 17:51:06 +08:00
赵小蒙
ec51cd8e6b setup.py从requirements.txt获取依赖 2024-03-01 17:07:50 +08:00
赵小蒙
1bbab88165 修改打包项目名称 2024-03-01 16:16:53 +08:00
赵小蒙
d5dbed7325 目录重构 2024-03-01 16:07:51 +08:00
赵小蒙
33e2922ae6 更新依赖包配置和打包配置 2024-03-01 15:17:42 +08:00
赵小蒙
9e7f7550de 配置打包参数 2024-02-29 19:17:36 +08:00