Commit Graph

24 Commits

Author SHA1 Message Date
myhloli
951ebd8c04 feat: add support for configurable thread count in PDF rendering 2026-02-02 19:21:09 +08:00
myhloli
bfb304ef1f fix: improve EXIF handling and save PDF logic in pdf_image_tools.py 2026-01-05 00:27:01 +08:00
Xiaomeng Zhao
ba06cd14ef Update pdf_image_tools.py 2026-01-04 18:29:51 +08:00
史提芬达
0ca244ad62 fix: add EXIF orientation handling for image inputs 2026-01-04 13:41:55 +08:00
Xiaomeng Zhao
e010b0974a Update mineru/utils/pdf_image_tools.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-04 20:21:37 +08:00
Xiaomeng Zhao
fe1549960d Update mineru/utils/pdf_image_tools.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-04 20:20:37 +08:00
Xiaomeng Zhao
dae2cc8514 Update mineru/utils/pdf_image_tools.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-04 19:59:59 +08:00
myhloli
5de8f1a19f feat: add environment variables for PDF rendering timeout and ONNX thread management 2025-11-04 19:47:59 +08:00
myhloli
51df4d8508 refactor: enhance PDF conversion function parameters and improve thread handling logic 2025-11-04 09:54:45 +08:00
Xiaomeng Zhao
a9c9501af6 Update mineru/utils/pdf_image_tools.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-03 22:09:29 +08:00
Xiaomeng Zhao
74de2725cb Update mineru/utils/pdf_image_tools.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-03 22:08:20 +08:00
Xiaomeng Zhao
6250c453d9 Update mineru/utils/pdf_image_tools.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-03 22:04:49 +08:00
myhloli
2079395774 refactor: adjust thread count based on CPU cores and comment out image loading time logging 2025-11-03 21:02:58 +08:00
myhloli
b4c57116c1 refactor: move PDF byte conversion logic to pdf_page_id and simplify image conversion process 2025-11-03 20:57:18 +08:00
myhloli
ace7f76869 refactor: move PDF byte conversion functions to pdf_page_tools and simplify logic 2025-11-03 20:26:34 +08:00
myhloli
5349fd7ccd refactor: enhance PDF image loading by removing multiprocessing for Windows environment and improving logging 2025-11-03 19:41:22 +08:00
myhloli
245ae28c27 refactor: optimize page range calculation and enhance logging for image conversion process 2025-11-03 19:11:05 +08:00
myhloli
66d5f3dfd2 feat: refactor PDF image conversion to use get_end_page_id utility function and add multi-threading support 2025-11-03 15:08:31 +08:00
myhloli
b614bef035 feat: add multiprocessing support for PDF to image conversion with timeout handling 2025-10-31 17:50:59 +08:00
myhloli
2fcffcb0af fix: refactor image handling to use numpy arrays instead of PIL images 2025-08-25 18:53:05 +08:00
myhloli
ff11e602fc feat: enhance image processing by introducing ImageType enum and updating related functions 2025-08-05 02:06:45 +08:00
myhloli
38ace5dc61 refactor: streamline document analysis and enhance image handling in processing pipeline 2025-06-03 15:11:05 +08:00
myhloli
0f21495a06 refactor: enhance block processing and sorting utilities for improved span management 2025-05-30 18:46:39 +08:00
myhloli
cbba27b4f5 refactor: reorganize project structure and update import paths 2025-05-28 20:17:26 +08:00