Commit Graph

  • 553ec651bf Deployed bd46b08 with MkDocs version: 1.5.3 gh-pages myhloli 2026-03-26 18:02:27 +00:00
  • bd46b0889f Merge pull request #4665 from myhloli/dev dev Xiaomeng Zhao 2026-03-27 02:01:58 +08:00
  • b6834bdb0c feat: simplify level adjustment logic in llm_aided.py myhloli 2026-03-27 02:00:10 +08:00
  • 1de4a48129 feat: improve progress bar handling and exclude idle time in analysis modules myhloli 2026-03-27 01:45:06 +08:00
  • 05648a7d40 feat: remove unused functions and clean up client and fast_api modules myhloli 2026-03-27 00:32:27 +08:00
  • ed530b6857 Merge pull request #4663 from myhloli/dev Xiaomeng Zhao 2026-03-27 00:06:58 +08:00
  • ad4bde7df8 feat: extend task result timeout to 3600 seconds in api_client myhloli 2026-03-27 00:05:47 +08:00
  • 12aebcfe17 feat: implement autoscroll functionality for status box in gradio_app myhloli 2026-03-26 23:19:02 +08:00
  • fde1bf5c24 feat: unify threading lock implementation for model singletons myhloli 2026-03-26 22:33:39 +08:00
  • fffccefec7 feat: enhance async task management and cleanup in gradio_app myhloli 2026-03-26 19:44:41 +08:00
  • 0e1ea31d39 feat: refactor demo script for async API integration and enhance input file handling myhloli 2026-03-26 19:29:05 +08:00
  • ce00207993 feat: add queued_ahead attribute to task status and update handling in API client myhloli 2026-03-26 18:42:41 +08:00
  • cd674238bb feat: add queued_ahead attribute to task status and update handling in API client myhloli 2026-03-26 17:52:11 +08:00
  • 3ca042daa7 feat: add task status snapshot and enhance queue handling in API client myhloli 2026-03-26 17:35:15 +08:00
  • 31e309ff53 feat: improve local queue handling and status updates in concurrency management myhloli 2026-03-26 17:08:20 +08:00
  • a37a6b617a feat: enhance request concurrency management with status updates and UI integration myhloli 2026-03-26 16:37:40 +08:00
  • 3dc3e06324 feat: implement local API server and integrate API client for task submission and status management myhloli 2026-03-26 15:06:11 +08:00
  • 61248e2ec9 Merge pull request #4662 from Niujunbo2002/master master Xiaomeng Zhao 2026-03-26 14:20:39 +08:00
  • 181277fed6 feat: enhance concurrent request handling with configurable limits myhloli 2026-03-26 11:42:18 +08:00
  • d2d1a35b32 feat: implement live task status rendering with thread-safe output handling myhloli 2026-03-26 11:19:40 +08:00
  • c717a1c83a docs: add MinerU-Diffusion reference to README Niujunbo2002 2026-03-26 11:15:48 +08:00
  • d4f1710e42 feat: update file upload handling to use unique filenames based on document stems myhloli 2026-03-26 10:54:59 +08:00
  • 185ab841ae feat: optimize VRAM management by cleaning up GPU memory during batch processing myhloli 2026-03-26 10:46:26 +08:00
  • 834dd496a3 feat: add uniquify_task_stems function for handling duplicate document stems and implement visualization job processing myhloli 2026-03-26 02:08:15 +08:00
  • f71d5c1d84 feat: implement visualization job handling with context management and logging myhloli 2026-03-26 01:49:31 +08:00
  • b064ebdf69 Merge pull request #27 from myhloli/new_client Xiaomeng Zhao 2026-03-26 01:16:21 +08:00
  • b786ba3f76 Merge pull request #26 from myhloli/new_client Xiaomeng Zhao 2026-03-26 01:12:20 +08:00
  • 87960671f8 feat: increase maximum concurrent requests to 3 and enhance task logging with progress tracking myhloli 2026-03-26 01:08:52 +08:00
  • fe257fe6a3 feat: update API client to support maximum concurrent requests and improve logging myhloli 2026-03-26 00:34:33 +08:00
  • 053ae8eb24 feat: implement robust cleanup for temporary API directory with retry mechanism myhloli 2026-03-25 21:12:15 +08:00
  • 40a52da3cf feat: add API protocol version and default processing window size to client myhloli 2026-03-25 20:44:46 +08:00
  • 3f0d3dc985 Merge pull request #4660 from myhloli/dev Xiaomeng Zhao 2026-03-25 19:02:26 +08:00
  • ad35c69a18 feat: simplify PDF byte conversion by integrating page range handling directly in rewrite function myhloli 2026-03-25 19:00:10 +08:00
  • 8d50bd9b63 feat: refactor PDF byte conversion to utilize pdfium for improved performance and error handling myhloli 2026-03-25 18:39:34 +08:00
  • cf65bb55fb Merge pull request #4658 from myhloli/dev Xiaomeng Zhao 2026-03-25 16:41:43 +08:00
  • 6c52a44143 feat: remove async context manager for PDF handling and simplify locking mechanism myhloli 2026-03-25 16:40:43 +08:00
  • c57e36ac54 feat: remove low memory handling and refactor to use processing window size myhloli 2026-03-25 15:19:31 +08:00
  • efeb832272 feat: add functions for retrieving image paths and MIME types myhloli 2026-03-25 14:26:46 +08:00
  • 934fbd03ec feat: add pdfium_guard for thread-safe PDF document handling myhloli 2026-03-25 14:16:56 +08:00
  • cf8964c873 feat: refactor PDF handling to utilize pdfium_guard for resource management myhloli 2026-03-25 14:11:33 +08:00
  • 160438be5a Merge pull request #4652 from myhloli/dev Xiaomeng Zhao 2026-03-25 01:51:43 +08:00
  • ce2063a8d0 @UaRuairc has signed the CLA in opendatalab/MinerU#4654 cla github-actions[bot] 2026-03-24 14:52:12 +00:00
  • 0eff2b0d70 feat: adjust default GPU memory utilization based on vllm version and GPU memory myhloli 2026-03-24 18:45:21 +08:00
  • 8861657d18 feat: update OCR detection base batch size and adjust memory requirements in documentation myhloli 2026-03-24 17:25:41 +08:00
  • ed16d84b86 Merge pull request #4651 from myhloli/dev Xiaomeng Zhao 2026-03-24 16:54:30 +08:00
  • 0cff3438f6 feat: refactor PDF conversion to use pypdf and remove threading lock myhloli 2026-03-24 16:43:06 +08:00
  • 39f7311f5f feat: mark lines in index blocks as list start lines myhloli 2026-03-24 16:05:20 +08:00
  • ed3731ba96 feat: update minimum dynamic batch size for MFR processing to 16 myhloli 2026-03-24 15:42:54 +08:00
  • 635775c810 feat: add support for disabling VLM acceleration via environment variable myhloli 2026-03-24 15:41:50 +08:00
  • af191d6add Merge pull request #4645 from myhloli/dev Xiaomeng Zhao 2026-03-24 11:31:01 +08:00
  • 5b640d6580 feat: change default value for low memory mode to true myhloli 2026-03-24 11:19:17 +08:00
  • b49b0ce3b8 feat: adjust OCR detection base batch size and optimize batch ratio logic for GPU memory myhloli 2026-03-24 11:16:22 +08:00
  • 4aada116e8 Merge remote-tracking branch 'origin/dev' into dev myhloli 2026-03-24 10:54:07 +08:00
  • 6ced3dd6f1 feat: improve text block merging by ensuring both blocks have lines before merging myhloli 2026-03-24 10:53:44 +08:00
  • 9b68645352 feat: optimize dynamic batch size calculation for MFR processing myhloli 2026-03-24 10:33:36 +08:00
  • 4aab895c8a feat: enhance document processing by improving file suffix handling and adding progress indicators myhloli 2026-03-24 04:58:21 +08:00
  • 45677d2a52 Merge pull request #4644 from myhloli/dev Xiaomeng Zhao 2026-03-24 00:30:13 +08:00
  • 7dbfb81b08 feat: fix language detection for code blocks by correcting sub_type reference myhloli 2026-03-24 00:23:09 +08:00
  • 4beb2ad207 feat: enhance code block rendering by adding language support for syntax highlighting myhloli 2026-03-24 00:08:21 +08:00
  • d3e79967df feat: enhance content list generation by adding support for SEAL and CHART block types myhloli 2026-03-23 23:36:53 +08:00
  • 3c6bab713c feat: enhance markdown rendering by refactoring text and hyperlink handling myhloli 2026-03-23 18:50:26 +08:00
  • c30d88f618 feat: enhance PDF classification by implementing hybrid and legacy strategies myhloli 2026-03-23 17:13:26 +08:00
  • 9c6dfd64ab Merge pull request #4641 from myhloli/dev Xiaomeng Zhao 2026-03-23 16:23:56 +08:00
  • e4995cfd84 feat: implement synchronous parsing endpoint and enhance task management myhloli 2026-03-23 16:11:47 +08:00
  • 42c278a79f feat: enhance batch group finalization by adding dynamic splitting and merging logic myhloli 2026-03-23 15:23:04 +08:00
  • 21aa9f7b7c feat: enhance bounding box processing by adding support for CHART and REF_TEXT block types myhloli 2026-03-23 12:03:10 +08:00
  • baf7442a81 Merge pull request #4638 from myhloli/dev Xiaomeng Zhao 2026-03-23 10:53:50 +08:00
  • 6eb91d3632 feat: optimize batch processing by implementing dynamic batch grouping and enhancing formula item handling myhloli 2026-03-23 01:46:00 +08:00
  • cbbabcb347 feat: refactor prediction methods to streamline batch processing and enhance error handling myhloli 2026-03-23 00:44:48 +08:00
  • daf970af0e docs: update citation entries in README files Niujunbo2002 2026-03-22 23:59:15 +08:00
  • 7423c135d1 feat: enhance formula number processing by appending tags to interline equations myhloli 2026-03-22 23:47:54 +08:00
  • 01d8e18a13 feat: add support for SEAL block type in bounding box processing myhloli 2026-03-22 23:20:32 +08:00
  • fb7246540c feat: improve paragraph splitting logic by adding conditions for block positioning and line count myhloli 2026-03-22 23:09:28 +08:00
  • 7a365d92c9 feat: enhance PDF generation by preserving original image raster content and optimizing image handling myhloli 2026-03-22 03:06:26 +08:00
  • e7b2a48485 @vivekvar-dl has signed the CLA in opendatalab/MinerU#4636 github-actions[bot] 2026-03-21 11:57:46 +00:00
  • 7160195787 Merge pull request #4635 from myhloli/dev Xiaomeng Zhao 2026-03-21 19:39:02 +08:00
  • e07820a12c feat: extend model initialization to support MFR alongside Layout myhloli 2026-03-21 19:33:36 +08:00
  • 7685afc4de feat: enhance table processing with inline object extraction and base64 image handling myhloli 2026-03-21 18:44:56 +08:00
  • 4d57a0fe58 feat: refactor batch analysis by removing unused methods and optimizing table processing logic myhloli 2026-03-21 16:01:54 +08:00
  • db2f76d556 feat: integrate vertical crop rotation in OCR image processing for improved alignment myhloli 2026-03-21 04:23:11 +08:00
  • 6562d440db feat: sort blocks by index in rebuilt page blocks for consistent ordering myhloli 2026-03-21 04:18:21 +08:00
  • 28eeebb95d feat: optimize PDF rendering process with dynamic thread allocation and page range calculation myhloli 2026-03-21 03:53:10 +08:00
  • 09fc22fcc2 feat: add support for SEAL block type in bbox drawing and update image handling in markdown content myhloli 2026-03-21 03:33:34 +08:00
  • 01ab656487 feat: replace shape attribute with len for box count and update return type for ragged arrays myhloli 2026-03-21 03:14:12 +08:00
  • 7409e645f3 feat: update license from AGPL-3.0 to Apache License 2.0 and reflect changes in documentation myhloli 2026-03-21 03:04:31 +08:00
  • 09cd7b16cf Merge pull request #25 from myhloli/add_ppdoclayout Xiaomeng Zhao 2026-03-21 02:48:39 +08:00
  • 8851223abb feat: rename RapidTable classes to PaddleTable for consistency and clarity myhloli 2026-03-21 02:44:49 +08:00
  • fecd55fd29 feat: remove unused block retrieval methods and optimize return type in prediction processing myhloli 2026-03-21 02:32:19 +08:00
  • 805479cfa1 feat: streamline title level application in PDF processing myhloli 2026-03-21 02:10:38 +08:00
  • 9a4b226795 feat: update model paths in download pipeline for improved document layout processing myhloli 2026-03-21 02:06:58 +08:00
  • c4efdc53be feat: update README to clarify project submission status and improve layout myhloli 2026-03-20 19:19:34 +08:00
  • 690cd163fa feat: enhance code block handling by adding support for code footnotes and refining layout processing myhloli 2026-03-20 19:15:59 +08:00
  • 61ecf1bc1b feat: refactor algorithm block handling to use code block type and enhance markdown rendering myhloli 2026-03-20 18:26:34 +08:00
  • 882a0ee72f feat: refactor PDF image processing and enhance markdown rendering for visual blocks myhloli 2026-03-20 16:52:51 +08:00
  • ff1c669d1e feat: add algorithm block type support in document processing myhloli 2026-03-20 15:27:10 +08:00
  • 18d93c5cd3 feat: enhance logging granularity and improve document context management myhloli 2026-03-20 10:48:12 +08:00
  • 07701638ed feat: remove outdated dependencies from pipeline in pyproject.toml myhloli 2026-03-20 02:19:35 +08:00
  • fa3bc49f3a feat: add configurable maximum concurrent requests for API handling myhloli 2026-03-20 02:12:39 +08:00
  • dc20d89173 feat: disable cuDNN V8 API for improved compatibility across modules myhloli 2026-03-20 01:48:16 +08:00
  • 4cd501ccc7 feat: implement thread safety for PDF processing and model initialization myhloli 2026-03-20 01:37:49 +08:00