mirror of
https://github.com/opendatalab/MinerU.git
synced 2026-03-27 11:08:32 +07:00
docs: update FAQ section in index.md with community support links and clarify missing text issue in PDF rendering
This commit is contained in:
@@ -1,5 +1,9 @@
|
||||
# Frequently Asked Questions
|
||||
|
||||
If your question is not listed, you can also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to communicate with the AI assistant, which can solve most common problems.
|
||||
|
||||
If you still cannot resolve the issue, you can join the community through [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to communicate with other users and developers.
|
||||
|
||||
## 1. Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2
|
||||
|
||||
The `libgl` library is missing in Ubuntu 22.04 on WSL2. You can install the `libgl` library with the following command to resolve the issue:
|
||||
@@ -21,3 +25,18 @@ pip install -U "mineru[pipeline_old_linux]"
|
||||
```
|
||||
|
||||
Reference: https://github.com/opendatalab/MinerU/issues/1004
|
||||
|
||||
|
||||
## 3. Missing text information in parsing results when installing and using on Linux systems.
|
||||
|
||||
MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
|
||||
To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:
|
||||
```bash
|
||||
sudo apt update
|
||||
sudo apt install fonts-noto-core
|
||||
sudo apt install fonts-noto-cjk
|
||||
fc-cache -fv
|
||||
```
|
||||
You can also directly use our [Docker deployment](../quick_start/docker_deployment.md) method to build the image, which includes the above font packages by default.
|
||||
|
||||
Reference: https://github.com/opendatalab/MinerU/issues/2915
|
||||
|
||||
@@ -1,10 +0,0 @@
|
||||
# Known Issues
|
||||
|
||||
- Reading order is determined by the model based on the spatial distribution of readable content, and may be out of order in some areas under extremely complex layouts.
|
||||
- Limited support for vertical text.
|
||||
- Tables of contents and lists are recognized through rules, and some uncommon list formats may not be recognized.
|
||||
- Code blocks are not yet supported in the layout model.
|
||||
- Comic books, art albums, primary school textbooks, and exercises cannot be parsed well.
|
||||
- Table recognition may result in row/column recognition errors in complex tables.
|
||||
- OCR recognition may produce inaccurate characters in PDFs of lesser-known languages (e.g., diacritical marks in Latin script, easily confused characters in Arabic script).
|
||||
- Some formulas may not render correctly in Markdown.
|
||||
@@ -1,72 +0,0 @@
|
||||
# Local Deployment
|
||||
|
||||
## Install MinerU
|
||||
|
||||
### Install via pip or uv
|
||||
|
||||
```bash
|
||||
pip install --upgrade pip
|
||||
pip install uv
|
||||
uv pip install -U "mineru[core]"
|
||||
```
|
||||
|
||||
### Install from source
|
||||
|
||||
```bash
|
||||
git clone https://github.com/opendatalab/MinerU.git
|
||||
cd MinerU
|
||||
uv pip install -e .[core]
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> Linux and macOS systems automatically support CUDA/MPS acceleration after installation. For Windows users who want to use CUDA acceleration,
|
||||
> please visit the [PyTorch official website](https://pytorch.org/get-started/locally/) to install PyTorch with the appropriate CUDA version.
|
||||
|
||||
### Install Full Version (supports sglang acceleration) (requires device with Turing or newer architecture and at least 8GB GPU memory)
|
||||
|
||||
If you need to use **sglang to accelerate VLM model inference**, you can choose any of the following methods to install the full version:
|
||||
|
||||
- Install using uv or pip:
|
||||
```bash
|
||||
uv pip install -U "mineru[all]"
|
||||
```
|
||||
- Install from source:
|
||||
```bash
|
||||
uv pip install -e .[all]
|
||||
```
|
||||
|
||||
> [!TIP]
|
||||
> If any exceptions occur during the installation of `sglang`, please refer to the [official sglang documentation](https://docs.sglang.ai/start/install.html) for troubleshooting and solutions, or directly use Docker-based installation.
|
||||
|
||||
- Build image using Dockerfile:
|
||||
```bash
|
||||
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile
|
||||
docker build -t mineru-sglang:latest -f Dockerfile .
|
||||
```
|
||||
Start Docker container:
|
||||
```bash
|
||||
docker run --gpus all \
|
||||
--shm-size 32g \
|
||||
-p 30000:30000 \
|
||||
--ipc=host \
|
||||
mineru-sglang:latest \
|
||||
mineru-sglang-server --host 0.0.0.0 --port 30000
|
||||
```
|
||||
Or start using Docker Compose:
|
||||
```bash
|
||||
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
|
||||
docker compose -f compose.yaml up -d
|
||||
```
|
||||
|
||||
> [!TIP]
|
||||
> The Dockerfile uses `lmsysorg/sglang:v0.4.8.post1-cu126` as the default base image, which supports the Turing/Ampere/Ada Lovelace/Hopper platforms.
|
||||
> If you are using the newer Blackwell platform, please change the base image to `lmsysorg/sglang:v0.4.8.post1-cu128-b200`.
|
||||
|
||||
### Install client (for connecting to sglang-server on edge devices that require only CPU and network connectivity)
|
||||
|
||||
```bash
|
||||
uv pip install -U mineru
|
||||
mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<host_ip>:<port>
|
||||
```
|
||||
|
||||
---
|
||||
@@ -1,5 +0,0 @@
|
||||
# Online Demo
|
||||
|
||||
[](https://mineru.net/OpenSourceTools/Extractor?source=github)
|
||||
[](https://huggingface.co/spaces/opendatalab/MinerU)
|
||||
[](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
|
||||
@@ -1,9 +0,0 @@
|
||||
# TODO
|
||||
|
||||
- [x] Reading order based on the model
|
||||
- [x] Recognition of `index` and `list` in the main text
|
||||
- [x] Table recognition
|
||||
- [x] Heading Classification
|
||||
- [ ] Code block recognition in the main text
|
||||
- [ ] [Chemical formula recognition](../chemical_knowledge_introduction/introduction.pdf)
|
||||
- [ ] Geometric shape recognition
|
||||
@@ -1,58 +0,0 @@
|
||||
# API Calls or Visual Invocation
|
||||
|
||||
1. Directly invoke using Python API: [Python Invocation Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
|
||||
2. Invoke using FastAPI:
|
||||
```bash
|
||||
mineru-api --host 127.0.0.1 --port 8000
|
||||
```
|
||||
Visit http://127.0.0.1:8000/docs in your browser to view the API documentation.
|
||||
|
||||
3. Use Gradio WebUI or Gradio API:
|
||||
```bash
|
||||
# Using pipeline/vlm-transformers/vlm-sglang-client backend
|
||||
mineru-gradio --server-name 127.0.0.1 --server-port 7860
|
||||
# Or using vlm-sglang-engine/pipeline backend
|
||||
mineru-gradio --server-name 127.0.0.1 --server-port 7860 --enable-sglang-engine true
|
||||
```
|
||||
Access http://127.0.0.1:7860 in your browser to use the Gradio WebUI, or visit http://127.0.0.1:7860/?view=api to use the Gradio API.
|
||||
|
||||
|
||||
> [!TIP]
|
||||
> - Below are some suggestions and notes for using the sglang acceleration mode:
|
||||
> - The sglang acceleration mode currently supports operation on Turing architecture GPUs with a minimum of 8GB VRAM, but you may encounter VRAM shortages on GPUs with less than 24GB VRAM. You can optimize VRAM usage with the following parameters:
|
||||
> - If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by setting `--mem-fraction-static 0.5`. If VRAM issues persist, try lowering it further to `0.4` or below.
|
||||
> - If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode: `--tp-size 2`
|
||||
> - If you are already successfully using sglang to accelerate VLM inference but wish to further improve inference speed, consider the following parameters:
|
||||
> - If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode: `--dp-size 2`
|
||||
> - You can also enable `torch.compile` to accelerate inference speed by about 15%: `--enable-torch-compile`
|
||||
> - For more information on using sglang parameters, please refer to the [sglang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
|
||||
> - All sglang-supported parameters can be passed to MinerU via command-line arguments, including those used with the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
|
||||
|
||||
> [!TIP]
|
||||
> - In any case, you can specify visible GPU devices at the start of a command line by adding the `CUDA_VISIBLE_DEVICES` environment variable. For example:
|
||||
> ```bash
|
||||
> CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
|
||||
> ```
|
||||
> - This method works for all command-line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
|
||||
> - Below are some common `CUDA_VISIBLE_DEVICES` settings:
|
||||
> ```bash
|
||||
> CUDA_VISIBLE_DEVICES=1 Only device 1 will be seen
|
||||
> CUDA_VISIBLE_DEVICES=0,1 Devices 0 and 1 will be visible
|
||||
> CUDA_VISIBLE_DEVICES="0,1" Same as above, quotation marks are optional
|
||||
> CUDA_VISIBLE_DEVICES=0,2,3 Devices 0, 2, 3 will be visible; device 1 is masked
|
||||
> CUDA_VISIBLE_DEVICES="" No GPU will be visible
|
||||
> ```
|
||||
> - Below are some possible use cases:
|
||||
> - If you have multiple GPUs and need to specify GPU 0 and GPU 1 to launch 'sglang-server' in multi-GPU mode, you can use the following command:
|
||||
> ```bash
|
||||
> CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
|
||||
> ```
|
||||
> - If you have multiple GPUs and need to launch two `fastapi` services on GPU 0 and GPU 1 respectively, listening on different ports, you can use the following commands:
|
||||
> ```bash
|
||||
> # In terminal 1
|
||||
> CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
|
||||
> # In terminal 2
|
||||
> CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
|
||||
> ```
|
||||
|
||||
---
|
||||
@@ -1,10 +0,0 @@
|
||||
# Extending MinerU Functionality Through Configuration Files
|
||||
|
||||
- MinerU is designed to work out-of-the-box, but also supports extending functionality through configuration files. You can create a `mineru.json` file in your home directory and add custom configurations.
|
||||
- The `mineru.json` file will be automatically generated when you use the built-in model download command `mineru-models-download`. Alternatively, you can create it by copying the [configuration template file](../../mineru.template.json) to your home directory and renaming it to `mineru.json`.
|
||||
- Below are some available configuration options:
|
||||
- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to the `$` symbol, and can be modified to other symbols or strings as needed.
|
||||
- `llm-aided-config`: Used to configure related parameters for LLM-assisted heading level detection, compatible with all LLM models supporting the `OpenAI protocol`. It defaults to Alibaba Cloud Qwen's `qwen2.5-32b-instruct` model. You need to configure an API key yourself and set `enable` to `true` to activate this feature.
|
||||
- `models-dir`: Used to specify local model storage directories. Please specify separate model directories for the `pipeline` and `vlm` backends. After specifying these directories, you can use local models by setting the environment variable `export MINERU_MODEL_SOURCE=local`.
|
||||
|
||||
---
|
||||
@@ -59,8 +59,6 @@ nav:
|
||||
- Output File Format: usage/output_file.md
|
||||
- FAQ:
|
||||
- FAQ: FAQ/index.md
|
||||
- Known Issues: known_issues.md
|
||||
- TODO: todo.md
|
||||
|
||||
plugins:
|
||||
- search
|
||||
@@ -85,8 +83,6 @@ plugins:
|
||||
Advanced CLI Parameters: 命令行参数进阶技巧
|
||||
FAQ: FAQ
|
||||
Output File Format: 输出文件格式
|
||||
Known Issues: 已知问题
|
||||
TODO: TODO
|
||||
- mkdocs-video
|
||||
|
||||
markdown_extensions:
|
||||
|
||||
Reference in New Issue
Block a user