3.7 KiB
Ubuntu 22.04 LTS
1. Check if NVIDIA Drivers Are Installed
nvidia-smi
If you see information similar to the following, it means that the NVIDIA drivers are already installed, and you can skip Step 2.
Note
Notice:
CUDA Versionshould be >= 12.1, If the displayed version number is less than 12.1, please upgrade the driver.
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 537.34 Driver Version: 537.34 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3060 Ti WDDM | 00000000:01:00.0 On | N/A |
| 0% 51C P8 12W / 200W | 1489MiB / 8192MiB | 5% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
2. Install the Driver
If no driver is installed, use the following command:
sudo apt-get update
sudo apt-get install nvidia-driver-545
Install the proprietary driver and restart your computer after installation.
reboot
3. Install Anaconda
If Anaconda is already installed, skip this step.
wget https://repo.anaconda.com/archive/Anaconda3-2024.06-1-Linux-x86_64.sh
bash Anaconda3-2024.06-1-Linux-x86_64.sh
In the final step, enter yes, close the terminal, and reopen it.
4. Create an Environment Using Conda
Specify Python version 3.10.
conda create -n MinerU python=3.10
conda activate MinerU
5. Install Applications
pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
Important
After installation, make sure to check the version of
magic-pdfusing the following command:magic-pdf --versionIf the version number is less than 0.7.0, please report the issue.
6. Download Models
Refer to detailed instructions on how to download model files.
7. Understand the Location of the Configuration File
After completing the 6. Download Models step, the script will automatically generate a magic-pdf.json file in the user directory and configure the default model path.
You can find the magic-pdf.json file in your user directory.
Tip
The user directory for Linux is "/home/username".
8. First Run
Download a sample file from the repository and test it.
wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf
magic-pdf -p small_ocr.pdf -o ./output
9. Test CUDA Acceleration
If your graphics card has at least 8GB of VRAM, follow these steps to test CUDA acceleration:
- Modify the value of
"device-mode"in themagic-pdf.jsonconfiguration file located in your home directory.{ "device-mode": "cuda" } - Test CUDA acceleration with the following command:
magic-pdf -p small_ocr.pdf -o ./output
10. Enable CUDA Acceleration for OCR
- Download
paddlepaddle-gpu. Installation will automatically enable OCR acceleration.python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/ - Test OCR acceleration with the following command:
magic-pdf -p small_ocr.pdf -o ./output