mirror of
https://github.com/opendatalab/MinerU.git
synced 2026-03-27 11:08:32 +07:00
update readme
This commit is contained in:
74
README.md
74
README.md
@@ -1,50 +1,64 @@
|
||||
<div id="top"></div>
|
||||
<div align="center">
|
||||
|
||||
[](https://github.com/magicpdf/Magic-PDF)
|
||||
[](https://github.com/magicpdf/Magic-PDF)
|
||||
[](https://github.com/magicpdf/Magic-PDF/tree/main/LICENSE)
|
||||
[](https://github.com/magicpdf/Magic-PDF/issues)
|
||||
[](https://github.com/magicpdf/Magic-PDF/issues)
|
||||
|
||||
[English](README.md) | [简体中文](README_zh-CN.md)
|
||||
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
|
||||
</div>
|
||||
|
||||
# Magic-PDF
|
||||
|
||||
便捷、准确的将PDF转换成Markdown文档
|
||||
## Introduction
|
||||
|
||||
Magic-PDF is a tool designed to convert PDF documents into markdown format, capable of processing files stored locally or on object storage supporting S3 protocol.
|
||||
|
||||
### 上手指南
|
||||
Key features include:
|
||||
|
||||
###### 开发前的配置要求
|
||||
- Support for multiple front-end model inputs
|
||||
- Removal of headers, footers, footnotes, and page numbers
|
||||
- Human-readable layout formatting
|
||||
- Extraction and display of images and tables within markdown
|
||||
- Conversion of equations into LaTeX format
|
||||
- Automatic detection and conversion of garbled PDFs
|
||||
- Compatibility with CPU and GPU environments
|
||||
- Available for Windows, Linux, and macOS platforms
|
||||
|
||||
python 3.9+
|
||||
## Getting Started
|
||||
|
||||
###### **安装步骤**
|
||||
### Requirements
|
||||
|
||||
1.Clone the repo
|
||||
- Python 3.9 or newer
|
||||
|
||||
```sh
|
||||
git clone https://github.com/magicpdf/Magic-PDF.git
|
||||
### Usage Instructions
|
||||
|
||||
1. **Install Magic-PDF**
|
||||
|
||||
```bash
|
||||
pip install magic-pdf[cpu] # Install the CPU version
|
||||
or
|
||||
pip install magic-pdf[gpu] # Install the GPU version
|
||||
```
|
||||
|
||||
2.Install the requirements
|
||||
2. **Usage via Command Line**
|
||||
|
||||
```sh
|
||||
cd Magic-PDF
|
||||
pip install -r requirements.txt
|
||||
```bash
|
||||
magic-pdf --help
|
||||
```
|
||||
|
||||
3.Run the command line
|
||||
## License Information
|
||||
|
||||
```sh
|
||||
linux/osx
|
||||
export PYTHONPATH=.
|
||||
win
|
||||
$env:PYTHONPATH += ";.\Magic-PDF\magic_pdf"
|
||||
```
|
||||
```
|
||||
python magic_pdf/cli/magicpdf.py --help
|
||||
```
|
||||
See [LICENSE.md](https://github.com/magicpdf/Magic-PDF/blob/master/LICENSE.md) for details.
|
||||
|
||||
### 版权说明
|
||||
|
||||
[LICENSE.md](https://github.com/magicpdf/Magic-PDF/blob/master/LICENSE.md)
|
||||
|
||||
### 鸣谢
|
||||
## Acknowledgments
|
||||
|
||||
- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
|
||||
- [PyMuPDF](https://github.com/pymupdf/PyMuPDF)
|
||||
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user