Compare commits
732 Commits
release-2.
...
release-2.
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
94dcf754b3 | ||
|
|
77c18e958f | ||
|
|
16997bea1b | ||
|
|
1a15dcee32 | ||
|
|
f78b25e3de | ||
|
|
24c973d99e | ||
|
|
4e5f03bba1 | ||
|
|
dfd99baccd | ||
|
|
c291cc1a59 | ||
|
|
6f20fefadf | ||
|
|
700321b23d | ||
|
|
0ba3992173 | ||
|
|
096717e4d0 | ||
|
|
ab365420b9 | ||
|
|
27e1fd63e7 | ||
|
|
4d47634913 | ||
|
|
7e33501cd0 | ||
|
|
a6f4eb3727 | ||
|
|
0d2bebd8b1 | ||
|
|
b7a209a4a7 | ||
|
|
08c9fadbcb | ||
|
|
424c37984b | ||
|
|
35d5ba8b8f | ||
|
|
b4f725258d | ||
|
|
4a081c3214 | ||
|
|
0e2e12ca84 | ||
|
|
48ed75d935 | ||
|
|
34c46cb83d | ||
|
|
16f167b351 | ||
|
|
91df5c8bb7 | ||
|
|
86b1fca74c | ||
|
|
444fd6f027 | ||
|
|
9aa46e9c6c | ||
|
|
7beee5be62 | ||
|
|
64586d03ea | ||
|
|
9c1c9d0c89 | ||
|
|
2111c35b83 | ||
|
|
83ad8e81a9 | ||
|
|
18ee522c77 | ||
|
|
72fa59bab2 | ||
|
|
28ebc0e2e8 | ||
|
|
3bcdd0a10a | ||
|
|
cd1c5c5e50 | ||
|
|
a83d351ccc | ||
|
|
7196f71153 | ||
|
|
1c530f64f5 | ||
|
|
997a131278 | ||
|
|
eeeaca85f8 | ||
|
|
c884d7ddb9 | ||
|
|
7bffbe2541 | ||
|
|
ed6fc3e44e | ||
|
|
a01a5d798b | ||
|
|
38f5995ae4 | ||
|
|
e7c80da602 | ||
|
|
33696974fe | ||
|
|
376d1e38d5 | ||
|
|
c5385af754 | ||
|
|
422ee671d8 | ||
|
|
76b1a559f8 | ||
|
|
afc6dcd7b0 | ||
|
|
cf1fbd2923 | ||
|
|
a0f27bd80b | ||
|
|
46f8c6d082 | ||
|
|
5f9fdd9b62 | ||
|
|
9ed6636ad2 | ||
|
|
f8af29e3a1 | ||
|
|
669b6cd629 | ||
|
|
281c965213 | ||
|
|
80445f24bf | ||
|
|
10af19f419 | ||
|
|
a149a8da50 | ||
|
|
843ab52da0 | ||
|
|
506179f0c8 | ||
|
|
43881d5f66 | ||
|
|
ad9521528e | ||
|
|
d67be0c7de | ||
|
|
056f8af0ae | ||
|
|
4f8d897342 | ||
|
|
0a4c9e307f | ||
|
|
79f2d03d32 | ||
|
|
d2c93b770f | ||
|
|
bb25385097 | ||
|
|
60c5f7d890 | ||
|
|
3293299f34 | ||
|
|
6581af72b4 | ||
|
|
4ba9c73458 | ||
|
|
f7509e7dc9 | ||
|
|
19c2a6612b | ||
|
|
1b440a8e92 | ||
|
|
39e7aa52a2 | ||
|
|
0c8e004874 | ||
|
|
f9f67ddef4 | ||
|
|
2ac829ca32 | ||
|
|
6bafca0555 | ||
|
|
7516d3ddf4 | ||
|
|
a2136c22a5 | ||
|
|
3fcca35c73 | ||
|
|
6c27bc7f53 | ||
|
|
30f1db6e6d | ||
|
|
e80e53d4de | ||
|
|
8e5a780fc6 | ||
|
|
ad35f0bbc2 | ||
|
|
5c743dc169 | ||
|
|
b26338d0ef | ||
|
|
275ae04e56 | ||
|
|
672e252506 | ||
|
|
85558061ff | ||
|
|
b4c8a017ea | ||
|
|
5c8d05e076 | ||
|
|
0cfc6c3d4e | ||
|
|
cdedc13713 | ||
|
|
95172d7d17 | ||
|
|
5cc13b919a | ||
|
|
9ec03c0353 | ||
|
|
ef485db9a8 | ||
|
|
16ff55b27f | ||
|
|
fa1149cd4a | ||
|
|
5a937d3059 | ||
|
|
f11e609a14 | ||
|
|
e010b0974a | ||
|
|
fe1549960d | ||
|
|
df23e45861 | ||
|
|
5ec07ee7ab | ||
|
|
f1ebf5a7f0 | ||
|
|
dae2cc8514 | ||
|
|
5de8f1a19f | ||
|
|
be2369bdd4 | ||
|
|
51df4d8508 | ||
|
|
f7225d8e17 | ||
|
|
a9c9501af6 | ||
|
|
74de2725cb | ||
|
|
6250c453d9 | ||
|
|
54417a51f8 | ||
|
|
2f120db20e | ||
|
|
2079395774 | ||
|
|
b4c57116c1 | ||
|
|
ace7f76869 | ||
|
|
5349fd7ccd | ||
|
|
5999f6664f | ||
|
|
245ae28c27 | ||
|
|
4afa045545 | ||
|
|
c32ff88400 | ||
|
|
4214634de8 | ||
|
|
bffc6aff53 | ||
|
|
05e114f8b9 | ||
|
|
66d5f3dfd2 | ||
|
|
305e3a61e8 | ||
|
|
b614bef035 | ||
|
|
cce16daf1f | ||
|
|
94eb35ffda | ||
|
|
1ebc1ae841 | ||
|
|
e90a17a3d2 | ||
|
|
61747bafdd | ||
|
|
374ace0a34 | ||
|
|
2c355d2d68 | ||
|
|
512554196b | ||
|
|
a33715c015 | ||
|
|
3bc44c8526 | ||
|
|
4ccd0528f4 | ||
|
|
64d6a38bf5 | ||
|
|
9ede336a0c | ||
|
|
1c0d4b8bc6 | ||
|
|
0b53696181 | ||
|
|
d06b105102 | ||
|
|
b70f49522e | ||
|
|
23d75bac09 | ||
|
|
14ca71eed0 | ||
|
|
d519095436 | ||
|
|
2238c49352 | ||
|
|
ef71228e1a | ||
|
|
8bf407a5e5 | ||
|
|
79fe3757b1 | ||
|
|
c9dc5df28d | ||
|
|
57b2c819f9 | ||
|
|
04860456e8 | ||
|
|
14c334d2b0 | ||
|
|
d57796a667 | ||
|
|
551802aebb | ||
|
|
59b5ffaf95 | ||
|
|
d975836b25 | ||
|
|
5351c76c5d | ||
|
|
324dd75a52 | ||
|
|
bb830c6cbf | ||
|
|
1fd357dd97 | ||
|
|
51726f7ac4 | ||
|
|
d306abf8d7 | ||
|
|
a2aae1fa48 | ||
|
|
05ce84c5e8 | ||
|
|
b2a2cac32e | ||
|
|
2dbb265cf9 | ||
|
|
737207582a | ||
|
|
d654238115 | ||
|
|
279e84bf58 | ||
|
|
9dfbdb8aec | ||
|
|
931aebc5d5 | ||
|
|
3896079940 | ||
|
|
a69e39860a | ||
|
|
5cd31f97b6 | ||
|
|
08ee48c1d7 | ||
|
|
05cf5a491e | ||
|
|
8a8fc59d20 | ||
|
|
7f96fa94b7 | ||
|
|
ad29a6a02a | ||
|
|
54ac866554 | ||
|
|
2080677d83 | ||
|
|
11a1f04b0f | ||
|
|
8a7b216d67 | ||
|
|
e5dba06035 | ||
|
|
beeef7068f | ||
|
|
1f5db12adb | ||
|
|
e5c8508ad7 | ||
|
|
633afeb9e2 | ||
|
|
797011879a | ||
|
|
7365f8137c | ||
|
|
2f1369a877 | ||
|
|
e803facba6 | ||
|
|
dc7b341e02 | ||
|
|
73c52b95f5 | ||
|
|
1037fd56bc | ||
|
|
25525ad899 | ||
|
|
55a0cb95b7 | ||
|
|
00d438d5fb | ||
|
|
eb02745e06 | ||
|
|
fe4985f6f0 | ||
|
|
8825235088 | ||
|
|
44a60785c6 | ||
|
|
473e235397 | ||
|
|
16814e1e1d | ||
|
|
3546766e72 | ||
|
|
b57d9caef3 | ||
|
|
0603edc202 | ||
|
|
2a0cb7963a | ||
|
|
a56bd6c334 | ||
|
|
f5400f0c94 | ||
|
|
6a6c650062 | ||
|
|
ae084eb317 | ||
|
|
7c77db7135 | ||
|
|
7b14a87b9d | ||
|
|
0d0ebfd7bc | ||
|
|
dc438fa620 | ||
|
|
f5a5644d12 | ||
|
|
91cc2524d5 | ||
|
|
e504e5e012 | ||
|
|
6b2f414438 | ||
|
|
a0da3029fd | ||
|
|
30fe325428 | ||
|
|
6131013ce9 | ||
|
|
f1c145054a | ||
|
|
078aaaf150 | ||
|
|
c3a55fffab | ||
|
|
4eddf28c8f | ||
|
|
dd92c5b723 | ||
|
|
b5922086cb | ||
|
|
df12e4fc79 | ||
|
|
90ed311198 | ||
|
|
c922c63fbc | ||
|
|
28b278508f | ||
|
|
6b54f321b4 | ||
|
|
e47ec7cd10 | ||
|
|
701f6018f2 | ||
|
|
5ade203e31 | ||
|
|
6e83f37754 | ||
|
|
972161a991 | ||
|
|
700e11d342 | ||
|
|
fd79885b23 | ||
|
|
a0810b5b6e | ||
|
|
39271b45de | ||
|
|
db68aaf4ac | ||
|
|
a6cc8fa90d | ||
|
|
47f34f4ce8 | ||
|
|
b7a8347f45 | ||
|
|
c6d241f4f4 | ||
|
|
06b2fda1c1 | ||
|
|
5c1ca9271e | ||
|
|
e7485c5d79 | ||
|
|
80436a89f9 | ||
|
|
b36793cef0 | ||
|
|
43b51e78fc | ||
|
|
9688f73046 | ||
|
|
c02edd9cba | ||
|
|
b4d08e994c | ||
|
|
a220b8a208 | ||
|
|
ab480a7a86 | ||
|
|
f57a6d8d9e | ||
|
|
915ba87f7d | ||
|
|
42a95e8e20 | ||
|
|
a513357607 | ||
|
|
c8ccf4cf20 | ||
|
|
33d43a5afc | ||
|
|
3b057c7996 | ||
|
|
34547262a2 | ||
|
|
cd0ed982c0 | ||
|
|
52dcbcbfa5 | ||
|
|
0758de6d24 | ||
|
|
ae7892a6f9 | ||
|
|
73567ccedc | ||
|
|
bb552282f3 | ||
|
|
14c38101f7 | ||
|
|
cb3a30e9ad | ||
|
|
f4db41d0cb | ||
|
|
dad59f7d52 | ||
|
|
499e877165 | ||
|
|
2d249666ba | ||
|
|
cedc62a728 | ||
|
|
1e40bac24f | ||
|
|
23701d0db4 | ||
|
|
e7d8bf097a | ||
|
|
08a89aeca1 | ||
|
|
1b724f3336 | ||
|
|
ea4271ab37 | ||
|
|
d83b83a5ad | ||
|
|
0853b84e87 | ||
|
|
36225160a3 | ||
|
|
a36118f8ba | ||
|
|
a38384e7fb | ||
|
|
4b7c2bbcc0 | ||
|
|
504fe6ada3 | ||
|
|
39be54023b | ||
|
|
484ff5a6f9 | ||
|
|
59a7a577b3 | ||
|
|
0e73ef9615 | ||
|
|
d580d6c7f8 | ||
|
|
4c8bb038ce | ||
|
|
a89715b9a2 | ||
|
|
f05ea7c2e6 | ||
|
|
b68db3ab90 | ||
|
|
3539cfba36 | ||
|
|
3bf50d5267 | ||
|
|
2108019698 | ||
|
|
17a9921ba9 | ||
|
|
3baee1d077 | ||
|
|
e1ee728e31 | ||
|
|
1b45e6e1bc | ||
|
|
966aadd1d3 | ||
|
|
ecb8e3f0ac | ||
|
|
1bef6e3526 | ||
|
|
4c4d1d0f95 | ||
|
|
c36aa54370 | ||
|
|
4b480cfcf7 | ||
|
|
7e18e1bb76 | ||
|
|
44fdeb663f | ||
|
|
cf59949ba9 | ||
|
|
c8c2f28afc | ||
|
|
aa4bc6259b | ||
|
|
b7e4ea0b49 | ||
|
|
998197a47f | ||
|
|
3c8b6e6b6b | ||
|
|
be42b46ff9 | ||
|
|
7c689e33b8 | ||
|
|
af66bc02c2 | ||
|
|
752f75ad8e | ||
|
|
1cfde98585 | ||
|
|
54676295d5 | ||
|
|
61c7c65d8b | ||
|
|
6f05f735d0 | ||
|
|
befb16e531 | ||
|
|
abc433d6f2 | ||
|
|
e7c1385068 | ||
|
|
342c5aa34a | ||
|
|
f25ddfa024 | ||
|
|
e31de3a453 | ||
|
|
2f01754410 | ||
|
|
8a9921fb22 | ||
|
|
652e11a253 | ||
|
|
61cc6886fe | ||
|
|
80dc57e7ce | ||
|
|
d84a006f6d | ||
|
|
2c5361bf8e | ||
|
|
eb01b7acf9 | ||
|
|
5656f1363b | ||
|
|
c9315b8e10 | ||
|
|
907099762f | ||
|
|
2c356cccee | ||
|
|
0f62f166e6 | ||
|
|
c7a64e72dc | ||
|
|
3cb3a94830 | ||
|
|
8301fa4c20 | ||
|
|
4400f4b75f | ||
|
|
92efb8f96e | ||
|
|
9a88cbfb09 | ||
|
|
e96e4a0ce4 | ||
|
|
c7bde0ab39 | ||
|
|
8754c24e42 | ||
|
|
4f8c00cc34 | ||
|
|
89681f98ad | ||
|
|
66d328dbc5 | ||
|
|
f0c1318545 | ||
|
|
6e97f3cf70 | ||
|
|
aede62167e | ||
|
|
5f2740f743 | ||
|
|
a888d2b625 | ||
|
|
4275876331 | ||
|
|
ec9f7f54ab | ||
|
|
7861e5e369 | ||
|
|
159f3a89a3 | ||
|
|
d9452bbeb9 | ||
|
|
d808a32c0b | ||
|
|
12ce3bd024 | ||
|
|
e3d7aece50 | ||
|
|
7c55a0ea65 | ||
|
|
f1659eb7a7 | ||
|
|
c6bffd9382 | ||
|
|
857dcb2ef5 | ||
|
|
ef69f98cd6 | ||
|
|
6d5d1cf26b | ||
|
|
7c481796f8 | ||
|
|
7d62b7b7cc | ||
|
|
5a0cf9af7f | ||
|
|
f5e0e67545 | ||
|
|
a4cac624df | ||
|
|
e1eb318b9b | ||
|
|
31834b1e68 | ||
|
|
100ace2e99 | ||
|
|
6aac639686 | ||
|
|
82f94a9a84 | ||
|
|
d928334c61 | ||
|
|
ebad82bd8c | ||
|
|
b03c5fb449 | ||
|
|
c343afd20c | ||
|
|
6586c7c01e | ||
|
|
304a6d9d8c | ||
|
|
bce9bb6d1d | ||
|
|
920220e48e | ||
|
|
9fc3d6c742 | ||
|
|
8fd544273e | ||
|
|
72f1f5f935 | ||
|
|
5559a4701a | ||
|
|
437022abfa | ||
|
|
4653ed1502 | ||
|
|
b58c7f8d6e | ||
|
|
f6133b1731 | ||
|
|
12d72c7c17 | ||
|
|
5f3f35c009 | ||
|
|
16ad71446b | ||
|
|
d4b364eb9f | ||
|
|
446188adf4 | ||
|
|
ff90c600aa | ||
|
|
3f2c7e5e7c | ||
|
|
2ba1c35fbd | ||
|
|
d3f92a0b20 | ||
|
|
4b6f151351 | ||
|
|
5fcd428cb5 | ||
|
|
5db08afef6 | ||
|
|
6b182f8378 | ||
|
|
ae9526127f | ||
|
|
39790095bf | ||
|
|
fef3081bdf | ||
|
|
5425da9571 | ||
|
|
9af1824328 | ||
|
|
e47b19c416 | ||
|
|
5646f46606 | ||
|
|
9d5568a9cb | ||
|
|
ec3549702f | ||
|
|
d185d1822b | ||
|
|
4864a086ce | ||
|
|
f736e29cc0 | ||
|
|
34fab4f5b8 | ||
|
|
2496875c33 | ||
|
|
ec4cc37861 | ||
|
|
c2208d84cb | ||
|
|
cdc025a9ec | ||
|
|
cdbe6ba9b6 | ||
|
|
75f576ad0c | ||
|
|
52844f0794 | ||
|
|
8d178b2b7e | ||
|
|
1083476a02 | ||
|
|
da29782a26 | ||
|
|
75797a3b7c | ||
|
|
5b73b89ceb | ||
|
|
c5b2926c7b | ||
|
|
8bb8b715c1 | ||
|
|
3ca520a3fe | ||
|
|
ba36a94aa0 | ||
|
|
11ebb47891 | ||
|
|
dd8dd5197b | ||
|
|
7a71cfe288 | ||
|
|
bba31191a4 | ||
|
|
9041f04588 | ||
|
|
69a9d11b0b | ||
|
|
36e7267ce1 | ||
|
|
14f347d613 | ||
|
|
6ea2cfeb21 | ||
|
|
078099f19d | ||
|
|
25d4a4588a | ||
|
|
679dad3aac | ||
|
|
e60da65cca | ||
|
|
f081d36a3a | ||
|
|
c74e712918 | ||
|
|
f2b944ab06 | ||
|
|
2e945adcc0 | ||
|
|
39eaf31fb9 | ||
|
|
7717534ea7 | ||
|
|
6166b98cd4 | ||
|
|
a02ab97ea0 | ||
|
|
beadb7a689 | ||
|
|
de5449fd40 | ||
|
|
76f74e7c70 | ||
|
|
efbf1422c6 | ||
|
|
3ec6479462 | ||
|
|
80e6f4ded4 | ||
|
|
376b5d924a | ||
|
|
6608615012 | ||
|
|
12dea70793 | ||
|
|
96a0a45c9a | ||
|
|
745954ca08 | ||
|
|
e120a90d11 | ||
|
|
8c75e0fce2 | ||
|
|
978c94f680 | ||
|
|
c4eae4e0ef | ||
|
|
411f3b7855 | ||
|
|
60e257e5f1 | ||
|
|
20e1dfe984 | ||
|
|
f2553dd89a | ||
|
|
b35c3345c0 | ||
|
|
af3ee06aa3 | ||
|
|
4f6ac22ce6 | ||
|
|
0f47a22bb3 | ||
|
|
2ca6ee1708 | ||
|
|
55eaad224d | ||
|
|
bb94e73fc9 | ||
|
|
70f62046e7 | ||
|
|
fd38cdff80 | ||
|
|
d30f762ac8 | ||
|
|
f65ff12eea | ||
|
|
8b8ac3e62e | ||
|
|
473154c2b3 | ||
|
|
e2fd491760 | ||
|
|
c29e2d0ca2 | ||
|
|
a5687394d5 | ||
|
|
13819c0596 | ||
|
|
d775f76eec | ||
|
|
5dd73dbcca | ||
|
|
3eda0d10a0 | ||
|
|
e0c3cbb34a | ||
|
|
d2fcdd0fa4 | ||
|
|
af887d63c0 | ||
|
|
b5a69c5258 | ||
|
|
ecfb4a03fb | ||
|
|
0bbefad67b | ||
|
|
a9f28b4436 | ||
|
|
05a9920ffe | ||
|
|
d96c3fc4d2 | ||
|
|
64e12cb924 | ||
|
|
29e37933aa | ||
|
|
287e5b6cfc | ||
|
|
9003f50a22 | ||
|
|
cb4d1cceb3 | ||
|
|
b670ebdd63 | ||
|
|
82323549c3 | ||
|
|
f24a30714f | ||
|
|
c497e4b1fc | ||
|
|
719154fe21 | ||
|
|
6ac9ebb3da | ||
|
|
30dce2063f | ||
|
|
41017331c6 | ||
|
|
d9618d9107 | ||
|
|
3da1ed8443 | ||
|
|
32a4bed808 | ||
|
|
244a1d9161 | ||
|
|
9b0c88a489 | ||
|
|
45a8ca81e8 | ||
|
|
06a158e56b | ||
|
|
bae254fa72 | ||
|
|
aa39e61fef | ||
|
|
733cbca6dd | ||
|
|
3bff7cd017 | ||
|
|
7a2286890b | ||
|
|
1bf3817be7 | ||
|
|
b8730977e5 | ||
|
|
28d0360ec3 | ||
|
|
d0e68a3018 | ||
|
|
2c8accf9d0 | ||
|
|
62fb477beb | ||
|
|
65a8097704 | ||
|
|
254a0a483b | ||
|
|
c65fb7de8a | ||
|
|
33f4a21ae8 | ||
|
|
7cb9d67ea4 | ||
|
|
a746bac44b | ||
|
|
98a7d66d28 | ||
|
|
650ff3c683 | ||
|
|
a084153411 | ||
|
|
90047f9bd5 | ||
|
|
43e5b8da0e | ||
|
|
7d3a76f80f | ||
|
|
b3a3c2ccd2 | ||
|
|
7a1603978f | ||
|
|
3e88f78c5c | ||
|
|
cd9ae14d1e | ||
|
|
959163a5b5 | ||
|
|
f1fb900ea5 | ||
|
|
be587e31fa | ||
|
|
396cf8b81d | ||
|
|
0bb4238114 | ||
|
|
10bb08c875 | ||
|
|
98c8761361 | ||
|
|
65b2ddc07f | ||
|
|
832d28e512 | ||
|
|
0641cc07f7 | ||
|
|
1671e68367 | ||
|
|
320cd60c81 | ||
|
|
c8ff2f2778 | ||
|
|
3a33acaeb0 | ||
|
|
2fcffcb0af | ||
|
|
ffb2ffcd76 | ||
|
|
da1431558a | ||
|
|
51a6077876 | ||
|
|
532cfd20f8 | ||
|
|
8634e0b51c | ||
|
|
cf2b74b030 | ||
|
|
c8a17c5f98 | ||
|
|
512f40fdfb | ||
|
|
193d5d8e44 | ||
|
|
17a7758fee | ||
|
|
9d10bb13f5 | ||
|
|
58cccf0825 | ||
|
|
3b9221de18 | ||
|
|
4a237eef36 | ||
|
|
a5b09b8479 | ||
|
|
2803ad4dd6 | ||
|
|
a73de7746a | ||
|
|
b0d40dd236 | ||
|
|
dde265f148 | ||
|
|
aad384f2e7 | ||
|
|
60dd005dd5 | ||
|
|
dee840afc7 | ||
|
|
aeacfc8d50 | ||
|
|
b54dc524bf | ||
|
|
c3db578247 | ||
|
|
2b7eb741dc | ||
|
|
27e2ea44b1 | ||
|
|
f0126cfc23 | ||
|
|
efa4a5b7f1 | ||
|
|
bb933ff9f6 | ||
|
|
80c1b995bc | ||
|
|
ca155df027 | ||
|
|
c85d5d271a | ||
|
|
91826697c9 | ||
|
|
0137913fd2 | ||
|
|
a086cfad0d | ||
|
|
de41fa5859 | ||
|
|
30b698ecc5 | ||
|
|
5c00dcaee7 | ||
|
|
c7e456033d | ||
|
|
7676543ff8 | ||
|
|
9ca1bf232b | ||
|
|
6081d01da1 | ||
|
|
8c8ac2c667 | ||
|
|
8d2871e827 | ||
|
|
1cd85ccfae | ||
|
|
1e18361273 | ||
|
|
866ad6ae51 | ||
|
|
0a91a69596 | ||
|
|
84a9e5f13e | ||
|
|
ac15ad0e61 | ||
|
|
dfbccbc624 | ||
|
|
e96eac481e | ||
|
|
97d6ff955d | ||
|
|
7e5b1a45d7 | ||
|
|
4b71e1855b | ||
|
|
29754f7c17 | ||
|
|
5db0f7b9ce | ||
|
|
139d9b15bc | ||
|
|
d58a440ffc | ||
|
|
5f9cb12fd8 | ||
|
|
26aa3d81e2 | ||
|
|
34360ba642 | ||
|
|
a01c4bbe66 | ||
|
|
706eadbf5d | ||
|
|
c702302684 | ||
|
|
eb67c36a81 | ||
|
|
948e4aefff | ||
|
|
0eec0d90b5 | ||
|
|
6978f09be0 | ||
|
|
f3c933770a | ||
|
|
1d55925954 | ||
|
|
e00c090616 | ||
|
|
21ce40a90c | ||
|
|
11c4a0c6b6 | ||
|
|
433b37589c | ||
|
|
e429c5a840 | ||
|
|
ff11e602fc | ||
|
|
f5afd61eb0 | ||
|
|
2ccad698d1 | ||
|
|
99d1fddc8c | ||
|
|
8b6d217efe | ||
|
|
324681f77c | ||
|
|
133984a514 | ||
|
|
fc5179ce5e | ||
|
|
86b20f3283 | ||
|
|
0298de844f | ||
|
|
5be16aa4cb | ||
|
|
2a8e6f9d45 | ||
|
|
fdf7e4f771 | ||
|
|
c98cba1e30 | ||
|
|
7d0c39df3b | ||
|
|
d6c8199326 | ||
|
|
be4f3de32b | ||
|
|
176bf3d845 | ||
|
|
a50616b089 | ||
|
|
da56746668 | ||
|
|
259ab11e74 | ||
|
|
64c7ac0083 | ||
|
|
20e8d7fcd7 | ||
|
|
4d0b6a0513 | ||
|
|
74726662ce | ||
|
|
990782acfc | ||
|
|
2ce4352a25 | ||
|
|
e76f29639f | ||
|
|
60d32b8ac1 | ||
|
|
6a9035bdf9 | ||
|
|
865b44a517 | ||
|
|
6db7df1cde | ||
|
|
1dbe356fa4 | ||
|
|
a67ff8707e | ||
|
|
7b701e4907 | ||
|
|
ebb5e317db | ||
|
|
bc17d77fa9 | ||
|
|
bf5b750565 | ||
|
|
1b6ed5d0a0 | ||
|
|
d85b5e86cd | ||
|
|
9c6778d5ad | ||
|
|
22f5b0b4b4 | ||
|
|
2b3b4331f2 | ||
|
|
0de4075586 | ||
|
|
80cbeb2a3f | ||
|
|
be489ab780 | ||
|
|
0041919d22 | ||
|
|
f7affcefb7 | ||
|
|
a11f6de370 | ||
|
|
93a3bc776b | ||
|
|
7ceee7f7bf |
16
.github/ISSUE_TEMPLATE/bug_report.yml
vendored
@@ -122,7 +122,21 @@ body:
|
||||
#multiple: false
|
||||
options:
|
||||
-
|
||||
- "2.0.x"
|
||||
- "`<2.2.0`"
|
||||
- "`2.2.x`"
|
||||
- "`>=2.5`"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: backend_name
|
||||
attributes:
|
||||
label: Backend name | 解析后端
|
||||
#multiple: false
|
||||
options:
|
||||
-
|
||||
- "vlm"
|
||||
- "pipeline"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
|
||||
365
README.md
@@ -1,7 +1,7 @@
|
||||
<div align="center" xmlns="http://www.w3.org/1999/html">
|
||||
<!-- logo -->
|
||||
<p align="center">
|
||||
<img src="docs/images/MinerU-logo.png" width="300px" style="vertical-align:middle;">
|
||||
<img src="https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docs/images/MinerU-logo.png" width="300px" style="vertical-align:middle;">
|
||||
</p>
|
||||
|
||||
<!-- icon -->
|
||||
@@ -17,8 +17,9 @@
|
||||
[](https://mineru.net/OpenSourceTools/Extractor?source=github)
|
||||
[](https://huggingface.co/spaces/opendatalab/MinerU)
|
||||
[](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
|
||||
[](https://colab.research.google.com/gist/myhloli/3b3a00a4a0a61577b6c30f989092d20d/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://colab.research.google.com/gist/myhloli/a3cb16570ab3cfeadf9d8f0ac91b4fca/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://arxiv.org/abs/2509.22186)
|
||||
[](https://deepwiki.com/opendatalab/MinerU)
|
||||
|
||||
|
||||
@@ -37,50 +38,209 @@
|
||||
<!-- join us -->
|
||||
|
||||
<p align="center">
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="http://mineru.space/s/V85Yl" target="_blank">WeChat</a>
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="https://mineru.net/community-portal/?aliasId=3c430f94" target="_blank">WeChat</a>
|
||||
</p>
|
||||
|
||||
</div>
|
||||
|
||||
# Changelog
|
||||
- 2025/07/28 version 2.1.8 Released
|
||||
- `sglang` 0.4.9.post5 version adaptation
|
||||
- 2025/07/27 version 2.1.7 Released
|
||||
- `transformers` 4.54.0 version adaptation
|
||||
- 2025/07/26 2.1.6 Released
|
||||
- Fixed table parsing issues in handwritten documents when using `vlm` backend
|
||||
- Fixed visualization box position drift issue when document is rotated #3175
|
||||
- 2025/07/24 2.1.5 Released
|
||||
- `sglang` 0.4.9 version adaptation, synchronously upgrading the dockerfile base image to sglang 0.4.9.post3
|
||||
- 2025/07/23 2.1.4 Released
|
||||
- Bug Fixes
|
||||
- Fixed the issue of excessive memory consumption during the `MFR` step in the `pipeline` backend under certain scenarios #2771
|
||||
- Fixed the inaccurate matching between `image`/`table` and `caption`/`footnote` under certain conditions #3129
|
||||
- 2025/07/16 2.1.1 Released
|
||||
- Bug fixes
|
||||
- Fixed text block content loss issue that could occur in certain `pipeline` scenarios #3005
|
||||
- Fixed issue where `sglang-client` required unnecessary packages like `torch` #2968
|
||||
- Updated `dockerfile` to fix incomplete text content parsing due to missing fonts in Linux #2915
|
||||
- Usability improvements
|
||||
- Updated `compose.yaml` to facilitate direct startup of `sglang-server`, `mineru-api`, and `mineru-gradio` services
|
||||
- Launched brand new [online documentation site](https://opendatalab.github.io/MinerU/), simplified readme, providing better documentation experience
|
||||
- 2025/07/05 Version 2.1.0 Released
|
||||
- This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:
|
||||
- **Performance Optimizations:**
|
||||
- Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).
|
||||
- Greatly enhanced post-processing speed when the `pipeline` backend handles batch processing of documents with fewer pages (<10 pages).
|
||||
- Layout analysis speed of the `pipeline` backend has been increased by approximately 20%.
|
||||
- **Experience Enhancements:**
|
||||
- Built-in ready-to-use `fastapi service` and `gradio webui`. For detailed usage instructions, please refer to [Documentation](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver).
|
||||
- Adapted to `sglang` version `0.4.8`, significantly reducing the GPU memory requirements for the `vlm-sglang` backend. It can now run on graphics cards with as little as `8GB GPU memory` (Turing architecture or newer).
|
||||
- Added transparent parameter passing for all commands related to `sglang`, allowing the `sglang-engine` backend to receive all `sglang` parameters consistently with the `sglang-server`.
|
||||
- Supports feature extensions based on configuration files, including `custom formula delimiters`, `enabling heading classification`, and `customizing local model directories`. For detailed usage instructions, please refer to [Documentation](https://opendatalab.github.io/MinerU/usage/quick_usage/#extending-mineru-functionality-with-configuration-files).
|
||||
- **New Features:**
|
||||
- Updated the `pipeline` backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. [Details](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
|
||||
- Introduced limited support for vertical text layout in the `pipeline` backend.
|
||||
- 2025/11/26 2.6.5 Release
|
||||
- Added support for a new backend vlm-lmdeploy-engine. Its usage is similar to vlm-vllm-(async)engine, but it uses lmdeploy as the inference engine and additionally supports native inference acceleration on Windows platforms compared to vllm.
|
||||
|
||||
- 2025/11/04 2.6.4 Release
|
||||
- Added timeout configuration for PDF image rendering, default is 300 seconds, can be configured via environment variable `MINERU_PDF_RENDER_TIMEOUT` to prevent long blocking of the rendering process caused by some abnormal PDF files.
|
||||
- Added CPU thread count configuration options for ONNX models, default is the system CPU core count, can be configured via environment variables `MINERU_INTRA_OP_NUM_THREADS` and `MINERU_INTER_OP_NUM_THREADS` to reduce CPU resource contention conflicts in high concurrency scenarios.
|
||||
|
||||
- 2025/10/31 2.6.3 Release
|
||||
- Added support for a new backend `vlm-mlx-engine`, enabling MLX-accelerated inference for the MinerU2.5 model on Apple Silicon devices. Compared to the `vlm-transformers` backend, `vlm-mlx-engine` delivers a 100%–200% speed improvement.
|
||||
- Bug fixes: #3849, #3859
|
||||
|
||||
- 2025/10/24 2.6.2 Release
|
||||
- `pipeline` backend optimizations
|
||||
- Added experimental support for Chinese formulas, which can be enabled by setting the environment variable `export MINERU_FORMULA_CH_SUPPORT=1`. This feature may cause a slight decrease in MFR speed and failures in recognizing some long formulas. It is recommended to enable it only when parsing Chinese formulas is needed. To disable this feature, set the environment variable to `0`.
|
||||
- `OCR` speed significantly improved by 200%~300%, thanks to the optimization solution provided by [@cjsdurj](https://github.com/cjsdurj)
|
||||
- `OCR` models optimized for improved accuracy and coverage of Latin script recognition, and updated Cyrillic, Arabic, Devanagari, Telugu (te), and Tamil (ta) language systems to `ppocr-v5` version, with accuracy improved by over 40% compared to previous models
|
||||
- `vlm` backend optimizations
|
||||
- `table_caption` and `table_footnote` matching logic optimized to improve the accuracy of table caption and footnote matching and reading order rationality in scenarios with multiple consecutive tables on a page
|
||||
- Optimized CPU resource usage during high concurrency when using `vllm` backend, reducing server pressure
|
||||
- Adapted to `vllm` version 0.11.0
|
||||
- General optimizations
|
||||
- Cross-page table merging effect optimized, added support for cross-page continuation table merging, improving table merging effectiveness in multi-column merge scenarios
|
||||
- Added environment variable configuration option `MINERU_TABLE_MERGE_ENABLE` for table merging feature. Table merging is enabled by default and can be disabled by setting this variable to `0`
|
||||
|
||||
- 2025/09/26 2.5.4 released
|
||||
- 🎉🎉 The MinerU2.5 [Technical Report](https://arxiv.org/abs/2509.22186) is now available! We welcome you to read it for a comprehensive overview of its model architecture, training strategy, data engineering and evaluation results.
|
||||
- Fixed an issue where some `PDF` files were mistakenly identified as `AI` files, causing parsing failures
|
||||
|
||||
- 2025/09/20 2.5.3 Released
|
||||
- Dependency version range adjustment to enable Turing and earlier architecture GPUs to use vLLM acceleration for MinerU2.5 model inference.
|
||||
- `pipeline` backend compatibility fixes for torch 2.8.0.
|
||||
- Reduced default concurrency for vLLM async backend to lower server pressure and avoid connection closure issues caused by high load.
|
||||
- More compatibility-related details can be found in the [announcement](https://github.com/opendatalab/MinerU/discussions/3548)
|
||||
|
||||
- 2025/09/19 2.5.2 Released
|
||||
|
||||
We are officially releasing MinerU2.5, currently the most powerful multimodal large model for document parsing.
|
||||
With only 1.2B parameters, MinerU2.5's accuracy on the OmniDocBench benchmark comprehensively surpasses top-tier multimodal models like Gemini 2.5 Pro, GPT-4o, and Qwen2.5-VL-72B. It also significantly outperforms leading specialized models such as dots.ocr, MonkeyOCR, and PP-StructureV3.
|
||||
The model has been released on [HuggingFace](https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B) and [ModelScope](https://modelscope.cn/models/opendatalab/MinerU2.5-2509-1.2B) platforms. Welcome to download and use!
|
||||
- Core Highlights:
|
||||
- SOTA Performance with Extreme Efficiency: As a 1.2B model, it achieves State-of-the-Art (SOTA) results that exceed models in the 10B and 100B+ classes, redefining the performance-per-parameter standard in document AI.
|
||||
- Advanced Architecture for Across-the-Board Leadership: By combining a two-stage inference pipeline (decoupling layout analysis from content recognition) with a native high-resolution architecture, it achieves SOTA performance across five key areas: layout analysis, text recognition, formula recognition, table recognition, and reading order.
|
||||
- Key Capability Enhancements:
|
||||
- Layout Detection: Delivers more complete results by accurately covering non-body content like headers, footers, and page numbers. It also provides more precise element localization and natural format reconstruction for lists and references.
|
||||
- Table Parsing: Drastically improves parsing for challenging cases, including rotated tables, borderless/semi-structured tables, and long/complex tables.
|
||||
- Formula Recognition: Significantly boosts accuracy for complex, long-form, and hybrid Chinese-English formulas, greatly enhancing the parsing capability for mathematical documents.
|
||||
|
||||
Additionally, with the release of vlm 2.5, we have made some adjustments to the repository:
|
||||
- The vlm backend has been upgraded to version 2.5, supporting the MinerU2.5 model and no longer compatible with the MinerU2.0-2505-0.9B model. The last version supporting the 2.0 model is mineru-2.2.2.
|
||||
- VLM inference-related code has been moved to [mineru_vl_utils](https://github.com/opendatalab/mineru-vl-utils), reducing coupling with the main mineru repository and facilitating independent iteration in the future.
|
||||
- The vlm accelerated inference framework has been switched from `sglang` to `vllm`, achieving full compatibility with the vllm ecosystem, allowing users to use the MinerU2.5 model and accelerated inference on any platform that supports the vllm framework.
|
||||
- Due to major upgrades in the vlm model supporting more layout types, we have made some adjustments to the structure of the parsing intermediate file `middle.json` and result file `content_list.json`. Please refer to the [documentation](https://opendatalab.github.io/MinerU/reference/output_files/) for details.
|
||||
|
||||
Other repository optimizations:
|
||||
- Removed file extension whitelist validation for input files. When input files are PDF documents or images, there are no longer requirements for file extensions, improving usability.
|
||||
|
||||
<details>
|
||||
<summary>History Log</summary>
|
||||
|
||||
<details>
|
||||
<summary>2025/09/10 2.2.2 Released</summary>
|
||||
<ul>
|
||||
<li>Fixed the issue where the new table recognition model would affect the overall parsing task when some table parsing failed</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/09/08 2.2.1 Released</summary>
|
||||
<ul>
|
||||
<li>Fixed the issue where some newly added models were not downloaded when using the model download command.</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/09/05 2.2.0 Released</summary>
|
||||
<ul>
|
||||
<li>
|
||||
Major Updates
|
||||
<ul>
|
||||
<li>In this version, we focused on improving table parsing accuracy by introducing a new <a href="https://github.com/RapidAI/TableStructureRec">wired table recognition model</a> and a brand-new hybrid table structure parsing algorithm, significantly enhancing the table recognition capabilities of the <code>pipeline</code> backend.</li>
|
||||
<li>We also added support for cross-page table merging, which is supported by both <code>pipeline</code> and <code>vlm</code> backends, further improving the completeness and accuracy of table parsing.</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
Other Updates
|
||||
<ul>
|
||||
<li>The <code>pipeline</code> backend now supports 270-degree rotated table parsing, bringing support for table parsing in 0/90/270-degree orientations</li>
|
||||
<li><code>pipeline</code> added OCR capability support for Thai and Greek, and updated the English OCR model to the latest version. English recognition accuracy improved by 11%, Thai recognition model accuracy is 82.68%, and Greek recognition model accuracy is 89.28% (by PPOCRv5)</li>
|
||||
<li>Added <code>bbox</code> field (mapped to 0-1000 range) in the output <code>content_list.json</code>, making it convenient for users to directly obtain position information for each content block</li>
|
||||
<li>Removed the <code>pipeline_old_linux</code> installation option, no longer supporting legacy Linux systems such as <code>CentOS 7</code>, to provide better support for <code>uv</code>'s <code>sync</code>/<code>run</code> commands</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/08/01 2.1.10 Released</summary>
|
||||
<ul>
|
||||
<li>Fixed an issue in the <code>pipeline</code> backend where block overlap caused the parsing results to deviate from expectations #3232</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/30 2.1.9 Released</summary>
|
||||
<ul>
|
||||
<li><code>transformers</code> 4.54.1 version adaptation</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/28 2.1.8 Released</summary>
|
||||
<ul>
|
||||
<li><code>sglang</code> 0.4.9.post5 version adaptation</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/27 2.1.7 Released</summary>
|
||||
<ul>
|
||||
<li><code>transformers</code> 4.54.0 version adaptation</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/26 2.1.6 Released</summary>
|
||||
<ul>
|
||||
<li>Fixed table parsing issues in handwritten documents when using <code>vlm</code> backend</li>
|
||||
<li>Fixed visualization box position drift issue when document is rotated #3175</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/24 2.1.5 Released</summary>
|
||||
<ul>
|
||||
<li><code>sglang</code> 0.4.9 version adaptation, synchronously upgrading the dockerfile base image to sglang 0.4.9.post3</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/23 2.1.4 Released</summary>
|
||||
<ul>
|
||||
<li><strong>Bug Fixes</strong>
|
||||
<ul>
|
||||
<li>Fixed the issue of excessive memory consumption during the <code>MFR</code> step in the <code>pipeline</code> backend under certain scenarios #2771</li>
|
||||
<li>Fixed the inaccurate matching between <code>image</code>/<code>table</code> and <code>caption</code>/<code>footnote</code> under certain conditions #3129</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/16 2.1.1 Released</summary>
|
||||
<ul>
|
||||
<li><strong>Bug fixes</strong>
|
||||
<ul>
|
||||
<li>Fixed text block content loss issue that could occur in certain <code>pipeline</code> scenarios #3005</li>
|
||||
<li>Fixed issue where <code>sglang-client</code> required unnecessary packages like <code>torch</code> #2968</li>
|
||||
<li>Updated <code>dockerfile</code> to fix incomplete text content parsing due to missing fonts in Linux #2915</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>Usability improvements</strong>
|
||||
<ul>
|
||||
<li>Updated <code>compose.yaml</code> to facilitate direct startup of <code>sglang-server</code>, <code>mineru-api</code>, and <code>mineru-gradio</code> services</li>
|
||||
<li>Launched brand new <a href="https://opendatalab.github.io/MinerU/">online documentation site</a>, simplified readme, providing better documentation experience</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/05 2.1.0 Released</summary>
|
||||
<ul>
|
||||
<li>This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:</li>
|
||||
<li><strong>Performance Optimizations:</strong>
|
||||
<ul>
|
||||
<li>Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).</li>
|
||||
<li>Greatly enhanced post-processing speed when the <code>pipeline</code> backend handles batch processing of documents with fewer pages (<10 pages).</li>
|
||||
<li>Layout analysis speed of the <code>pipeline</code> backend has been increased by approximately 20%.</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>Experience Enhancements:</strong>
|
||||
<ul>
|
||||
<li>Built-in ready-to-use <code>fastapi service</code> and <code>gradio webui</code>. For detailed usage instructions, please refer to <a href="https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver">Documentation</a>.</li>
|
||||
<li>Adapted to <code>sglang</code> version <code>0.4.8</code>, significantly reducing the GPU memory requirements for the <code>vlm-sglang</code> backend. It can now run on graphics cards with as little as <code>8GB GPU memory</code> (Turing architecture or newer).</li>
|
||||
<li>Added transparent parameter passing for all commands related to <code>sglang</code>, allowing the <code>sglang-engine</code> backend to receive all <code>sglang</code> parameters consistently with the <code>sglang-server</code>.</li>
|
||||
<li>Supports feature extensions based on configuration files, including <code>custom formula delimiters</code>, <code>enabling heading classification</code>, and <code>customizing local model directories</code>. For detailed usage instructions, please refer to <a href="https://opendatalab.github.io/MinerU/usage/quick_usage/#extending-mineru-functionality-with-configuration-files">Documentation</a>.</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>New Features:</strong>
|
||||
<ul>
|
||||
<li>Updated the <code>pipeline</code> backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. <a href="https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html">Details</a></li>
|
||||
<li>Introduced limited support for vertical text layout in the <code>pipeline</code> backend.</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/06/20 2.0.6 Released</summary>
|
||||
<ul>
|
||||
@@ -434,7 +594,7 @@ https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
|
||||
- Automatically recognize and convert formulas in the document to LaTeX format.
|
||||
- Automatically recognize and convert tables in the document to HTML format.
|
||||
- Automatically detect scanned PDFs and garbled PDFs and enable OCR functionality.
|
||||
- OCR supports detection and recognition of 84 languages.
|
||||
- OCR supports detection and recognition of 109 languages.
|
||||
- Supports multiple output formats, such as multimodal and NLP Markdown, JSON sorted by reading order, and rich intermediate formats.
|
||||
- Supports various visualization results, including layout visualization and span visualization, for efficient confirmation of output quality.
|
||||
- Supports running in a pure CPU environment, and also supports GPU(CUDA)/NPU(CANN)/MPS acceleration
|
||||
@@ -471,41 +631,75 @@ A WebUI developed based on Gradio, with a simple interface and only core parsing
|
||||
> In non-mainline environments, due to the diversity of hardware and software configurations, as well as third-party dependency compatibility issues, we cannot guarantee 100% project availability. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first. Most issues already have corresponding solutions in the FAQ. We also encourage community feedback to help us gradually expand support.
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<td>Parsing Backend</td>
|
||||
<td>pipeline</td>
|
||||
<td>vlm-transformers</td>
|
||||
<td>vlm-sglang</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Operating System</td>
|
||||
<td>Linux / Windows / macOS</td>
|
||||
<td>Linux / Windows</td>
|
||||
<td>Linux / Windows (via WSL2)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>CPU Inference Support</td>
|
||||
<td>✅</td>
|
||||
<td colspan="2">❌</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>GPU Requirements</td>
|
||||
<td>Turing architecture and later, 6GB+ VRAM or Apple Silicon</td>
|
||||
<td colspan="2">Turing architecture and later, 8GB+ VRAM</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Memory Requirements</td>
|
||||
<td colspan="3">Minimum 16GB+, recommended 32GB+</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Disk Space Requirements</td>
|
||||
<td colspan="3">20GB+, SSD recommended</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Python Version</td>
|
||||
<td colspan="3">3.10-3.13</td>
|
||||
</tr>
|
||||
<thead>
|
||||
<tr>
|
||||
<th rowspan="2">Parsing Backend</th>
|
||||
<th rowspan="2">pipeline <br> (Accuracy<sup>1</sup> 82+)</th>
|
||||
<th colspan="5">vlm (Accuracy<sup>1</sup> 90+)</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>transformers</th>
|
||||
<th>mlx-engine</th>
|
||||
<th>vllm-engine / <br>vllm-async-engine</th>
|
||||
<th>lmdeploy-engine</th>
|
||||
<th>http-client</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<th>Backend Features</th>
|
||||
<td>Fast, no hallucinations</td>
|
||||
<td>Good compatibility, <br>but slower</td>
|
||||
<td>Faster than transformers</td>
|
||||
<td>Fast, compatible with the vLLM ecosystem</td>
|
||||
<td>Fast, compatible with the LMDeploy ecosystem</td>
|
||||
<td>Suitable for OpenAI-compatible servers<sup>6</sup></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Operating System</th>
|
||||
<td colspan="2" style="text-align:center;">Linux<sup>2</sup> / Windows / macOS</td>
|
||||
<td style="text-align:center;">macOS<sup>3</sup></td>
|
||||
<td style="text-align:center;">Linux<sup>2</sup> / Windows<sup>4</sup> </td>
|
||||
<td style="text-align:center;">Linux<sup>2</sup> / Windows<sup>5</sup> </td>
|
||||
<td>Any</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>CPU inference support</th>
|
||||
<td colspan="2" style="text-align:center;">✅</td>
|
||||
<td colspan="3" style="text-align:center;">❌</td>
|
||||
<td>Not required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>GPU Requirements</th><td colspan="2" style="text-align:center;">Volta or later architectures, 6 GB VRAM or more, or Apple Silicon</td>
|
||||
<td>Apple Silicon</td>
|
||||
<td colspan="2" style="text-align:center;">Volta or later architectures, 8 GB VRAM or more</td>
|
||||
<td>Not required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Memory Requirements</th>
|
||||
<td colspan="5" style="text-align:center;">Minimum 16 GB, 32 GB recommended</td>
|
||||
<td>8 GB</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Disk Space Requirements</th>
|
||||
<td colspan="5" style="text-align:center;">20 GB or more, SSD recommended</td>
|
||||
<td>2 GB</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Python Version</th>
|
||||
<td colspan="6" style="text-align:center;">3.10-3.13<sup>7</sup></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<sup>1</sup> Accuracy metric is the End-to-End Evaluation Overall score of OmniDocBench (v1.5), tested on the latest `MinerU` version.
|
||||
<sup>2</sup> Linux supports only distributions released in 2019 or later.
|
||||
<sup>3</sup> MLX requires macOS 13.5 or later, recommended for use with version 14.0 or higher.
|
||||
<sup>4</sup> Windows vLLM support via WSL2(Windows Subsystem for Linux).
|
||||
<sup>5</sup> Windows LMDeploy can only use the `turbomind` backend, which is slightly slower than the `pytorch` backend. If performance is critical, it is recommended to run it via WSL2.
|
||||
<sup>6</sup> Servers compatible with the OpenAI API, such as local or remote model services deployed via inference frameworks like `vLLM`, `SGLang`, or `LMDeploy`.
|
||||
<sup>7</sup> Windows + LMDeploy only supports Python versions 3.10–3.12, as the critical dependency `ray` does not yet support Python 3.13 on Windows.
|
||||
|
||||
|
||||
### Install MinerU
|
||||
|
||||
@@ -524,8 +718,8 @@ uv pip install -e .[core]
|
||||
```
|
||||
|
||||
> [!TIP]
|
||||
> `mineru[core]` includes all core features except `sglang` acceleration, compatible with Windows / Linux / macOS systems, suitable for most users.
|
||||
> If you need to use `sglang` acceleration for VLM model inference or install a lightweight client on edge devices, please refer to the documentation [Extension Modules Installation Guide](https://opendatalab.github.io/MinerU/quick_start/extension_modules/).
|
||||
> `mineru[core]` includes all core features except `vLLM`/`LMDeploy` acceleration, compatible with Windows / Linux / macOS systems, suitable for most users.
|
||||
> If you need to use `vLLM`/`LMDeploy` acceleration for VLM model inference or install a lightweight client on edge devices, please refer to the documentation [Extension Modules Installation Guide](https://opendatalab.github.io/MinerU/quick_start/extension_modules/).
|
||||
|
||||
---
|
||||
|
||||
@@ -553,8 +747,8 @@ You can use MinerU for PDF parsing through various methods such as command line,
|
||||
- [x] Handwritten Text Recognition
|
||||
- [x] Vertical Text Recognition
|
||||
- [x] Latin Accent Mark Recognition
|
||||
- [ ] Code block recognition in the main text
|
||||
- [ ] [Chemical formula recognition](docs/chemical_knowledge_introduction/introduction.pdf)
|
||||
- [x] Code block recognition in the main text
|
||||
- [x] [Chemical formula recognition](docs/chemical_knowledge_introduction/introduction.pdf)(mineru.net)
|
||||
- [ ] Geometric shape recognition
|
||||
|
||||
# Known Issues
|
||||
@@ -572,7 +766,7 @@ You can use MinerU for PDF parsing through various methods such as command line,
|
||||
|
||||
- If you encounter any issues during usage, you can first check the [FAQ](https://opendatalab.github.io/MinerU/faq/) for solutions.
|
||||
- If your issue remains unresolved, you may also use [DeepWiki](https://deepwiki.com/opendatalab/MinerU) to interact with an AI assistant, which can address most common problems.
|
||||
- If you still cannot resolve the issue, you are welcome to join our community via [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) to discuss with other users and developers.
|
||||
- If you still cannot resolve the issue, you are welcome to join our community via [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](https://mineru.net/community-portal/?aliasId=3c430f94) to discuss with other users and developers.
|
||||
|
||||
# All Thanks To Our Contributors
|
||||
|
||||
@@ -592,6 +786,7 @@ Currently, some models in this project are trained based on YOLO. However, since
|
||||
- [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
|
||||
- [UniMERNet](https://github.com/opendatalab/UniMERNet)
|
||||
- [RapidTable](https://github.com/RapidAI/RapidTable)
|
||||
- [TableStructureRec](https://github.com/RapidAI/TableStructureRec)
|
||||
- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
|
||||
- [PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch)
|
||||
- [layoutreader](https://github.com/ppaanngggg/layoutreader)
|
||||
@@ -601,10 +796,21 @@ Currently, some models in this project are trained based on YOLO. However, since
|
||||
- [pdftext](https://github.com/datalab-to/pdftext)
|
||||
- [pdfminer.six](https://github.com/pdfminer/pdfminer.six)
|
||||
- [pypdf](https://github.com/py-pdf/pypdf)
|
||||
- [magika](https://github.com/google/magika)
|
||||
|
||||
# Citation
|
||||
|
||||
```bibtex
|
||||
@misc{niu2025mineru25decoupledvisionlanguagemodel,
|
||||
title={MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing},
|
||||
author={Junbo Niu and Zheng Liu and Zhuangcheng Gu and Bin Wang and Linke Ouyang and Zhiyuan Zhao and Tao Chu and Tianyao He and Fan Wu and Qintong Zhang and Zhenjiang Jin and Guang Liang and Rui Zhang and Wenzheng Zhang and Yuan Qu and Zhifei Ren and Yuefeng Sun and Yuanhong Zheng and Dongsheng Ma and Zirui Tang and Boyu Niu and Ziyang Miao and Hejun Dong and Siyi Qian and Junyuan Zhang and Jingzhou Chen and Fangdong Wang and Xiaomeng Zhao and Liqun Wei and Wei Li and Shasha Wang and Ruiliang Xu and Yuanyuan Cao and Lu Chen and Qianqian Wu and Huaiyu Gu and Lindong Lu and Keming Wang and Dechen Lin and Guanlin Shen and Xuanhe Zhou and Linfeng Zhang and Yuhang Zang and Xiaoyi Dong and Jiaqi Wang and Bo Zhang and Lei Bai and Pei Chu and Weijia Li and Jiang Wu and Lijun Wu and Zhenxiang Li and Guangyu Wang and Zhongying Tu and Chao Xu and Kai Chen and Yu Qiao and Bowen Zhou and Dahua Lin and Wentao Zhang and Conghui He},
|
||||
year={2025},
|
||||
eprint={2509.22186},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV},
|
||||
url={https://arxiv.org/abs/2509.22186},
|
||||
}
|
||||
|
||||
@misc{wang2024mineruopensourcesolutionprecise,
|
||||
title={MinerU: An Open-Source Solution for Precise Document Content Extraction},
|
||||
author={Bin Wang and Chao Xu and Xiaomeng Zhao and Linke Ouyang and Fan Wu and Zhiyuan Zhao and Rui Xu and Kaiwen Liu and Yuan Qu and Fukai Shang and Bo Zhang and Liqun Wei and Zhihao Sui and Wei Li and Botian Shi and Yu Qiao and Dahua Lin and Conghui He},
|
||||
@@ -643,3 +849,4 @@ Currently, some models in this project are trained based on YOLO. However, since
|
||||
- [OmniDocBench (A Comprehensive Benchmark for Document Parsing and Evaluation)](https://github.com/opendatalab/OmniDocBench)
|
||||
- [Magic-HTML (Mixed web page extraction tool)](https://github.com/opendatalab/magic-html)
|
||||
- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc)
|
||||
- [Dingo: A Comprehensive AI Data Quality Evaluation Tool](https://github.com/MigoXLab/dingo)
|
||||
|
||||
375
README_zh-CN.md
@@ -1,7 +1,7 @@
|
||||
<div align="center" xmlns="http://www.w3.org/1999/html">
|
||||
<!-- logo -->
|
||||
<p align="center">
|
||||
<img src="docs/images/MinerU-logo.png" width="300px" style="vertical-align:middle;">
|
||||
<img src="https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docs/images/MinerU-logo.png" width="300px" style="vertical-align:middle;">
|
||||
</p>
|
||||
|
||||
<!-- icon -->
|
||||
@@ -17,8 +17,9 @@
|
||||
[](https://mineru.net/OpenSourceTools/Extractor?source=github)
|
||||
[](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
|
||||
[](https://huggingface.co/spaces/opendatalab/MinerU)
|
||||
[](https://colab.research.google.com/gist/myhloli/3b3a00a4a0a61577b6c30f989092d20d/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://colab.research.google.com/gist/myhloli/a3cb16570ab3cfeadf9d8f0ac91b4fca/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://arxiv.org/abs/2509.22186)
|
||||
[](https://deepwiki.com/opendatalab/MinerU)
|
||||
|
||||
|
||||
@@ -37,50 +38,211 @@
|
||||
<!-- join us -->
|
||||
|
||||
<p align="center">
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="http://mineru.space/s/V85Yl" target="_blank">WeChat</a>
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="https://mineru.net/community-portal/?aliasId=3c430f94" target="_blank">WeChat</a>
|
||||
</p>
|
||||
|
||||
</div>
|
||||
|
||||
# 更新记录
|
||||
- 2025/07/28 2.1.8 发布
|
||||
- `sglang` 0.4.9.post5 版本适配
|
||||
- 2025/07/27 2.1.7 发布
|
||||
- `transformers` 4.54.0 版本适配
|
||||
- 2025/07/26 2.1.6 发布
|
||||
- 修复`vlm`后端解析部分手写文档时的表格异常问题
|
||||
- 修复文档旋转时可视化框位置漂移问题 #3175
|
||||
- 2025/07/24 2.1.5 发布
|
||||
- `sglang` 0.4.9 版本适配,同步升级dockerfile基础镜像为sglang 0.4.9.post3
|
||||
- 2025/07/23 2.1.4 发布
|
||||
- bug修复
|
||||
- 修复`pipeline`后端中`MFR`步骤在某些情况下显存消耗过大的问题 #2771
|
||||
- 修复某些情况下`image`/`table`与`caption`/`footnote`匹配不准确的问题 #3129
|
||||
- 2025/07/16 2.1.1 发布
|
||||
- bug修复
|
||||
- 修复`pipeline`在某些情况可能发生的文本块内容丢失问题 #3005
|
||||
- 修复`sglang-client`需要安装`torch`等不必要的包的问题 #2968
|
||||
- 更新`dockerfile`以修复linux字体缺失导致的解析文本内容不完整问题 #2915
|
||||
- 易用性更新
|
||||
- 更新`compose.yaml`,便于用户直接启动`sglang-server`、`mineru-api`、`mineru-gradio`服务
|
||||
- 启用全新的[在线文档站点](https://opendatalab.github.io/MinerU/zh/),简化readme,提供更好的文档体验
|
||||
- 2025/07/05 2.1.0 发布
|
||||
- 这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下:
|
||||
- 性能优化:
|
||||
- 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度
|
||||
- 大幅提升`pipeline`后端批量处理大量页数较少(<10)文档时的后处理速度
|
||||
- `pipeline`后端的layout分析速度提升约20%
|
||||
- 体验优化:
|
||||
- 内置开箱即用的`fastapi服务`和`gradio webui`,详细使用方法请参考[文档](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#apiwebuisglang-clientserver)
|
||||
- `sglang`适配`0.4.8`版本,大幅降低`vlm-sglang`后端的显存要求,最低可在`8G显存`(Turing及以后架构)的显卡上运行
|
||||
- 对所有命令增加`sglang`的参数透传,使得`sglang-engine`后端可以与`sglang-server`一致,接收`sglang`的所有参数
|
||||
- 支持基于配置文件的功能扩展,包含`自定义公式标识符`、`开启标题分级功能`、`自定义本地模型目录`,详细使用方法请参考[文档](https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#mineru_1)
|
||||
- 新特性:
|
||||
- `pipeline`后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。[详情](https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html)
|
||||
- `pipeline`后端增加对竖排文本的有限支持
|
||||
|
||||
- 2025/11/26 2.6.5 发布
|
||||
- 增加新后端`vlm-lmdeploy-engine`支持,使用方式与`vlm-vllm-(async)engine`类似,但使用`lmdeploy`作为推理引擎,与`vllm`相比额外支持Windows平台原生推理加速。
|
||||
- 新增国产算力平台`昇腾/npu`、`平头哥/ppu`、`沐曦/maca`的适配支持,用户可在对应平台上使用`pipeline`与`vlm`模型,并使用`vllm`/`lmdeploy`引擎加速vlm模型推理,具体使用方式请参考[其他加速卡适配](https://opendatalab.github.io/MinerU/zh/usage/)。
|
||||
- 国产平台适配不易,我们已尽量确保适配的完整性和稳定性,但仍可能存在一些稳定性/兼容问题与精度对齐问题,请大家根据适配文档页面内红绿灯情况自行选择合适的环境与场景进行使用。
|
||||
- 如在使用国产化平台适配方案的过程中遇到任何文档未提及的问题,为便于其他用户查找解决方案,请在discussions的[指定帖子](https://github.com/opendatalab/MinerU/discussions/4064)中进行反馈。
|
||||
|
||||
- 2025/11/04 2.6.4 发布
|
||||
- 为pdf渲染图片增加超时配置,默认为300秒,可通过环境变量`MINERU_PDF_RENDER_TIMEOUT`进行配置,防止部分异常pdf文件导致渲染过程长时间阻塞。
|
||||
- 为onnx模型增加cpu线程数配置选项,默认为系统cpu核心数,可通过环境变量`MINERU_INTRA_OP_NUM_THREADS`和`MINERU_INTER_OP_NUM_THREADS`进行配置,以减少高并发场景下的对cpu资源的抢占冲突。
|
||||
|
||||
- 2025/10/31 2.6.3 发布
|
||||
- 增加新后端`vlm-mlx-engine`支持,在Apple Silicon设备上支持使用`MLX`加速`MinerU2.5`模型推理,相比`vlm-transformers`后端,`vlm-mlx-engine`后端速度提升100%~200%。
|
||||
- bug修复: #3849 #3859
|
||||
|
||||
- 2025/10/24 2.6.2 发布
|
||||
- `pipline`后端优化
|
||||
- 增加对中文公式的实验性支持,可通过配置环境变量`export MINERU_FORMULA_CH_SUPPORT=1`开启。该功能可能会导致MFR速率略微下降、部分长公式识别失败等问题,建议仅在需要解析中文公式的场景下开启。如需关闭该功能,可将环境变量设置为`0`。
|
||||
- `OCR`速度大幅提升200%~300%,感谢 [@cjsdurj](https://github.com/cjsdurj) 提供的优化方案
|
||||
- `OCR`模型优化拉丁文识别的准度和广度,并更新西里尔文(cyrillic)、阿拉伯文(arabic)、天城文(devanagari)、泰卢固语(te)、泰米尔语(ta)语系至`ppocr-v5`版本,精度相比上代模型提升40%以上
|
||||
- `vlm`后端优化
|
||||
- `table_caption`、`table_footnote`匹配逻辑优化,提升页内多张连续表场景下的表格标题和脚注的匹配准确率和阅读顺序合理性
|
||||
- 优化使用`vllm`后端时高并发时的cpu资源占用,降低服务端压力
|
||||
- 适配`vllm`0.11.0版本
|
||||
- 通用优化
|
||||
- 跨页表格合并效果优化,新增跨页续表合并支持,提升在多列合并场景下的表格合并效果
|
||||
- 为表格合并功能增加环境变量配置选项`MINERU_TABLE_MERGE_ENABLE`,表格合并功能默认开启,可通过设置该变量为`0`来关闭表格合并功能
|
||||
|
||||
- 2025/09/26 2.5.4 发布
|
||||
- 🎉🎉 MinerU2.5[技术报告](https://arxiv.org/abs/2509.22186)现已发布,欢迎阅读全面了解其模型架构、训练策略、数据工程和评测结果。
|
||||
- 修复部分`pdf`文件被识别成`ai`文件导致无法解析的问题
|
||||
|
||||
- 2025/09/20 2.5.3 发布
|
||||
- 依赖版本范围调整,使得Turing及更早架构显卡可以使用vLLM加速推理MinerU2.5模型。
|
||||
- `pipeline`后端对torch 2.8.0的一些兼容性修复。
|
||||
- 降低vLLM异步后端默认的并发数,降低服务端压力以避免高压导致的链接关闭问题。
|
||||
- 更多兼容性相关内容详见[公告](https://github.com/opendatalab/MinerU/discussions/3547)
|
||||
|
||||
- 2025/09/19 2.5.2 发布
|
||||
我们正式发布 MinerU2.5,当前最强文档解析多模态大模型。仅凭 1.2B 参数,MinerU2.5 在 OmniDocBench 文档解析评测中,精度已全面超越 Gemini2.5-Pro、GPT-4o、Qwen2.5-VL-72B等顶级多模态大模型,并显著领先于主流文档解析专用模型(如 dots.ocr, MonkeyOCR, PP-StructureV3 等)。
|
||||
模型已发布至[HuggingFace](https://huggingface.co/opendatalab/MinerU2.5-2509-1.2B)和[ModelScope](https://modelscope.cn/models/opendatalab/MinerU2.5-2509-1.2B)平台,欢迎大家下载使用!
|
||||
- 核心亮点
|
||||
- 极致能效,性能SOTA: 以 1.2B 的轻量化规模,实现了超越百亿乃至千亿级模型的SOTA性能,重新定义了文档解析的能效比。
|
||||
- 先进架构,全面领先: 通过 “两阶段推理” (解耦布局分析与内容识别) 与 原生高分辨率架构 的结合,在布局分析、文本识别、公式识别、表格识别及阅读顺序五大方面均达到 SOTA 水平。
|
||||
- 关键能力提升
|
||||
- 布局检测: 结果更完整,精准覆盖页眉、页脚、页码等非正文内容;同时提供更精准的元素定位与更自然的格式还原(如列表、参考文献)。
|
||||
- 表格解析: 大幅优化了对旋转表格、无线/少线表、以及长难表格的解析能力。
|
||||
- 公式识别: 显著提升中英混合公式及复杂长公式的识别准确率,大幅改善数学类文档解析能力。
|
||||
|
||||
此外,伴随vlm 2.5的发布,我们对仓库做出一些调整:
|
||||
- vlm后端升级至2.5版本,支持MinerU2.5模型,不再兼容MinerU2.0-2505-0.9B模型,最后一个支持2.0模型的版本为mineru-2.2.2。
|
||||
- vlm推理相关代码已移至[mineru_vl_utils](https://github.com/opendatalab/mineru-vl-utils),降低与mineru主仓库的耦合度,便于后续独立迭代。
|
||||
- vlm加速推理框架从`sglang`切换至`vllm`,并实现对vllm生态的完全兼容,使得用户可以在任何支持vllm框架的平台上使用MinerU2.5模型并加速推理。
|
||||
- 由于vlm模型的重大升级,支持更多layout type,因此我们对解析的中间文件`middle.json`和结果文件`content_list.json`的结构做出一些调整,请参考[文档](https://opendatalab.github.io/MinerU/zh/reference/output_files/)了解详情。
|
||||
|
||||
其他仓库优化:
|
||||
- 移除对输入文件的后缀名白名单校验,当输入文件为PDF文档或图片时,对文件的后缀名不再有要求,提升易用性。
|
||||
|
||||
<details>
|
||||
<summary>历史日志</summary>
|
||||
|
||||
<details>
|
||||
<summary>2025/09/10 2.2.2 发布</summary>
|
||||
<ul>
|
||||
<li>修复新的表格识别模型在部分表格解析失败时影响整体解析任务的问题</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/09/08 2.2.1 发布</summary>
|
||||
<ul>
|
||||
<li>修复使用模型下载命令时,部分新增模型未下载的问题</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/09/05 2.2.0 发布</summary>
|
||||
<ul>
|
||||
<li>
|
||||
主要更新
|
||||
<ul>
|
||||
<li>在这个版本我们重点提升了表格的解析精度,通过引入新的<a href="https://github.com/RapidAI/TableStructureRec">有线表识别模型</a>和全新的混合表格结构解析算法,显著提升了<code>pipeline</code>后端的表格识别能力。</li>
|
||||
<li>另外我们增加了对跨页表格合并的支持,这一功能同时支持<code>pipeline</code>和<code>vlm</code>后端,进一步提升了表格解析的完整性和准确性。</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>
|
||||
其他更新
|
||||
<ul>
|
||||
<li><code>pipeline</code>后端增加270度旋转的表格解析能力,现已支持0/90/270度三个方向的表格解析</li>
|
||||
<li><code>pipeline</code>增加对泰文、希腊文的ocr能力支持,并更新了英文ocr模型至最新,英文识别精度提升11%,泰文识别模型精度 82.68%,希腊文识别模型精度 89.28%(by PPOCRv5)</li>
|
||||
<li>在输出的<code>content_list.json</code>中增加了<code>bbox</code>字段(映射至0-1000范围内),方便用户直接获取每个内容块的位置信息</li>
|
||||
<li>移除<code>pipeline_old_linux</code>安装可选项,不再支持老版本的Linux系统如<code>Centos 7</code>等,以便对<code>uv</code>的<code>sync</code>/<code>run</code>等命令进行更好的支持</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/08/01 2.1.10 发布</summary>
|
||||
<ul>
|
||||
<li>修复<code>pipeline</code>后端因block覆盖导致的解析结果与预期不符 #3232</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/30 2.1.9 发布</summary>
|
||||
<ul>
|
||||
<li><code>transformers</code> 4.54.1 版本适配</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/28 2.1.8 发布</summary>
|
||||
<ul>
|
||||
<li><code>sglang</code> 0.4.9.post5 版本适配</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/27 2.1.7 发布</summary>
|
||||
<ul>
|
||||
<li><code>transformers</code> 4.54.0 版本适配</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/26 2.1.6 发布</summary>
|
||||
<ul>
|
||||
<li>修复<code>vlm</code>后端解析部分手写文档时的表格异常问题</li>
|
||||
<li>修复文档旋转时可视化框位置漂移问题 #3175</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/24 2.1.5 发布</summary>
|
||||
<ul>
|
||||
<li><code>sglang</code> 0.4.9 版本适配,同步升级dockerfile基础镜像为sglang 0.4.9.post3</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/23 2.1.4 发布</summary>
|
||||
<ul>
|
||||
<li><strong>bug修复</strong>
|
||||
<ul>
|
||||
<li>修复<code>pipeline</code>后端中<code>MFR</code>步骤在某些情况下显存消耗过大的问题 #2771</li>
|
||||
<li>修复某些情况下<code>image</code>/<code>table</code>与<code>caption</code>/<code>footnote</code>匹配不准确的问题 #3129</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/16 2.1.1 发布</summary>
|
||||
<ul>
|
||||
<li><strong>bug修复</strong>
|
||||
<ul>
|
||||
<li>修复<code>pipeline</code>在某些情况可能发生的文本块内容丢失问题 #3005</li>
|
||||
<li>修复<code>sglang-client</code>需要安装<code>torch</code>等不必要的包的问题 #2968</li>
|
||||
<li>更新<code>dockerfile</code>以修复linux字体缺失导致的解析文本内容不完整问题 #2915</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>易用性更新</strong>
|
||||
<ul>
|
||||
<li>更新<code>compose.yaml</code>,便于用户直接启动<code>sglang-server</code>、<code>mineru-api</code>、<code>mineru-gradio</code>服务</li>
|
||||
<li>启用全新的<a href="https://opendatalab.github.io/MinerU/zh/">在线文档站点</a>,简化readme,提供更好的文档体验</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/07/05 2.1.0 发布</summary>
|
||||
<p>这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下:</p>
|
||||
<ul>
|
||||
<li><strong>性能优化:</strong>
|
||||
<ul>
|
||||
<li>大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度</li>
|
||||
<li>大幅提升<code>pipeline</code>后端批量处理大量页数较少(<10)文档时的后处理速度</li>
|
||||
<li><code>pipeline</code>后端的layout分析速度提升约20%</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>体验优化:</strong>
|
||||
<ul>
|
||||
<li>内置开箱即用的<code>fastapi服务</code>和<code>gradio webui</code>,详细使用方法请参考<a href="https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#apiwebuisglang-clientserver">文档</a></li>
|
||||
<li><code>sglang</code>适配<code>0.4.8</code>版本,大幅降低<code>vlm-sglang</code>后端的显存要求,最低可在<code>8G显存</code>(Turing及以后架构)的显卡上运行</li>
|
||||
<li>对所有命令增加<code>sglang</code>的参数透传,使得<code>sglang-engine</code>后端可以与<code>sglang-server</code>一致,接收<code>sglang</code>的所有参数</li>
|
||||
<li>支持基于配置文件的功能扩展,包含<code>自定义公式标识符</code>、<code>开启标题分级功能</code>、<code>自定义本地模型目录</code>,详细使用方法请参考<a href="https://opendatalab.github.io/MinerU/zh/usage/quick_usage/#mineru_1">文档</a></li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>新特性:</strong>
|
||||
<ul>
|
||||
<li><code>pipeline</code>后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。<a href="https://paddlepaddle.github.io/PaddleOCR/latest/version3.x/algorithm/PP-OCRv5/PP-OCRv5_multi_languages.html">详情</a></li>
|
||||
<li><code>pipeline</code>后端增加对竖排文本的有限支持</li>
|
||||
</ul>
|
||||
</li>
|
||||
</ul>
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>2025/06/20 2.0.6发布</summary>
|
||||
<ul>
|
||||
@@ -423,7 +585,7 @@ https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
|
||||
- 自动识别并转换文档中的公式为LaTeX格式
|
||||
- 自动识别并转换文档中的表格为HTML格式
|
||||
- 自动检测扫描版PDF和乱码PDF,并启用OCR功能
|
||||
- OCR支持84种语言的检测与识别
|
||||
- OCR支持109种语言的检测与识别
|
||||
- 支持多种输出格式,如多模态与NLP的Markdown、按阅读顺序排序的JSON、含有丰富信息的中间格式等
|
||||
- 支持多种可视化结果,包括layout可视化、span可视化等,便于高效确认输出效果与质检
|
||||
- 支持纯CPU环境运行,并支持 GPU(CUDA)/NPU(CANN)/MPS 加速
|
||||
@@ -458,42 +620,80 @@ https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
|
||||
>
|
||||
> 在非主线环境中,由于硬件、软件配置的多样性,以及第三方依赖项的兼容性问题,我们无法100%保证项目的完全可用性。因此,对于希望在非推荐环境中使用本项目的用户,我们建议先仔细阅读文档以及FAQ,大多数问题已经在FAQ中有对应的解决方案,除此之外我们鼓励社区反馈问题,以便我们能够逐步扩大支持范围。
|
||||
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<td>解析后端</td>
|
||||
<td>pipeline</td>
|
||||
<td>vlm-transformers</td>
|
||||
<td>vlm-sglang</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>操作系统</td>
|
||||
<td>Linux / Windows / macOS</td>
|
||||
<td>Linux / Windows</td>
|
||||
<td>Linux / Windows (via WSL2)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>CPU推理支持</td>
|
||||
<td>✅</td>
|
||||
<td colspan="2">❌</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>GPU要求</td>
|
||||
<td>Turing及以后架构,6G显存以上或Apple Silicon</td>
|
||||
<td colspan="2">Turing及以后架构,8G显存以上</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>内存要求</td>
|
||||
<td colspan="3">最低16G以上,推荐32G以上</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>磁盘空间要求</td>
|
||||
<td colspan="3">20G以上,推荐使用SSD</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>python版本</td>
|
||||
<td colspan="3">3.10-3.13</td>
|
||||
</tr>
|
||||
</table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th rowspan="2">解析后端</th>
|
||||
<th rowspan="2">pipeline <br> (精度<sup>1</sup> 82+)</th>
|
||||
<th colspan="5">vlm (精度<sup>1</sup> 90+)</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>transformers</th>
|
||||
<th>mlx-engine</th>
|
||||
<th>vllm-engine / <br>vllm-async-engine</th>
|
||||
<th>lmdeploy-engine</th>
|
||||
<th>http-client</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<th>后端特性</th>
|
||||
<td>速度快, 无幻觉</td>
|
||||
<td>兼容性好, 速度较慢</td>
|
||||
<td>比transformers快</td>
|
||||
<td>速度快, 兼容vllm生态</td>
|
||||
<td>速度快, 兼容lmdeploy生态</td>
|
||||
<td>适用于OpenAI兼容服务器<sup>6</sup></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>操作系统</th>
|
||||
<td colspan="2" style="text-align:center;">Linux<sup>2</sup> / Windows / macOS</td>
|
||||
<td style="text-align:center;">macOS<sup>3</sup></td>
|
||||
<td style="text-align:center;">Linux<sup>2</sup> / Windows<sup>4</sup> </td>
|
||||
<td style="text-align:center;">Linux<sup>2</sup> / Windows<sup>5</sup> </td>
|
||||
<td>不限</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>CPU推理支持</th>
|
||||
<td colspan="2" style="text-align:center;">✅</td>
|
||||
<td colspan="3" style="text-align:center;">❌</td>
|
||||
<td >不需要</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>GPU要求</th><td colspan="2" style="text-align:center;">Volta及以后架构, 6G显存以上或Apple Silicon</td>
|
||||
<td>Apple Silicon</td>
|
||||
<td colspan="2" style="text-align:center;">Volta及以后架构, 8G显存以上</td>
|
||||
<td>不需要</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>内存要求</th>
|
||||
<td colspan="5" style="text-align:center;">最低16GB以上, 推荐32GB以上</td>
|
||||
<td>8GB</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>磁盘空间要求</th>
|
||||
<td colspan="5" style="text-align:center;">20GB以上, 推荐使用SSD</td>
|
||||
<td>2GB</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>python版本</th>
|
||||
<td colspan="6" style="text-align:center;">3.10-3.13<sup>7</sup></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<sup>1</sup> 精度指标为OmniDocBench (v1.5)的End-to-End Evaluation Overall分数,基于`MinerU`最新版本测试
|
||||
<sup>2</sup> Linux仅支持2019年及以后发行版
|
||||
<sup>3</sup> MLX需macOS 13.5及以上版本支持,推荐14.0以上版本使用
|
||||
<sup>4</sup> Windows vLLM通过WSL2(适用于 Linux 的 Windows 子系统)实现支持
|
||||
<sup>5</sup> Windows LMDeploy只能使用`turbomind`后端,速度比`pytorch`后端稍慢,如对速度有要求建议通过WSL2运行
|
||||
<sup>6</sup> 兼容OpenAI API的服务器,如通过`vLLM`/`SGLang`/`LMDeploy`等推理框架部署的本地模型服务器或远程模型服务
|
||||
<sup>7</sup> Windows + LMDeploy 由于关键依赖`ray`未能在windows平台支持Python 3.13,故仅支持至3.10~3.12版本
|
||||
|
||||
> [!TIP]
|
||||
> 除以上主流环境与平台外,我们也收录了一些社区用户反馈的其他平台支持情况,详情请参考[其他加速卡适配](https://opendatalab.github.io/MinerU/zh/usage/)。
|
||||
> 如果您有意将自己的环境适配经验分享给社区,欢迎通过[show-and-tell](https://github.com/opendatalab/MinerU/discussions/categories/show-and-tell)提交或提交PR至[其他加速卡适配](https://github.com/opendatalab/MinerU/tree/master/docs/zh/usage/acceleration_cards)文档。
|
||||
|
||||
### 安装 MinerU
|
||||
|
||||
@@ -512,8 +712,8 @@ uv pip install -e .[core] -i https://mirrors.aliyun.com/pypi/simple
|
||||
```
|
||||
|
||||
> [!TIP]
|
||||
> `mineru[core]`包含除`sglang`加速外的所有核心功能,兼容Windows / Linux / macOS系统,适合绝大多数用户。
|
||||
> 如果您有使用`sglang`加速VLM模型推理,或是在边缘设备安装轻量版client端等需求,可以参考文档[扩展模块安装指南](https://opendatalab.github.io/MinerU/zh/quick_start/extension_modules/)。
|
||||
> `mineru[core]`包含除`vLLM`/`LMDeploy`加速外的所有核心功能,兼容Windows / Linux / macOS系统,适合绝大多数用户。
|
||||
> 如果您需要使用`vLLM`/`LMDeploy`加速VLM模型推理,或是有在边缘设备安装轻量版client端等需求,可以参考文档[扩展模块安装指南](https://opendatalab.github.io/MinerU/zh/quick_start/extension_modules/)。
|
||||
|
||||
---
|
||||
|
||||
@@ -541,8 +741,8 @@ mineru -p <input_path> -o <output_path>
|
||||
- [x] 手写文本识别
|
||||
- [x] 竖排文本识别
|
||||
- [x] 拉丁字母重音符号识别
|
||||
- [ ] 正文中代码块识别
|
||||
- [ ] [化学式识别](docs/chemical_knowledge_introduction/introduction.pdf)
|
||||
- [x] 正文中代码块识别
|
||||
- [x] [化学式识别](docs/chemical_knowledge_introduction/introduction.pdf)(https://mineru.net)
|
||||
- [ ] 图表内容识别
|
||||
|
||||
# Known Issues
|
||||
@@ -560,7 +760,7 @@ mineru -p <input_path> -o <output_path>
|
||||
|
||||
- 如果您在使用过程中遇到问题,可以先查看[常见问题](https://opendatalab.github.io/MinerU/zh/faq/)是否有解答。
|
||||
- 如果未能解决您的问题,您也可以使用[DeepWiki](https://deepwiki.com/opendatalab/MinerU)与AI助手交流,这可以解决大部分常见问题。
|
||||
- 如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)或[WeChat](http://mineru.space/s/V85Yl)加入社区,与其他用户和开发者交流。
|
||||
- 如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)或[WeChat](https://mineru.net/community-portal/?aliasId=3c430f94)加入社区,与其他用户和开发者交流。
|
||||
|
||||
# All Thanks To Our Contributors
|
||||
|
||||
@@ -580,6 +780,7 @@ mineru -p <input_path> -o <output_path>
|
||||
- [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO)
|
||||
- [UniMERNet](https://github.com/opendatalab/UniMERNet)
|
||||
- [RapidTable](https://github.com/RapidAI/RapidTable)
|
||||
- [TableStructureRec](https://github.com/RapidAI/TableStructureRec)
|
||||
- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
|
||||
- [PaddleOCR2Pytorch](https://github.com/frotms/PaddleOCR2Pytorch)
|
||||
- [layoutreader](https://github.com/ppaanngggg/layoutreader)
|
||||
@@ -589,10 +790,21 @@ mineru -p <input_path> -o <output_path>
|
||||
- [pdftext](https://github.com/datalab-to/pdftext)
|
||||
- [pdfminer.six](https://github.com/pdfminer/pdfminer.six)
|
||||
- [pypdf](https://github.com/py-pdf/pypdf)
|
||||
- [magika](https://github.com/google/magika)
|
||||
|
||||
# Citation
|
||||
|
||||
```bibtex
|
||||
@misc{niu2025mineru25decoupledvisionlanguagemodel,
|
||||
title={MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing},
|
||||
author={Junbo Niu and Zheng Liu and Zhuangcheng Gu and Bin Wang and Linke Ouyang and Zhiyuan Zhao and Tao Chu and Tianyao He and Fan Wu and Qintong Zhang and Zhenjiang Jin and Guang Liang and Rui Zhang and Wenzheng Zhang and Yuan Qu and Zhifei Ren and Yuefeng Sun and Yuanhong Zheng and Dongsheng Ma and Zirui Tang and Boyu Niu and Ziyang Miao and Hejun Dong and Siyi Qian and Junyuan Zhang and Jingzhou Chen and Fangdong Wang and Xiaomeng Zhao and Liqun Wei and Wei Li and Shasha Wang and Ruiliang Xu and Yuanyuan Cao and Lu Chen and Qianqian Wu and Huaiyu Gu and Lindong Lu and Keming Wang and Dechen Lin and Guanlin Shen and Xuanhe Zhou and Linfeng Zhang and Yuhang Zang and Xiaoyi Dong and Jiaqi Wang and Bo Zhang and Lei Bai and Pei Chu and Weijia Li and Jiang Wu and Lijun Wu and Zhenxiang Li and Guangyu Wang and Zhongying Tu and Chao Xu and Kai Chen and Yu Qiao and Bowen Zhou and Dahua Lin and Wentao Zhang and Conghui He},
|
||||
year={2025},
|
||||
eprint={2509.22186},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV},
|
||||
url={https://arxiv.org/abs/2509.22186},
|
||||
}
|
||||
|
||||
@misc{wang2024mineruopensourcesolutionprecise,
|
||||
title={MinerU: An Open-Source Solution for Precise Document Content Extraction},
|
||||
author={Bin Wang and Chao Xu and Xiaomeng Zhao and Linke Ouyang and Fan Wu and Zhiyuan Zhao and Rui Xu and Kaiwen Liu and Yuan Qu and Fukai Shang and Bo Zhang and Liqun Wei and Zhihao Sui and Wei Li and Botian Shi and Yu Qiao and Dahua Lin and Conghui He},
|
||||
@@ -630,4 +842,5 @@ mineru -p <input_path> -o <output_path>
|
||||
- [PDF-Extract-Kit (A Comprehensive Toolkit for High-Quality PDF Content Extraction)](https://github.com/opendatalab/PDF-Extract-Kit)
|
||||
- [OmniDocBench (A Comprehensive Benchmark for Document Parsing and Evaluation)](https://github.com/opendatalab/OmniDocBench)
|
||||
- [Magic-HTML (Mixed web page extraction tool)](https://github.com/opendatalab/magic-html)
|
||||
- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc)
|
||||
- [Magic-Doc (Fast speed ppt/pptx/doc/docx/pdf extraction tool)](https://github.com/InternLM/magic-doc)
|
||||
- [Dingo: A Comprehensive AI Data Quality Evaluation Tool](https://github.com/MigoXLab/dingo)
|
||||
|
||||
168
demo/demo.py
@@ -15,7 +15,7 @@ from mineru.backend.pipeline.pipeline_analyze import doc_analyze as pipeline_doc
|
||||
from mineru.backend.pipeline.pipeline_middle_json_mkcontent import union_make as pipeline_union_make
|
||||
from mineru.backend.pipeline.model_json_to_middle_json import result_to_middle_json as pipeline_result_to_middle_json
|
||||
from mineru.backend.vlm.vlm_middle_json_mkcontent import union_make as vlm_union_make
|
||||
from mineru.utils.models_download_utils import auto_download_and_get_model_root_path
|
||||
from mineru.utils.guess_suffix_or_lang import guess_suffix_by_path
|
||||
|
||||
|
||||
def do_parse(
|
||||
@@ -27,7 +27,7 @@ def do_parse(
|
||||
parse_method="auto", # The method for parsing PDF, default is 'auto'
|
||||
formula_enable=True, # Enable formula parsing
|
||||
table_enable=True, # Enable table parsing
|
||||
server_url=None, # Server URL for vlm-sglang-client backend
|
||||
server_url=None, # Server URL for vlm-http-client backend
|
||||
f_draw_layout_bbox=True, # Whether to draw layout bounding boxes
|
||||
f_draw_span_bbox=True, # Whether to draw span bounding boxes
|
||||
f_dump_md=True, # Whether to dump markdown files
|
||||
@@ -62,47 +62,12 @@ def do_parse(
|
||||
pdf_info = middle_json["pdf_info"]
|
||||
|
||||
pdf_bytes = pdf_bytes_list[idx]
|
||||
if f_draw_layout_bbox:
|
||||
draw_layout_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_layout.pdf")
|
||||
|
||||
if f_draw_span_bbox:
|
||||
draw_span_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_span.pdf")
|
||||
|
||||
if f_dump_orig_pdf:
|
||||
md_writer.write(
|
||||
f"{pdf_file_name}_origin.pdf",
|
||||
pdf_bytes,
|
||||
)
|
||||
|
||||
if f_dump_md:
|
||||
image_dir = str(os.path.basename(local_image_dir))
|
||||
md_content_str = pipeline_union_make(pdf_info, f_make_md_mode, image_dir)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}.md",
|
||||
md_content_str,
|
||||
)
|
||||
|
||||
if f_dump_content_list:
|
||||
image_dir = str(os.path.basename(local_image_dir))
|
||||
content_list = pipeline_union_make(pdf_info, MakeMode.CONTENT_LIST, image_dir)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_content_list.json",
|
||||
json.dumps(content_list, ensure_ascii=False, indent=4),
|
||||
)
|
||||
|
||||
if f_dump_middle_json:
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_middle.json",
|
||||
json.dumps(middle_json, ensure_ascii=False, indent=4),
|
||||
)
|
||||
|
||||
if f_dump_model_output:
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_model.json",
|
||||
json.dumps(model_json, ensure_ascii=False, indent=4),
|
||||
)
|
||||
|
||||
logger.info(f"local output dir is {local_md_dir}")
|
||||
_process_output(
|
||||
pdf_info, pdf_bytes, pdf_file_name, local_md_dir, local_image_dir,
|
||||
md_writer, f_draw_layout_bbox, f_draw_span_bbox, f_dump_orig_pdf,
|
||||
f_dump_md, f_dump_content_list, f_dump_middle_json, f_dump_model_output,
|
||||
f_make_md_mode, middle_json, model_json, is_pipeline=True
|
||||
)
|
||||
else:
|
||||
if backend.startswith("vlm-"):
|
||||
backend = backend[4:]
|
||||
@@ -118,48 +83,77 @@ def do_parse(
|
||||
|
||||
pdf_info = middle_json["pdf_info"]
|
||||
|
||||
if f_draw_layout_bbox:
|
||||
draw_layout_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_layout.pdf")
|
||||
_process_output(
|
||||
pdf_info, pdf_bytes, pdf_file_name, local_md_dir, local_image_dir,
|
||||
md_writer, f_draw_layout_bbox, f_draw_span_bbox, f_dump_orig_pdf,
|
||||
f_dump_md, f_dump_content_list, f_dump_middle_json, f_dump_model_output,
|
||||
f_make_md_mode, middle_json, infer_result, is_pipeline=False
|
||||
)
|
||||
|
||||
if f_draw_span_bbox:
|
||||
draw_span_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_span.pdf")
|
||||
|
||||
if f_dump_orig_pdf:
|
||||
md_writer.write(
|
||||
f"{pdf_file_name}_origin.pdf",
|
||||
pdf_bytes,
|
||||
)
|
||||
def _process_output(
|
||||
pdf_info,
|
||||
pdf_bytes,
|
||||
pdf_file_name,
|
||||
local_md_dir,
|
||||
local_image_dir,
|
||||
md_writer,
|
||||
f_draw_layout_bbox,
|
||||
f_draw_span_bbox,
|
||||
f_dump_orig_pdf,
|
||||
f_dump_md,
|
||||
f_dump_content_list,
|
||||
f_dump_middle_json,
|
||||
f_dump_model_output,
|
||||
f_make_md_mode,
|
||||
middle_json,
|
||||
model_output=None,
|
||||
is_pipeline=True
|
||||
):
|
||||
"""处理输出文件"""
|
||||
if f_draw_layout_bbox:
|
||||
draw_layout_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_layout.pdf")
|
||||
|
||||
if f_dump_md:
|
||||
image_dir = str(os.path.basename(local_image_dir))
|
||||
md_content_str = vlm_union_make(pdf_info, f_make_md_mode, image_dir)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}.md",
|
||||
md_content_str,
|
||||
)
|
||||
if f_draw_span_bbox:
|
||||
draw_span_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_span.pdf")
|
||||
|
||||
if f_dump_content_list:
|
||||
image_dir = str(os.path.basename(local_image_dir))
|
||||
content_list = vlm_union_make(pdf_info, MakeMode.CONTENT_LIST, image_dir)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_content_list.json",
|
||||
json.dumps(content_list, ensure_ascii=False, indent=4),
|
||||
)
|
||||
if f_dump_orig_pdf:
|
||||
md_writer.write(
|
||||
f"{pdf_file_name}_origin.pdf",
|
||||
pdf_bytes,
|
||||
)
|
||||
|
||||
if f_dump_middle_json:
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_middle.json",
|
||||
json.dumps(middle_json, ensure_ascii=False, indent=4),
|
||||
)
|
||||
image_dir = str(os.path.basename(local_image_dir))
|
||||
|
||||
if f_dump_model_output:
|
||||
model_output = ("\n" + "-" * 50 + "\n").join(infer_result)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_model_output.txt",
|
||||
model_output,
|
||||
)
|
||||
if f_dump_md:
|
||||
make_func = pipeline_union_make if is_pipeline else vlm_union_make
|
||||
md_content_str = make_func(pdf_info, f_make_md_mode, image_dir)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}.md",
|
||||
md_content_str,
|
||||
)
|
||||
|
||||
logger.info(f"local output dir is {local_md_dir}")
|
||||
if f_dump_content_list:
|
||||
make_func = pipeline_union_make if is_pipeline else vlm_union_make
|
||||
content_list = make_func(pdf_info, MakeMode.CONTENT_LIST, image_dir)
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_content_list.json",
|
||||
json.dumps(content_list, ensure_ascii=False, indent=4),
|
||||
)
|
||||
|
||||
if f_dump_middle_json:
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_middle.json",
|
||||
json.dumps(middle_json, ensure_ascii=False, indent=4),
|
||||
)
|
||||
|
||||
if f_dump_model_output:
|
||||
md_writer.write_string(
|
||||
f"{pdf_file_name}_model.json",
|
||||
json.dumps(model_output, ensure_ascii=False, indent=4),
|
||||
)
|
||||
|
||||
logger.info(f"local output dir is {local_md_dir}")
|
||||
|
||||
|
||||
def parse_doc(
|
||||
@@ -182,8 +176,8 @@ def parse_doc(
|
||||
backend: the backend for parsing pdf:
|
||||
pipeline: More general.
|
||||
vlm-transformers: More general.
|
||||
vlm-sglang-engine: Faster(engine).
|
||||
vlm-sglang-client: Faster(client).
|
||||
vlm-vllm-engine: Faster(engine).
|
||||
vlm-http-client: Faster(client).
|
||||
without method specified, pipeline will be used by default.
|
||||
method: the method for parsing pdf:
|
||||
auto: Automatically determine the method based on the file type.
|
||||
@@ -191,7 +185,7 @@ def parse_doc(
|
||||
ocr: Use OCR method for image-based PDFs.
|
||||
Without method specified, 'auto' will be used by default.
|
||||
Adapted only for the case where the backend is set to "pipeline".
|
||||
server_url: When the backend is `sglang-client`, you need to specify the server_url, for example:`http://127.0.0.1:30000`
|
||||
server_url: When the backend is `http-client`, you need to specify the server_url, for example:`http://127.0.0.1:30000`
|
||||
start_page_id: Start page ID for parsing, default is 0
|
||||
end_page_id: End page ID for parsing, default is None (parse all pages until the end of the document)
|
||||
"""
|
||||
@@ -225,12 +219,12 @@ if __name__ == '__main__':
|
||||
__dir__ = os.path.dirname(os.path.abspath(__file__))
|
||||
pdf_files_dir = os.path.join(__dir__, "pdfs")
|
||||
output_dir = os.path.join(__dir__, "output")
|
||||
pdf_suffixes = [".pdf"]
|
||||
image_suffixes = [".png", ".jpeg", ".jpg"]
|
||||
pdf_suffixes = ["pdf"]
|
||||
image_suffixes = ["png", "jpeg", "jp2", "webp", "gif", "bmp", "jpg"]
|
||||
|
||||
doc_path_list = []
|
||||
for doc_path in Path(pdf_files_dir).glob('*'):
|
||||
if doc_path.suffix in pdf_suffixes + image_suffixes:
|
||||
if guess_suffix_by_path(doc_path) in pdf_suffixes + image_suffixes:
|
||||
doc_path_list.append(doc_path)
|
||||
|
||||
"""如果您由于网络问题无法下载模型,可以设置环境变量MINERU_MODEL_SOURCE为modelscope使用免代理仓库下载模型"""
|
||||
@@ -241,5 +235,7 @@ if __name__ == '__main__':
|
||||
|
||||
"""To enable VLM mode, change the backend to 'vlm-xxx'"""
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-transformers") # more general.
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-sglang-engine") # faster(engine).
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-sglang-client", server_url="http://127.0.0.1:30000") # faster(client).
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-mlx-engine") # faster than transformers in macOS 13.5+.
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-vllm-engine") # faster(vllm-engine).
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-lmdeploy-engine") # faster(lmdeploy-engine).
|
||||
# parse_doc(doc_path_list, output_dir, backend="vlm-http-client", server_url="http://127.0.0.1:30000") # faster(client).
|
||||
@@ -1,7 +1,9 @@
|
||||
# Use the official sglang image
|
||||
FROM lmsysorg/sglang:v0.4.9.post5-cu126
|
||||
# For blackwell GPU, use the following line instead:
|
||||
# FROM lmsysorg/sglang:v0.4.9.post5-cu128-b200
|
||||
# Use DaoCloud mirrored vllm image for China region for gpu with Ampere architecture and above (Compute Capability>=8.0)
|
||||
# Compute Capability version query (https://developer.nvidia.com/cuda-gpus)
|
||||
FROM docker.m.daocloud.io/vllm/vllm-openai:v0.10.1.1
|
||||
|
||||
# Use DaoCloud mirrored vllm image for China region for gpu with Turing architecture and below (Compute Capability<8.0)
|
||||
# FROM docker.m.daocloud.io/vllm/vllm-openai:v0.10.2
|
||||
|
||||
# Install libgl for opencv support & Noto fonts for Chinese characters
|
||||
RUN apt-get update && \
|
||||
|
||||
34
docker/china/maca.Dockerfile
Normal file
@@ -0,0 +1,34 @@
|
||||
# 基础镜像配置 vLLM 或 LMDeploy 推理环境,请根据实际需要选择其中一个,要求 amd64(x86-64) CPU + metax GPU。
|
||||
# Base image containing the vLLM inference environment, requiring amd64(x86-64) CPU + metax GPU.
|
||||
FROM cr.metax-tech.com/public-ai-release/maca/vllm:maca.ai3.1.0.7-torch2.6-py310-ubuntu22.04-amd64
|
||||
# Base image containing the LMDeploy inference environment, requiring amd64(x86-64) CPU + metax GPU.
|
||||
# FROM crpi-vofi3w62lkohhxsp.cn-shanghai.personal.cr.aliyuncs.com/opendatalab-mineru/maca:maca.ai3.1.0.7-torch2.6-py310-ubuntu22.04-lmdeploy0.10.2-amd64
|
||||
|
||||
# Install libgl for opencv support & Noto fonts for Chinese characters
|
||||
RUN apt-get update && \
|
||||
apt-get install -y \
|
||||
fonts-noto-core \
|
||||
fonts-noto-cjk \
|
||||
fontconfig \
|
||||
libgl1 && \
|
||||
fc-cache -fv && \
|
||||
apt-get clean && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# mod torchvision to be compatible with torch 2.6
|
||||
RUN sed -i '3s/^Version: 0.15.1+metax3\.1\.0\.4$/Version: 0.21.0+metax3.1.0.4/' /opt/conda/lib/python3.10/site-packages/torchvision-0.15.1+metax3.1.0.4.dist-info/METADATA && \
|
||||
mv /opt/conda/lib/python3.10/site-packages/torchvision-0.15.1+metax3.1.0.4.dist-info /opt/conda/lib/python3.10/site-packages/torchvision-0.21.0+metax3.1.0.4.dist-info
|
||||
|
||||
# Install mineru latest
|
||||
RUN /opt/conda/bin/python3 -m pip install -U pip -i https://mirrors.aliyun.com/pypi/simple && \
|
||||
/opt/conda/bin/python3 -m pip install 'mineru[core]>=2.6.5' \
|
||||
numpy==1.26.4 \
|
||||
opencv-python==4.11.0.86 \
|
||||
-i https://mirrors.aliyun.com/pypi/simple && \
|
||||
/opt/conda/bin/python3 -m pip cache purge
|
||||
|
||||
# Download models and update the configuration file
|
||||
RUN /bin/bash -c "/opt/conda/bin/mineru-models-download -s modelscope -m all"
|
||||
|
||||
# Set the entry point to activate the virtual environment and run the command line tool
|
||||
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
|
||||
29
docker/china/npu.Dockerfile
Normal file
@@ -0,0 +1,29 @@
|
||||
# 基础镜像配置 vLLM 或 LMDeploy ,请根据实际需要选择其中一个,要求 ARM(AArch64) CPU + Ascend NPU。
|
||||
# Base image containing the vLLM inference environment, requiring ARM(AArch64) CPU + Ascend NPU.
|
||||
FROM quay.io/ascend/vllm-ascend:v0.11.0rc1
|
||||
# Base image containing the LMDeploy inference environment, requiring ARM(AArch64) CPU + Ascend NPU.
|
||||
# FROM crpi-4crprmm5baj1v8iv.cn-hangzhou.personal.cr.aliyuncs.com/lmdeploy_dlinfer/ascend:mineru-a2
|
||||
|
||||
|
||||
# Install libgl for opencv support & Noto fonts for Chinese characters
|
||||
RUN apt-get update && \
|
||||
apt-get install -y \
|
||||
fonts-noto-core \
|
||||
fonts-noto-cjk \
|
||||
fontconfig \
|
||||
libgl1 \
|
||||
libglib2.0-0 && \
|
||||
fc-cache -fv && \
|
||||
apt-get clean && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install mineru latest
|
||||
RUN python3 -m pip install -U pip -i https://mirrors.aliyun.com/pypi/simple && \
|
||||
python3 -m pip install -U 'mineru[core]>=2.6.5' -i https://mirrors.aliyun.com/pypi/simple && \
|
||||
python3 -m pip cache purge
|
||||
|
||||
# Download models and update the configuration file
|
||||
RUN TORCH_DEVICE_BACKEND_AUTOLOAD=0 /bin/bash -c "mineru-models-download -s modelscope -m all"
|
||||
|
||||
# Set the entry point to activate the virtual environment and run the command line tool
|
||||
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
|
||||
30
docker/china/ppu.Dockerfile
Normal file
@@ -0,0 +1,30 @@
|
||||
# 基础镜像配置 vLLM 或 LMDeploy 推理环境,请根据实际需要选择其中一个,要求 amd64(x86-64) CPU + t-head PPU。
|
||||
# Base image containing the vLLM inference environment, requiring amd64(x86-64) CPU + t-head PPU.
|
||||
FROM crpi-vofi3w62lkohhxsp.cn-shanghai.personal.cr.aliyuncs.com/opendatalab-mineru/ppu:ppu-pytorch2.6.0-ubuntu24.04-cuda12.6-vllm0.8.5-py312
|
||||
# Base image containing the LMDeploy inference environment, requiring amd64(x86-64) CPU + t-head PPU.
|
||||
# FROM crpi-4crprmm5baj1v8iv.cn-hangzhou.personal.cr.aliyuncs.com/lmdeploy_dlinfer/ppu:mineru-ppu
|
||||
|
||||
# Install libgl for opencv support & Noto fonts for Chinese characters
|
||||
RUN apt-get update && \
|
||||
apt-get install -y \
|
||||
fonts-noto-core \
|
||||
fonts-noto-cjk \
|
||||
fontconfig \
|
||||
libgl1 && \
|
||||
fc-cache -fv && \
|
||||
apt-get clean && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install mineru latest
|
||||
RUN python3 -m pip install -U pip -i https://mirrors.aliyun.com/pypi/simple && \
|
||||
python3 -m pip install 'mineru[core]>=2.6.5' \
|
||||
numpy==1.26.4 \
|
||||
opencv-python==4.11.0.86 \
|
||||
-i https://mirrors.aliyun.com/pypi/simple && \
|
||||
python3 -m pip cache purge
|
||||
|
||||
# Download models and update the configuration file
|
||||
RUN /bin/bash -c "mineru-models-download -s modelscope -m all"
|
||||
|
||||
# Set the entry point to activate the virtual environment and run the command line tool
|
||||
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
|
||||
@@ -1,21 +1,38 @@
|
||||
services:
|
||||
mineru-sglang-server:
|
||||
image: mineru-sglang:latest
|
||||
container_name: mineru-sglang-server
|
||||
mineru-openai-server:
|
||||
image: mineru:latest
|
||||
container_name: mineru-openai-server
|
||||
restart: always
|
||||
profiles: ["sglang-server"]
|
||||
profiles: ["openai-server"]
|
||||
ports:
|
||||
- 30000:30000
|
||||
environment:
|
||||
MINERU_MODEL_SOURCE: local
|
||||
entrypoint: mineru-sglang-server
|
||||
entrypoint: mineru-openai-server
|
||||
command:
|
||||
# ==================== Engine Selection ====================
|
||||
# WARNING: Only ONE engine can be enabled at a time!
|
||||
# Choose 'vllm' OR 'lmdeploy' (uncomment one line below)
|
||||
--engine vllm
|
||||
# --engine lmdeploy
|
||||
|
||||
# ==================== vLLM Engine Parameters ====================
|
||||
# Uncomment if using --engine vllm
|
||||
--host 0.0.0.0
|
||||
--port 30000
|
||||
# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%
|
||||
# --dp-size 2 # If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode
|
||||
# --tp-size 2 # If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode.
|
||||
# --mem-fraction-static 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
|
||||
# Multi-GPU configuration (increase throughput)
|
||||
# --data-parallel-size 2
|
||||
# Single GPU memory optimization (reduce if VRAM insufficient)
|
||||
# --gpu-memory-utilization 0.5 # Try 0.4 or lower if issues persist
|
||||
|
||||
# ==================== LMDeploy Engine Parameters ====================
|
||||
# Uncomment if using --engine lmdeploy
|
||||
# --server-name 0.0.0.0
|
||||
# --server-port 30000
|
||||
# Multi-GPU configuration (increase throughput)
|
||||
# --dp 2
|
||||
# Single GPU memory optimization (reduce if VRAM insufficient)
|
||||
# --cache-max-entry-count 0.5 # Try 0.4 or lower if issues persist
|
||||
ulimits:
|
||||
memlock: -1
|
||||
stack: 67108864
|
||||
@@ -27,11 +44,11 @@ services:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
device_ids: ["0"]
|
||||
device_ids: ["0"] # Modify for multiple GPUs: ["0", "1"]
|
||||
capabilities: [gpu]
|
||||
|
||||
mineru-api:
|
||||
image: mineru-sglang:latest
|
||||
image: mineru:latest
|
||||
container_name: mineru-api
|
||||
restart: always
|
||||
profiles: ["api"]
|
||||
@@ -41,13 +58,21 @@ services:
|
||||
MINERU_MODEL_SOURCE: local
|
||||
entrypoint: mineru-api
|
||||
command:
|
||||
# ==================== Server Configuration ====================
|
||||
--host 0.0.0.0
|
||||
--port 8000
|
||||
# parameters for sglang-engine
|
||||
# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%
|
||||
# --dp-size 2 # If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode
|
||||
# --tp-size 2 # If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode.
|
||||
# --mem-fraction-static 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
|
||||
|
||||
# ==================== vLLM Engine Parameters ====================
|
||||
# Multi-GPU configuration
|
||||
# --data-parallel-size 2
|
||||
# Single GPU memory optimization
|
||||
# --gpu-memory-utilization 0.5 # Try 0.4 or lower if VRAM insufficient
|
||||
|
||||
# ==================== LMDeploy Engine Parameters ====================
|
||||
# Multi-GPU configuration
|
||||
# --dp 2
|
||||
# Single GPU memory optimization
|
||||
# --cache-max-entry-count 0.5 # Try 0.4 or lower if VRAM insufficient
|
||||
ulimits:
|
||||
memlock: -1
|
||||
stack: 67108864
|
||||
@@ -57,11 +82,11 @@ services:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
device_ids: [ "0" ]
|
||||
capabilities: [ gpu ]
|
||||
device_ids: ["0"] # Modify for multiple GPUs: ["0", "1"]
|
||||
capabilities: [gpu]
|
||||
|
||||
mineru-gradio:
|
||||
image: mineru-sglang:latest
|
||||
image: mineru:latest
|
||||
container_name: mineru-gradio
|
||||
restart: always
|
||||
profiles: ["gradio"]
|
||||
@@ -71,16 +96,30 @@ services:
|
||||
MINERU_MODEL_SOURCE: local
|
||||
entrypoint: mineru-gradio
|
||||
command:
|
||||
# ==================== Gradio Server Configuration ====================
|
||||
--server-name 0.0.0.0
|
||||
--server-port 7860
|
||||
--enable-sglang-engine true # Enable the sglang engine for Gradio
|
||||
# --enable-api false # If you want to disable the API, set this to false
|
||||
# --max-convert-pages 20 # If you want to limit the number of pages for conversion, set this to a specific number
|
||||
# parameters for sglang-engine
|
||||
# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%
|
||||
# --dp-size 2 # If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode
|
||||
# --tp-size 2 # If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode.
|
||||
# --mem-fraction-static 0.5 # If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by this parameter, if VRAM issues persist, try lowering it further to `0.4` or below.
|
||||
|
||||
# ==================== Gradio Feature Settings ====================
|
||||
# --enable-api false # Disable API endpoint
|
||||
# --max-convert-pages 20 # Limit conversion page count
|
||||
|
||||
# ==================== Engine Selection ====================
|
||||
# WARNING: Only ONE engine can be enabled at a time!
|
||||
|
||||
# Option 1: vLLM Engine (recommended for most users)
|
||||
--enable-vllm-engine true
|
||||
# Multi-GPU configuration
|
||||
# --data-parallel-size 2
|
||||
# Single GPU memory optimization
|
||||
# --gpu-memory-utilization 0.5 # Try 0.4 or lower if VRAM insufficient
|
||||
|
||||
# Option 2: LMDeploy Engine
|
||||
# --enable-lmdeploy-engine true
|
||||
# Multi-GPU configuration
|
||||
# --dp 2
|
||||
# Single GPU memory optimization
|
||||
# --cache-max-entry-count 0.5 # Try 0.4 or lower if VRAM insufficient
|
||||
ulimits:
|
||||
memlock: -1
|
||||
stack: 67108864
|
||||
@@ -90,5 +129,5 @@ services:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
device_ids: [ "0" ]
|
||||
capabilities: [ gpu ]
|
||||
device_ids: ["0"] # Modify for multiple GPUs: ["0", "1"]
|
||||
capabilities: [gpu]
|
||||
|
||||
@@ -1,7 +1,9 @@
|
||||
# Use the official sglang image
|
||||
FROM lmsysorg/sglang:v0.4.9.post5-cu126
|
||||
# For blackwell GPU, use the following line instead:
|
||||
# FROM lmsysorg/sglang:v0.4.9.post5-cu128-b200
|
||||
# Use the official vllm image for gpu with Ampere architecture and above (Compute Capability>=8.0)
|
||||
# Compute Capability version query (https://developer.nvidia.com/cuda-gpus)
|
||||
FROM vllm/vllm-openai:v0.10.1.1
|
||||
|
||||
# Use the official vllm image for gpu with Turing architecture and below (Compute Capability<8.0)
|
||||
# FROM vllm/vllm-openai:v0.10.2
|
||||
|
||||
# Install libgl for opencv support & Noto fonts for Chinese characters
|
||||
RUN apt-get update && \
|
||||
|
||||
BIN
docs/assets/images/BISHENG_01.png
Normal file
|
After Width: | Height: | Size: 96 KiB |
BIN
docs/assets/images/Cherry_Studio_1.png
Normal file
|
After Width: | Height: | Size: 34 KiB |
BIN
docs/assets/images/Cherry_Studio_2.png
Normal file
|
After Width: | Height: | Size: 51 KiB |
BIN
docs/assets/images/Cherry_Studio_3.png
Normal file
|
After Width: | Height: | Size: 72 KiB |
BIN
docs/assets/images/Cherry_Studio_4.png
Normal file
|
After Width: | Height: | Size: 55 KiB |
BIN
docs/assets/images/Cherry_Studio_5.png
Normal file
|
After Width: | Height: | Size: 64 KiB |
BIN
docs/assets/images/Cherry_Studio_6.png
Normal file
|
After Width: | Height: | Size: 75 KiB |
BIN
docs/assets/images/Cherry_Studio_7.png
Normal file
|
After Width: | Height: | Size: 56 KiB |
BIN
docs/assets/images/Cherry_Studio_8.png
Normal file
|
After Width: | Height: | Size: 28 KiB |
BIN
docs/assets/images/Coze_1.png
Normal file
|
After Width: | Height: | Size: 64 KiB |
BIN
docs/assets/images/Coze_10.png
Normal file
|
After Width: | Height: | Size: 88 KiB |
BIN
docs/assets/images/Coze_11.png
Normal file
|
After Width: | Height: | Size: 76 KiB |
BIN
docs/assets/images/Coze_12.png
Normal file
|
After Width: | Height: | Size: 110 KiB |
BIN
docs/assets/images/Coze_13.png
Normal file
|
After Width: | Height: | Size: 79 KiB |
BIN
docs/assets/images/Coze_14.png
Normal file
|
After Width: | Height: | Size: 104 KiB |
BIN
docs/assets/images/Coze_15.png
Normal file
|
After Width: | Height: | Size: 72 KiB |
BIN
docs/assets/images/Coze_16.png
Normal file
|
After Width: | Height: | Size: 87 KiB |
BIN
docs/assets/images/Coze_17.png
Normal file
|
After Width: | Height: | Size: 201 KiB |
BIN
docs/assets/images/Coze_18.png
Normal file
|
After Width: | Height: | Size: 261 KiB |
BIN
docs/assets/images/Coze_19.png
Normal file
|
After Width: | Height: | Size: 261 KiB |
BIN
docs/assets/images/Coze_2.png
Normal file
|
After Width: | Height: | Size: 53 KiB |
BIN
docs/assets/images/Coze_20.png
Normal file
|
After Width: | Height: | Size: 145 KiB |
BIN
docs/assets/images/Coze_21.png
Normal file
|
After Width: | Height: | Size: 130 KiB |
BIN
docs/assets/images/Coze_3.png
Normal file
|
After Width: | Height: | Size: 95 KiB |
BIN
docs/assets/images/Coze_4.png
Normal file
|
After Width: | Height: | Size: 110 KiB |
BIN
docs/assets/images/Coze_5.png
Normal file
|
After Width: | Height: | Size: 102 KiB |
BIN
docs/assets/images/Coze_6.png
Normal file
|
After Width: | Height: | Size: 101 KiB |
BIN
docs/assets/images/Coze_7.png
Normal file
|
After Width: | Height: | Size: 214 KiB |
BIN
docs/assets/images/Coze_8.png
Normal file
|
After Width: | Height: | Size: 151 KiB |
BIN
docs/assets/images/Coze_9.png
Normal file
|
After Width: | Height: | Size: 83 KiB |
BIN
docs/assets/images/DataFLow_01.png
Normal file
|
After Width: | Height: | Size: 89 KiB |
BIN
docs/assets/images/DataFlow_02.png
Normal file
|
After Width: | Height: | Size: 147 KiB |
BIN
docs/assets/images/Dify_1.png
Normal file
|
After Width: | Height: | Size: 108 KiB |
BIN
docs/assets/images/Dify_10.png
Normal file
|
After Width: | Height: | Size: 81 KiB |
BIN
docs/assets/images/Dify_11.png
Normal file
|
After Width: | Height: | Size: 85 KiB |
BIN
docs/assets/images/Dify_12.png
Normal file
|
After Width: | Height: | Size: 129 KiB |
BIN
docs/assets/images/Dify_13.png
Normal file
|
After Width: | Height: | Size: 35 KiB |
BIN
docs/assets/images/Dify_14.png
Normal file
|
After Width: | Height: | Size: 249 KiB |
BIN
docs/assets/images/Dify_15.png
Normal file
|
After Width: | Height: | Size: 255 KiB |
BIN
docs/assets/images/Dify_16.png
Normal file
|
After Width: | Height: | Size: 107 KiB |
BIN
docs/assets/images/Dify_17.png
Normal file
|
After Width: | Height: | Size: 125 KiB |
BIN
docs/assets/images/Dify_18.png
Normal file
|
After Width: | Height: | Size: 180 KiB |
BIN
docs/assets/images/Dify_19.png
Normal file
|
After Width: | Height: | Size: 105 KiB |
BIN
docs/assets/images/Dify_2.png
Normal file
|
After Width: | Height: | Size: 236 KiB |
BIN
docs/assets/images/Dify_20.png
Normal file
|
After Width: | Height: | Size: 177 KiB |
BIN
docs/assets/images/Dify_21.png
Normal file
|
After Width: | Height: | Size: 77 KiB |
BIN
docs/assets/images/Dify_22.png
Normal file
|
After Width: | Height: | Size: 118 KiB |
BIN
docs/assets/images/Dify_23.png
Normal file
|
After Width: | Height: | Size: 94 KiB |
BIN
docs/assets/images/Dify_24.png
Normal file
|
After Width: | Height: | Size: 133 KiB |
BIN
docs/assets/images/Dify_25.png
Normal file
|
After Width: | Height: | Size: 161 KiB |
BIN
docs/assets/images/Dify_26.png
Normal file
|
After Width: | Height: | Size: 190 KiB |
BIN
docs/assets/images/Dify_3.png
Normal file
|
After Width: | Height: | Size: 263 KiB |
BIN
docs/assets/images/Dify_4.png
Normal file
|
After Width: | Height: | Size: 264 KiB |
BIN
docs/assets/images/Dify_5.png
Normal file
|
After Width: | Height: | Size: 261 KiB |
BIN
docs/assets/images/Dify_6.png
Normal file
|
After Width: | Height: | Size: 286 KiB |
BIN
docs/assets/images/Dify_7.png
Normal file
|
After Width: | Height: | Size: 50 KiB |
BIN
docs/assets/images/Dify_8.png
Normal file
|
After Width: | Height: | Size: 136 KiB |
BIN
docs/assets/images/Dify_9.png
Normal file
|
After Width: | Height: | Size: 110 KiB |
BIN
docs/assets/images/DingTalk_01.png
Normal file
|
After Width: | Height: | Size: 133 KiB |
BIN
docs/assets/images/FastGPT_01.png
Normal file
|
After Width: | Height: | Size: 185 KiB |
BIN
docs/assets/images/FastGPT_02.png
Normal file
|
After Width: | Height: | Size: 92 KiB |
BIN
docs/assets/images/ModelWhale_01.png
Normal file
|
After Width: | Height: | Size: 246 KiB |
BIN
docs/assets/images/ModelWhale_02.png
Normal file
|
After Width: | Height: | Size: 71 KiB |
BIN
docs/assets/images/ModelWhale_1.png
Normal file
|
After Width: | Height: | Size: 72 KiB |
BIN
docs/assets/images/RagFlow_01.png
Normal file
|
After Width: | Height: | Size: 116 KiB |
BIN
docs/assets/images/RagFlow_02.png
Normal file
|
After Width: | Height: | Size: 151 KiB |
BIN
docs/assets/images/Sider_1.png
Normal file
|
After Width: | Height: | Size: 62 KiB |
BIN
docs/assets/images/coze_0.png
Normal file
|
After Width: | Height: | Size: 92 KiB |
BIN
docs/assets/images/n8n_0.png
Normal file
|
After Width: | Height: | Size: 276 KiB |
BIN
docs/assets/images/n8n_1.png
Normal file
|
After Width: | Height: | Size: 67 KiB |
BIN
docs/assets/images/n8n_10.png
Normal file
|
After Width: | Height: | Size: 14 KiB |
BIN
docs/assets/images/n8n_2.png
Normal file
|
After Width: | Height: | Size: 74 KiB |
BIN
docs/assets/images/n8n_3.png
Normal file
|
After Width: | Height: | Size: 71 KiB |
BIN
docs/assets/images/n8n_4.png
Normal file
|
After Width: | Height: | Size: 72 KiB |
BIN
docs/assets/images/n8n_5.png
Normal file
|
After Width: | Height: | Size: 70 KiB |
BIN
docs/assets/images/n8n_6.png
Normal file
|
After Width: | Height: | Size: 63 KiB |
BIN
docs/assets/images/n8n_7.png
Normal file
|
After Width: | Height: | Size: 23 KiB |
BIN
docs/assets/images/n8n_8.png
Normal file
|
After Width: | Height: | Size: 33 KiB |
BIN
docs/assets/images/n8n_9.png
Normal file
|
After Width: | Height: | Size: 89 KiB |
@@ -2,7 +2,7 @@
|
||||
|
||||
If your question is not listed, try using [DeepWiki](https://deepwiki.com/opendatalab/MinerU)'s AI assistant for common issues.
|
||||
|
||||
For unresolved problems, join our [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](http://mineru.space/s/V85Yl) community for support.
|
||||
For unresolved problems, join our [Discord](https://discord.gg/Tdedn9GTXq) or [WeChat](https://mineru.net/community-portal/?aliasId=3c430f94) community for support.
|
||||
|
||||
??? question "Encountered the error `ImportError: libGL.so.1: cannot open shared object file: No such file or directory` in Ubuntu 22.04 on WSL2"
|
||||
|
||||
@@ -15,18 +15,6 @@ For unresolved problems, join our [Discord](https://discord.gg/Tdedn9GTXq) or [W
|
||||
Reference: [#388](https://github.com/opendatalab/MinerU/issues/388)
|
||||
|
||||
|
||||
??? question "Error when installing MinerU on CentOS 7 or Ubuntu 18: `ERROR: Failed building wheel for simsimd`"
|
||||
|
||||
The new version of albumentations (1.4.21) introduces a dependency on simsimd. Since the pre-built package of simsimd for Linux requires a glibc version greater than or equal to 2.28, this causes installation issues on some Linux distributions released before 2019. You can resolve this issue by using the following command:
|
||||
```
|
||||
conda create -n mineru python=3.11 -y
|
||||
conda activate mineru
|
||||
pip install -U "mineru[pipeline_old_linux]"
|
||||
```
|
||||
|
||||
Reference: [#1004](https://github.com/opendatalab/MinerU/issues/1004)
|
||||
|
||||
|
||||
??? question "Missing text information in parsing results when installing and using on Linux systems."
|
||||
|
||||
MinerU uses `pypdfium2` instead of `pymupdf` as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images.
|
||||
|
||||
@@ -18,8 +18,9 @@
|
||||
[](https://mineru.net/OpenSourceTools/Extractor?source=github)
|
||||
[](https://huggingface.co/spaces/opendatalab/MinerU)
|
||||
[](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
|
||||
[](https://colab.research.google.com/gist/myhloli/3b3a00a4a0a61577b6c30f989092d20d/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://colab.research.google.com/gist/myhloli/a3cb16570ab3cfeadf9d8f0ac91b4fca/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://arxiv.org/abs/2509.22186)
|
||||
[](https://deepwiki.com/opendatalab/MinerU)
|
||||
|
||||
<div align="center">
|
||||
@@ -34,7 +35,7 @@
|
||||
<!-- join us -->
|
||||
|
||||
<p align="center">
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="http://mineru.space/s/V85Yl" target="_blank">WeChat</a>
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="https://mineru.net/community-portal/?aliasId=3c430f94" target="_blank">WeChat</a>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
@@ -56,7 +57,7 @@ Compared to well-known commercial products domestically and internationally, Min
|
||||
- Automatically identify and convert formulas in documents to LaTeX format
|
||||
- Automatically identify and convert tables in documents to HTML format
|
||||
- Automatically detect scanned PDFs and garbled PDFs, and enable OCR functionality
|
||||
- OCR supports detection and recognition of 84 languages
|
||||
- OCR supports detection and recognition of 109 languages
|
||||
- Support multiple output formats, such as multimodal and NLP Markdown, reading-order-sorted JSON, and information-rich intermediate formats
|
||||
- Support multiple visualization results, including layout visualization, span visualization, etc., for efficient confirmation of output effects and quality inspection
|
||||
- Support pure CPU environment operation, and support GPU(CUDA)/NPU(CANN)/MPS acceleration
|
||||
|
||||
@@ -6,25 +6,23 @@ MinerU provides a convenient Docker deployment method, which helps quickly set u
|
||||
|
||||
```bash
|
||||
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile
|
||||
docker build -t mineru-sglang:latest -f Dockerfile .
|
||||
docker build -t mineru:latest -f Dockerfile .
|
||||
```
|
||||
|
||||
> [!TIP]
|
||||
> The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/global/Dockerfile) uses `lmsysorg/sglang:v0.4.9.post5-cu126` as the base image by default, supporting Turing/Ampere/Ada Lovelace/Hopper platforms.
|
||||
> If you are using the newer `Blackwell` platform, please modify the base image to `lmsysorg/sglang:v0.4.9.post5-cu128-b200` before executing the build operation.
|
||||
> The [Dockerfile](https://github.com/opendatalab/MinerU/blob/master/docker/global/Dockerfile) uses `vllm/vllm-openai:v0.10.1.1` as the base image by default. This version of vLLM v1 engine has limited support for GPU models.
|
||||
> If you cannot use vLLM accelerated inference on Turing and earlier architecture GPUs, you can resolve this issue by changing the base image to `vllm/vllm-openai:v0.10.2`.
|
||||
|
||||
## Docker Description
|
||||
|
||||
MinerU's Docker uses `lmsysorg/sglang` as the base image, so it includes the `sglang` inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use `sglang` to accelerate VLM model inference.
|
||||
MinerU's Docker uses `vllm/vllm-openai` as the base image, so it includes the `vllm` inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use `vllm` to accelerate VLM model inference.
|
||||
|
||||
> [!NOTE]
|
||||
> Requirements for using `sglang` to accelerate VLM model inference:
|
||||
> Requirements for using `vllm` to accelerate VLM model inference:
|
||||
>
|
||||
> - Device must have Turing architecture or later graphics cards with 8GB+ available VRAM.
|
||||
> - The host machine's graphics driver should support CUDA 12.6 or higher; `Blackwell` platform should support CUDA 12.8 or higher. You can check the driver version using the `nvidia-smi` command.
|
||||
> - Device must have Volta architecture or later graphics cards with 8GB+ available VRAM.
|
||||
> - The host machine's graphics driver should support CUDA 12.8 or higher; You can check the driver version using the `nvidia-smi` command.
|
||||
> - Docker container must have access to the host machine's graphics devices.
|
||||
>
|
||||
> If your device doesn't meet the above requirements, you can still use other features of MinerU, but cannot use `sglang` to accelerate VLM model inference, meaning you cannot use the `vlm-sglang-engine` backend or start the `vlm-sglang-server` service.
|
||||
|
||||
## Start Docker Container
|
||||
|
||||
@@ -33,12 +31,12 @@ docker run --gpus all \
|
||||
--shm-size 32g \
|
||||
-p 30000:30000 -p 7860:7860 -p 8000:8000 \
|
||||
--ipc=host \
|
||||
-it mineru-sglang:latest \
|
||||
-it mineru:latest \
|
||||
/bin/bash
|
||||
```
|
||||
|
||||
After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features.
|
||||
You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [Start the service via command](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-sglang-clientserver).
|
||||
You can also directly start MinerU services by replacing `/bin/bash` with service startup commands. For detailed instructions, please refer to the [Start the service via command](https://opendatalab.github.io/MinerU/usage/quick_usage/#advanced-usage-via-api-webui-http-clientserver).
|
||||
|
||||
## Start Services Directly with Docker Compose
|
||||
|
||||
@@ -53,19 +51,19 @@ wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
|
||||
>
|
||||
>- The `compose.yaml` file contains configurations for multiple services of MinerU, you can choose to start specific services as needed.
|
||||
>- Different services might have additional parameter configurations, which you can view and edit in the `compose.yaml` file.
|
||||
>- Due to the pre-allocation of GPU memory by the `sglang` inference acceleration framework, you may not be able to run multiple `sglang` services simultaneously on the same machine. Therefore, ensure that other services that might use GPU memory have been stopped before starting the `vlm-sglang-server` service or using the `vlm-sglang-engine` backend.
|
||||
>- Due to the pre-allocation of GPU memory by the `vllm` inference acceleration framework, you may not be able to run multiple `vllm` services simultaneously on the same machine. Therefore, ensure that other services that might use GPU memory have been stopped before starting the `vlm-openai-server` service or using the `vlm-vllm-engine` backend.
|
||||
|
||||
---
|
||||
|
||||
### Start sglang-server service
|
||||
connect to `sglang-server` via `vlm-sglang-client` backend
|
||||
### Start OpenAI-compatible server service
|
||||
connect to `openai-server` via `vlm-http-client` backend
|
||||
```bash
|
||||
docker compose -f compose.yaml --profile sglang-server up -d
|
||||
docker compose -f compose.yaml --profile openai-server up -d
|
||||
```
|
||||
>[!TIP]
|
||||
>In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
|
||||
>In another terminal, connect to openai server via http client (only requires CPU and network, no vllm environment needed)
|
||||
> ```bash
|
||||
> mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://<server_ip>:30000
|
||||
> mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://<server_ip>:30000
|
||||
> ```
|
||||
|
||||
---
|
||||
@@ -86,4 +84,3 @@ connect to `sglang-server` via `vlm-sglang-client` backend
|
||||
>[!TIP]
|
||||
>
|
||||
>- Access `http://<server_ip>:7860` in your browser to use the Gradio WebUI.
|
||||
>- Access `http://<server_ip>:7860/?view=api` to use the Gradio API.
|
||||
|
||||
@@ -4,34 +4,43 @@ MinerU supports installing extension modules on demand based on different needs
|
||||
## Common Scenarios
|
||||
|
||||
### Core Functionality Installation
|
||||
The `core` module is the core dependency of MinerU, containing all functional modules except `sglang`. Installing this module ensures the basic functionality of MinerU works properly.
|
||||
The `core` module is the core dependency of MinerU, containing all functional modules except `vllm`/`lmdeploy`. Installing this module ensures the basic functionality of MinerU works properly.
|
||||
```bash
|
||||
uv pip install mineru[core]
|
||||
uv pip install "mineru[core]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Using `sglang` to Accelerate VLM Model Inference
|
||||
The `sglang` module provides acceleration support for VLM model inference, suitable for graphics cards with Turing architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.
|
||||
In the configuration, `all` includes both `core` and `sglang` modules, so `mineru[all]` and `mineru[core,sglang]` are equivalent.
|
||||
### Using `vllm` to Accelerate VLM Model Inference
|
||||
> [!NOTE]
|
||||
> `vllm` and `lmdeploy` have nearly identical VLM inference acceleration effects and usage methods. You can choose one of them to install and use based on your actual needs, but it is not recommended to install both modules simultaneously to avoid potential dependency conflicts.
|
||||
|
||||
The `vllm` module provides acceleration support for VLM model inference, suitable for graphics cards with Volta architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.
|
||||
|
||||
```bash
|
||||
uv pip install mineru[all]
|
||||
uv pip install "mineru[core,vllm]"
|
||||
```
|
||||
> [!TIP]
|
||||
> If exceptions occur during installation of the complete package including sglang, please refer to the [sglang official documentation](https://docs.sglang.ai/start/install.html) to try to resolve the issue, or directly use the [Docker](./docker_deployment.md) deployment method.
|
||||
> If exceptions occur during installation of the extra package including vllm, please refer to the [vllm official documentation](https://docs.vllm.ai/en/latest/getting_started/installation/index.html) to try to resolve the issue, or directly use the [Docker](./docker_deployment.md) deployment method.
|
||||
|
||||
---
|
||||
|
||||
### Installing Lightweight Client to Connect to sglang-server
|
||||
If you need to install a lightweight client on edge devices to connect to `sglang-server`, you can install the basic mineru package, which is very lightweight and suitable for devices with only CPU and network connectivity.
|
||||
### Using `lmdeploy` to Accelerate VLM Model Inference
|
||||
> [!NOTE]
|
||||
> `vllm` and `lmdeploy` have nearly identical VLM inference acceleration effects and usage methods. You can choose one of them to install and use based on your actual needs, but it is not recommended to install both modules simultaneously to avoid potential dependency conflicts.
|
||||
|
||||
The `lmdeploy` module provides acceleration support for VLM model inference, suitable for graphics cards with Volta architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.
|
||||
|
||||
```bash
|
||||
uv pip install "mineru[core,lmdeploy]"
|
||||
```
|
||||
> [!TIP]
|
||||
> If exceptions occur during installation of the extra package including lmdeploy, please refer to the [lmdeploy official documentation](https://lmdeploy.readthedocs.io/en/latest/get_started/installation.html) to try to resolve the issue.
|
||||
|
||||
---
|
||||
|
||||
### Installing Lightweight Client to Connect to OpenAI-compatible servers
|
||||
If you need to install a lightweight client on edge devices to connect to an OpenAI-compatible server for using VLM mode, you can install the basic mineru package, which is very lightweight and suitable for devices with only CPU and network connectivity.
|
||||
```bash
|
||||
uv pip install mineru
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Using Pipeline Backend on Outdated Linux Systems
|
||||
If your system is too outdated to meet the dependency requirements of `mineru[core]`, this option can minimally meet MinerU's runtime requirements, suitable for old systems that cannot be upgraded and only need to use the pipeline backend.
|
||||
```bash
|
||||
uv pip install mineru[pipeline_old_linux]
|
||||
```
|
||||
|
||||
@@ -27,41 +27,75 @@ A WebUI developed based on Gradio, with a simple interface and only core parsing
|
||||
> In non-mainstream environments, due to the diversity of hardware and software configurations, as well as compatibility issues with third-party dependencies, we cannot guarantee 100% usability of the project. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first, as most issues have corresponding solutions in the FAQ. Additionally, we encourage community feedback on issues so that we can gradually expand our support range.
|
||||
|
||||
<table border="1">
|
||||
<tr>
|
||||
<td>Parsing Backend</td>
|
||||
<td>pipeline</td>
|
||||
<td>vlm-transformers</td>
|
||||
<td>vlm-sglang</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Operating System</td>
|
||||
<td>Linux / Windows / macOS</td>
|
||||
<td>Linux / Windows</td>
|
||||
<td>Linux / Windows (via WSL2)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>CPU Inference Support</td>
|
||||
<td>✅</td>
|
||||
<td colspan="2">❌</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>GPU Requirements</td>
|
||||
<td>Turing architecture and later, 6GB+ VRAM or Apple Silicon</td>
|
||||
<td colspan="2">Turing architecture and later, 8GB+ VRAM</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Memory Requirements</td>
|
||||
<td colspan="3">Minimum 16GB+, recommended 32GB+</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Disk Space Requirements</td>
|
||||
<td colspan="3">20GB+, SSD recommended</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Python Version</td>
|
||||
<td colspan="3">3.10-3.13</td>
|
||||
</tr>
|
||||
<thead>
|
||||
<tr>
|
||||
<th rowspan="2">Parsing Backend</th>
|
||||
<th rowspan="2">pipeline <br> (Accuracy<sup>1</sup> 82+)</th>
|
||||
<th colspan="5">vlm (Accuracy<sup>1</sup> 90+)</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>transformers</th>
|
||||
<th>mlx-engine</th>
|
||||
<th>vllm-engine / <br>vllm-async-engine</th>
|
||||
<th>lmdeploy-engine</th>
|
||||
<th>http-client</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<th>Backend Features</th>
|
||||
<td>Fast, no hallucinations</td>
|
||||
<td>Good compatibility, <br>but slower</td>
|
||||
<td>Faster than transformers</td>
|
||||
<td>Fast, compatible with the vLLM ecosystem</td>
|
||||
<td>Fast, compatible with the LMDeploy ecosystem</td>
|
||||
<td>Suitable for OpenAI-compatible servers<sup>6</sup></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Operating System</th>
|
||||
<td colspan="2" style="text-align:center;">Linux<sup>2</sup> / Windows / macOS</td>
|
||||
<td style="text-align:center;">macOS<sup>3</sup></td>
|
||||
<td style="text-align:center;">Linux<sup>2</sup> / Windows<sup>4</sup> </td>
|
||||
<td style="text-align:center;">Linux<sup>2</sup> / Windows<sup>5</sup> </td>
|
||||
<td>Any</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>CPU inference support</th>
|
||||
<td colspan="2" style="text-align:center;">✅</td>
|
||||
<td colspan="3" style="text-align:center;">❌</td>
|
||||
<td>Not required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>GPU Requirements</th><td colspan="2" style="text-align:center;">Volta or later architectures, 6 GB VRAM or more, or Apple Silicon</td>
|
||||
<td>Apple Silicon</td>
|
||||
<td colspan="2" style="text-align:center;">Volta or later architectures, 8 GB VRAM or more</td>
|
||||
<td>Not required</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Memory Requirements</th>
|
||||
<td colspan="5" style="text-align:center;">Minimum 16 GB, 32 GB recommended</td>
|
||||
<td>8 GB</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Disk Space Requirements</th>
|
||||
<td colspan="5" style="text-align:center;">20 GB or more, SSD recommended</td>
|
||||
<td>2 GB</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<th>Python Version</th>
|
||||
<td colspan="6" style="text-align:center;">3.10-3.13<sup>7</sup></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
|
||||
<sup>1</sup> Accuracy metric is the End-to-End Evaluation Overall score of OmniDocBench (v1.5), tested on the latest `MinerU` version.
|
||||
<sup>2</sup> Linux supports only distributions released in 2019 or later.
|
||||
<sup>3</sup> MLX requires macOS 13.5 or later, recommended for use with version 14.0 or higher.
|
||||
<sup>4</sup> Windows vLLM support via WSL2(Windows Subsystem for Linux).
|
||||
<sup>5</sup> Windows LMDeploy can only use the `turbomind` backend, which is slightly slower than the `pytorch` backend. If performance is critical, it is recommended to run it via WSL2.
|
||||
<sup>6</sup> Servers compatible with the OpenAI API, such as local or remote model services deployed via inference frameworks like `vLLM`, `SGLang`, or `LMDeploy`.
|
||||
<sup>7</sup> Windows + LMDeploy only supports Python versions 3.10–3.12, as the critical dependency `ray` does not yet support Python 3.13 on Windows.
|
||||
|
||||
|
||||
### Install MinerU
|
||||
|
||||
@@ -80,8 +114,8 @@ uv pip install -e .[core]
|
||||
```
|
||||
|
||||
> [!TIP]
|
||||
> `mineru[core]` includes all core features except `sglang` acceleration, compatible with Windows / Linux / macOS systems, suitable for most users.
|
||||
> If you need to use `sglang` acceleration for VLM model inference or install a lightweight client on edge devices, please refer to the documentation [Extension Modules Installation Guide](./extension_modules.md).
|
||||
> `mineru[core]` includes all core features except `vLLM`/`LMDeploy` acceleration, compatible with Windows / Linux / macOS systems, suitable for most users.
|
||||
> If you need to use `vLLM`/`LMDeploy` acceleration for VLM model inference or install a lightweight client on edge devices, please refer to the documentation [Extension Modules Installation Guide](./extension_modules.md).
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -51,14 +51,16 @@ The following sections provide detailed descriptions of each file's purpose and
|
||||
|
||||
## Structured Data Files
|
||||
|
||||
### Model Inference Results (model.json)
|
||||
> [!IMPORTANT]
|
||||
> The VLM backend output has significant changes in version 2.5 and is not backward-compatible with the pipeline backend. If you plan to build secondary development on structured outputs, please read this document carefully.
|
||||
|
||||
> [!NOTE]
|
||||
> Only applicable to pipeline backend
|
||||
### Pipeline Backend Output Results
|
||||
|
||||
#### Model Inference Results (model.json)
|
||||
|
||||
**File naming format**: `{original_filename}_model.json`
|
||||
|
||||
#### Data Structure Definition
|
||||
##### Data Structure Definition
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel, Field
|
||||
@@ -103,7 +105,7 @@ class PageInferenceResults(BaseModel):
|
||||
inference_result: list[PageInferenceResults] = []
|
||||
```
|
||||
|
||||
#### Coordinate System Description
|
||||
##### Coordinate System Description
|
||||
|
||||
`poly` coordinate format: `[x0, y0, x1, y1, x2, y2, x3, y3]`
|
||||
|
||||
@@ -112,7 +114,7 @@ inference_result: list[PageInferenceResults] = []
|
||||
|
||||

|
||||
|
||||
#### Sample Data
|
||||
##### Sample Data
|
||||
|
||||
```json
|
||||
[
|
||||
@@ -165,52 +167,11 @@ inference_result: list[PageInferenceResults] = []
|
||||
]
|
||||
```
|
||||
|
||||
### VLM Output Results (model_output.txt)
|
||||
|
||||
> [!NOTE]
|
||||
> Only applicable to VLM backend
|
||||
|
||||
**File naming format**: `{original_filename}_model_output.txt`
|
||||
|
||||
#### File Format Description
|
||||
|
||||
- Uses `----` to separate output results for each page
|
||||
- Each page contains multiple text blocks starting with `<|box_start|>` and ending with `<|md_end|>`
|
||||
|
||||
#### Field Meanings
|
||||
|
||||
| Tag | Format | Description |
|
||||
|-----|--------|-------------|
|
||||
| Bounding box | `<\|box_start\|>x0 y0 x1 y1<\|box_end\|>` | Quadrilateral coordinates (top-left, bottom-right points), coordinate values after scaling page to 1000×1000 |
|
||||
| Type tag | `<\|ref_start\|>type<\|ref_end\|>` | Content block type identifier |
|
||||
| Content | `<\|md_start\|>markdown content<\|md_end\|>` | Markdown content of the block |
|
||||
|
||||
#### Supported Content Types
|
||||
|
||||
```json
|
||||
{
|
||||
"text": "Text",
|
||||
"title": "Title",
|
||||
"image": "Image",
|
||||
"image_caption": "Image caption",
|
||||
"image_footnote": "Image footnote",
|
||||
"table": "Table",
|
||||
"table_caption": "Table caption",
|
||||
"table_footnote": "Table footnote",
|
||||
"equation": "Interline formula"
|
||||
}
|
||||
```
|
||||
|
||||
#### Special Tags
|
||||
|
||||
- `<|txt_contd|>`: Appears at the end of text, indicating that this text block can be connected with subsequent text blocks
|
||||
- Table content uses `otsl` format and needs to be converted to HTML for rendering in Markdown
|
||||
|
||||
### Intermediate Processing Results (middle.json)
|
||||
#### Intermediate Processing Results (middle.json)
|
||||
|
||||
**File naming format**: `{original_filename}_middle.json`
|
||||
|
||||
#### Top-level Structure
|
||||
##### Top-level Structure
|
||||
|
||||
| Field Name | Type | Description |
|
||||
|------------|------|-------------|
|
||||
@@ -218,22 +179,20 @@ inference_result: list[PageInferenceResults] = []
|
||||
| `_backend` | `string` | Parsing mode: `pipeline` or `vlm` |
|
||||
| `_version_name` | `string` | MinerU version number |
|
||||
|
||||
#### Page Information Structure (pdf_info)
|
||||
##### Page Information Structure (pdf_info)
|
||||
|
||||
| Field Name | Description |
|
||||
|------------|-------------|
|
||||
| `preproc_blocks` | Unsegmented intermediate results after PDF preprocessing |
|
||||
| `layout_bboxes` | Layout segmentation results, including layout direction and bounding boxes, sorted by reading order |
|
||||
| `page_idx` | Page number, starting from 0 |
|
||||
| `page_size` | Page width and height `[width, height]` |
|
||||
| `_layout_tree` | Layout tree structure |
|
||||
| `images` | Image block information list |
|
||||
| `tables` | Table block information list |
|
||||
| `interline_equations` | Interline formula block information list |
|
||||
| `discarded_blocks` | Block information to be discarded |
|
||||
| `para_blocks` | Content block results after segmentation |
|
||||
|
||||
#### Block Structure Hierarchy
|
||||
##### Block Structure Hierarchy
|
||||
|
||||
```
|
||||
Level 1 blocks (table | image)
|
||||
@@ -242,7 +201,7 @@ Level 1 blocks (table | image)
|
||||
└── Spans
|
||||
```
|
||||
|
||||
#### Level 1 Block Fields
|
||||
##### Level 1 Block Fields
|
||||
|
||||
| Field Name | Description |
|
||||
|------------|-------------|
|
||||
@@ -250,7 +209,7 @@ Level 1 blocks (table | image)
|
||||
| `bbox` | Rectangular box coordinates of the block `[x0, y0, x1, y1]` |
|
||||
| `blocks` | List of contained level 2 blocks |
|
||||
|
||||
#### Level 2 Block Fields
|
||||
##### Level 2 Block Fields
|
||||
|
||||
| Field Name | Description |
|
||||
|------------|-------------|
|
||||
@@ -258,7 +217,7 @@ Level 1 blocks (table | image)
|
||||
| `bbox` | Rectangular box coordinates of the block |
|
||||
| `lines` | List of contained line information |
|
||||
|
||||
#### Level 2 Block Types
|
||||
##### Level 2 Block Types
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
@@ -274,7 +233,7 @@ Level 1 blocks (table | image)
|
||||
| `list` | List block |
|
||||
| `interline_equation` | Interline formula block |
|
||||
|
||||
#### Line and Span Structure
|
||||
##### Line and Span Structure
|
||||
|
||||
**Line fields**:
|
||||
- `bbox`: Rectangular box coordinates of the line
|
||||
@@ -285,7 +244,7 @@ Level 1 blocks (table | image)
|
||||
- `type`: Span type (`image`, `table`, `text`, `inline_equation`, `interline_equation`)
|
||||
- `content` | `img_path`: Text content or image path
|
||||
|
||||
#### Sample Data
|
||||
##### Sample Data
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -388,15 +347,15 @@ Level 1 blocks (table | image)
|
||||
}
|
||||
```
|
||||
|
||||
### Content List (content_list.json)
|
||||
#### Content List (content_list.json)
|
||||
|
||||
**File naming format**: `{original_filename}_content_list.json`
|
||||
|
||||
#### Functionality
|
||||
##### Functionality
|
||||
|
||||
This is a simplified version of `middle.json` that stores all readable content blocks in reading order as a flat structure, removing complex layout information for easier subsequent processing.
|
||||
|
||||
#### Content Types
|
||||
##### Content Types
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
@@ -405,7 +364,7 @@ This is a simplified version of `middle.json` that stores all readable content b
|
||||
| `text` | Text/Title |
|
||||
| `equation` | Interline formula |
|
||||
|
||||
#### Text Level Identification
|
||||
##### Text Level Identification
|
||||
|
||||
Text levels are distinguished through the `text_level` field:
|
||||
|
||||
@@ -414,49 +373,40 @@ Text levels are distinguished through the `text_level` field:
|
||||
- `text_level: 2`: Level 2 heading
|
||||
- And so on...
|
||||
|
||||
#### Common Fields
|
||||
##### Common Fields
|
||||
|
||||
All content blocks include a `page_idx` field indicating the page number (starting from 0).
|
||||
- All content blocks include a `page_idx` field indicating the page number (starting from 0).
|
||||
- All content blocks include a `bbox` field representing the bounding box coordinates of the content block `[x0, y0, x1, y1]`, mapped to a range of 0-1000.
|
||||
|
||||
#### Sample Data
|
||||
##### Sample Data
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
"type": "text",
|
||||
"text": "The response of flow duration curves to afforestation ",
|
||||
"text_level": 1,
|
||||
"text_level": 1,
|
||||
"bbox": [
|
||||
62,
|
||||
480,
|
||||
946,
|
||||
904
|
||||
],
|
||||
"page_idx": 0
|
||||
},
|
||||
{
|
||||
"type": "text",
|
||||
"text": "Received 1 October 2003; revised 22 December 2004; accepted 3 January 2005 ",
|
||||
"page_idx": 0
|
||||
},
|
||||
{
|
||||
"type": "text",
|
||||
"text": "Abstract ",
|
||||
"text_level": 2,
|
||||
"page_idx": 0
|
||||
},
|
||||
{
|
||||
"type": "text",
|
||||
"text": "The hydrologic effect of replacing pasture or other short crops with trees is reasonably well understood on a mean annual basis. The impact on flow regime, as described by the annual flow duration curve (FDC) is less certain. A method to assess the impact of plantation establishment on FDCs was developed. The starting point for the analyses was the assumption that rainfall and vegetation age are the principal drivers of evapotranspiration. A key objective was to remove the variability in the rainfall signal, leaving changes in streamflow solely attributable to the evapotranspiration of the plantation. A method was developed to (1) fit a model to the observed annual time series of FDC percentiles; i.e. 10th percentile for each year of record with annual rainfall and plantation age as parameters, (2) replace the annual rainfall variation with the long term mean to obtain climate adjusted FDCs, and (3) quantify changes in FDC percentiles as plantations age. Data from 10 catchments from Australia, South Africa and New Zealand were used. The model was able to represent flow variation for the majority of percentiles at eight of the 10 catchments, particularly for the 10–50th percentiles. The adjusted FDCs revealed variable patterns in flow reductions with two types of responses (groups) being identified. Group 1 catchments show a substantial increase in the number of zero flow days, with low flows being more affected than high flows. Group 2 catchments show a more uniform reduction in flows across all percentiles. The differences may be partly explained by storage characteristics. The modelled flow reductions were in accord with published results of paired catchment experiments. An additional analysis was performed to characterise the impact of afforestation on the number of zero flow days $( N _ { \\mathrm { z e r o } } )$ for the catchments in group 1. This model performed particularly well, and when adjusted for climate, indicated a significant increase in $N _ { \\mathrm { z e r o } }$ . The zero flow day method could be used to determine change in the occurrence of any given flow in response to afforestation. The methods used in this study proved satisfactory in removing the rainfall variability, and have added useful insight into the hydrologic impacts of plantation establishment. This approach provides a methodology for understanding catchment response to afforestation, where paired catchment data is not available. ",
|
||||
"page_idx": 0
|
||||
},
|
||||
{
|
||||
"type": "text",
|
||||
"text": "1. Introduction ",
|
||||
"text_level": 2,
|
||||
"page_idx": 1
|
||||
},
|
||||
{
|
||||
"type": "image",
|
||||
"img_path": "images/a8ecda1c69b27e4f79fce1589175a9d721cbdc1cf78b4cc06a015f3746f6b9d8.jpg",
|
||||
"img_caption": [
|
||||
"image_caption": [
|
||||
"Fig. 1. Annual flow duration curves of daily flows from Pine Creek, Australia, 1989–2000. "
|
||||
],
|
||||
"img_footnote": [],
|
||||
"image_footnote": [],
|
||||
"bbox": [
|
||||
62,
|
||||
480,
|
||||
946,
|
||||
904
|
||||
],
|
||||
"page_idx": 1
|
||||
},
|
||||
{
|
||||
@@ -464,6 +414,12 @@ All content blocks include a `page_idx` field indicating the page number (starti
|
||||
"img_path": "images/181ea56ef185060d04bf4e274685f3e072e922e7b839f093d482c29bf89b71e8.jpg",
|
||||
"text": "$$\nQ _ { \\% } = f ( P ) + g ( T )\n$$",
|
||||
"text_format": "latex",
|
||||
"bbox": [
|
||||
62,
|
||||
480,
|
||||
946,
|
||||
904
|
||||
],
|
||||
"page_idx": 2
|
||||
},
|
||||
{
|
||||
@@ -476,16 +432,281 @@ All content blocks include a `page_idx` field indicating the page number (starti
|
||||
"indicates that the rainfall term was significant at the $5 \\%$ level, $T$ indicates that the time term was significant at the $5 \\%$ level, \\* represents significance at the $10 \\%$ level, and na denotes too few data points for meaningful analysis. "
|
||||
],
|
||||
"table_body": "<html><body><table><tr><td rowspan=\"2\">Site</td><td colspan=\"10\">Percentile</td></tr><tr><td>10</td><td>20</td><td>30</td><td>40</td><td>50</td><td>60</td><td>70</td><td>80</td><td>90</td><td>100</td></tr><tr><td>Traralgon Ck</td><td>P</td><td>P,*</td><td>P</td><td>P</td><td>P,</td><td>P,</td><td>P,</td><td>P,</td><td>P</td><td>P</td></tr><tr><td>Redhill</td><td>P,T</td><td>P,T</td><td>,*</td><td>**</td><td>P.T</td><td>P,*</td><td>P*</td><td>P*</td><td>*</td><td>,*</td></tr><tr><td>Pine Ck</td><td></td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td><td>T</td><td>na</td><td>na</td></tr><tr><td>Stewarts Ck 5</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P.T</td><td>P.T</td><td>P,T</td><td>na</td><td>na</td><td>na</td></tr><tr><td>Glendhu 2</td><td>P</td><td>P,T</td><td>P,*</td><td>P,T</td><td>P.T</td><td>P,ns</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td></tr><tr><td>Cathedral Peak 2</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Cathedral Peak 3</td><td>P.T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Lambrechtsbos A</td><td>P,T</td><td>P</td><td>P</td><td>P,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>T</td></tr><tr><td>Lambrechtsbos B</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td></tr><tr><td>Biesievlei</td><td>P,T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>*,T</td><td>T</td><td>T</td><td>P,T</td><td>P,T</td></tr></table></body></html>",
|
||||
"bbox": [
|
||||
62,
|
||||
480,
|
||||
946,
|
||||
904
|
||||
],
|
||||
"page_idx": 5
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### VLM Backend Output Results
|
||||
|
||||
#### Model Inference Results (model.json)
|
||||
|
||||
**File naming format**: `{original_filename}_model.json`
|
||||
|
||||
##### File format description
|
||||
- Two-level nested list: outer list = pages; inner list = content blocks of that page
|
||||
- Each block is a dict with at least: `type`, `bbox`, `angle`, `content` (some types add extra fields like `score`, `block_tags`, `content_tags`, `format`)
|
||||
- Designed for direct, raw model inspection
|
||||
|
||||
##### Supported content types (type field values)
|
||||
```json
|
||||
{
|
||||
"text": "Plain text",
|
||||
"title": "Title",
|
||||
"equation": "Display (interline) formula",
|
||||
"image": "Image",
|
||||
"image_caption": "Image caption",
|
||||
"image_footnote": "Image footnote",
|
||||
"table": "Table",
|
||||
"table_caption": "Table caption",
|
||||
"table_footnote": "Table footnote",
|
||||
"phonetic": "Phonetic annotation",
|
||||
"code": "Code block",
|
||||
"code_caption": "Code caption",
|
||||
"ref_text": "Reference / citation entry",
|
||||
"algorithm": "Algorithm block (treated as code subtype)",
|
||||
"list": "List container",
|
||||
"header": "Page header",
|
||||
"footer": "Page footer",
|
||||
"page_number": "Page number",
|
||||
"aside_text": "Side / margin note",
|
||||
"page_footnote": "Page footnote"
|
||||
}
|
||||
```
|
||||
|
||||
##### Coordinate system
|
||||
- `bbox` = `[x0, y0, x1, y1]` (top-left, bottom-right)
|
||||
- Origin at top-left of the page
|
||||
- All coordinates are normalized percentages in `[0,1]`
|
||||
|
||||
##### Sample data
|
||||
```json
|
||||
[
|
||||
[
|
||||
{
|
||||
"type": "header",
|
||||
"bbox": [0.077, 0.095, 0.18, 0.181],
|
||||
"angle": 0,
|
||||
"score": null,
|
||||
"block_tags": null,
|
||||
"content": "ELSEVIER",
|
||||
"format": null,
|
||||
"content_tags": null
|
||||
},
|
||||
{
|
||||
"type": "title",
|
||||
"bbox": [0.157, 0.228, 0.833, 0.253],
|
||||
"angle": 0,
|
||||
"score": null,
|
||||
"block_tags": null,
|
||||
"content": "The response of flow duration curves to afforestation",
|
||||
"format": null,
|
||||
"content_tags": null
|
||||
}
|
||||
]
|
||||
]
|
||||
```
|
||||
|
||||
#### Intermediate Processing Results (middle.json)
|
||||
|
||||
**File naming format**: `{original_filename}_middle.json`
|
||||
|
||||
Structure is broadly similar to the pipeline backend, but with these differences:
|
||||
|
||||
- `list` becomes a second‑level block, a new field `sub_type` distinguishes list categories:
|
||||
* `text`: ordinary list
|
||||
* `ref_text`: reference / bibliography style list
|
||||
- New `code` block type with `sub_type`(a code block always has at least a `code_body`, it may optionally have a `code_caption`):
|
||||
* `code`
|
||||
* `algorithm`
|
||||
- `discarded_blocks` may contain additional types:
|
||||
* `header`
|
||||
* `footer`
|
||||
* `page_number`
|
||||
* `aside_text`
|
||||
* `page_footnote`
|
||||
- All blocks include an `angle` field indicating rotation (one of `0, 90, 180, 270`).
|
||||
|
||||
##### Examples
|
||||
- Example: list block
|
||||
```json
|
||||
{
|
||||
"bbox": [174,155,818,333],
|
||||
"type": "list",
|
||||
"angle": 0,
|
||||
"index": 11,
|
||||
"blocks": [
|
||||
{
|
||||
"bbox": [174,157,311,175],
|
||||
"type": "text",
|
||||
"angle": 0,
|
||||
"lines": [
|
||||
{
|
||||
"bbox": [174,157,311,175],
|
||||
"spans": [
|
||||
{
|
||||
"bbox": [174,157,311,175],
|
||||
"type": "text",
|
||||
"content": "H.1 Introduction"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"index": 3
|
||||
},
|
||||
{
|
||||
"bbox": [175,182,464,229],
|
||||
"type": "text",
|
||||
"angle": 0,
|
||||
"lines": [
|
||||
{
|
||||
"bbox": [175,182,464,229],
|
||||
"spans": [
|
||||
{
|
||||
"bbox": [175,182,464,229],
|
||||
"type": "text",
|
||||
"content": "H.2 Example: Divide by Zero without Exception Handling"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"index": 4
|
||||
}
|
||||
],
|
||||
"sub_type": "text"
|
||||
}
|
||||
```
|
||||
|
||||
- Example: code block with optional caption:
|
||||
```json
|
||||
{
|
||||
"type": "code",
|
||||
"bbox": [114,780,885,1231],
|
||||
"blocks": [
|
||||
{
|
||||
"bbox": [114,780,885,1231],
|
||||
"lines": [
|
||||
{
|
||||
"bbox": [114,780,885,1231],
|
||||
"spans": [
|
||||
{
|
||||
"bbox": [114,780,885,1231],
|
||||
"type": "text",
|
||||
"content": "1 // Fig. H.1: DivideByZeroNoExceptionHandling.java \n2 // Integer division without exception handling. \n3 import java.util.Scanner; \n4 \n5 public class DivideByZeroNoExceptionHandling \n6 { \n7 // demonstrates throwing an exception when a divide-by-zero occurs \n8 public static int quotient( int numerator, int denominator ) \n9 { \n10 return numerator / denominator; // possible division by zero \n11 } // end method quotient \n12 \n13 public static void main(String[] args) \n14 { \n15 Scanner scanner = new Scanner(System.in); // scanner for input \n16 \n17 System.out.print(\"Please enter an integer numerator: \"); \n18 int numerator = scanner.nextInt(); \n19 System.out.print(\"Please enter an integer denominator: \"); \n20 int denominator = scanner.nextInt(); \n21"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"index": 17,
|
||||
"angle": 0,
|
||||
"type": "code_body"
|
||||
},
|
||||
{
|
||||
"bbox": [867,160,1280,189],
|
||||
"lines": [
|
||||
{
|
||||
"bbox": [867,160,1280,189],
|
||||
"spans": [
|
||||
{
|
||||
"bbox": [867,160,1280,189],
|
||||
"type": "text",
|
||||
"content": "Algorithm 1 Modules for MCTSteg"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"index": 19,
|
||||
"angle": 0,
|
||||
"type": "code_caption"
|
||||
}
|
||||
],
|
||||
"index": 17,
|
||||
"sub_type": "code"
|
||||
}
|
||||
```
|
||||
|
||||
#### Content List (content_list.json)
|
||||
|
||||
**File naming format**: `{original_filename}_content_list.json`
|
||||
|
||||
Based on the pipeline format, with these VLM-specific extensions:
|
||||
|
||||
- New `code` type with `sub_type` (`code` | `algorithm`):
|
||||
* Fields: `code_body` (string), optional `code_caption` (list of strings)
|
||||
- New `list` type with `sub_type` (`text` | `ref_text`):
|
||||
* Field: `list_items` (array of strings)
|
||||
- All `discarded_blocks` entries are also output (e.g., headers, footers, page numbers, margin notes, page footnotes).
|
||||
- Existing types (`image`, `table`, `text`, `equation`) remain unchanged.
|
||||
- `bbox` still uses the 0–1000 normalized coordinate mapping.
|
||||
|
||||
|
||||
##### Examples
|
||||
Example: code (algorithm) entry
|
||||
```json
|
||||
{
|
||||
"type": "code",
|
||||
"sub_type": "algorithm",
|
||||
"code_caption": ["Algorithm 1 Modules for MCTSteg"],
|
||||
"code_body": "1: function GETCOORDINATE(d) \n2: $x \\gets d / l$ , $y \\gets d$ mod $l$ \n3: return $(x, y)$ \n4: end function \n5: function BESTCHILD(v) \n6: $C \\gets$ child set of $v$ \n7: $v' \\gets \\arg \\max_{c \\in C} \\mathrm{UCTScore}(c)$ \n8: $v'.n \\gets v'.n + 1$ \n9: return $v'$ \n10: end function \n11: function BACK PROPAGATE(v) \n12: Calculate $R$ using Equation 11 \n13: while $v$ is not a root node do \n14: $v.r \\gets v.r + R$ , $v \\gets v.p$ \n15: end while \n16: end function \n17: function RANDOMSEARCH(v) \n18: while $v$ is not a leaf node do \n19: Randomly select an untried action $a \\in A(v)$ \n20: Create a new node $v'$ \n21: $(x, y) \\gets \\mathrm{GETCOORDINATE}(v'.d)$ \n22: $v'.p \\gets v$ , $v'.d \\gets v.d + 1$ , $v'.\\Gamma \\gets v.\\Gamma$ \n23: $v'.\\gamma_{x,y} \\gets a$ \n24: if $a = -1$ then \n25: $v.lc \\gets v'$ \n26: else if $a = 0$ then \n27: $v.mc \\gets v'$ \n28: else \n29: $v.rc \\gets v'$ \n30: end if \n31: $v \\gets v'$ \n32: end while \n33: return $v$ \n34: end function \n35: function SEARCH(v) \n36: while $v$ is fully expanded do \n37: $v \\gets$ BESTCHILD(v) \n38: end while \n39: if $v$ is not a leaf node then \n40: $v \\gets$ RANDOMSEARCH(v) \n41: end if \n42: return $v$ \n43: end function",
|
||||
"bbox": [510,87,881,740],
|
||||
"page_idx": 0
|
||||
}
|
||||
```
|
||||
|
||||
Example: list (text) entry
|
||||
```json
|
||||
{
|
||||
"type": "list",
|
||||
"sub_type": "text",
|
||||
"list_items": [
|
||||
"H.1 Introduction",
|
||||
"H.2 Example: Divide by Zero without Exception Handling",
|
||||
"H.3 Example: Divide by Zero with Exception Handling",
|
||||
"H.4 Summary"
|
||||
],
|
||||
"bbox": [174,155,818,333],
|
||||
"page_idx": 0
|
||||
}
|
||||
```
|
||||
|
||||
Example: discarded blocks output
|
||||
```json
|
||||
[
|
||||
{
|
||||
"type": "header",
|
||||
"text": "Journal of Hydrology 310 (2005) 253-265",
|
||||
"bbox": [363,164,623,177],
|
||||
"page_idx": 0
|
||||
},
|
||||
{
|
||||
"type": "page_footnote",
|
||||
"text": "* Corresponding author. Address: Forest Science Centre, Department of Sustainability and Environment, P.O. Box 137, Heidelberg, Vic. 3084, Australia. Tel.: +61 3 9450 8719; fax: +61 3 9450 8644.",
|
||||
"bbox": [71,815,915,841],
|
||||
"page_idx": 0
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:
|
||||
|
||||
- **Model outputs**: Use raw outputs (model.json, model_output.txt)
|
||||
- **Debugging and verification**: Use visualization files (layout.pdf, spans.pdf)
|
||||
- **Content extraction**: Use simplified files (*.md, content_list.json)
|
||||
- **Secondary development**: Use structured files (middle.json)
|
||||
- **Model outputs** (Use raw outputs):
|
||||
* model.json
|
||||
|
||||
- **Debugging and verification** (Use visualization files):
|
||||
* layout.pdf
|
||||
* spans.pdf
|
||||
|
||||
- **Content extraction**: (Use simplified files):
|
||||
* *.md
|
||||
* content_list.json
|
||||
|
||||
- **Secondary development**: (Use structured files):
|
||||
* middle.json
|
||||
|
||||
@@ -1,25 +1,18 @@
|
||||
# Advanced Command Line Parameters
|
||||
|
||||
## SGLang Acceleration Parameter Optimization
|
||||
## Pass-through of inference engine parameters
|
||||
|
||||
### Memory Optimization Parameters
|
||||
### vllm Acceleration Parameter Optimization
|
||||
> [!TIP]
|
||||
> SGLang acceleration mode currently supports running on Turing architecture graphics cards with a minimum of 8GB VRAM, but graphics cards with <24GB VRAM may encounter insufficient memory issues. You can optimize memory usage with the following parameters:
|
||||
> If you can already use vllm normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
|
||||
>
|
||||
> - If you encounter insufficient VRAM when using a single graphics card, you may need to reduce the KV cache size with `--mem-fraction-static 0.5`. If VRAM issues persist, try reducing it further to `0.4` or lower.
|
||||
> - If you have two or more graphics cards, you can try using tensor parallelism (TP) mode to simply expand available VRAM: `--tp-size 2`
|
||||
|
||||
### Performance Optimization Parameters
|
||||
> [!TIP]
|
||||
> If you can already use SGLang normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:
|
||||
>
|
||||
> - If you have multiple graphics cards, you can use SGLang's multi-card parallel mode to increase throughput: `--dp-size 2`
|
||||
> - You can also enable `torch.compile` to accelerate inference speed by approximately 15%: `--enable-torch-compile`
|
||||
> - If you have multiple graphics cards, you can use vllm's multi-card parallel mode to increase throughput: `--data-parallel-size 2`
|
||||
|
||||
### Parameter Passing Instructions
|
||||
> [!TIP]
|
||||
> - All officially supported SGLang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
|
||||
> - If you want to learn more about `sglang` parameter usage, please refer to the [SGLang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
|
||||
> - All officially supported vllm/lmdeploy parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-openai-server`, `mineru-gradio`, `mineru-api`
|
||||
> - If you want to learn more about `vllm` parameter usage, please refer to the [vllm official documentation](https://docs.vllm.ai/en/latest/cli/serve.html)
|
||||
> - If you want to learn more about `lmdeploy` parameter usage, please refer to the [lmdeploy official documentation](https://lmdeploy.readthedocs.io/en/latest/llm/api_server.html)
|
||||
|
||||
## GPU Device Selection and Configuration
|
||||
|
||||
@@ -29,7 +22,7 @@
|
||||
> ```bash
|
||||
> CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
|
||||
> ```
|
||||
> - This specification method is effective for all command line calls, including `mineru`, `mineru-sglang-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
|
||||
> - This specification method is effective for all command line calls, including `mineru`, `mineru-openai-server`, `mineru-gradio`, and `mineru-api`, and applies to both `pipeline` and `vlm` backends.
|
||||
|
||||
### Common Device Configuration Examples
|
||||
> [!TIP]
|
||||
@@ -46,14 +39,9 @@
|
||||
> [!TIP]
|
||||
> Here are some possible usage scenarios:
|
||||
>
|
||||
> - If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start `sglang-server`, you can use the following command:
|
||||
> - If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start `openai-server`, you can use the following command:
|
||||
> ```bash
|
||||
> CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
|
||||
> ```
|
||||
>
|
||||
> - If you have multiple GPUs and need to specify GPU 0–3, and start the `sglang-server` using multi-GPU data parallelism and tensor parallelism, you can use the following command:
|
||||
> ```bash
|
||||
> CUDA_VISIBLE_DEVICES=0,1,2,3 mineru-sglang-server --port 30000 --dp-size 2 --tp-size 2
|
||||
> CUDA_VISIBLE_DEVICES=0,1 mineru-openai-server --engine vllm --port 30000 --data-parallel-size 2
|
||||
> ```
|
||||
>
|
||||
> - If you have multiple graphics cards and need to start two `fastapi` services on cards 0 and 1, listening on different ports respectively, you can use the following commands:
|
||||
|
||||
@@ -11,16 +11,16 @@ Options:
|
||||
-p, --path PATH Input file path or directory (required)
|
||||
-o, --output PATH Output directory (required)
|
||||
-m, --method [auto|txt|ocr] Parsing method: auto (default), txt, ocr (pipeline backend only)
|
||||
-b, --backend [pipeline|vlm-transformers|vlm-sglang-engine|vlm-sglang-client]
|
||||
-b, --backend [pipeline|vlm-transformers|vlm-vllm-engine|vlm-lmdeploy-engine|vlm-http-client]
|
||||
Parsing backend (default: pipeline)
|
||||
-l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|latin|arabic|east_slavic|cyrillic|devanagari]
|
||||
-l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|th|el|latin|arabic|east_slavic|cyrillic|devanagari]
|
||||
Specify document language (improves OCR accuracy, pipeline backend only)
|
||||
-u, --url TEXT Service address when using sglang-client
|
||||
-u, --url TEXT Service address when using http-client
|
||||
-s, --start INTEGER Starting page number for parsing (0-based)
|
||||
-e, --end INTEGER Ending page number for parsing (0-based)
|
||||
-f, --formula BOOLEAN Enable formula parsing (default: enabled)
|
||||
-t, --table BOOLEAN Enable table parsing (default: enabled)
|
||||
-d, --device TEXT Inference device (e.g., cpu/cuda/cuda:0/npu/mps, pipeline backend only)
|
||||
-d, --device TEXT Inference device (e.g., cpu/cuda/cuda:0/npu/mps, pipeline and vlm-transformers backend only)
|
||||
--vram INTEGER Maximum GPU VRAM usage per process (GB) (pipeline backend only)
|
||||
--source [huggingface|modelscope|local]
|
||||
Model source, default: huggingface
|
||||
@@ -45,7 +45,7 @@ Options:
|
||||
files to be input need to be placed in the
|
||||
`example` folder within the directory where
|
||||
the command is currently executed.
|
||||
--enable-sglang-engine BOOLEAN Enable SgLang engine backend for faster
|
||||
--enable-vllm-engine BOOLEAN Enable vllm engine backend for faster
|
||||
processing.
|
||||
--enable-api BOOLEAN Enable gradio API for serving the
|
||||
application.
|
||||
@@ -65,9 +65,49 @@ Options:
|
||||
Some parameters of MinerU command line tools have equivalent environment variable configurations. Generally, environment variable configurations have higher priority than command line parameters and take effect across all command line tools.
|
||||
Here are the environment variables and their descriptions:
|
||||
|
||||
- `MINERU_DEVICE_MODE`: Used to specify inference device, supports device types like `cpu/cuda/cuda:0/npu/mps`, only effective for `pipeline` backend.
|
||||
- `MINERU_VIRTUAL_VRAM_SIZE`: Used to specify maximum GPU VRAM usage per process (GB), only effective for `pipeline` backend.
|
||||
- `MINERU_MODEL_SOURCE`: Used to specify model source, supports `huggingface/modelscope/local`, defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.
|
||||
- `MINERU_TOOLS_CONFIG_JSON`: Used to specify configuration file path, defaults to `mineru.json` in user directory, can specify other configuration file paths through environment variables.
|
||||
- `MINERU_FORMULA_ENABLE`: Used to enable formula parsing, defaults to `true`, can be set to `false` through environment variables to disable formula parsing.
|
||||
- `MINERU_TABLE_ENABLE`: Used to enable table parsing, defaults to `true`, can be set to `false` through environment variables to disable table parsing.
|
||||
- `MINERU_DEVICE_MODE`:
|
||||
* Used to specify inference device
|
||||
* supports device types like `cpu/cuda/cuda:0/npu/mps`
|
||||
* only effective for `pipeline` and `vlm-transformers` backends.
|
||||
|
||||
- `MINERU_VIRTUAL_VRAM_SIZE`:
|
||||
* Used to specify maximum GPU VRAM usage per process (GB)
|
||||
* only effective for `pipeline` backend.
|
||||
|
||||
- `MINERU_MODEL_SOURCE`:
|
||||
* Used to specify model source
|
||||
* supports `huggingface/modelscope/local`
|
||||
* defaults to `huggingface`, can be switched to `modelscope` or local models through environment variables.
|
||||
|
||||
- `MINERU_TOOLS_CONFIG_JSON`:
|
||||
* Used to specify configuration file path
|
||||
* defaults to `mineru.json` in user directory, can specify other configuration file paths through environment variables.
|
||||
|
||||
- `MINERU_FORMULA_ENABLE`:
|
||||
* Used to enable formula parsing
|
||||
* defaults to `true`, can be set to `false` through environment variables to disable formula parsing.
|
||||
|
||||
- `MINERU_FORMULA_CH_SUPPORT`:
|
||||
* Used to enable Chinese formula parsing optimization (experimental feature)
|
||||
* Default is `false`, can be set to `true` via environment variable to enable Chinese formula parsing optimization.
|
||||
* Only effective for `pipeline` backend.
|
||||
|
||||
- `MINERU_TABLE_ENABLE`:
|
||||
* Used to enable table parsing
|
||||
* Default is `true`, can be set to `false` via environment variable to disable table parsing.
|
||||
|
||||
- `MINERU_TABLE_MERGE_ENABLE`:
|
||||
* Used to enable table merging functionality
|
||||
* Default is `true`, can be set to `false` via environment variable to disable table merging functionality.
|
||||
|
||||
- `MINERU_PDF_RENDER_TIMEOUT`:
|
||||
* Used to set the timeout period (in seconds) for rendering PDF to images
|
||||
* Default is `300` seconds, can be set to other values via environment variable to adjust the image rendering timeout.
|
||||
|
||||
- `MINERU_INTRA_OP_NUM_THREADS`:
|
||||
* Used to set the intra_op thread count for ONNX models, affects the computation speed of individual operators
|
||||
* Default is `-1` (auto-select), can be set to other values via environment variable to adjust the thread count.
|
||||
|
||||
- `MINERU_INTER_OP_NUM_THREADS`:
|
||||
* Used to set the inter_op thread count for ONNX models, affects the parallel execution of multiple operators
|
||||
* Default is `-1` (auto-select), can be set to other values via environment variable to adjust the thread count.
|
||||
|
||||
@@ -29,11 +29,11 @@ mineru -p <input_path> -o <output_path>
|
||||
mineru -p <input_path> -o <output_path> -b vlm-transformers
|
||||
```
|
||||
> [!TIP]
|
||||
> The vlm backend additionally supports `sglang` acceleration. Compared to the `transformers` backend, `sglang` can achieve 20-30x speedup. You can check the installation method for the complete package supporting `sglang` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
|
||||
> The vlm backend additionally supports `vllm`/`lmdeploy` acceleration. Compared to the `transformers` backend, inference speed can be significantly improved. You can check the installation method for the complete package supporting `vllm`/`lmdeploy` acceleration in the [Extension Modules Installation Guide](../quick_start/extension_modules.md).
|
||||
|
||||
If you need to adjust parsing options through custom parameters, you can also check the more detailed [Command Line Tools Usage Instructions](./cli_tools.md) in the documentation.
|
||||
|
||||
## Advanced Usage via API, WebUI, sglang-client/server
|
||||
## Advanced Usage via API, WebUI, http-client/server
|
||||
|
||||
- Direct Python API calls: [Python Usage Example](https://github.com/opendatalab/MinerU/blob/master/demo/demo.py)
|
||||
- FastAPI calls:
|
||||
@@ -44,29 +44,35 @@ If you need to adjust parsing options through custom parameters, you can also ch
|
||||
>Access `http://127.0.0.1:8000/docs` in your browser to view the API documentation.
|
||||
- Start Gradio WebUI visual frontend:
|
||||
```bash
|
||||
# Using pipeline/vlm-transformers/vlm-sglang-client backends
|
||||
# Using pipeline/vlm-transformers/vlm-http-client backends
|
||||
mineru-gradio --server-name 0.0.0.0 --server-port 7860
|
||||
# Or using vlm-sglang-engine/pipeline backends (requires sglang environment)
|
||||
mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-sglang-engine true
|
||||
# Or using vlm-vllm-engine/pipeline backends (requires vllm environment)
|
||||
mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-vllm-engine true
|
||||
# Or using vlm-lmdeploy-engine/pipeline backends (requires lmdeploy environment)
|
||||
mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-lmdeploy-engine true
|
||||
```
|
||||
>[!TIP]
|
||||
>
|
||||
>- Access `http://127.0.0.1:7860` in your browser to use the Gradio WebUI.
|
||||
>- Access `http://127.0.0.1:7860/?view=api` to use the Gradio API.
|
||||
- Using `sglang-client/server` method:
|
||||
|
||||
- Using `http-client/server` method:
|
||||
```bash
|
||||
# Start sglang server (requires sglang environment)
|
||||
mineru-sglang-server --port 30000
|
||||
# Start openai compatible server (requires vllm or lmdeploy environment)
|
||||
mineru-openai-server
|
||||
# Or start vllm server (requires vllm environment)
|
||||
mineru-openai-server --engine vllm --port 30000
|
||||
# Or start lmdeploy server (requires lmdeploy environment)
|
||||
mineru-openai-server --engine lmdeploy --server-port 30000
|
||||
```
|
||||
>[!TIP]
|
||||
>In another terminal, connect to sglang server via sglang client (only requires CPU and network, no sglang environment needed)
|
||||
>In another terminal, connect to vllm server via http client (only requires CPU and network, no vllm environment needed)
|
||||
> ```bash
|
||||
> mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1:30000
|
||||
> mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:30000
|
||||
> ```
|
||||
|
||||
> [!NOTE]
|
||||
> All officially supported sglang parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`.
|
||||
> We have compiled some commonly used parameters and usage methods for `sglang`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
|
||||
> All officially supported `vllm/lmdeploy` parameters can be passed to MinerU through command line arguments, including the following commands: `mineru`, `mineru-openai-server`, `mineru-gradio`, `mineru-api`.
|
||||
> We have compiled some commonly used parameters and usage methods for `vllm/lmdeploy`, which can be found in the documentation [Advanced Command Line Parameters](./advanced_cli_parameters.md).
|
||||
|
||||
## Extending MinerU Functionality with Configuration Files
|
||||
|
||||
@@ -77,7 +83,36 @@ MinerU is now ready to use out of the box, but also supports extending functiona
|
||||
|
||||
Here are some available configuration options:
|
||||
|
||||
- `latex-delimiter-config`: Used to configure LaTeX formula delimiters, defaults to `$` symbol, can be modified to other symbols or strings as needed.
|
||||
- `llm-aided-config`: Used to configure parameters for LLM-assisted title hierarchy, compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen2.5-32b-instruct` model. You need to configure your own API key and set `enable` to `true` to enable this feature.
|
||||
- `models-dir`: Used to specify local model storage directory, please specify model directories for `pipeline` and `vlm` backends separately. After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
|
||||
|
||||
- `latex-delimiter-config`:
|
||||
* Used to configure LaTeX formula delimiters
|
||||
* Defaults to `$` symbol, can be modified to other symbols or strings as needed.
|
||||
|
||||
- `llm-aided-config`:
|
||||
* Used to configure parameters for LLM-assisted title hierarchy
|
||||
* Compatible with all LLM models supporting `openai protocol`, defaults to using Alibaba Cloud Bailian's `qwen3-next-80b-a3b-instruct` model.
|
||||
* You need to configure your own API key and set `enable` to `true` to enable this feature.
|
||||
* If your API provider does not support the `enable_thinking` parameter, please manually remove it.
|
||||
* For example, in your configuration file, the `llm-aided-config` section may look like:
|
||||
```json
|
||||
"llm-aided-config": {
|
||||
"api_key": "your_api_key",
|
||||
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
|
||||
"model": "qwen3-next-80b-a3b-instruct",
|
||||
"enable_thinking": false,
|
||||
"enable": false
|
||||
}
|
||||
```
|
||||
* To remove the `enable_thinking` parameter, simply delete the line containing `"enable_thinking": false`, resulting in:
|
||||
```json
|
||||
"llm-aided-config": {
|
||||
"api_key": "your_api_key",
|
||||
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
|
||||
"model": "qwen3-next-80b-a3b-instruct",
|
||||
"enable": false
|
||||
}
|
||||
```
|
||||
|
||||
- `models-dir`:
|
||||
* Used to specify local model storage directory
|
||||
* Please specify model directories for `pipeline` and `vlm` backends separately.
|
||||
* After specifying the directory, you can use local models by configuring the environment variable `export MINERU_MODEL_SOURCE=local`.
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
如果未能列出您的问题,您也可以使用[DeepWiki](https://deepwiki.com/opendatalab/MinerU)与AI助手交流,这可以解决大部分常见问题。
|
||||
|
||||
如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)或[WeChat](http://mineru.space/s/V85Yl)加入社区,与其他用户和开发者交流。
|
||||
如果您仍然无法解决问题,您可通过[Discord](https://discord.gg/Tdedn9GTXq)或[WeChat](https://mineru.net/community-portal/?aliasId=3c430f94)加入社区,与其他用户和开发者交流。
|
||||
|
||||
??? question "在WSL2的Ubuntu22.04中遇到报错`ImportError: libGL.so.1: cannot open shared object file: No such file or directory`"
|
||||
|
||||
@@ -14,18 +14,6 @@
|
||||
|
||||
参考:[#388](https://github.com/opendatalab/MinerU/issues/388)
|
||||
|
||||
|
||||
??? question "在 CentOS 7 或 Ubuntu 18 系统安装MinerU时报错`ERROR: Failed building wheel for simsimd`"
|
||||
|
||||
新版本albumentations(1.4.21)引入了依赖simsimd,由于simsimd在linux的预编译包要求glibc的版本大于等于2.28,导致部分2019年之前发布的Linux发行版无法正常安装,可通过如下命令安装:
|
||||
```
|
||||
conda create -n mineru python=3.11 -y
|
||||
conda activate mineru
|
||||
pip install -U "mineru[pipeline_old_linux]"
|
||||
```
|
||||
|
||||
参考:[#1004](https://github.com/opendatalab/MinerU/issues/1004)
|
||||
|
||||
??? question "在 Linux 系统安装并使用时,解析结果缺失部份文字信息。"
|
||||
|
||||
MinerU在>=2.0的版本中使用`pypdfium2`代替`pymupdf`作为PDF页面的渲染引擎,以解决AGPLv3的许可证问题,在某些Linux发行版,由于缺少CJK字体,可能会在将PDF渲染成图片的过程中丢失部份文字。
|
||||
|
||||
@@ -18,8 +18,9 @@
|
||||
[](https://mineru.net/OpenSourceTools/Extractor?source=github)
|
||||
[](https://www.modelscope.cn/studios/OpenDataLab/MinerU)
|
||||
[](https://huggingface.co/spaces/opendatalab/MinerU)
|
||||
[](https://colab.research.google.com/gist/myhloli/3b3a00a4a0a61577b6c30f989092d20d/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://colab.research.google.com/gist/myhloli/a3cb16570ab3cfeadf9d8f0ac91b4fca/mineru_demo.ipynb)
|
||||
[](https://arxiv.org/abs/2409.18839)
|
||||
[](https://arxiv.org/abs/2509.22186)
|
||||
[](https://deepwiki.com/opendatalab/MinerU)
|
||||
|
||||
<div align="center">
|
||||
@@ -33,7 +34,7 @@
|
||||
<!-- join us -->
|
||||
|
||||
<p align="center">
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="http://mineru.space/s/V85Yl" target="_blank">WeChat</a>
|
||||
👋 join us on <a href="https://discord.gg/Tdedn9GTXq" target="_blank">Discord</a> and <a href="https://mineru.net/community-portal/?aliasId=3c430f94" target="_blank">WeChat</a>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
@@ -55,7 +56,7 @@ MinerU诞生于[书生-浦语](https://github.com/InternLM/InternLM)的预训练
|
||||
- 自动识别并转换文档中的公式为LaTeX格式
|
||||
- 自动识别并转换文档中的表格为HTML格式
|
||||
- 自动检测扫描版PDF和乱码PDF,并启用OCR功能
|
||||
- OCR支持84种语言的检测与识别
|
||||
- OCR支持109种语言的检测与识别
|
||||
- 支持多种输出格式,如多模态与NLP的Markdown、按阅读顺序排序的JSON、含有丰富信息的中间格式等
|
||||
- 支持多种可视化结果,包括layout可视化、span可视化等,便于高效确认输出效果与质检
|
||||
- 支持纯CPU环境运行,并支持 GPU(CUDA)/NPU(CANN)/MPS 加速
|
||||
|
||||