commit 553ec651bfbd5763454e82cec7228927b2e1995d Author: myhloli Date: Thu Mar 26 18:02:27 2026 +0000 Deployed bd46b08 with MkDocs version: 1.5.3 diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 00000000..e69de29b diff --git a/404.html b/404.html new file mode 100644 index 00000000..78869dae --- /dev/null +++ b/404.html @@ -0,0 +1,1605 @@ + + + + + + + + + + + + + + + + + + + MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ +

404 - Not found

+ +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/assets/images/BISHENG_01.png b/assets/images/BISHENG_01.png new file mode 100644 index 00000000..1291233c Binary files /dev/null and b/assets/images/BISHENG_01.png differ diff --git a/assets/images/Cherry_Studio_1.png b/assets/images/Cherry_Studio_1.png new file mode 100644 index 00000000..dffb4a00 Binary files /dev/null and b/assets/images/Cherry_Studio_1.png differ diff --git a/assets/images/Cherry_Studio_2.png b/assets/images/Cherry_Studio_2.png new file mode 100644 index 00000000..c1b1cfef Binary files /dev/null and b/assets/images/Cherry_Studio_2.png differ diff --git a/assets/images/Cherry_Studio_3.png b/assets/images/Cherry_Studio_3.png new file mode 100644 index 00000000..d400e734 Binary files /dev/null and b/assets/images/Cherry_Studio_3.png differ diff --git a/assets/images/Cherry_Studio_4.png b/assets/images/Cherry_Studio_4.png new file mode 100644 index 00000000..26a5dddb Binary files /dev/null and b/assets/images/Cherry_Studio_4.png differ diff --git a/assets/images/Cherry_Studio_5.png b/assets/images/Cherry_Studio_5.png new file mode 100644 index 00000000..da88d85a Binary files /dev/null and b/assets/images/Cherry_Studio_5.png differ diff --git a/assets/images/Cherry_Studio_6.png b/assets/images/Cherry_Studio_6.png new file mode 100644 index 00000000..35762af1 Binary files /dev/null and b/assets/images/Cherry_Studio_6.png differ diff --git a/assets/images/Cherry_Studio_7.png b/assets/images/Cherry_Studio_7.png new file mode 100644 index 00000000..916ecfe9 Binary files /dev/null and b/assets/images/Cherry_Studio_7.png differ diff --git a/assets/images/Cherry_Studio_8.png b/assets/images/Cherry_Studio_8.png new file mode 100644 index 00000000..ab2a9b96 Binary files /dev/null and b/assets/images/Cherry_Studio_8.png differ diff --git a/assets/images/Coze_1.png b/assets/images/Coze_1.png new file mode 100644 index 00000000..7c834d1d Binary files /dev/null and b/assets/images/Coze_1.png differ diff --git a/assets/images/Coze_10.png b/assets/images/Coze_10.png new file mode 100644 index 00000000..6feda8eb Binary files /dev/null and b/assets/images/Coze_10.png differ diff --git a/assets/images/Coze_11.png b/assets/images/Coze_11.png new file mode 100644 index 00000000..42077628 Binary files /dev/null and b/assets/images/Coze_11.png differ diff --git a/assets/images/Coze_12.png b/assets/images/Coze_12.png new file mode 100644 index 00000000..ed8ef1e0 Binary files /dev/null and b/assets/images/Coze_12.png differ diff --git a/assets/images/Coze_13.png b/assets/images/Coze_13.png new file mode 100644 index 00000000..fa2e79cc Binary files /dev/null and b/assets/images/Coze_13.png differ diff --git a/assets/images/Coze_14.png b/assets/images/Coze_14.png new file mode 100644 index 00000000..cc491651 Binary files /dev/null and b/assets/images/Coze_14.png differ diff --git a/assets/images/Coze_15.png b/assets/images/Coze_15.png new file mode 100644 index 00000000..fd95fff2 Binary files /dev/null and b/assets/images/Coze_15.png differ diff --git a/assets/images/Coze_16.png b/assets/images/Coze_16.png new file mode 100644 index 00000000..29c6598d Binary files /dev/null and b/assets/images/Coze_16.png differ diff --git a/assets/images/Coze_17.png b/assets/images/Coze_17.png new file mode 100644 index 00000000..71279272 Binary files /dev/null and b/assets/images/Coze_17.png differ diff --git a/assets/images/Coze_18.png b/assets/images/Coze_18.png new file mode 100644 index 00000000..ef2c54db Binary files /dev/null and b/assets/images/Coze_18.png differ diff --git a/assets/images/Coze_19.png b/assets/images/Coze_19.png new file mode 100644 index 00000000..b7afb73d Binary files /dev/null and b/assets/images/Coze_19.png differ diff --git a/assets/images/Coze_2.png b/assets/images/Coze_2.png new file mode 100644 index 00000000..fb8dac37 Binary files /dev/null and b/assets/images/Coze_2.png differ diff --git a/assets/images/Coze_20.png b/assets/images/Coze_20.png new file mode 100644 index 00000000..a7505c3b Binary files /dev/null and b/assets/images/Coze_20.png differ diff --git a/assets/images/Coze_21.png b/assets/images/Coze_21.png new file mode 100644 index 00000000..a8365e60 Binary files /dev/null and b/assets/images/Coze_21.png differ diff --git a/assets/images/Coze_3.png b/assets/images/Coze_3.png new file mode 100644 index 00000000..f77f8969 Binary files /dev/null and b/assets/images/Coze_3.png differ diff --git a/assets/images/Coze_4.png b/assets/images/Coze_4.png new file mode 100644 index 00000000..c67f5252 Binary files /dev/null and b/assets/images/Coze_4.png differ diff --git a/assets/images/Coze_5.png b/assets/images/Coze_5.png new file mode 100644 index 00000000..e70ba9ca Binary files /dev/null and b/assets/images/Coze_5.png differ diff --git a/assets/images/Coze_6.png b/assets/images/Coze_6.png new file mode 100644 index 00000000..5a0cb102 Binary files /dev/null and b/assets/images/Coze_6.png differ diff --git a/assets/images/Coze_7.png b/assets/images/Coze_7.png new file mode 100644 index 00000000..9f6f8292 Binary files /dev/null and b/assets/images/Coze_7.png differ diff --git a/assets/images/Coze_8.png b/assets/images/Coze_8.png new file mode 100644 index 00000000..f08e5378 Binary files /dev/null and b/assets/images/Coze_8.png differ diff --git a/assets/images/Coze_9.png b/assets/images/Coze_9.png new file mode 100644 index 00000000..fee7923b Binary files /dev/null and b/assets/images/Coze_9.png differ diff --git a/assets/images/DataFLow_01.png b/assets/images/DataFLow_01.png new file mode 100644 index 00000000..17e3882b Binary files /dev/null and b/assets/images/DataFLow_01.png differ diff --git a/assets/images/DataFlow_02.png b/assets/images/DataFlow_02.png new file mode 100644 index 00000000..a6182ddf Binary files /dev/null and b/assets/images/DataFlow_02.png differ diff --git a/assets/images/Dify_1.png b/assets/images/Dify_1.png new file mode 100644 index 00000000..dbb9e6d5 Binary files /dev/null and b/assets/images/Dify_1.png differ diff --git a/assets/images/Dify_10.png b/assets/images/Dify_10.png new file mode 100644 index 00000000..d6626aba Binary files /dev/null and b/assets/images/Dify_10.png differ diff --git a/assets/images/Dify_11.png b/assets/images/Dify_11.png new file mode 100644 index 00000000..bec7e91e Binary files /dev/null and b/assets/images/Dify_11.png differ diff --git a/assets/images/Dify_12.png b/assets/images/Dify_12.png new file mode 100644 index 00000000..822fd7fb Binary files /dev/null and b/assets/images/Dify_12.png differ diff --git a/assets/images/Dify_13.png b/assets/images/Dify_13.png new file mode 100644 index 00000000..d5025f13 Binary files /dev/null and b/assets/images/Dify_13.png differ diff --git a/assets/images/Dify_14.png b/assets/images/Dify_14.png new file mode 100644 index 00000000..f785542a Binary files /dev/null and b/assets/images/Dify_14.png differ diff --git a/assets/images/Dify_15.png b/assets/images/Dify_15.png new file mode 100644 index 00000000..ef40173c Binary files /dev/null and b/assets/images/Dify_15.png differ diff --git a/assets/images/Dify_16.png b/assets/images/Dify_16.png new file mode 100644 index 00000000..1f203b52 Binary files /dev/null and b/assets/images/Dify_16.png differ diff --git a/assets/images/Dify_17.png b/assets/images/Dify_17.png new file mode 100644 index 00000000..f944a390 Binary files /dev/null and b/assets/images/Dify_17.png differ diff --git a/assets/images/Dify_18.png b/assets/images/Dify_18.png new file mode 100644 index 00000000..a2b069d8 Binary files /dev/null and b/assets/images/Dify_18.png differ diff --git a/assets/images/Dify_19.png b/assets/images/Dify_19.png new file mode 100644 index 00000000..dc278e01 Binary files /dev/null and b/assets/images/Dify_19.png differ diff --git a/assets/images/Dify_2.png b/assets/images/Dify_2.png new file mode 100644 index 00000000..182c85f3 Binary files /dev/null and b/assets/images/Dify_2.png differ diff --git a/assets/images/Dify_20.png b/assets/images/Dify_20.png new file mode 100644 index 00000000..91b910e2 Binary files /dev/null and b/assets/images/Dify_20.png differ diff --git a/assets/images/Dify_21.png b/assets/images/Dify_21.png new file mode 100644 index 00000000..784de772 Binary files /dev/null and b/assets/images/Dify_21.png differ diff --git a/assets/images/Dify_22.png b/assets/images/Dify_22.png new file mode 100644 index 00000000..304995f1 Binary files /dev/null and b/assets/images/Dify_22.png differ diff --git a/assets/images/Dify_23.png b/assets/images/Dify_23.png new file mode 100644 index 00000000..9a1ac093 Binary files /dev/null and b/assets/images/Dify_23.png differ diff --git a/assets/images/Dify_24.png b/assets/images/Dify_24.png new file mode 100644 index 00000000..4902617c Binary files /dev/null and b/assets/images/Dify_24.png differ diff --git a/assets/images/Dify_25.png b/assets/images/Dify_25.png new file mode 100644 index 00000000..21315a3f Binary files /dev/null and b/assets/images/Dify_25.png differ diff --git a/assets/images/Dify_26.png b/assets/images/Dify_26.png new file mode 100644 index 00000000..c59244b2 Binary files /dev/null and b/assets/images/Dify_26.png differ diff --git a/assets/images/Dify_3.png b/assets/images/Dify_3.png new file mode 100644 index 00000000..0a7cb968 Binary files /dev/null and b/assets/images/Dify_3.png differ diff --git a/assets/images/Dify_4.png b/assets/images/Dify_4.png new file mode 100644 index 00000000..759c070a Binary files /dev/null and b/assets/images/Dify_4.png differ diff --git a/assets/images/Dify_5.png b/assets/images/Dify_5.png new file mode 100644 index 00000000..bc73986d Binary files /dev/null and b/assets/images/Dify_5.png differ diff --git a/assets/images/Dify_6.png b/assets/images/Dify_6.png new file mode 100644 index 00000000..27225bf8 Binary files /dev/null and b/assets/images/Dify_6.png differ diff --git a/assets/images/Dify_7.png b/assets/images/Dify_7.png new file mode 100644 index 00000000..82bb291c Binary files /dev/null and b/assets/images/Dify_7.png differ diff --git a/assets/images/Dify_8.png b/assets/images/Dify_8.png new file mode 100644 index 00000000..9f9422ae Binary files /dev/null and b/assets/images/Dify_8.png differ diff --git a/assets/images/Dify_9.png b/assets/images/Dify_9.png new file mode 100644 index 00000000..b94f315d Binary files /dev/null and b/assets/images/Dify_9.png differ diff --git a/assets/images/DingTalk_01.png b/assets/images/DingTalk_01.png new file mode 100644 index 00000000..413012ad Binary files /dev/null and b/assets/images/DingTalk_01.png differ diff --git a/assets/images/FastGPT_01.png b/assets/images/FastGPT_01.png new file mode 100644 index 00000000..25fbdfa0 Binary files /dev/null and b/assets/images/FastGPT_01.png differ diff --git a/assets/images/FastGPT_02.png b/assets/images/FastGPT_02.png new file mode 100644 index 00000000..345f1eec Binary files /dev/null and b/assets/images/FastGPT_02.png differ diff --git a/assets/images/ModelWhale_01.png b/assets/images/ModelWhale_01.png new file mode 100644 index 00000000..b2f768aa Binary files /dev/null and b/assets/images/ModelWhale_01.png differ diff --git a/assets/images/ModelWhale_02.png b/assets/images/ModelWhale_02.png new file mode 100644 index 00000000..bc964f11 Binary files /dev/null and b/assets/images/ModelWhale_02.png differ diff --git a/assets/images/ModelWhale_1.png b/assets/images/ModelWhale_1.png new file mode 100644 index 00000000..c80e7f42 Binary files /dev/null and b/assets/images/ModelWhale_1.png differ diff --git a/assets/images/RagFlow_01.png b/assets/images/RagFlow_01.png new file mode 100644 index 00000000..476e0de2 Binary files /dev/null and b/assets/images/RagFlow_01.png differ diff --git a/assets/images/RagFlow_02.png b/assets/images/RagFlow_02.png new file mode 100644 index 00000000..f2fec68e Binary files /dev/null and b/assets/images/RagFlow_02.png differ diff --git a/assets/images/Sider_1.png b/assets/images/Sider_1.png new file mode 100644 index 00000000..f682e38b Binary files /dev/null and b/assets/images/Sider_1.png differ diff --git a/assets/images/coze_0.png b/assets/images/coze_0.png new file mode 100644 index 00000000..92ff2131 Binary files /dev/null and b/assets/images/coze_0.png differ diff --git a/assets/images/favicon.png b/assets/images/favicon.png new file mode 100644 index 00000000..1cf13b9f Binary files /dev/null and b/assets/images/favicon.png differ diff --git a/assets/images/n8n_0.png b/assets/images/n8n_0.png new file mode 100644 index 00000000..31b42c03 Binary files /dev/null and b/assets/images/n8n_0.png differ diff --git a/assets/images/n8n_1.png b/assets/images/n8n_1.png new file mode 100644 index 00000000..3f9fecbc Binary files /dev/null and b/assets/images/n8n_1.png differ diff --git a/assets/images/n8n_10.png b/assets/images/n8n_10.png new file mode 100644 index 00000000..6fdc12cb Binary files /dev/null and b/assets/images/n8n_10.png differ diff --git a/assets/images/n8n_2.png b/assets/images/n8n_2.png new file mode 100644 index 00000000..f93599a8 Binary files /dev/null and b/assets/images/n8n_2.png differ diff --git a/assets/images/n8n_3.png b/assets/images/n8n_3.png new file mode 100644 index 00000000..c1ab8807 Binary files /dev/null and b/assets/images/n8n_3.png differ diff --git a/assets/images/n8n_4.png b/assets/images/n8n_4.png new file mode 100644 index 00000000..76657fab Binary files /dev/null and b/assets/images/n8n_4.png differ diff --git a/assets/images/n8n_5.png b/assets/images/n8n_5.png new file mode 100644 index 00000000..f6aa18af Binary files /dev/null and b/assets/images/n8n_5.png differ diff --git a/assets/images/n8n_6.png b/assets/images/n8n_6.png new file mode 100644 index 00000000..88c9ea3a Binary files /dev/null and b/assets/images/n8n_6.png differ diff --git a/assets/images/n8n_7.png b/assets/images/n8n_7.png new file mode 100644 index 00000000..7a1e0f70 Binary files /dev/null and b/assets/images/n8n_7.png differ diff --git a/assets/images/n8n_8.png b/assets/images/n8n_8.png new file mode 100644 index 00000000..9daff94e Binary files /dev/null and b/assets/images/n8n_8.png differ diff --git a/assets/images/n8n_9.png b/assets/images/n8n_9.png new file mode 100644 index 00000000..77c6272f Binary files /dev/null and b/assets/images/n8n_9.png differ diff --git a/assets/javascripts/bundle.1e8ae164.min.js b/assets/javascripts/bundle.1e8ae164.min.js new file mode 100644 index 00000000..21297988 --- /dev/null +++ b/assets/javascripts/bundle.1e8ae164.min.js @@ -0,0 +1,29 @@ +"use strict";(()=>{var _i=Object.create;var br=Object.defineProperty;var Ai=Object.getOwnPropertyDescriptor;var Ci=Object.getOwnPropertyNames,Ft=Object.getOwnPropertySymbols,ki=Object.getPrototypeOf,vr=Object.prototype.hasOwnProperty,eo=Object.prototype.propertyIsEnumerable;var Zr=(e,t,r)=>t in e?br(e,t,{enumerable:!0,configurable:!0,writable:!0,value:r}):e[t]=r,F=(e,t)=>{for(var r in t||(t={}))vr.call(t,r)&&Zr(e,r,t[r]);if(Ft)for(var r of Ft(t))eo.call(t,r)&&Zr(e,r,t[r]);return e};var to=(e,t)=>{var r={};for(var o in e)vr.call(e,o)&&t.indexOf(o)<0&&(r[o]=e[o]);if(e!=null&&Ft)for(var o of Ft(e))t.indexOf(o)<0&&eo.call(e,o)&&(r[o]=e[o]);return r};var gr=(e,t)=>()=>(t||e((t={exports:{}}).exports,t),t.exports);var Hi=(e,t,r,o)=>{if(t&&typeof t=="object"||typeof t=="function")for(let n of Ci(t))!vr.call(e,n)&&n!==r&&br(e,n,{get:()=>t[n],enumerable:!(o=Ai(t,n))||o.enumerable});return e};var jt=(e,t,r)=>(r=e!=null?_i(ki(e)):{},Hi(t||!e||!e.__esModule?br(r,"default",{value:e,enumerable:!0}):r,e));var ro=(e,t,r)=>new Promise((o,n)=>{var i=c=>{try{s(r.next(c))}catch(p){n(p)}},a=c=>{try{s(r.throw(c))}catch(p){n(p)}},s=c=>c.done?o(c.value):Promise.resolve(c.value).then(i,a);s((r=r.apply(e,t)).next())});var no=gr((xr,oo)=>{(function(e,t){typeof xr=="object"&&typeof oo!="undefined"?t():typeof define=="function"&&define.amd?define(t):t()})(xr,function(){"use strict";function e(r){var o=!0,n=!1,i=null,a={text:!0,search:!0,url:!0,tel:!0,email:!0,password:!0,number:!0,date:!0,month:!0,week:!0,time:!0,datetime:!0,"datetime-local":!0};function s(C){return!!(C&&C!==document&&C.nodeName!=="HTML"&&C.nodeName!=="BODY"&&"classList"in C&&"contains"in C.classList)}function c(C){var ct=C.type,Ne=C.tagName;return!!(Ne==="INPUT"&&a[ct]&&!C.readOnly||Ne==="TEXTAREA"&&!C.readOnly||C.isContentEditable)}function p(C){C.classList.contains("focus-visible")||(C.classList.add("focus-visible"),C.setAttribute("data-focus-visible-added",""))}function l(C){C.hasAttribute("data-focus-visible-added")&&(C.classList.remove("focus-visible"),C.removeAttribute("data-focus-visible-added"))}function f(C){C.metaKey||C.altKey||C.ctrlKey||(s(r.activeElement)&&p(r.activeElement),o=!0)}function u(C){o=!1}function h(C){s(C.target)&&(o||c(C.target))&&p(C.target)}function w(C){s(C.target)&&(C.target.classList.contains("focus-visible")||C.target.hasAttribute("data-focus-visible-added"))&&(n=!0,window.clearTimeout(i),i=window.setTimeout(function(){n=!1},100),l(C.target))}function A(C){document.visibilityState==="hidden"&&(n&&(o=!0),Z())}function Z(){document.addEventListener("mousemove",J),document.addEventListener("mousedown",J),document.addEventListener("mouseup",J),document.addEventListener("pointermove",J),document.addEventListener("pointerdown",J),document.addEventListener("pointerup",J),document.addEventListener("touchmove",J),document.addEventListener("touchstart",J),document.addEventListener("touchend",J)}function te(){document.removeEventListener("mousemove",J),document.removeEventListener("mousedown",J),document.removeEventListener("mouseup",J),document.removeEventListener("pointermove",J),document.removeEventListener("pointerdown",J),document.removeEventListener("pointerup",J),document.removeEventListener("touchmove",J),document.removeEventListener("touchstart",J),document.removeEventListener("touchend",J)}function J(C){C.target.nodeName&&C.target.nodeName.toLowerCase()==="html"||(o=!1,te())}document.addEventListener("keydown",f,!0),document.addEventListener("mousedown",u,!0),document.addEventListener("pointerdown",u,!0),document.addEventListener("touchstart",u,!0),document.addEventListener("visibilitychange",A,!0),Z(),r.addEventListener("focus",h,!0),r.addEventListener("blur",w,!0),r.nodeType===Node.DOCUMENT_FRAGMENT_NODE&&r.host?r.host.setAttribute("data-js-focus-visible",""):r.nodeType===Node.DOCUMENT_NODE&&(document.documentElement.classList.add("js-focus-visible"),document.documentElement.setAttribute("data-js-focus-visible",""))}if(typeof window!="undefined"&&typeof document!="undefined"){window.applyFocusVisiblePolyfill=e;var t;try{t=new CustomEvent("focus-visible-polyfill-ready")}catch(r){t=document.createEvent("CustomEvent"),t.initCustomEvent("focus-visible-polyfill-ready",!1,!1,{})}window.dispatchEvent(t)}typeof document!="undefined"&&e(document)})});var zr=gr((kt,Vr)=>{/*! + * clipboard.js v2.0.11 + * https://clipboardjs.com/ + * + * Licensed MIT © Zeno Rocha + */(function(t,r){typeof kt=="object"&&typeof Vr=="object"?Vr.exports=r():typeof define=="function"&&define.amd?define([],r):typeof kt=="object"?kt.ClipboardJS=r():t.ClipboardJS=r()})(kt,function(){return function(){var e={686:function(o,n,i){"use strict";i.d(n,{default:function(){return Li}});var a=i(279),s=i.n(a),c=i(370),p=i.n(c),l=i(817),f=i.n(l);function u(D){try{return document.execCommand(D)}catch(M){return!1}}var h=function(M){var O=f()(M);return u("cut"),O},w=h;function A(D){var M=document.documentElement.getAttribute("dir")==="rtl",O=document.createElement("textarea");O.style.fontSize="12pt",O.style.border="0",O.style.padding="0",O.style.margin="0",O.style.position="absolute",O.style[M?"right":"left"]="-9999px";var I=window.pageYOffset||document.documentElement.scrollTop;return O.style.top="".concat(I,"px"),O.setAttribute("readonly",""),O.value=D,O}var Z=function(M,O){var I=A(M);O.container.appendChild(I);var W=f()(I);return u("copy"),I.remove(),W},te=function(M){var O=arguments.length>1&&arguments[1]!==void 0?arguments[1]:{container:document.body},I="";return typeof M=="string"?I=Z(M,O):M instanceof HTMLInputElement&&!["text","search","url","tel","password"].includes(M==null?void 0:M.type)?I=Z(M.value,O):(I=f()(M),u("copy")),I},J=te;function C(D){"@babel/helpers - typeof";return typeof Symbol=="function"&&typeof Symbol.iterator=="symbol"?C=function(O){return typeof O}:C=function(O){return O&&typeof Symbol=="function"&&O.constructor===Symbol&&O!==Symbol.prototype?"symbol":typeof O},C(D)}var ct=function(){var M=arguments.length>0&&arguments[0]!==void 0?arguments[0]:{},O=M.action,I=O===void 0?"copy":O,W=M.container,K=M.target,Ce=M.text;if(I!=="copy"&&I!=="cut")throw new Error('Invalid "action" value, use either "copy" or "cut"');if(K!==void 0)if(K&&C(K)==="object"&&K.nodeType===1){if(I==="copy"&&K.hasAttribute("disabled"))throw new Error('Invalid "target" attribute. Please use "readonly" instead of "disabled" attribute');if(I==="cut"&&(K.hasAttribute("readonly")||K.hasAttribute("disabled")))throw new Error(`Invalid "target" attribute. You can't cut text from elements with "readonly" or "disabled" attributes`)}else throw new Error('Invalid "target" value, use a valid Element');if(Ce)return J(Ce,{container:W});if(K)return I==="cut"?w(K):J(K,{container:W})},Ne=ct;function Pe(D){"@babel/helpers - typeof";return typeof Symbol=="function"&&typeof Symbol.iterator=="symbol"?Pe=function(O){return typeof O}:Pe=function(O){return O&&typeof Symbol=="function"&&O.constructor===Symbol&&O!==Symbol.prototype?"symbol":typeof O},Pe(D)}function xi(D,M){if(!(D instanceof M))throw new TypeError("Cannot call a class as a function")}function Xr(D,M){for(var O=0;O0&&arguments[0]!==void 0?arguments[0]:{};this.action=typeof W.action=="function"?W.action:this.defaultAction,this.target=typeof W.target=="function"?W.target:this.defaultTarget,this.text=typeof W.text=="function"?W.text:this.defaultText,this.container=Pe(W.container)==="object"?W.container:document.body}},{key:"listenClick",value:function(W){var K=this;this.listener=p()(W,"click",function(Ce){return K.onClick(Ce)})}},{key:"onClick",value:function(W){var K=W.delegateTarget||W.currentTarget,Ce=this.action(K)||"copy",It=Ne({action:Ce,container:this.container,target:this.target(K),text:this.text(K)});this.emit(It?"success":"error",{action:Ce,text:It,trigger:K,clearSelection:function(){K&&K.focus(),window.getSelection().removeAllRanges()}})}},{key:"defaultAction",value:function(W){return hr("action",W)}},{key:"defaultTarget",value:function(W){var K=hr("target",W);if(K)return document.querySelector(K)}},{key:"defaultText",value:function(W){return hr("text",W)}},{key:"destroy",value:function(){this.listener.destroy()}}],[{key:"copy",value:function(W){var K=arguments.length>1&&arguments[1]!==void 0?arguments[1]:{container:document.body};return J(W,K)}},{key:"cut",value:function(W){return w(W)}},{key:"isSupported",value:function(){var W=arguments.length>0&&arguments[0]!==void 0?arguments[0]:["copy","cut"],K=typeof W=="string"?[W]:W,Ce=!!document.queryCommandSupported;return K.forEach(function(It){Ce=Ce&&!!document.queryCommandSupported(It)}),Ce}}]),O}(s()),Li=Mi},828:function(o){var n=9;if(typeof Element!="undefined"&&!Element.prototype.matches){var i=Element.prototype;i.matches=i.matchesSelector||i.mozMatchesSelector||i.msMatchesSelector||i.oMatchesSelector||i.webkitMatchesSelector}function a(s,c){for(;s&&s.nodeType!==n;){if(typeof s.matches=="function"&&s.matches(c))return s;s=s.parentNode}}o.exports=a},438:function(o,n,i){var a=i(828);function s(l,f,u,h,w){var A=p.apply(this,arguments);return l.addEventListener(u,A,w),{destroy:function(){l.removeEventListener(u,A,w)}}}function c(l,f,u,h,w){return typeof l.addEventListener=="function"?s.apply(null,arguments):typeof u=="function"?s.bind(null,document).apply(null,arguments):(typeof l=="string"&&(l=document.querySelectorAll(l)),Array.prototype.map.call(l,function(A){return s(A,f,u,h,w)}))}function p(l,f,u,h){return function(w){w.delegateTarget=a(w.target,f),w.delegateTarget&&h.call(l,w)}}o.exports=c},879:function(o,n){n.node=function(i){return i!==void 0&&i instanceof HTMLElement&&i.nodeType===1},n.nodeList=function(i){var a=Object.prototype.toString.call(i);return i!==void 0&&(a==="[object NodeList]"||a==="[object HTMLCollection]")&&"length"in i&&(i.length===0||n.node(i[0]))},n.string=function(i){return typeof i=="string"||i instanceof String},n.fn=function(i){var a=Object.prototype.toString.call(i);return a==="[object Function]"}},370:function(o,n,i){var a=i(879),s=i(438);function c(u,h,w){if(!u&&!h&&!w)throw new Error("Missing required arguments");if(!a.string(h))throw new TypeError("Second argument must be a String");if(!a.fn(w))throw new TypeError("Third argument must be a Function");if(a.node(u))return p(u,h,w);if(a.nodeList(u))return l(u,h,w);if(a.string(u))return f(u,h,w);throw new TypeError("First argument must be a String, HTMLElement, HTMLCollection, or NodeList")}function p(u,h,w){return u.addEventListener(h,w),{destroy:function(){u.removeEventListener(h,w)}}}function l(u,h,w){return Array.prototype.forEach.call(u,function(A){A.addEventListener(h,w)}),{destroy:function(){Array.prototype.forEach.call(u,function(A){A.removeEventListener(h,w)})}}}function f(u,h,w){return s(document.body,u,h,w)}o.exports=c},817:function(o){function n(i){var a;if(i.nodeName==="SELECT")i.focus(),a=i.value;else if(i.nodeName==="INPUT"||i.nodeName==="TEXTAREA"){var s=i.hasAttribute("readonly");s||i.setAttribute("readonly",""),i.select(),i.setSelectionRange(0,i.value.length),s||i.removeAttribute("readonly"),a=i.value}else{i.hasAttribute("contenteditable")&&i.focus();var c=window.getSelection(),p=document.createRange();p.selectNodeContents(i),c.removeAllRanges(),c.addRange(p),a=c.toString()}return a}o.exports=n},279:function(o){function n(){}n.prototype={on:function(i,a,s){var c=this.e||(this.e={});return(c[i]||(c[i]=[])).push({fn:a,ctx:s}),this},once:function(i,a,s){var c=this;function p(){c.off(i,p),a.apply(s,arguments)}return p._=a,this.on(i,p,s)},emit:function(i){var a=[].slice.call(arguments,1),s=((this.e||(this.e={}))[i]||[]).slice(),c=0,p=s.length;for(c;c{"use strict";/*! + * escape-html + * Copyright(c) 2012-2013 TJ Holowaychuk + * Copyright(c) 2015 Andreas Lubbe + * Copyright(c) 2015 Tiancheng "Timothy" Gu + * MIT Licensed + */var Va=/["'&<>]/;qn.exports=za;function za(e){var t=""+e,r=Va.exec(t);if(!r)return t;var o,n="",i=0,a=0;for(i=r.index;i0&&i[i.length-1])&&(p[0]===6||p[0]===2)){r=0;continue}if(p[0]===3&&(!i||p[1]>i[0]&&p[1]=e.length&&(e=void 0),{value:e&&e[o++],done:!e}}};throw new TypeError(t?"Object is not iterable.":"Symbol.iterator is not defined.")}function V(e,t){var r=typeof Symbol=="function"&&e[Symbol.iterator];if(!r)return e;var o=r.call(e),n,i=[],a;try{for(;(t===void 0||t-- >0)&&!(n=o.next()).done;)i.push(n.value)}catch(s){a={error:s}}finally{try{n&&!n.done&&(r=o.return)&&r.call(o)}finally{if(a)throw a.error}}return i}function z(e,t,r){if(r||arguments.length===2)for(var o=0,n=t.length,i;o1||s(u,h)})})}function s(u,h){try{c(o[u](h))}catch(w){f(i[0][3],w)}}function c(u){u.value instanceof ot?Promise.resolve(u.value.v).then(p,l):f(i[0][2],u)}function p(u){s("next",u)}function l(u){s("throw",u)}function f(u,h){u(h),i.shift(),i.length&&s(i[0][0],i[0][1])}}function so(e){if(!Symbol.asyncIterator)throw new TypeError("Symbol.asyncIterator is not defined.");var t=e[Symbol.asyncIterator],r;return t?t.call(e):(e=typeof ue=="function"?ue(e):e[Symbol.iterator](),r={},o("next"),o("throw"),o("return"),r[Symbol.asyncIterator]=function(){return this},r);function o(i){r[i]=e[i]&&function(a){return new Promise(function(s,c){a=e[i](a),n(s,c,a.done,a.value)})}}function n(i,a,s,c){Promise.resolve(c).then(function(p){i({value:p,done:s})},a)}}function k(e){return typeof e=="function"}function pt(e){var t=function(o){Error.call(o),o.stack=new Error().stack},r=e(t);return r.prototype=Object.create(Error.prototype),r.prototype.constructor=r,r}var Wt=pt(function(e){return function(r){e(this),this.message=r?r.length+` errors occurred during unsubscription: +`+r.map(function(o,n){return n+1+") "+o.toString()}).join(` + `):"",this.name="UnsubscriptionError",this.errors=r}});function Ve(e,t){if(e){var r=e.indexOf(t);0<=r&&e.splice(r,1)}}var Ie=function(){function e(t){this.initialTeardown=t,this.closed=!1,this._parentage=null,this._finalizers=null}return e.prototype.unsubscribe=function(){var t,r,o,n,i;if(!this.closed){this.closed=!0;var a=this._parentage;if(a)if(this._parentage=null,Array.isArray(a))try{for(var s=ue(a),c=s.next();!c.done;c=s.next()){var p=c.value;p.remove(this)}}catch(A){t={error:A}}finally{try{c&&!c.done&&(r=s.return)&&r.call(s)}finally{if(t)throw t.error}}else a.remove(this);var l=this.initialTeardown;if(k(l))try{l()}catch(A){i=A instanceof Wt?A.errors:[A]}var f=this._finalizers;if(f){this._finalizers=null;try{for(var u=ue(f),h=u.next();!h.done;h=u.next()){var w=h.value;try{co(w)}catch(A){i=i!=null?i:[],A instanceof Wt?i=z(z([],V(i)),V(A.errors)):i.push(A)}}}catch(A){o={error:A}}finally{try{h&&!h.done&&(n=u.return)&&n.call(u)}finally{if(o)throw o.error}}}if(i)throw new Wt(i)}},e.prototype.add=function(t){var r;if(t&&t!==this)if(this.closed)co(t);else{if(t instanceof e){if(t.closed||t._hasParent(this))return;t._addParent(this)}(this._finalizers=(r=this._finalizers)!==null&&r!==void 0?r:[]).push(t)}},e.prototype._hasParent=function(t){var r=this._parentage;return r===t||Array.isArray(r)&&r.includes(t)},e.prototype._addParent=function(t){var r=this._parentage;this._parentage=Array.isArray(r)?(r.push(t),r):r?[r,t]:t},e.prototype._removeParent=function(t){var r=this._parentage;r===t?this._parentage=null:Array.isArray(r)&&Ve(r,t)},e.prototype.remove=function(t){var r=this._finalizers;r&&Ve(r,t),t instanceof e&&t._removeParent(this)},e.EMPTY=function(){var t=new e;return t.closed=!0,t}(),e}();var Er=Ie.EMPTY;function Dt(e){return e instanceof Ie||e&&"closed"in e&&k(e.remove)&&k(e.add)&&k(e.unsubscribe)}function co(e){k(e)?e():e.unsubscribe()}var ke={onUnhandledError:null,onStoppedNotification:null,Promise:void 0,useDeprecatedSynchronousErrorHandling:!1,useDeprecatedNextContext:!1};var lt={setTimeout:function(e,t){for(var r=[],o=2;o0},enumerable:!1,configurable:!0}),t.prototype._trySubscribe=function(r){return this._throwIfClosed(),e.prototype._trySubscribe.call(this,r)},t.prototype._subscribe=function(r){return this._throwIfClosed(),this._checkFinalizedStatuses(r),this._innerSubscribe(r)},t.prototype._innerSubscribe=function(r){var o=this,n=this,i=n.hasError,a=n.isStopped,s=n.observers;return i||a?Er:(this.currentObservers=null,s.push(r),new Ie(function(){o.currentObservers=null,Ve(s,r)}))},t.prototype._checkFinalizedStatuses=function(r){var o=this,n=o.hasError,i=o.thrownError,a=o.isStopped;n?r.error(i):a&&r.complete()},t.prototype.asObservable=function(){var r=new j;return r.source=this,r},t.create=function(r,o){return new vo(r,o)},t}(j);var vo=function(e){se(t,e);function t(r,o){var n=e.call(this)||this;return n.destination=r,n.source=o,n}return t.prototype.next=function(r){var o,n;(n=(o=this.destination)===null||o===void 0?void 0:o.next)===null||n===void 0||n.call(o,r)},t.prototype.error=function(r){var o,n;(n=(o=this.destination)===null||o===void 0?void 0:o.error)===null||n===void 0||n.call(o,r)},t.prototype.complete=function(){var r,o;(o=(r=this.destination)===null||r===void 0?void 0:r.complete)===null||o===void 0||o.call(r)},t.prototype._subscribe=function(r){var o,n;return(n=(o=this.source)===null||o===void 0?void 0:o.subscribe(r))!==null&&n!==void 0?n:Er},t}(g);var St={now:function(){return(St.delegate||Date).now()},delegate:void 0};var Ot=function(e){se(t,e);function t(r,o,n){r===void 0&&(r=1/0),o===void 0&&(o=1/0),n===void 0&&(n=St);var i=e.call(this)||this;return i._bufferSize=r,i._windowTime=o,i._timestampProvider=n,i._buffer=[],i._infiniteTimeWindow=!0,i._infiniteTimeWindow=o===1/0,i._bufferSize=Math.max(1,r),i._windowTime=Math.max(1,o),i}return t.prototype.next=function(r){var o=this,n=o.isStopped,i=o._buffer,a=o._infiniteTimeWindow,s=o._timestampProvider,c=o._windowTime;n||(i.push(r),!a&&i.push(s.now()+c)),this._trimBuffer(),e.prototype.next.call(this,r)},t.prototype._subscribe=function(r){this._throwIfClosed(),this._trimBuffer();for(var o=this._innerSubscribe(r),n=this,i=n._infiniteTimeWindow,a=n._buffer,s=a.slice(),c=0;c0?e.prototype.requestAsyncId.call(this,r,o,n):(r.actions.push(this),r._scheduled||(r._scheduled=ut.requestAnimationFrame(function(){return r.flush(void 0)})))},t.prototype.recycleAsyncId=function(r,o,n){var i;if(n===void 0&&(n=0),n!=null?n>0:this.delay>0)return e.prototype.recycleAsyncId.call(this,r,o,n);var a=r.actions;o!=null&&((i=a[a.length-1])===null||i===void 0?void 0:i.id)!==o&&(ut.cancelAnimationFrame(o),r._scheduled=void 0)},t}(zt);var yo=function(e){se(t,e);function t(){return e!==null&&e.apply(this,arguments)||this}return t.prototype.flush=function(r){this._active=!0;var o=this._scheduled;this._scheduled=void 0;var n=this.actions,i;r=r||n.shift();do if(i=r.execute(r.state,r.delay))break;while((r=n[0])&&r.id===o&&n.shift());if(this._active=!1,i){for(;(r=n[0])&&r.id===o&&n.shift();)r.unsubscribe();throw i}},t}(qt);var de=new yo(xo);var L=new j(function(e){return e.complete()});function Kt(e){return e&&k(e.schedule)}function _r(e){return e[e.length-1]}function Je(e){return k(_r(e))?e.pop():void 0}function Ae(e){return Kt(_r(e))?e.pop():void 0}function Qt(e,t){return typeof _r(e)=="number"?e.pop():t}var dt=function(e){return e&&typeof e.length=="number"&&typeof e!="function"};function Yt(e){return k(e==null?void 0:e.then)}function Bt(e){return k(e[ft])}function Gt(e){return Symbol.asyncIterator&&k(e==null?void 0:e[Symbol.asyncIterator])}function Jt(e){return new TypeError("You provided "+(e!==null&&typeof e=="object"?"an invalid object":"'"+e+"'")+" where a stream was expected. You can provide an Observable, Promise, ReadableStream, Array, AsyncIterable, or Iterable.")}function Di(){return typeof Symbol!="function"||!Symbol.iterator?"@@iterator":Symbol.iterator}var Xt=Di();function Zt(e){return k(e==null?void 0:e[Xt])}function er(e){return ao(this,arguments,function(){var r,o,n,i;return Ut(this,function(a){switch(a.label){case 0:r=e.getReader(),a.label=1;case 1:a.trys.push([1,,9,10]),a.label=2;case 2:return[4,ot(r.read())];case 3:return o=a.sent(),n=o.value,i=o.done,i?[4,ot(void 0)]:[3,5];case 4:return[2,a.sent()];case 5:return[4,ot(n)];case 6:return[4,a.sent()];case 7:return a.sent(),[3,2];case 8:return[3,10];case 9:return r.releaseLock(),[7];case 10:return[2]}})})}function tr(e){return k(e==null?void 0:e.getReader)}function N(e){if(e instanceof j)return e;if(e!=null){if(Bt(e))return Ni(e);if(dt(e))return Vi(e);if(Yt(e))return zi(e);if(Gt(e))return Eo(e);if(Zt(e))return qi(e);if(tr(e))return Ki(e)}throw Jt(e)}function Ni(e){return new j(function(t){var r=e[ft]();if(k(r.subscribe))return r.subscribe(t);throw new TypeError("Provided object does not correctly implement Symbol.observable")})}function Vi(e){return new j(function(t){for(var r=0;r=2;return function(o){return o.pipe(e?b(function(n,i){return e(n,i,o)}):ce,ye(1),r?Qe(t):jo(function(){return new or}))}}function $r(e){return e<=0?function(){return L}:x(function(t,r){var o=[];t.subscribe(S(r,function(n){o.push(n),e=2,!0))}function le(e){e===void 0&&(e={});var t=e.connector,r=t===void 0?function(){return new g}:t,o=e.resetOnError,n=o===void 0?!0:o,i=e.resetOnComplete,a=i===void 0?!0:i,s=e.resetOnRefCountZero,c=s===void 0?!0:s;return function(p){var l,f,u,h=0,w=!1,A=!1,Z=function(){f==null||f.unsubscribe(),f=void 0},te=function(){Z(),l=u=void 0,w=A=!1},J=function(){var C=l;te(),C==null||C.unsubscribe()};return x(function(C,ct){h++,!A&&!w&&Z();var Ne=u=u!=null?u:r();ct.add(function(){h--,h===0&&!A&&!w&&(f=Pr(J,c))}),Ne.subscribe(ct),!l&&h>0&&(l=new it({next:function(Pe){return Ne.next(Pe)},error:function(Pe){A=!0,Z(),f=Pr(te,n,Pe),Ne.error(Pe)},complete:function(){w=!0,Z(),f=Pr(te,a),Ne.complete()}}),N(C).subscribe(l))})(p)}}function Pr(e,t){for(var r=[],o=2;oe.next(document)),e}function R(e,t=document){return Array.from(t.querySelectorAll(e))}function P(e,t=document){let r=me(e,t);if(typeof r=="undefined")throw new ReferenceError(`Missing element: expected "${e}" to be present`);return r}function me(e,t=document){return t.querySelector(e)||void 0}function Re(){var e,t,r,o;return(o=(r=(t=(e=document.activeElement)==null?void 0:e.shadowRoot)==null?void 0:t.activeElement)!=null?r:document.activeElement)!=null?o:void 0}var la=T(d(document.body,"focusin"),d(document.body,"focusout")).pipe(be(1),q(void 0),m(()=>Re()||document.body),B(1));function vt(e){return la.pipe(m(t=>e.contains(t)),Y())}function Vo(e,t){return T(d(e,"mouseenter").pipe(m(()=>!0)),d(e,"mouseleave").pipe(m(()=>!1))).pipe(t?be(t):ce,q(!1))}function Ue(e){return{x:e.offsetLeft,y:e.offsetTop}}function zo(e){return T(d(window,"load"),d(window,"resize")).pipe(Me(0,de),m(()=>Ue(e)),q(Ue(e)))}function ir(e){return{x:e.scrollLeft,y:e.scrollTop}}function et(e){return T(d(e,"scroll"),d(window,"resize")).pipe(Me(0,de),m(()=>ir(e)),q(ir(e)))}function qo(e,t){if(typeof t=="string"||typeof t=="number")e.innerHTML+=t.toString();else if(t instanceof Node)e.appendChild(t);else if(Array.isArray(t))for(let r of t)qo(e,r)}function E(e,t,...r){let o=document.createElement(e);if(t)for(let n of Object.keys(t))typeof t[n]!="undefined"&&(typeof t[n]!="boolean"?o.setAttribute(n,t[n]):o.setAttribute(n,""));for(let n of r)qo(o,n);return o}function ar(e){if(e>999){let t=+((e-950)%1e3>99);return`${((e+1e-6)/1e3).toFixed(t)}k`}else return e.toString()}function gt(e){let t=E("script",{src:e});return H(()=>(document.head.appendChild(t),T(d(t,"load"),d(t,"error").pipe(v(()=>Ar(()=>new ReferenceError(`Invalid script: ${e}`))))).pipe(m(()=>{}),_(()=>document.head.removeChild(t)),ye(1))))}var Ko=new g,ma=H(()=>typeof ResizeObserver=="undefined"?gt("https://unpkg.com/resize-observer-polyfill"):$(void 0)).pipe(m(()=>new ResizeObserver(e=>{for(let t of e)Ko.next(t)})),v(e=>T(qe,$(e)).pipe(_(()=>e.disconnect()))),B(1));function pe(e){return{width:e.offsetWidth,height:e.offsetHeight}}function Ee(e){return ma.pipe(y(t=>t.observe(e)),v(t=>Ko.pipe(b(({target:r})=>r===e),_(()=>t.unobserve(e)),m(()=>pe(e)))),q(pe(e)))}function xt(e){return{width:e.scrollWidth,height:e.scrollHeight}}function sr(e){let t=e.parentElement;for(;t&&(e.scrollWidth<=t.scrollWidth&&e.scrollHeight<=t.scrollHeight);)t=(e=t).parentElement;return t?e:void 0}var Qo=new g,fa=H(()=>$(new IntersectionObserver(e=>{for(let t of e)Qo.next(t)},{threshold:0}))).pipe(v(e=>T(qe,$(e)).pipe(_(()=>e.disconnect()))),B(1));function yt(e){return fa.pipe(y(t=>t.observe(e)),v(t=>Qo.pipe(b(({target:r})=>r===e),_(()=>t.unobserve(e)),m(({isIntersecting:r})=>r))))}function Yo(e,t=16){return et(e).pipe(m(({y:r})=>{let o=pe(e),n=xt(e);return r>=n.height-o.height-t}),Y())}var cr={drawer:P("[data-md-toggle=drawer]"),search:P("[data-md-toggle=search]")};function Bo(e){return cr[e].checked}function Be(e,t){cr[e].checked!==t&&cr[e].click()}function We(e){let t=cr[e];return d(t,"change").pipe(m(()=>t.checked),q(t.checked))}function ua(e,t){switch(e.constructor){case HTMLInputElement:return e.type==="radio"?/^Arrow/.test(t):!0;case HTMLSelectElement:case HTMLTextAreaElement:return!0;default:return e.isContentEditable}}function da(){return T(d(window,"compositionstart").pipe(m(()=>!0)),d(window,"compositionend").pipe(m(()=>!1))).pipe(q(!1))}function Go(){let e=d(window,"keydown").pipe(b(t=>!(t.metaKey||t.ctrlKey)),m(t=>({mode:Bo("search")?"search":"global",type:t.key,claim(){t.preventDefault(),t.stopPropagation()}})),b(({mode:t,type:r})=>{if(t==="global"){let o=Re();if(typeof o!="undefined")return!ua(o,r)}return!0}),le());return da().pipe(v(t=>t?L:e))}function ve(){return new URL(location.href)}function st(e,t=!1){if(G("navigation.instant")&&!t){let r=E("a",{href:e.href});document.body.appendChild(r),r.click(),r.remove()}else location.href=e.href}function Jo(){return new g}function Xo(){return location.hash.slice(1)}function Zo(e){let t=E("a",{href:e});t.addEventListener("click",r=>r.stopPropagation()),t.click()}function ha(e){return T(d(window,"hashchange"),e).pipe(m(Xo),q(Xo()),b(t=>t.length>0),B(1))}function en(e){return ha(e).pipe(m(t=>me(`[id="${t}"]`)),b(t=>typeof t!="undefined"))}function At(e){let t=matchMedia(e);return nr(r=>t.addListener(()=>r(t.matches))).pipe(q(t.matches))}function tn(){let e=matchMedia("print");return T(d(window,"beforeprint").pipe(m(()=>!0)),d(window,"afterprint").pipe(m(()=>!1))).pipe(q(e.matches))}function Ur(e,t){return e.pipe(v(r=>r?t():L))}function Wr(e,t){return new j(r=>{let o=new XMLHttpRequest;return o.open("GET",`${e}`),o.responseType="blob",o.addEventListener("load",()=>{o.status>=200&&o.status<300?(r.next(o.response),r.complete()):r.error(new Error(o.statusText))}),o.addEventListener("error",()=>{r.error(new Error("Network error"))}),o.addEventListener("abort",()=>{r.complete()}),typeof(t==null?void 0:t.progress$)!="undefined"&&(o.addEventListener("progress",n=>{var i;if(n.lengthComputable)t.progress$.next(n.loaded/n.total*100);else{let a=(i=o.getResponseHeader("Content-Length"))!=null?i:0;t.progress$.next(n.loaded/+a*100)}}),t.progress$.next(5)),o.send(),()=>o.abort()})}function De(e,t){return Wr(e,t).pipe(v(r=>r.text()),m(r=>JSON.parse(r)),B(1))}function rn(e,t){let r=new DOMParser;return Wr(e,t).pipe(v(o=>o.text()),m(o=>r.parseFromString(o,"text/html")),B(1))}function on(e,t){let r=new DOMParser;return Wr(e,t).pipe(v(o=>o.text()),m(o=>r.parseFromString(o,"text/xml")),B(1))}function nn(){return{x:Math.max(0,scrollX),y:Math.max(0,scrollY)}}function an(){return T(d(window,"scroll",{passive:!0}),d(window,"resize",{passive:!0})).pipe(m(nn),q(nn()))}function sn(){return{width:innerWidth,height:innerHeight}}function cn(){return d(window,"resize",{passive:!0}).pipe(m(sn),q(sn()))}function pn(){return Q([an(),cn()]).pipe(m(([e,t])=>({offset:e,size:t})),B(1))}function pr(e,{viewport$:t,header$:r}){let o=t.pipe(X("size")),n=Q([o,r]).pipe(m(()=>Ue(e)));return Q([r,t,n]).pipe(m(([{height:i},{offset:a,size:s},{x:c,y:p}])=>({offset:{x:a.x-c,y:a.y-p+i},size:s})))}function ba(e){return d(e,"message",t=>t.data)}function va(e){let t=new g;return t.subscribe(r=>e.postMessage(r)),t}function ln(e,t=new Worker(e)){let r=ba(t),o=va(t),n=new g;n.subscribe(o);let i=o.pipe(ee(),oe(!0));return n.pipe(ee(),$e(r.pipe(U(i))),le())}var ga=P("#__config"),Et=JSON.parse(ga.textContent);Et.base=`${new URL(Et.base,ve())}`;function we(){return Et}function G(e){return Et.features.includes(e)}function ge(e,t){return typeof t!="undefined"?Et.translations[e].replace("#",t.toString()):Et.translations[e]}function Te(e,t=document){return P(`[data-md-component=${e}]`,t)}function ie(e,t=document){return R(`[data-md-component=${e}]`,t)}function xa(e){let t=P(".md-typeset > :first-child",e);return d(t,"click",{once:!0}).pipe(m(()=>P(".md-typeset",e)),m(r=>({hash:__md_hash(r.innerHTML)})))}function mn(e){if(!G("announce.dismiss")||!e.childElementCount)return L;if(!e.hidden){let t=P(".md-typeset",e);__md_hash(t.innerHTML)===__md_get("__announce")&&(e.hidden=!0)}return H(()=>{let t=new g;return t.subscribe(({hash:r})=>{e.hidden=!0,__md_set("__announce",r)}),xa(e).pipe(y(r=>t.next(r)),_(()=>t.complete()),m(r=>F({ref:e},r)))})}function ya(e,{target$:t}){return t.pipe(m(r=>({hidden:r!==e})))}function fn(e,t){let r=new g;return r.subscribe(({hidden:o})=>{e.hidden=o}),ya(e,t).pipe(y(o=>r.next(o)),_(()=>r.complete()),m(o=>F({ref:e},o)))}function Ct(e,t){return t==="inline"?E("div",{class:"md-tooltip md-tooltip--inline",id:e,role:"tooltip"},E("div",{class:"md-tooltip__inner md-typeset"})):E("div",{class:"md-tooltip",id:e,role:"tooltip"},E("div",{class:"md-tooltip__inner md-typeset"}))}function un(e,t){if(t=t?`${t}_annotation_${e}`:void 0,t){let r=t?`#${t}`:void 0;return E("aside",{class:"md-annotation",tabIndex:0},Ct(t),E("a",{href:r,class:"md-annotation__index",tabIndex:-1},E("span",{"data-md-annotation-id":e})))}else return E("aside",{class:"md-annotation",tabIndex:0},Ct(t),E("span",{class:"md-annotation__index",tabIndex:-1},E("span",{"data-md-annotation-id":e})))}function dn(e){return E("button",{class:"md-clipboard md-icon",title:ge("clipboard.copy"),"data-clipboard-target":`#${e} > code`})}function Dr(e,t){let r=t&2,o=t&1,n=Object.keys(e.terms).filter(c=>!e.terms[c]).reduce((c,p)=>[...c,E("del",null,p)," "],[]).slice(0,-1),i=we(),a=new URL(e.location,i.base);G("search.highlight")&&a.searchParams.set("h",Object.entries(e.terms).filter(([,c])=>c).reduce((c,[p])=>`${c} ${p}`.trim(),""));let{tags:s}=we();return E("a",{href:`${a}`,class:"md-search-result__link",tabIndex:-1},E("article",{class:"md-search-result__article md-typeset","data-md-score":e.score.toFixed(2)},r>0&&E("div",{class:"md-search-result__icon md-icon"}),r>0&&E("h1",null,e.title),r<=0&&E("h2",null,e.title),o>0&&e.text.length>0&&e.text,e.tags&&e.tags.map(c=>{let p=s?c in s?`md-tag-icon md-tag--${s[c]}`:"md-tag-icon":"";return E("span",{class:`md-tag ${p}`},c)}),o>0&&n.length>0&&E("p",{class:"md-search-result__terms"},ge("search.result.term.missing"),": ",...n)))}function hn(e){let t=e[0].score,r=[...e],o=we(),n=r.findIndex(l=>!`${new URL(l.location,o.base)}`.includes("#")),[i]=r.splice(n,1),a=r.findIndex(l=>l.scoreDr(l,1)),...c.length?[E("details",{class:"md-search-result__more"},E("summary",{tabIndex:-1},E("div",null,c.length>0&&c.length===1?ge("search.result.more.one"):ge("search.result.more.other",c.length))),...c.map(l=>Dr(l,1)))]:[]];return E("li",{class:"md-search-result__item"},p)}function bn(e){return E("ul",{class:"md-source__facts"},Object.entries(e).map(([t,r])=>E("li",{class:`md-source__fact md-source__fact--${t}`},typeof r=="number"?ar(r):r)))}function Nr(e){let t=`tabbed-control tabbed-control--${e}`;return E("div",{class:t,hidden:!0},E("button",{class:"tabbed-button",tabIndex:-1,"aria-hidden":"true"}))}function vn(e){return E("div",{class:"md-typeset__scrollwrap"},E("div",{class:"md-typeset__table"},e))}function Ea(e){let t=we(),r=new URL(`../${e.version}/`,t.base);return E("li",{class:"md-version__item"},E("a",{href:`${r}`,class:"md-version__link"},e.title))}function gn(e,t){return e=e.filter(r=>{var o;return!((o=r.properties)!=null&&o.hidden)}),E("div",{class:"md-version"},E("button",{class:"md-version__current","aria-label":ge("select.version")},t.title),E("ul",{class:"md-version__list"},e.map(Ea)))}var wa=0;function Ta(e,t){document.body.append(e);let{width:r}=pe(e);e.style.setProperty("--md-tooltip-width",`${r}px`),e.remove();let o=sr(t),n=typeof o!="undefined"?et(o):$({x:0,y:0}),i=T(vt(t),Vo(t)).pipe(Y());return Q([i,n]).pipe(m(([a,s])=>{let{x:c,y:p}=Ue(t),l=pe(t),f=t.closest("table");return f&&t.parentElement&&(c+=f.offsetLeft+t.parentElement.offsetLeft,p+=f.offsetTop+t.parentElement.offsetTop),{active:a,offset:{x:c-s.x+l.width/2-r/2,y:p-s.y+l.height+8}}}))}function Ge(e){let t=e.title;if(!t.length)return L;let r=`__tooltip_${wa++}`,o=Ct(r,"inline"),n=P(".md-typeset",o);return n.innerHTML=t,H(()=>{let i=new g;return i.subscribe({next({offset:a}){o.style.setProperty("--md-tooltip-x",`${a.x}px`),o.style.setProperty("--md-tooltip-y",`${a.y}px`)},complete(){o.style.removeProperty("--md-tooltip-x"),o.style.removeProperty("--md-tooltip-y")}}),T(i.pipe(b(({active:a})=>a)),i.pipe(be(250),b(({active:a})=>!a))).subscribe({next({active:a}){a?(e.insertAdjacentElement("afterend",o),e.setAttribute("aria-describedby",r),e.removeAttribute("title")):(o.remove(),e.removeAttribute("aria-describedby"),e.setAttribute("title",t))},complete(){o.remove(),e.removeAttribute("aria-describedby"),e.setAttribute("title",t)}}),i.pipe(Me(16,de)).subscribe(({active:a})=>{o.classList.toggle("md-tooltip--active",a)}),i.pipe(_t(125,de),b(()=>!!e.offsetParent),m(()=>e.offsetParent.getBoundingClientRect()),m(({x:a})=>a)).subscribe({next(a){a?o.style.setProperty("--md-tooltip-0",`${-a}px`):o.style.removeProperty("--md-tooltip-0")},complete(){o.style.removeProperty("--md-tooltip-0")}}),Ta(o,e).pipe(y(a=>i.next(a)),_(()=>i.complete()),m(a=>F({ref:e},a)))}).pipe(ze(ae))}function Sa(e,t){let r=H(()=>Q([zo(e),et(t)])).pipe(m(([{x:o,y:n},i])=>{let{width:a,height:s}=pe(e);return{x:o-i.x+a/2,y:n-i.y+s/2}}));return vt(e).pipe(v(o=>r.pipe(m(n=>({active:o,offset:n})),ye(+!o||1/0))))}function xn(e,t,{target$:r}){let[o,n]=Array.from(e.children);return H(()=>{let i=new g,a=i.pipe(ee(),oe(!0));return i.subscribe({next({offset:s}){e.style.setProperty("--md-tooltip-x",`${s.x}px`),e.style.setProperty("--md-tooltip-y",`${s.y}px`)},complete(){e.style.removeProperty("--md-tooltip-x"),e.style.removeProperty("--md-tooltip-y")}}),yt(e).pipe(U(a)).subscribe(s=>{e.toggleAttribute("data-md-visible",s)}),T(i.pipe(b(({active:s})=>s)),i.pipe(be(250),b(({active:s})=>!s))).subscribe({next({active:s}){s?e.prepend(o):o.remove()},complete(){e.prepend(o)}}),i.pipe(Me(16,de)).subscribe(({active:s})=>{o.classList.toggle("md-tooltip--active",s)}),i.pipe(_t(125,de),b(()=>!!e.offsetParent),m(()=>e.offsetParent.getBoundingClientRect()),m(({x:s})=>s)).subscribe({next(s){s?e.style.setProperty("--md-tooltip-0",`${-s}px`):e.style.removeProperty("--md-tooltip-0")},complete(){e.style.removeProperty("--md-tooltip-0")}}),d(n,"click").pipe(U(a),b(s=>!(s.metaKey||s.ctrlKey))).subscribe(s=>{s.stopPropagation(),s.preventDefault()}),d(n,"mousedown").pipe(U(a),ne(i)).subscribe(([s,{active:c}])=>{var p;if(s.button!==0||s.metaKey||s.ctrlKey)s.preventDefault();else if(c){s.preventDefault();let l=e.parentElement.closest(".md-annotation");l instanceof HTMLElement?l.focus():(p=Re())==null||p.blur()}}),r.pipe(U(a),b(s=>s===o),Ye(125)).subscribe(()=>e.focus()),Sa(e,t).pipe(y(s=>i.next(s)),_(()=>i.complete()),m(s=>F({ref:e},s)))})}function Oa(e){return e.tagName==="CODE"?R(".c, .c1, .cm",e):[e]}function Ma(e){let t=[];for(let r of Oa(e)){let o=[],n=document.createNodeIterator(r,NodeFilter.SHOW_TEXT);for(let i=n.nextNode();i;i=n.nextNode())o.push(i);for(let i of o){let a;for(;a=/(\(\d+\))(!)?/.exec(i.textContent);){let[,s,c]=a;if(typeof c=="undefined"){let p=i.splitText(a.index);i=p.splitText(s.length),t.push(p)}else{i.textContent=s,t.push(i);break}}}}return t}function yn(e,t){t.append(...Array.from(e.childNodes))}function lr(e,t,{target$:r,print$:o}){let n=t.closest("[id]"),i=n==null?void 0:n.id,a=new Map;for(let s of Ma(t)){let[,c]=s.textContent.match(/\((\d+)\)/);me(`:scope > li:nth-child(${c})`,e)&&(a.set(c,un(c,i)),s.replaceWith(a.get(c)))}return a.size===0?L:H(()=>{let s=new g,c=s.pipe(ee(),oe(!0)),p=[];for(let[l,f]of a)p.push([P(".md-typeset",f),P(`:scope > li:nth-child(${l})`,e)]);return o.pipe(U(c)).subscribe(l=>{e.hidden=!l,e.classList.toggle("md-annotation-list",l);for(let[f,u]of p)l?yn(f,u):yn(u,f)}),T(...[...a].map(([,l])=>xn(l,t,{target$:r}))).pipe(_(()=>s.complete()),le())})}function En(e){if(e.nextElementSibling){let t=e.nextElementSibling;if(t.tagName==="OL")return t;if(t.tagName==="P"&&!t.children.length)return En(t)}}function wn(e,t){return H(()=>{let r=En(e);return typeof r!="undefined"?lr(r,e,t):L})}var Tn=jt(zr());var La=0;function Sn(e){if(e.nextElementSibling){let t=e.nextElementSibling;if(t.tagName==="OL")return t;if(t.tagName==="P"&&!t.children.length)return Sn(t)}}function _a(e){return Ee(e).pipe(m(({width:t})=>({scrollable:xt(e).width>t})),X("scrollable"))}function On(e,t){let{matches:r}=matchMedia("(hover)"),o=H(()=>{let n=new g,i=n.pipe($r(1));n.subscribe(({scrollable:c})=>{c&&r?e.setAttribute("tabindex","0"):e.removeAttribute("tabindex")});let a=[];if(Tn.default.isSupported()&&(e.closest(".copy")||G("content.code.copy")&&!e.closest(".no-copy"))){let c=e.closest("pre");c.id=`__code_${La++}`;let p=dn(c.id);c.insertBefore(p,e),G("content.tooltips")&&a.push(Ge(p))}let s=e.closest(".highlight");if(s instanceof HTMLElement){let c=Sn(s);if(typeof c!="undefined"&&(s.classList.contains("annotate")||G("content.code.annotate"))){let p=lr(c,e,t);a.push(Ee(s).pipe(U(i),m(({width:l,height:f})=>l&&f),Y(),v(l=>l?p:L)))}}return _a(e).pipe(y(c=>n.next(c)),_(()=>n.complete()),m(c=>F({ref:e},c)),$e(...a))});return G("content.lazy")?yt(e).pipe(b(n=>n),ye(1),v(()=>o)):o}function Aa(e,{target$:t,print$:r}){let o=!0;return T(t.pipe(m(n=>n.closest("details:not([open])")),b(n=>e===n),m(()=>({action:"open",reveal:!0}))),r.pipe(b(n=>n||!o),y(()=>o=e.open),m(n=>({action:n?"open":"close"}))))}function Mn(e,t){return H(()=>{let r=new g;return r.subscribe(({action:o,reveal:n})=>{e.toggleAttribute("open",o==="open"),n&&e.scrollIntoView()}),Aa(e,t).pipe(y(o=>r.next(o)),_(()=>r.complete()),m(o=>F({ref:e},o)))})}var Ln=".node circle,.node ellipse,.node path,.node polygon,.node rect{fill:var(--md-mermaid-node-bg-color);stroke:var(--md-mermaid-node-fg-color)}marker{fill:var(--md-mermaid-edge-color)!important}.edgeLabel .label rect{fill:#0000}.label{color:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}.label foreignObject{line-height:normal;overflow:visible}.label div .edgeLabel{color:var(--md-mermaid-label-fg-color)}.edgeLabel,.edgeLabel rect,.label div .edgeLabel{background-color:var(--md-mermaid-label-bg-color)}.edgeLabel,.edgeLabel rect{fill:var(--md-mermaid-label-bg-color);color:var(--md-mermaid-edge-color)}.edgePath .path,.flowchart-link{stroke:var(--md-mermaid-edge-color);stroke-width:.05rem}.edgePath .arrowheadPath{fill:var(--md-mermaid-edge-color);stroke:none}.cluster rect{fill:var(--md-default-fg-color--lightest);stroke:var(--md-default-fg-color--lighter)}.cluster span{color:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}g #flowchart-circleEnd,g #flowchart-circleStart,g #flowchart-crossEnd,g #flowchart-crossStart,g #flowchart-pointEnd,g #flowchart-pointStart{stroke:none}g.classGroup line,g.classGroup rect{fill:var(--md-mermaid-node-bg-color);stroke:var(--md-mermaid-node-fg-color)}g.classGroup text{fill:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}.classLabel .box{fill:var(--md-mermaid-label-bg-color);background-color:var(--md-mermaid-label-bg-color);opacity:1}.classLabel .label{fill:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}.node .divider{stroke:var(--md-mermaid-node-fg-color)}.relation{stroke:var(--md-mermaid-edge-color)}.cardinality{fill:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}.cardinality text{fill:inherit!important}defs #classDiagram-compositionEnd,defs #classDiagram-compositionStart,defs #classDiagram-dependencyEnd,defs #classDiagram-dependencyStart,defs #classDiagram-extensionEnd,defs #classDiagram-extensionStart{fill:var(--md-mermaid-edge-color)!important;stroke:var(--md-mermaid-edge-color)!important}defs #classDiagram-aggregationEnd,defs #classDiagram-aggregationStart{fill:var(--md-mermaid-label-bg-color)!important;stroke:var(--md-mermaid-edge-color)!important}g.stateGroup rect{fill:var(--md-mermaid-node-bg-color);stroke:var(--md-mermaid-node-fg-color)}g.stateGroup .state-title{fill:var(--md-mermaid-label-fg-color)!important;font-family:var(--md-mermaid-font-family)}g.stateGroup .composit{fill:var(--md-mermaid-label-bg-color)}.nodeLabel,.nodeLabel p{color:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}.node circle.state-end,.node circle.state-start,.start-state{fill:var(--md-mermaid-edge-color);stroke:none}.end-state-inner,.end-state-outer{fill:var(--md-mermaid-edge-color)}.end-state-inner,.node circle.state-end{stroke:var(--md-mermaid-label-bg-color)}.transition{stroke:var(--md-mermaid-edge-color)}[id^=state-fork] rect,[id^=state-join] rect{fill:var(--md-mermaid-edge-color)!important;stroke:none!important}.statediagram-cluster.statediagram-cluster .inner{fill:var(--md-default-bg-color)}.statediagram-cluster rect{fill:var(--md-mermaid-node-bg-color);stroke:var(--md-mermaid-node-fg-color)}.statediagram-state rect.divider{fill:var(--md-default-fg-color--lightest);stroke:var(--md-default-fg-color--lighter)}defs #statediagram-barbEnd{stroke:var(--md-mermaid-edge-color)}.attributeBoxEven,.attributeBoxOdd{fill:var(--md-mermaid-node-bg-color);stroke:var(--md-mermaid-node-fg-color)}.entityBox{fill:var(--md-mermaid-label-bg-color);stroke:var(--md-mermaid-node-fg-color)}.entityLabel{fill:var(--md-mermaid-label-fg-color);font-family:var(--md-mermaid-font-family)}.relationshipLabelBox{fill:var(--md-mermaid-label-bg-color);fill-opacity:1;background-color:var(--md-mermaid-label-bg-color);opacity:1}.relationshipLabel{fill:var(--md-mermaid-label-fg-color)}.relationshipLine{stroke:var(--md-mermaid-edge-color)}defs #ONE_OR_MORE_END *,defs #ONE_OR_MORE_START *,defs #ONLY_ONE_END *,defs #ONLY_ONE_START *,defs #ZERO_OR_MORE_END *,defs #ZERO_OR_MORE_START *,defs #ZERO_OR_ONE_END *,defs #ZERO_OR_ONE_START *{stroke:var(--md-mermaid-edge-color)!important}defs #ZERO_OR_MORE_END circle,defs #ZERO_OR_MORE_START circle{fill:var(--md-mermaid-label-bg-color)}.actor{fill:var(--md-mermaid-sequence-actor-bg-color);stroke:var(--md-mermaid-sequence-actor-border-color)}text.actor>tspan{fill:var(--md-mermaid-sequence-actor-fg-color);font-family:var(--md-mermaid-font-family)}line{stroke:var(--md-mermaid-sequence-actor-line-color)}.actor-man circle,.actor-man line{fill:var(--md-mermaid-sequence-actorman-bg-color);stroke:var(--md-mermaid-sequence-actorman-line-color)}.messageLine0,.messageLine1{stroke:var(--md-mermaid-sequence-message-line-color)}.note{fill:var(--md-mermaid-sequence-note-bg-color);stroke:var(--md-mermaid-sequence-note-border-color)}.loopText,.loopText>tspan,.messageText,.noteText>tspan{stroke:none;font-family:var(--md-mermaid-font-family)!important}.messageText{fill:var(--md-mermaid-sequence-message-fg-color)}.loopText,.loopText>tspan{fill:var(--md-mermaid-sequence-loop-fg-color)}.noteText>tspan{fill:var(--md-mermaid-sequence-note-fg-color)}#arrowhead path{fill:var(--md-mermaid-sequence-message-line-color);stroke:none}.loopLine{fill:var(--md-mermaid-sequence-loop-bg-color);stroke:var(--md-mermaid-sequence-loop-border-color)}.labelBox{fill:var(--md-mermaid-sequence-label-bg-color);stroke:none}.labelText,.labelText>span{fill:var(--md-mermaid-sequence-label-fg-color);font-family:var(--md-mermaid-font-family)}.sequenceNumber{fill:var(--md-mermaid-sequence-number-fg-color)}rect.rect{fill:var(--md-mermaid-sequence-box-bg-color);stroke:none}rect.rect+text.text{fill:var(--md-mermaid-sequence-box-fg-color)}defs #sequencenumber{fill:var(--md-mermaid-sequence-number-bg-color)!important}";var qr,ka=0;function Ha(){return typeof mermaid=="undefined"||mermaid instanceof Element?gt("https://unpkg.com/mermaid@10.7.0/dist/mermaid.min.js"):$(void 0)}function _n(e){return e.classList.remove("mermaid"),qr||(qr=Ha().pipe(y(()=>mermaid.initialize({startOnLoad:!1,themeCSS:Ln,sequence:{actorFontSize:"16px",messageFontSize:"16px",noteFontSize:"16px"}})),m(()=>{}),B(1))),qr.subscribe(()=>ro(this,null,function*(){e.classList.add("mermaid");let t=`__mermaid_${ka++}`,r=E("div",{class:"mermaid"}),o=e.textContent,{svg:n,fn:i}=yield mermaid.render(t,o),a=r.attachShadow({mode:"closed"});a.innerHTML=n,e.replaceWith(r),i==null||i(a)})),qr.pipe(m(()=>({ref:e})))}var An=E("table");function Cn(e){return e.replaceWith(An),An.replaceWith(vn(e)),$({ref:e})}function $a(e){let t=e.find(r=>r.checked)||e[0];return T(...e.map(r=>d(r,"change").pipe(m(()=>P(`label[for="${r.id}"]`))))).pipe(q(P(`label[for="${t.id}"]`)),m(r=>({active:r})))}function kn(e,{viewport$:t,target$:r}){let o=P(".tabbed-labels",e),n=R(":scope > input",e),i=Nr("prev");e.append(i);let a=Nr("next");return e.append(a),H(()=>{let s=new g,c=s.pipe(ee(),oe(!0));Q([s,Ee(e)]).pipe(U(c),Me(1,de)).subscribe({next([{active:p},l]){let f=Ue(p),{width:u}=pe(p);e.style.setProperty("--md-indicator-x",`${f.x}px`),e.style.setProperty("--md-indicator-width",`${u}px`);let h=ir(o);(f.xh.x+l.width)&&o.scrollTo({left:Math.max(0,f.x-16),behavior:"smooth"})},complete(){e.style.removeProperty("--md-indicator-x"),e.style.removeProperty("--md-indicator-width")}}),Q([et(o),Ee(o)]).pipe(U(c)).subscribe(([p,l])=>{let f=xt(o);i.hidden=p.x<16,a.hidden=p.x>f.width-l.width-16}),T(d(i,"click").pipe(m(()=>-1)),d(a,"click").pipe(m(()=>1))).pipe(U(c)).subscribe(p=>{let{width:l}=pe(o);o.scrollBy({left:l*p,behavior:"smooth"})}),r.pipe(U(c),b(p=>n.includes(p))).subscribe(p=>p.click()),o.classList.add("tabbed-labels--linked");for(let p of n){let l=P(`label[for="${p.id}"]`);l.replaceChildren(E("a",{href:`#${l.htmlFor}`,tabIndex:-1},...Array.from(l.childNodes))),d(l.firstElementChild,"click").pipe(U(c),b(f=>!(f.metaKey||f.ctrlKey)),y(f=>{f.preventDefault(),f.stopPropagation()})).subscribe(()=>{history.replaceState({},"",`#${l.htmlFor}`),l.click()})}return G("content.tabs.link")&&s.pipe(Le(1),ne(t)).subscribe(([{active:p},{offset:l}])=>{let f=p.innerText.trim();if(p.hasAttribute("data-md-switching"))p.removeAttribute("data-md-switching");else{let u=e.offsetTop-l.y;for(let w of R("[data-tabs]"))for(let A of R(":scope > input",w)){let Z=P(`label[for="${A.id}"]`);if(Z!==p&&Z.innerText.trim()===f){Z.setAttribute("data-md-switching",""),A.click();break}}window.scrollTo({top:e.offsetTop-u});let h=__md_get("__tabs")||[];__md_set("__tabs",[...new Set([f,...h])])}}),s.pipe(U(c)).subscribe(()=>{for(let p of R("audio, video",e))p.pause()}),$a(n).pipe(y(p=>s.next(p)),_(()=>s.complete()),m(p=>F({ref:e},p)))}).pipe(ze(ae))}function Hn(e,{viewport$:t,target$:r,print$:o}){return T(...R(".annotate:not(.highlight)",e).map(n=>wn(n,{target$:r,print$:o})),...R("pre:not(.mermaid) > code",e).map(n=>On(n,{target$:r,print$:o})),...R("pre.mermaid",e).map(n=>_n(n)),...R("table:not([class])",e).map(n=>Cn(n)),...R("details",e).map(n=>Mn(n,{target$:r,print$:o})),...R("[data-tabs]",e).map(n=>kn(n,{viewport$:t,target$:r})),...R("[title]",e).filter(()=>G("content.tooltips")).map(n=>Ge(n)))}function Ra(e,{alert$:t}){return t.pipe(v(r=>T($(!0),$(!1).pipe(Ye(2e3))).pipe(m(o=>({message:r,active:o})))))}function $n(e,t){let r=P(".md-typeset",e);return H(()=>{let o=new g;return o.subscribe(({message:n,active:i})=>{e.classList.toggle("md-dialog--active",i),r.textContent=n}),Ra(e,t).pipe(y(n=>o.next(n)),_(()=>o.complete()),m(n=>F({ref:e},n)))})}function Pa({viewport$:e}){if(!G("header.autohide"))return $(!1);let t=e.pipe(m(({offset:{y:n}})=>n),Ke(2,1),m(([n,i])=>[nMath.abs(i-n.y)>100),m(([,[n]])=>n),Y()),o=We("search");return Q([e,o]).pipe(m(([{offset:n},i])=>n.y>400&&!i),Y(),v(n=>n?r:$(!1)),q(!1))}function Rn(e,t){return H(()=>Q([Ee(e),Pa(t)])).pipe(m(([{height:r},o])=>({height:r,hidden:o})),Y((r,o)=>r.height===o.height&&r.hidden===o.hidden),B(1))}function Pn(e,{header$:t,main$:r}){return H(()=>{let o=new g,n=o.pipe(ee(),oe(!0));o.pipe(X("active"),je(t)).subscribe(([{active:a},{hidden:s}])=>{e.classList.toggle("md-header--shadow",a&&!s),e.hidden=s});let i=fe(R("[title]",e)).pipe(b(()=>G("content.tooltips")),re(a=>Ge(a)));return r.subscribe(o),t.pipe(U(n),m(a=>F({ref:e},a)),$e(i.pipe(U(n))))})}function Ia(e,{viewport$:t,header$:r}){return pr(e,{viewport$:t,header$:r}).pipe(m(({offset:{y:o}})=>{let{height:n}=pe(e);return{active:o>=n}}),X("active"))}function In(e,t){return H(()=>{let r=new g;r.subscribe({next({active:n}){e.classList.toggle("md-header__title--active",n)},complete(){e.classList.remove("md-header__title--active")}});let o=me(".md-content h1");return typeof o=="undefined"?L:Ia(o,t).pipe(y(n=>r.next(n)),_(()=>r.complete()),m(n=>F({ref:e},n)))})}function Fn(e,{viewport$:t,header$:r}){let o=r.pipe(m(({height:i})=>i),Y()),n=o.pipe(v(()=>Ee(e).pipe(m(({height:i})=>({top:e.offsetTop,bottom:e.offsetTop+i})),X("bottom"))));return Q([o,n,t]).pipe(m(([i,{top:a,bottom:s},{offset:{y:c},size:{height:p}}])=>(p=Math.max(0,p-Math.max(0,a-c,i)-Math.max(0,p+c-s)),{offset:a-i,height:p,active:a-i<=c})),Y((i,a)=>i.offset===a.offset&&i.height===a.height&&i.active===a.active))}function Fa(e){let t=__md_get("__palette")||{index:e.findIndex(o=>matchMedia(o.getAttribute("data-md-color-media")).matches)},r=Math.max(0,Math.min(t.index,e.length-1));return $(...e).pipe(re(o=>d(o,"change").pipe(m(()=>o))),q(e[r]),m(o=>({index:e.indexOf(o),color:{media:o.getAttribute("data-md-color-media"),scheme:o.getAttribute("data-md-color-scheme"),primary:o.getAttribute("data-md-color-primary"),accent:o.getAttribute("data-md-color-accent")}})),B(1))}function jn(e){let t=R("input",e),r=E("meta",{name:"theme-color"});document.head.appendChild(r);let o=E("meta",{name:"color-scheme"});document.head.appendChild(o);let n=At("(prefers-color-scheme: light)");return H(()=>{let i=new g;return i.subscribe(a=>{if(document.body.setAttribute("data-md-color-switching",""),a.color.media==="(prefers-color-scheme)"){let s=matchMedia("(prefers-color-scheme: light)"),c=document.querySelector(s.matches?"[data-md-color-media='(prefers-color-scheme: light)']":"[data-md-color-media='(prefers-color-scheme: dark)']");a.color.scheme=c.getAttribute("data-md-color-scheme"),a.color.primary=c.getAttribute("data-md-color-primary"),a.color.accent=c.getAttribute("data-md-color-accent")}for(let[s,c]of Object.entries(a.color))document.body.setAttribute(`data-md-color-${s}`,c);for(let s=0;sa.key==="Enter"),ne(i,(a,s)=>s)).subscribe(({index:a})=>{a=(a+1)%t.length,t[a].click(),t[a].focus()}),i.pipe(m(()=>{let a=Te("header"),s=window.getComputedStyle(a);return o.content=s.colorScheme,s.backgroundColor.match(/\d+/g).map(c=>(+c).toString(16).padStart(2,"0")).join("")})).subscribe(a=>r.content=`#${a}`),i.pipe(Oe(ae)).subscribe(()=>{document.body.removeAttribute("data-md-color-switching")}),Fa(t).pipe(U(n.pipe(Le(1))),at(),y(a=>i.next(a)),_(()=>i.complete()),m(a=>F({ref:e},a)))})}function Un(e,{progress$:t}){return H(()=>{let r=new g;return r.subscribe(({value:o})=>{e.style.setProperty("--md-progress-value",`${o}`)}),t.pipe(y(o=>r.next({value:o})),_(()=>r.complete()),m(o=>({ref:e,value:o})))})}var Kr=jt(zr());function ja(e){e.setAttribute("data-md-copying","");let t=e.closest("[data-copy]"),r=t?t.getAttribute("data-copy"):e.innerText;return e.removeAttribute("data-md-copying"),r.trimEnd()}function Wn({alert$:e}){Kr.default.isSupported()&&new j(t=>{new Kr.default("[data-clipboard-target], [data-clipboard-text]",{text:r=>r.getAttribute("data-clipboard-text")||ja(P(r.getAttribute("data-clipboard-target")))}).on("success",r=>t.next(r))}).pipe(y(t=>{t.trigger.focus()}),m(()=>ge("clipboard.copied"))).subscribe(e)}function Dn(e,t){return e.protocol=t.protocol,e.hostname=t.hostname,e}function Ua(e,t){let r=new Map;for(let o of R("url",e)){let n=P("loc",o),i=[Dn(new URL(n.textContent),t)];r.set(`${i[0]}`,i);for(let a of R("[rel=alternate]",o)){let s=a.getAttribute("href");s!=null&&i.push(Dn(new URL(s),t))}}return r}function mr(e){return on(new URL("sitemap.xml",e)).pipe(m(t=>Ua(t,new URL(e))),he(()=>$(new Map)))}function Wa(e,t){if(!(e.target instanceof Element))return L;let r=e.target.closest("a");if(r===null)return L;if(r.target||e.metaKey||e.ctrlKey)return L;let o=new URL(r.href);return o.search=o.hash="",t.has(`${o}`)?(e.preventDefault(),$(new URL(r.href))):L}function Nn(e){let t=new Map;for(let r of R(":scope > *",e.head))t.set(r.outerHTML,r);return t}function Vn(e){for(let t of R("[href], [src]",e))for(let r of["href","src"]){let o=t.getAttribute(r);if(o&&!/^(?:[a-z]+:)?\/\//i.test(o)){t[r]=t[r];break}}return $(e)}function Da(e){for(let o of["[data-md-component=announce]","[data-md-component=container]","[data-md-component=header-topic]","[data-md-component=outdated]","[data-md-component=logo]","[data-md-component=skip]",...G("navigation.tabs.sticky")?["[data-md-component=tabs]"]:[]]){let n=me(o),i=me(o,e);typeof n!="undefined"&&typeof i!="undefined"&&n.replaceWith(i)}let t=Nn(document);for(let[o,n]of Nn(e))t.has(o)?t.delete(o):document.head.appendChild(n);for(let o of t.values()){let n=o.getAttribute("name");n!=="theme-color"&&n!=="color-scheme"&&o.remove()}let r=Te("container");return Fe(R("script",r)).pipe(v(o=>{let n=e.createElement("script");if(o.src){for(let i of o.getAttributeNames())n.setAttribute(i,o.getAttribute(i));return o.replaceWith(n),new j(i=>{n.onload=()=>i.complete()})}else return n.textContent=o.textContent,o.replaceWith(n),L}),ee(),oe(document))}function zn({location$:e,viewport$:t,progress$:r}){let o=we();if(location.protocol==="file:")return L;let n=mr(o.base);$(document).subscribe(Vn);let i=d(document.body,"click").pipe(je(n),v(([c,p])=>Wa(c,p)),le()),a=d(window,"popstate").pipe(m(ve),le());i.pipe(ne(t)).subscribe(([c,{offset:p}])=>{history.replaceState(p,""),history.pushState(null,"",c)}),T(i,a).subscribe(e);let s=e.pipe(X("pathname"),v(c=>rn(c,{progress$:r}).pipe(he(()=>(st(c,!0),L)))),v(Vn),v(Da),le());return T(s.pipe(ne(e,(c,p)=>p)),e.pipe(X("pathname"),v(()=>e),X("hash")),e.pipe(Y((c,p)=>c.pathname===p.pathname&&c.hash===p.hash),v(()=>i),y(()=>history.back()))).subscribe(c=>{var p,l;history.state!==null||!c.hash?window.scrollTo(0,(l=(p=history.state)==null?void 0:p.y)!=null?l:0):(history.scrollRestoration="auto",Zo(c.hash),history.scrollRestoration="manual")}),e.subscribe(()=>{history.scrollRestoration="manual"}),d(window,"beforeunload").subscribe(()=>{history.scrollRestoration="auto"}),t.pipe(X("offset"),be(100)).subscribe(({offset:c})=>{history.replaceState(c,"")}),s}var Qn=jt(Kn());function Yn(e){let t=e.separator.split("|").map(n=>n.replace(/(\(\?[!=<][^)]+\))/g,"").length===0?"\uFFFD":n).join("|"),r=new RegExp(t,"img"),o=(n,i,a)=>`${i}${a}`;return n=>{n=n.replace(/[\s*+\-:~^]+/g," ").trim();let i=new RegExp(`(^|${e.separator}|)(${n.replace(/[|\\{}()[\]^$+*?.-]/g,"\\$&").replace(r,"|")})`,"img");return a=>(0,Qn.default)(a).replace(i,o).replace(/<\/mark>(\s+)]*>/img,"$1")}}function Ht(e){return e.type===1}function fr(e){return e.type===3}function Bn(e,t){let r=ln(e);return T($(location.protocol!=="file:"),We("search")).pipe(He(o=>o),v(()=>t)).subscribe(({config:o,docs:n})=>r.next({type:0,data:{config:o,docs:n,options:{suggest:G("search.suggest")}}})),r}function Gn({document$:e}){let t=we(),r=De(new URL("../versions.json",t.base)).pipe(he(()=>L)),o=r.pipe(m(n=>{let[,i]=t.base.match(/([^/]+)\/?$/);return n.find(({version:a,aliases:s})=>a===i||s.includes(i))||n[0]}));r.pipe(m(n=>new Map(n.map(i=>[`${new URL(`../${i.version}/`,t.base)}`,i]))),v(n=>d(document.body,"click").pipe(b(i=>!i.metaKey&&!i.ctrlKey),ne(o),v(([i,a])=>{if(i.target instanceof Element){let s=i.target.closest("a");if(s&&!s.target&&n.has(s.href)){let c=s.href;return!i.target.closest(".md-version")&&n.get(c)===a?L:(i.preventDefault(),$(c))}}return L}),v(i=>{let{version:a}=n.get(i);return mr(new URL(i)).pipe(m(s=>{let p=ve().href.replace(t.base,"");return s.has(p.split("#")[0])?new URL(`../${a}/${p}`,t.base):new URL(i)}))})))).subscribe(n=>st(n,!0)),Q([r,o]).subscribe(([n,i])=>{P(".md-header__topic").appendChild(gn(n,i))}),e.pipe(v(()=>o)).subscribe(n=>{var a;let i=__md_get("__outdated",sessionStorage);if(i===null){i=!0;let s=((a=t.version)==null?void 0:a.default)||"latest";Array.isArray(s)||(s=[s]);e:for(let c of s)for(let p of n.aliases.concat(n.version))if(new RegExp(c,"i").test(p)){i=!1;break e}__md_set("__outdated",i,sessionStorage)}if(i)for(let s of ie("outdated"))s.hidden=!1})}function Ka(e,{worker$:t}){let{searchParams:r}=ve();r.has("q")&&(Be("search",!0),e.value=r.get("q"),e.focus(),We("search").pipe(He(i=>!i)).subscribe(()=>{let i=ve();i.searchParams.delete("q"),history.replaceState({},"",`${i}`)}));let o=vt(e),n=T(t.pipe(He(Ht)),d(e,"keyup"),o).pipe(m(()=>e.value),Y());return Q([n,o]).pipe(m(([i,a])=>({value:i,focus:a})),B(1))}function Jn(e,{worker$:t}){let r=new g,o=r.pipe(ee(),oe(!0));Q([t.pipe(He(Ht)),r],(i,a)=>a).pipe(X("value")).subscribe(({value:i})=>t.next({type:2,data:i})),r.pipe(X("focus")).subscribe(({focus:i})=>{i&&Be("search",i)}),d(e.form,"reset").pipe(U(o)).subscribe(()=>e.focus());let n=P("header [for=__search]");return d(n,"click").subscribe(()=>e.focus()),Ka(e,{worker$:t}).pipe(y(i=>r.next(i)),_(()=>r.complete()),m(i=>F({ref:e},i)),B(1))}function Xn(e,{worker$:t,query$:r}){let o=new g,n=Yo(e.parentElement).pipe(b(Boolean)),i=e.parentElement,a=P(":scope > :first-child",e),s=P(":scope > :last-child",e);We("search").subscribe(l=>s.setAttribute("role",l?"list":"presentation")),o.pipe(ne(r),Ir(t.pipe(He(Ht)))).subscribe(([{items:l},{value:f}])=>{switch(l.length){case 0:a.textContent=f.length?ge("search.result.none"):ge("search.result.placeholder");break;case 1:a.textContent=ge("search.result.one");break;default:let u=ar(l.length);a.textContent=ge("search.result.other",u)}});let c=o.pipe(y(()=>s.innerHTML=""),v(({items:l})=>T($(...l.slice(0,10)),$(...l.slice(10)).pipe(Ke(4),jr(n),v(([f])=>f)))),m(hn),le());return c.subscribe(l=>s.appendChild(l)),c.pipe(re(l=>{let f=me("details",l);return typeof f=="undefined"?L:d(f,"toggle").pipe(U(o),m(()=>f))})).subscribe(l=>{l.open===!1&&l.offsetTop<=i.scrollTop&&i.scrollTo({top:l.offsetTop})}),t.pipe(b(fr),m(({data:l})=>l)).pipe(y(l=>o.next(l)),_(()=>o.complete()),m(l=>F({ref:e},l)))}function Qa(e,{query$:t}){return t.pipe(m(({value:r})=>{let o=ve();return o.hash="",r=r.replace(/\s+/g,"+").replace(/&/g,"%26").replace(/=/g,"%3D"),o.search=`q=${r}`,{url:o}}))}function Zn(e,t){let r=new g,o=r.pipe(ee(),oe(!0));return r.subscribe(({url:n})=>{e.setAttribute("data-clipboard-text",e.href),e.href=`${n}`}),d(e,"click").pipe(U(o)).subscribe(n=>n.preventDefault()),Qa(e,t).pipe(y(n=>r.next(n)),_(()=>r.complete()),m(n=>F({ref:e},n)))}function ei(e,{worker$:t,keyboard$:r}){let o=new g,n=Te("search-query"),i=T(d(n,"keydown"),d(n,"focus")).pipe(Oe(ae),m(()=>n.value),Y());return o.pipe(je(i),m(([{suggest:s},c])=>{let p=c.split(/([\s-]+)/);if(s!=null&&s.length&&p[p.length-1]){let l=s[s.length-1];l.startsWith(p[p.length-1])&&(p[p.length-1]=l)}else p.length=0;return p})).subscribe(s=>e.innerHTML=s.join("").replace(/\s/g," ")),r.pipe(b(({mode:s})=>s==="search")).subscribe(s=>{switch(s.type){case"ArrowRight":e.innerText.length&&n.selectionStart===n.value.length&&(n.value=e.innerText);break}}),t.pipe(b(fr),m(({data:s})=>s)).pipe(y(s=>o.next(s)),_(()=>o.complete()),m(()=>({ref:e})))}function ti(e,{index$:t,keyboard$:r}){let o=we();try{let n=Bn(o.search,t),i=Te("search-query",e),a=Te("search-result",e);d(e,"click").pipe(b(({target:c})=>c instanceof Element&&!!c.closest("a"))).subscribe(()=>Be("search",!1)),r.pipe(b(({mode:c})=>c==="search")).subscribe(c=>{let p=Re();switch(c.type){case"Enter":if(p===i){let l=new Map;for(let f of R(":first-child [href]",a)){let u=f.firstElementChild;l.set(f,parseFloat(u.getAttribute("data-md-score")))}if(l.size){let[[f]]=[...l].sort(([,u],[,h])=>h-u);f.click()}c.claim()}break;case"Escape":case"Tab":Be("search",!1),i.blur();break;case"ArrowUp":case"ArrowDown":if(typeof p=="undefined")i.focus();else{let l=[i,...R(":not(details) > [href], summary, details[open] [href]",a)],f=Math.max(0,(Math.max(0,l.indexOf(p))+l.length+(c.type==="ArrowUp"?-1:1))%l.length);l[f].focus()}c.claim();break;default:i!==Re()&&i.focus()}}),r.pipe(b(({mode:c})=>c==="global")).subscribe(c=>{switch(c.type){case"f":case"s":case"/":i.focus(),i.select(),c.claim();break}});let s=Jn(i,{worker$:n});return T(s,Xn(a,{worker$:n,query$:s})).pipe($e(...ie("search-share",e).map(c=>Zn(c,{query$:s})),...ie("search-suggest",e).map(c=>ei(c,{worker$:n,keyboard$:r}))))}catch(n){return e.hidden=!0,qe}}function ri(e,{index$:t,location$:r}){return Q([t,r.pipe(q(ve()),b(o=>!!o.searchParams.get("h")))]).pipe(m(([o,n])=>Yn(o.config)(n.searchParams.get("h"))),m(o=>{var a;let n=new Map,i=document.createNodeIterator(e,NodeFilter.SHOW_TEXT);for(let s=i.nextNode();s;s=i.nextNode())if((a=s.parentElement)!=null&&a.offsetHeight){let c=s.textContent,p=o(c);p.length>c.length&&n.set(s,p)}for(let[s,c]of n){let{childNodes:p}=E("span",null,c);s.replaceWith(...Array.from(p))}return{ref:e,nodes:n}}))}function Ya(e,{viewport$:t,main$:r}){let o=e.closest(".md-grid"),n=o.offsetTop-o.parentElement.offsetTop;return Q([r,t]).pipe(m(([{offset:i,height:a},{offset:{y:s}}])=>(a=a+Math.min(n,Math.max(0,s-i))-n,{height:a,locked:s>=i+n})),Y((i,a)=>i.height===a.height&&i.locked===a.locked))}function Qr(e,o){var n=o,{header$:t}=n,r=to(n,["header$"]);let i=P(".md-sidebar__scrollwrap",e),{y:a}=Ue(i);return H(()=>{let s=new g,c=s.pipe(ee(),oe(!0)),p=s.pipe(Me(0,de));return p.pipe(ne(t)).subscribe({next([{height:l},{height:f}]){i.style.height=`${l-2*a}px`,e.style.top=`${f}px`},complete(){i.style.height="",e.style.top=""}}),p.pipe(He()).subscribe(()=>{for(let l of R(".md-nav__link--active[href]",e)){if(!l.clientHeight)continue;let f=l.closest(".md-sidebar__scrollwrap");if(typeof f!="undefined"){let u=l.offsetTop-f.offsetTop,{height:h}=pe(f);f.scrollTo({top:u-h/2})}}}),fe(R("label[tabindex]",e)).pipe(re(l=>d(l,"click").pipe(Oe(ae),m(()=>l),U(c)))).subscribe(l=>{let f=P(`[id="${l.htmlFor}"]`);P(`[aria-labelledby="${l.id}"]`).setAttribute("aria-expanded",`${f.checked}`)}),Ya(e,r).pipe(y(l=>s.next(l)),_(()=>s.complete()),m(l=>F({ref:e},l)))})}function oi(e,t){if(typeof t!="undefined"){let r=`https://api.github.com/repos/${e}/${t}`;return Lt(De(`${r}/releases/latest`).pipe(he(()=>L),m(o=>({version:o.tag_name})),Qe({})),De(r).pipe(he(()=>L),m(o=>({stars:o.stargazers_count,forks:o.forks_count})),Qe({}))).pipe(m(([o,n])=>F(F({},o),n)))}else{let r=`https://api.github.com/users/${e}`;return De(r).pipe(m(o=>({repositories:o.public_repos})),Qe({}))}}function ni(e,t){let r=`https://${e}/api/v4/projects/${encodeURIComponent(t)}`;return De(r).pipe(he(()=>L),m(({star_count:o,forks_count:n})=>({stars:o,forks:n})),Qe({}))}function ii(e){let t=e.match(/^.+github\.com\/([^/]+)\/?([^/]+)?/i);if(t){let[,r,o]=t;return oi(r,o)}if(t=e.match(/^.+?([^/]*gitlab[^/]+)\/(.+?)\/?$/i),t){let[,r,o]=t;return ni(r,o)}return L}var Ba;function Ga(e){return Ba||(Ba=H(()=>{let t=__md_get("__source",sessionStorage);if(t)return $(t);if(ie("consent").length){let o=__md_get("__consent");if(!(o&&o.github))return L}return ii(e.href).pipe(y(o=>__md_set("__source",o,sessionStorage)))}).pipe(he(()=>L),b(t=>Object.keys(t).length>0),m(t=>({facts:t})),B(1)))}function ai(e){let t=P(":scope > :last-child",e);return H(()=>{let r=new g;return r.subscribe(({facts:o})=>{t.appendChild(bn(o)),t.classList.add("md-source__repository--active")}),Ga(e).pipe(y(o=>r.next(o)),_(()=>r.complete()),m(o=>F({ref:e},o)))})}function Ja(e,{viewport$:t,header$:r}){return Ee(document.body).pipe(v(()=>pr(e,{header$:r,viewport$:t})),m(({offset:{y:o}})=>({hidden:o>=10})),X("hidden"))}function si(e,t){return H(()=>{let r=new g;return r.subscribe({next({hidden:o}){e.hidden=o},complete(){e.hidden=!1}}),(G("navigation.tabs.sticky")?$({hidden:!1}):Ja(e,t)).pipe(y(o=>r.next(o)),_(()=>r.complete()),m(o=>F({ref:e},o)))})}function Xa(e,{viewport$:t,header$:r}){let o=new Map,n=R(".md-nav__link",e);for(let s of n){let c=decodeURIComponent(s.hash.substring(1)),p=me(`[id="${c}"]`);typeof p!="undefined"&&o.set(s,p)}let i=r.pipe(X("height"),m(({height:s})=>{let c=Te("main"),p=P(":scope > :first-child",c);return s+.8*(p.offsetTop-c.offsetTop)}),le());return Ee(document.body).pipe(X("height"),v(s=>H(()=>{let c=[];return $([...o].reduce((p,[l,f])=>{for(;c.length&&o.get(c[c.length-1]).tagName>=f.tagName;)c.pop();let u=f.offsetTop;for(;!u&&f.parentElement;)f=f.parentElement,u=f.offsetTop;let h=f.offsetParent;for(;h;h=h.offsetParent)u+=h.offsetTop;return p.set([...c=[...c,l]].reverse(),u)},new Map))}).pipe(m(c=>new Map([...c].sort(([,p],[,l])=>p-l))),je(i),v(([c,p])=>t.pipe(Rr(([l,f],{offset:{y:u},size:h})=>{let w=u+h.height>=Math.floor(s.height);for(;f.length;){let[,A]=f[0];if(A-p=u&&!w)f=[l.pop(),...f];else break}return[l,f]},[[],[...c]]),Y((l,f)=>l[0]===f[0]&&l[1]===f[1])))))).pipe(m(([s,c])=>({prev:s.map(([p])=>p),next:c.map(([p])=>p)})),q({prev:[],next:[]}),Ke(2,1),m(([s,c])=>s.prev.length{let i=new g,a=i.pipe(ee(),oe(!0));if(i.subscribe(({prev:s,next:c})=>{for(let[p]of c)p.classList.remove("md-nav__link--passed"),p.classList.remove("md-nav__link--active");for(let[p,[l]]of s.entries())l.classList.add("md-nav__link--passed"),l.classList.toggle("md-nav__link--active",p===s.length-1)}),G("toc.follow")){let s=T(t.pipe(be(1),m(()=>{})),t.pipe(be(250),m(()=>"smooth")));i.pipe(b(({prev:c})=>c.length>0),je(o.pipe(Oe(ae))),ne(s)).subscribe(([[{prev:c}],p])=>{let[l]=c[c.length-1];if(l.offsetHeight){let f=sr(l);if(typeof f!="undefined"){let u=l.offsetTop-f.offsetTop,{height:h}=pe(f);f.scrollTo({top:u-h/2,behavior:p})}}})}return G("navigation.tracking")&&t.pipe(U(a),X("offset"),be(250),Le(1),U(n.pipe(Le(1))),at({delay:250}),ne(i)).subscribe(([,{prev:s}])=>{let c=ve(),p=s[s.length-1];if(p&&p.length){let[l]=p,{hash:f}=new URL(l.href);c.hash!==f&&(c.hash=f,history.replaceState({},"",`${c}`))}else c.hash="",history.replaceState({},"",`${c}`)}),Xa(e,{viewport$:t,header$:r}).pipe(y(s=>i.next(s)),_(()=>i.complete()),m(s=>F({ref:e},s)))})}function Za(e,{viewport$:t,main$:r,target$:o}){let n=t.pipe(m(({offset:{y:a}})=>a),Ke(2,1),m(([a,s])=>a>s&&s>0),Y()),i=r.pipe(m(({active:a})=>a));return Q([i,n]).pipe(m(([a,s])=>!(a&&s)),Y(),U(o.pipe(Le(1))),oe(!0),at({delay:250}),m(a=>({hidden:a})))}function pi(e,{viewport$:t,header$:r,main$:o,target$:n}){let i=new g,a=i.pipe(ee(),oe(!0));return i.subscribe({next({hidden:s}){e.hidden=s,s?(e.setAttribute("tabindex","-1"),e.blur()):e.removeAttribute("tabindex")},complete(){e.style.top="",e.hidden=!0,e.removeAttribute("tabindex")}}),r.pipe(U(a),X("height")).subscribe(({height:s})=>{e.style.top=`${s+16}px`}),d(e,"click").subscribe(s=>{s.preventDefault(),window.scrollTo({top:0})}),Za(e,{viewport$:t,main$:o,target$:n}).pipe(y(s=>i.next(s)),_(()=>i.complete()),m(s=>F({ref:e},s)))}function li({document$:e}){e.pipe(v(()=>R(".md-ellipsis")),re(t=>yt(t).pipe(U(e.pipe(Le(1))),b(r=>r),m(()=>t),ye(1))),b(t=>t.offsetWidth{let r=t.innerText,o=t.closest("a")||t;return o.title=r,Ge(o).pipe(U(e.pipe(Le(1))),_(()=>o.removeAttribute("title")))})).subscribe(),e.pipe(v(()=>R(".md-status")),re(t=>Ge(t))).subscribe()}function mi({document$:e,tablet$:t}){e.pipe(v(()=>R(".md-toggle--indeterminate")),y(r=>{r.indeterminate=!0,r.checked=!1}),re(r=>d(r,"change").pipe(Fr(()=>r.classList.contains("md-toggle--indeterminate")),m(()=>r))),ne(t)).subscribe(([r,o])=>{r.classList.remove("md-toggle--indeterminate"),o&&(r.checked=!1)})}function es(){return/(iPad|iPhone|iPod)/.test(navigator.userAgent)}function fi({document$:e}){e.pipe(v(()=>R("[data-md-scrollfix]")),y(t=>t.removeAttribute("data-md-scrollfix")),b(es),re(t=>d(t,"touchstart").pipe(m(()=>t)))).subscribe(t=>{let r=t.scrollTop;r===0?t.scrollTop=1:r+t.offsetHeight===t.scrollHeight&&(t.scrollTop=r-1)})}function ui({viewport$:e,tablet$:t}){Q([We("search"),t]).pipe(m(([r,o])=>r&&!o),v(r=>$(r).pipe(Ye(r?400:100))),ne(e)).subscribe(([r,{offset:{y:o}}])=>{if(r)document.body.setAttribute("data-md-scrolllock",""),document.body.style.top=`-${o}px`;else{let n=-1*parseInt(document.body.style.top,10);document.body.removeAttribute("data-md-scrolllock"),document.body.style.top="",n&&window.scrollTo(0,n)}})}Object.entries||(Object.entries=function(e){let t=[];for(let r of Object.keys(e))t.push([r,e[r]]);return t});Object.values||(Object.values=function(e){let t=[];for(let r of Object.keys(e))t.push(e[r]);return t});typeof Element!="undefined"&&(Element.prototype.scrollTo||(Element.prototype.scrollTo=function(e,t){typeof e=="object"?(this.scrollLeft=e.left,this.scrollTop=e.top):(this.scrollLeft=e,this.scrollTop=t)}),Element.prototype.replaceWith||(Element.prototype.replaceWith=function(...e){let t=this.parentNode;if(t){e.length===0&&t.removeChild(this);for(let r=e.length-1;r>=0;r--){let o=e[r];typeof o=="string"?o=document.createTextNode(o):o.parentNode&&o.parentNode.removeChild(o),r?t.insertBefore(this.previousSibling,o):t.replaceChild(o,this)}}}));function ts(){return location.protocol==="file:"?gt(`${new URL("search/search_index.js",Yr.base)}`).pipe(m(()=>__index),B(1)):De(new URL("search/search_index.json",Yr.base))}document.documentElement.classList.remove("no-js");document.documentElement.classList.add("js");var rt=No(),Rt=Jo(),wt=en(Rt),Br=Go(),_e=pn(),ur=At("(min-width: 960px)"),hi=At("(min-width: 1220px)"),bi=tn(),Yr=we(),vi=document.forms.namedItem("search")?ts():qe,Gr=new g;Wn({alert$:Gr});var Jr=new g;G("navigation.instant")&&zn({location$:Rt,viewport$:_e,progress$:Jr}).subscribe(rt);var di;((di=Yr.version)==null?void 0:di.provider)==="mike"&&Gn({document$:rt});T(Rt,wt).pipe(Ye(125)).subscribe(()=>{Be("drawer",!1),Be("search",!1)});Br.pipe(b(({mode:e})=>e==="global")).subscribe(e=>{switch(e.type){case"p":case",":let t=me("link[rel=prev]");typeof t!="undefined"&&st(t);break;case"n":case".":let r=me("link[rel=next]");typeof r!="undefined"&&st(r);break;case"Enter":let o=Re();o instanceof HTMLLabelElement&&o.click()}});li({document$:rt});mi({document$:rt,tablet$:ur});fi({document$:rt});ui({viewport$:_e,tablet$:ur});var tt=Rn(Te("header"),{viewport$:_e}),$t=rt.pipe(m(()=>Te("main")),v(e=>Fn(e,{viewport$:_e,header$:tt})),B(1)),rs=T(...ie("consent").map(e=>fn(e,{target$:wt})),...ie("dialog").map(e=>$n(e,{alert$:Gr})),...ie("header").map(e=>Pn(e,{viewport$:_e,header$:tt,main$:$t})),...ie("palette").map(e=>jn(e)),...ie("progress").map(e=>Un(e,{progress$:Jr})),...ie("search").map(e=>ti(e,{index$:vi,keyboard$:Br})),...ie("source").map(e=>ai(e))),os=H(()=>T(...ie("announce").map(e=>mn(e)),...ie("content").map(e=>Hn(e,{viewport$:_e,target$:wt,print$:bi})),...ie("content").map(e=>G("search.highlight")?ri(e,{index$:vi,location$:Rt}):L),...ie("header-title").map(e=>In(e,{viewport$:_e,header$:tt})),...ie("sidebar").map(e=>e.getAttribute("data-md-type")==="navigation"?Ur(hi,()=>Qr(e,{viewport$:_e,header$:tt,main$:$t})):Ur(ur,()=>Qr(e,{viewport$:_e,header$:tt,main$:$t}))),...ie("tabs").map(e=>si(e,{viewport$:_e,header$:tt})),...ie("toc").map(e=>ci(e,{viewport$:_e,header$:tt,main$:$t,target$:wt})),...ie("top").map(e=>pi(e,{viewport$:_e,header$:tt,main$:$t,target$:wt})))),gi=rt.pipe(v(()=>os),$e(rs),B(1));gi.subscribe();window.document$=rt;window.location$=Rt;window.target$=wt;window.keyboard$=Br;window.viewport$=_e;window.tablet$=ur;window.screen$=hi;window.print$=bi;window.alert$=Gr;window.progress$=Jr;window.component$=gi;})(); +//# sourceMappingURL=bundle.1e8ae164.min.js.map + diff --git a/assets/javascripts/bundle.1e8ae164.min.js.map b/assets/javascripts/bundle.1e8ae164.min.js.map new file mode 100644 index 00000000..6c33b8e8 --- /dev/null +++ b/assets/javascripts/bundle.1e8ae164.min.js.map @@ -0,0 +1,7 @@ +{ + "version": 3, + "sources": ["node_modules/focus-visible/dist/focus-visible.js", "node_modules/clipboard/dist/clipboard.js", "node_modules/escape-html/index.js", "src/templates/assets/javascripts/bundle.ts", "node_modules/rxjs/node_modules/tslib/tslib.es6.js", "node_modules/rxjs/src/internal/util/isFunction.ts", "node_modules/rxjs/src/internal/util/createErrorClass.ts", "node_modules/rxjs/src/internal/util/UnsubscriptionError.ts", "node_modules/rxjs/src/internal/util/arrRemove.ts", "node_modules/rxjs/src/internal/Subscription.ts", "node_modules/rxjs/src/internal/config.ts", "node_modules/rxjs/src/internal/scheduler/timeoutProvider.ts", "node_modules/rxjs/src/internal/util/reportUnhandledError.ts", "node_modules/rxjs/src/internal/util/noop.ts", "node_modules/rxjs/src/internal/NotificationFactories.ts", "node_modules/rxjs/src/internal/util/errorContext.ts", "node_modules/rxjs/src/internal/Subscriber.ts", "node_modules/rxjs/src/internal/symbol/observable.ts", "node_modules/rxjs/src/internal/util/identity.ts", "node_modules/rxjs/src/internal/util/pipe.ts", "node_modules/rxjs/src/internal/Observable.ts", "node_modules/rxjs/src/internal/util/lift.ts", "node_modules/rxjs/src/internal/operators/OperatorSubscriber.ts", "node_modules/rxjs/src/internal/scheduler/animationFrameProvider.ts", "node_modules/rxjs/src/internal/util/ObjectUnsubscribedError.ts", "node_modules/rxjs/src/internal/Subject.ts", "node_modules/rxjs/src/internal/scheduler/dateTimestampProvider.ts", "node_modules/rxjs/src/internal/ReplaySubject.ts", "node_modules/rxjs/src/internal/scheduler/Action.ts", "node_modules/rxjs/src/internal/scheduler/intervalProvider.ts", "node_modules/rxjs/src/internal/scheduler/AsyncAction.ts", "node_modules/rxjs/src/internal/Scheduler.ts", "node_modules/rxjs/src/internal/scheduler/AsyncScheduler.ts", "node_modules/rxjs/src/internal/scheduler/async.ts", "node_modules/rxjs/src/internal/scheduler/AnimationFrameAction.ts", "node_modules/rxjs/src/internal/scheduler/AnimationFrameScheduler.ts", "node_modules/rxjs/src/internal/scheduler/animationFrame.ts", "node_modules/rxjs/src/internal/observable/empty.ts", "node_modules/rxjs/src/internal/util/isScheduler.ts", "node_modules/rxjs/src/internal/util/args.ts", "node_modules/rxjs/src/internal/util/isArrayLike.ts", "node_modules/rxjs/src/internal/util/isPromise.ts", "node_modules/rxjs/src/internal/util/isInteropObservable.ts", "node_modules/rxjs/src/internal/util/isAsyncIterable.ts", "node_modules/rxjs/src/internal/util/throwUnobservableError.ts", "node_modules/rxjs/src/internal/symbol/iterator.ts", "node_modules/rxjs/src/internal/util/isIterable.ts", "node_modules/rxjs/src/internal/util/isReadableStreamLike.ts", "node_modules/rxjs/src/internal/observable/innerFrom.ts", "node_modules/rxjs/src/internal/util/executeSchedule.ts", "node_modules/rxjs/src/internal/operators/observeOn.ts", "node_modules/rxjs/src/internal/operators/subscribeOn.ts", "node_modules/rxjs/src/internal/scheduled/scheduleObservable.ts", "node_modules/rxjs/src/internal/scheduled/schedulePromise.ts", "node_modules/rxjs/src/internal/scheduled/scheduleArray.ts", "node_modules/rxjs/src/internal/scheduled/scheduleIterable.ts", "node_modules/rxjs/src/internal/scheduled/scheduleAsyncIterable.ts", "node_modules/rxjs/src/internal/scheduled/scheduleReadableStreamLike.ts", "node_modules/rxjs/src/internal/scheduled/scheduled.ts", "node_modules/rxjs/src/internal/observable/from.ts", "node_modules/rxjs/src/internal/observable/of.ts", "node_modules/rxjs/src/internal/observable/throwError.ts", "node_modules/rxjs/src/internal/util/EmptyError.ts", "node_modules/rxjs/src/internal/util/isDate.ts", "node_modules/rxjs/src/internal/operators/map.ts", "node_modules/rxjs/src/internal/util/mapOneOrManyArgs.ts", "node_modules/rxjs/src/internal/util/argsArgArrayOrObject.ts", "node_modules/rxjs/src/internal/util/createObject.ts", "node_modules/rxjs/src/internal/observable/combineLatest.ts", "node_modules/rxjs/src/internal/operators/mergeInternals.ts", "node_modules/rxjs/src/internal/operators/mergeMap.ts", "node_modules/rxjs/src/internal/operators/mergeAll.ts", "node_modules/rxjs/src/internal/operators/concatAll.ts", "node_modules/rxjs/src/internal/observable/concat.ts", "node_modules/rxjs/src/internal/observable/defer.ts", "node_modules/rxjs/src/internal/observable/fromEvent.ts", "node_modules/rxjs/src/internal/observable/fromEventPattern.ts", "node_modules/rxjs/src/internal/observable/timer.ts", "node_modules/rxjs/src/internal/observable/merge.ts", "node_modules/rxjs/src/internal/observable/never.ts", "node_modules/rxjs/src/internal/util/argsOrArgArray.ts", "node_modules/rxjs/src/internal/operators/filter.ts", "node_modules/rxjs/src/internal/observable/zip.ts", "node_modules/rxjs/src/internal/operators/audit.ts", "node_modules/rxjs/src/internal/operators/auditTime.ts", "node_modules/rxjs/src/internal/operators/bufferCount.ts", "node_modules/rxjs/src/internal/operators/catchError.ts", "node_modules/rxjs/src/internal/operators/scanInternals.ts", "node_modules/rxjs/src/internal/operators/combineLatest.ts", "node_modules/rxjs/src/internal/operators/combineLatestWith.ts", "node_modules/rxjs/src/internal/operators/debounceTime.ts", "node_modules/rxjs/src/internal/operators/defaultIfEmpty.ts", "node_modules/rxjs/src/internal/operators/take.ts", "node_modules/rxjs/src/internal/operators/ignoreElements.ts", "node_modules/rxjs/src/internal/operators/mapTo.ts", "node_modules/rxjs/src/internal/operators/delayWhen.ts", "node_modules/rxjs/src/internal/operators/delay.ts", "node_modules/rxjs/src/internal/operators/distinctUntilChanged.ts", "node_modules/rxjs/src/internal/operators/distinctUntilKeyChanged.ts", "node_modules/rxjs/src/internal/operators/throwIfEmpty.ts", "node_modules/rxjs/src/internal/operators/endWith.ts", "node_modules/rxjs/src/internal/operators/finalize.ts", "node_modules/rxjs/src/internal/operators/first.ts", "node_modules/rxjs/src/internal/operators/takeLast.ts", "node_modules/rxjs/src/internal/operators/merge.ts", "node_modules/rxjs/src/internal/operators/mergeWith.ts", "node_modules/rxjs/src/internal/operators/repeat.ts", "node_modules/rxjs/src/internal/operators/scan.ts", "node_modules/rxjs/src/internal/operators/share.ts", "node_modules/rxjs/src/internal/operators/shareReplay.ts", "node_modules/rxjs/src/internal/operators/skip.ts", "node_modules/rxjs/src/internal/operators/skipUntil.ts", "node_modules/rxjs/src/internal/operators/startWith.ts", "node_modules/rxjs/src/internal/operators/switchMap.ts", "node_modules/rxjs/src/internal/operators/takeUntil.ts", "node_modules/rxjs/src/internal/operators/takeWhile.ts", "node_modules/rxjs/src/internal/operators/tap.ts", "node_modules/rxjs/src/internal/operators/throttle.ts", "node_modules/rxjs/src/internal/operators/throttleTime.ts", "node_modules/rxjs/src/internal/operators/withLatestFrom.ts", "node_modules/rxjs/src/internal/operators/zip.ts", "node_modules/rxjs/src/internal/operators/zipWith.ts", "src/templates/assets/javascripts/browser/document/index.ts", "src/templates/assets/javascripts/browser/element/_/index.ts", "src/templates/assets/javascripts/browser/element/focus/index.ts", "src/templates/assets/javascripts/browser/element/hover/index.ts", "src/templates/assets/javascripts/browser/element/offset/_/index.ts", "src/templates/assets/javascripts/browser/element/offset/content/index.ts", "src/templates/assets/javascripts/utilities/h/index.ts", "src/templates/assets/javascripts/utilities/round/index.ts", "src/templates/assets/javascripts/browser/script/index.ts", "src/templates/assets/javascripts/browser/element/size/_/index.ts", "src/templates/assets/javascripts/browser/element/size/content/index.ts", "src/templates/assets/javascripts/browser/element/visibility/index.ts", "src/templates/assets/javascripts/browser/toggle/index.ts", "src/templates/assets/javascripts/browser/keyboard/index.ts", "src/templates/assets/javascripts/browser/location/_/index.ts", "src/templates/assets/javascripts/browser/location/hash/index.ts", "src/templates/assets/javascripts/browser/media/index.ts", "src/templates/assets/javascripts/browser/request/index.ts", "src/templates/assets/javascripts/browser/viewport/offset/index.ts", "src/templates/assets/javascripts/browser/viewport/size/index.ts", "src/templates/assets/javascripts/browser/viewport/_/index.ts", "src/templates/assets/javascripts/browser/viewport/at/index.ts", "src/templates/assets/javascripts/browser/worker/index.ts", "src/templates/assets/javascripts/_/index.ts", "src/templates/assets/javascripts/components/_/index.ts", "src/templates/assets/javascripts/components/announce/index.ts", "src/templates/assets/javascripts/components/consent/index.ts", "src/templates/assets/javascripts/templates/tooltip/index.tsx", "src/templates/assets/javascripts/templates/annotation/index.tsx", "src/templates/assets/javascripts/templates/clipboard/index.tsx", "src/templates/assets/javascripts/templates/search/index.tsx", "src/templates/assets/javascripts/templates/source/index.tsx", "src/templates/assets/javascripts/templates/tabbed/index.tsx", "src/templates/assets/javascripts/templates/table/index.tsx", "src/templates/assets/javascripts/templates/version/index.tsx", "src/templates/assets/javascripts/components/tooltip/index.ts", "src/templates/assets/javascripts/components/content/annotation/_/index.ts", "src/templates/assets/javascripts/components/content/annotation/list/index.ts", "src/templates/assets/javascripts/components/content/annotation/block/index.ts", "src/templates/assets/javascripts/components/content/code/_/index.ts", "src/templates/assets/javascripts/components/content/details/index.ts", "src/templates/assets/javascripts/components/content/mermaid/index.css", "src/templates/assets/javascripts/components/content/mermaid/index.ts", "src/templates/assets/javascripts/components/content/table/index.ts", "src/templates/assets/javascripts/components/content/tabs/index.ts", "src/templates/assets/javascripts/components/content/_/index.ts", "src/templates/assets/javascripts/components/dialog/index.ts", "src/templates/assets/javascripts/components/header/_/index.ts", "src/templates/assets/javascripts/components/header/title/index.ts", "src/templates/assets/javascripts/components/main/index.ts", "src/templates/assets/javascripts/components/palette/index.ts", "src/templates/assets/javascripts/components/progress/index.ts", "src/templates/assets/javascripts/integrations/clipboard/index.ts", "src/templates/assets/javascripts/integrations/sitemap/index.ts", "src/templates/assets/javascripts/integrations/instant/index.ts", "src/templates/assets/javascripts/integrations/search/highlighter/index.ts", "src/templates/assets/javascripts/integrations/search/worker/message/index.ts", "src/templates/assets/javascripts/integrations/search/worker/_/index.ts", "src/templates/assets/javascripts/integrations/version/index.ts", "src/templates/assets/javascripts/components/search/query/index.ts", "src/templates/assets/javascripts/components/search/result/index.ts", "src/templates/assets/javascripts/components/search/share/index.ts", "src/templates/assets/javascripts/components/search/suggest/index.ts", "src/templates/assets/javascripts/components/search/_/index.ts", "src/templates/assets/javascripts/components/search/highlight/index.ts", "src/templates/assets/javascripts/components/sidebar/index.ts", "src/templates/assets/javascripts/components/source/facts/github/index.ts", "src/templates/assets/javascripts/components/source/facts/gitlab/index.ts", "src/templates/assets/javascripts/components/source/facts/_/index.ts", "src/templates/assets/javascripts/components/source/_/index.ts", "src/templates/assets/javascripts/components/tabs/index.ts", "src/templates/assets/javascripts/components/toc/index.ts", "src/templates/assets/javascripts/components/top/index.ts", "src/templates/assets/javascripts/patches/ellipsis/index.ts", "src/templates/assets/javascripts/patches/indeterminate/index.ts", "src/templates/assets/javascripts/patches/scrollfix/index.ts", "src/templates/assets/javascripts/patches/scrolllock/index.ts", "src/templates/assets/javascripts/polyfills/index.ts"], + "sourcesContent": ["(function (global, factory) {\n typeof exports === 'object' && typeof module !== 'undefined' ? factory() :\n typeof define === 'function' && define.amd ? define(factory) :\n (factory());\n}(this, (function () { 'use strict';\n\n /**\n * Applies the :focus-visible polyfill at the given scope.\n * A scope in this case is either the top-level Document or a Shadow Root.\n *\n * @param {(Document|ShadowRoot)} scope\n * @see https://github.com/WICG/focus-visible\n */\n function applyFocusVisiblePolyfill(scope) {\n var hadKeyboardEvent = true;\n var hadFocusVisibleRecently = false;\n var hadFocusVisibleRecentlyTimeout = null;\n\n var inputTypesAllowlist = {\n text: true,\n search: true,\n url: true,\n tel: true,\n email: true,\n password: true,\n number: true,\n date: true,\n month: true,\n week: true,\n time: true,\n datetime: true,\n 'datetime-local': true\n };\n\n /**\n * Helper function for legacy browsers and iframes which sometimes focus\n * elements like document, body, and non-interactive SVG.\n * @param {Element} el\n */\n function isValidFocusTarget(el) {\n if (\n el &&\n el !== document &&\n el.nodeName !== 'HTML' &&\n el.nodeName !== 'BODY' &&\n 'classList' in el &&\n 'contains' in el.classList\n ) {\n return true;\n }\n return false;\n }\n\n /**\n * Computes whether the given element should automatically trigger the\n * `focus-visible` class being added, i.e. whether it should always match\n * `:focus-visible` when focused.\n * @param {Element} el\n * @return {boolean}\n */\n function focusTriggersKeyboardModality(el) {\n var type = el.type;\n var tagName = el.tagName;\n\n if (tagName === 'INPUT' && inputTypesAllowlist[type] && !el.readOnly) {\n return true;\n }\n\n if (tagName === 'TEXTAREA' && !el.readOnly) {\n return true;\n }\n\n if (el.isContentEditable) {\n return true;\n }\n\n return false;\n }\n\n /**\n * Add the `focus-visible` class to the given element if it was not added by\n * the author.\n * @param {Element} el\n */\n function addFocusVisibleClass(el) {\n if (el.classList.contains('focus-visible')) {\n return;\n }\n el.classList.add('focus-visible');\n el.setAttribute('data-focus-visible-added', '');\n }\n\n /**\n * Remove the `focus-visible` class from the given element if it was not\n * originally added by the author.\n * @param {Element} el\n */\n function removeFocusVisibleClass(el) {\n if (!el.hasAttribute('data-focus-visible-added')) {\n return;\n }\n el.classList.remove('focus-visible');\n el.removeAttribute('data-focus-visible-added');\n }\n\n /**\n * If the most recent user interaction was via the keyboard;\n * and the key press did not include a meta, alt/option, or control key;\n * then the modality is keyboard. Otherwise, the modality is not keyboard.\n * Apply `focus-visible` to any current active element and keep track\n * of our keyboard modality state with `hadKeyboardEvent`.\n * @param {KeyboardEvent} e\n */\n function onKeyDown(e) {\n if (e.metaKey || e.altKey || e.ctrlKey) {\n return;\n }\n\n if (isValidFocusTarget(scope.activeElement)) {\n addFocusVisibleClass(scope.activeElement);\n }\n\n hadKeyboardEvent = true;\n }\n\n /**\n * If at any point a user clicks with a pointing device, ensure that we change\n * the modality away from keyboard.\n * This avoids the situation where a user presses a key on an already focused\n * element, and then clicks on a different element, focusing it with a\n * pointing device, while we still think we're in keyboard modality.\n * @param {Event} e\n */\n function onPointerDown(e) {\n hadKeyboardEvent = false;\n }\n\n /**\n * On `focus`, add the `focus-visible` class to the target if:\n * - the target received focus as a result of keyboard navigation, or\n * - the event target is an element that will likely require interaction\n * via the keyboard (e.g. a text box)\n * @param {Event} e\n */\n function onFocus(e) {\n // Prevent IE from focusing the document or HTML element.\n if (!isValidFocusTarget(e.target)) {\n return;\n }\n\n if (hadKeyboardEvent || focusTriggersKeyboardModality(e.target)) {\n addFocusVisibleClass(e.target);\n }\n }\n\n /**\n * On `blur`, remove the `focus-visible` class from the target.\n * @param {Event} e\n */\n function onBlur(e) {\n if (!isValidFocusTarget(e.target)) {\n return;\n }\n\n if (\n e.target.classList.contains('focus-visible') ||\n e.target.hasAttribute('data-focus-visible-added')\n ) {\n // To detect a tab/window switch, we look for a blur event followed\n // rapidly by a visibility change.\n // If we don't see a visibility change within 100ms, it's probably a\n // regular focus change.\n hadFocusVisibleRecently = true;\n window.clearTimeout(hadFocusVisibleRecentlyTimeout);\n hadFocusVisibleRecentlyTimeout = window.setTimeout(function() {\n hadFocusVisibleRecently = false;\n }, 100);\n removeFocusVisibleClass(e.target);\n }\n }\n\n /**\n * If the user changes tabs, keep track of whether or not the previously\n * focused element had .focus-visible.\n * @param {Event} e\n */\n function onVisibilityChange(e) {\n if (document.visibilityState === 'hidden') {\n // If the tab becomes active again, the browser will handle calling focus\n // on the element (Safari actually calls it twice).\n // If this tab change caused a blur on an element with focus-visible,\n // re-apply the class when the user switches back to the tab.\n if (hadFocusVisibleRecently) {\n hadKeyboardEvent = true;\n }\n addInitialPointerMoveListeners();\n }\n }\n\n /**\n * Add a group of listeners to detect usage of any pointing devices.\n * These listeners will be added when the polyfill first loads, and anytime\n * the window is blurred, so that they are active when the window regains\n * focus.\n */\n function addInitialPointerMoveListeners() {\n document.addEventListener('mousemove', onInitialPointerMove);\n document.addEventListener('mousedown', onInitialPointerMove);\n document.addEventListener('mouseup', onInitialPointerMove);\n document.addEventListener('pointermove', onInitialPointerMove);\n document.addEventListener('pointerdown', onInitialPointerMove);\n document.addEventListener('pointerup', onInitialPointerMove);\n document.addEventListener('touchmove', onInitialPointerMove);\n document.addEventListener('touchstart', onInitialPointerMove);\n document.addEventListener('touchend', onInitialPointerMove);\n }\n\n function removeInitialPointerMoveListeners() {\n document.removeEventListener('mousemove', onInitialPointerMove);\n document.removeEventListener('mousedown', onInitialPointerMove);\n document.removeEventListener('mouseup', onInitialPointerMove);\n document.removeEventListener('pointermove', onInitialPointerMove);\n document.removeEventListener('pointerdown', onInitialPointerMove);\n document.removeEventListener('pointerup', onInitialPointerMove);\n document.removeEventListener('touchmove', onInitialPointerMove);\n document.removeEventListener('touchstart', onInitialPointerMove);\n document.removeEventListener('touchend', onInitialPointerMove);\n }\n\n /**\n * When the polfyill first loads, assume the user is in keyboard modality.\n * If any event is received from a pointing device (e.g. mouse, pointer,\n * touch), turn off keyboard modality.\n * This accounts for situations where focus enters the page from the URL bar.\n * @param {Event} e\n */\n function onInitialPointerMove(e) {\n // Work around a Safari quirk that fires a mousemove on whenever the\n // window blurs, even if you're tabbing out of the page. \u00AF\\_(\u30C4)_/\u00AF\n if (e.target.nodeName && e.target.nodeName.toLowerCase() === 'html') {\n return;\n }\n\n hadKeyboardEvent = false;\n removeInitialPointerMoveListeners();\n }\n\n // For some kinds of state, we are interested in changes at the global scope\n // only. For example, global pointer input, global key presses and global\n // visibility change should affect the state at every scope:\n document.addEventListener('keydown', onKeyDown, true);\n document.addEventListener('mousedown', onPointerDown, true);\n document.addEventListener('pointerdown', onPointerDown, true);\n document.addEventListener('touchstart', onPointerDown, true);\n document.addEventListener('visibilitychange', onVisibilityChange, true);\n\n addInitialPointerMoveListeners();\n\n // For focus and blur, we specifically care about state changes in the local\n // scope. This is because focus / blur events that originate from within a\n // shadow root are not re-dispatched from the host element if it was already\n // the active element in its own scope:\n scope.addEventListener('focus', onFocus, true);\n scope.addEventListener('blur', onBlur, true);\n\n // We detect that a node is a ShadowRoot by ensuring that it is a\n // DocumentFragment and also has a host property. This check covers native\n // implementation and polyfill implementation transparently. If we only cared\n // about the native implementation, we could just check if the scope was\n // an instance of a ShadowRoot.\n if (scope.nodeType === Node.DOCUMENT_FRAGMENT_NODE && scope.host) {\n // Since a ShadowRoot is a special kind of DocumentFragment, it does not\n // have a root element to add a class to. So, we add this attribute to the\n // host element instead:\n scope.host.setAttribute('data-js-focus-visible', '');\n } else if (scope.nodeType === Node.DOCUMENT_NODE) {\n document.documentElement.classList.add('js-focus-visible');\n document.documentElement.setAttribute('data-js-focus-visible', '');\n }\n }\n\n // It is important to wrap all references to global window and document in\n // these checks to support server-side rendering use cases\n // @see https://github.com/WICG/focus-visible/issues/199\n if (typeof window !== 'undefined' && typeof document !== 'undefined') {\n // Make the polyfill helper globally available. This can be used as a signal\n // to interested libraries that wish to coordinate with the polyfill for e.g.,\n // applying the polyfill to a shadow root:\n window.applyFocusVisiblePolyfill = applyFocusVisiblePolyfill;\n\n // Notify interested libraries of the polyfill's presence, in case the\n // polyfill was loaded lazily:\n var event;\n\n try {\n event = new CustomEvent('focus-visible-polyfill-ready');\n } catch (error) {\n // IE11 does not support using CustomEvent as a constructor directly:\n event = document.createEvent('CustomEvent');\n event.initCustomEvent('focus-visible-polyfill-ready', false, false, {});\n }\n\n window.dispatchEvent(event);\n }\n\n if (typeof document !== 'undefined') {\n // Apply the polyfill to the global document, so that no JavaScript\n // coordination is required to use the polyfill in the top-level document:\n applyFocusVisiblePolyfill(document);\n }\n\n})));\n", "/*!\n * clipboard.js v2.0.11\n * https://clipboardjs.com/\n *\n * Licensed MIT \u00A9 Zeno Rocha\n */\n(function webpackUniversalModuleDefinition(root, factory) {\n\tif(typeof exports === 'object' && typeof module === 'object')\n\t\tmodule.exports = factory();\n\telse if(typeof define === 'function' && define.amd)\n\t\tdefine([], factory);\n\telse if(typeof exports === 'object')\n\t\texports[\"ClipboardJS\"] = factory();\n\telse\n\t\troot[\"ClipboardJS\"] = factory();\n})(this, function() {\nreturn /******/ (function() { // webpackBootstrap\n/******/ \tvar __webpack_modules__ = ({\n\n/***/ 686:\n/***/ (function(__unused_webpack_module, __webpack_exports__, __webpack_require__) {\n\n\"use strict\";\n\n// EXPORTS\n__webpack_require__.d(__webpack_exports__, {\n \"default\": function() { return /* binding */ clipboard; }\n});\n\n// EXTERNAL MODULE: ./node_modules/tiny-emitter/index.js\nvar tiny_emitter = __webpack_require__(279);\nvar tiny_emitter_default = /*#__PURE__*/__webpack_require__.n(tiny_emitter);\n// EXTERNAL MODULE: ./node_modules/good-listener/src/listen.js\nvar listen = __webpack_require__(370);\nvar listen_default = /*#__PURE__*/__webpack_require__.n(listen);\n// EXTERNAL MODULE: ./node_modules/select/src/select.js\nvar src_select = __webpack_require__(817);\nvar select_default = /*#__PURE__*/__webpack_require__.n(src_select);\n;// CONCATENATED MODULE: ./src/common/command.js\n/**\n * Executes a given operation type.\n * @param {String} type\n * @return {Boolean}\n */\nfunction command(type) {\n try {\n return document.execCommand(type);\n } catch (err) {\n return false;\n }\n}\n;// CONCATENATED MODULE: ./src/actions/cut.js\n\n\n/**\n * Cut action wrapper.\n * @param {String|HTMLElement} target\n * @return {String}\n */\n\nvar ClipboardActionCut = function ClipboardActionCut(target) {\n var selectedText = select_default()(target);\n command('cut');\n return selectedText;\n};\n\n/* harmony default export */ var actions_cut = (ClipboardActionCut);\n;// CONCATENATED MODULE: ./src/common/create-fake-element.js\n/**\n * Creates a fake textarea element with a value.\n * @param {String} value\n * @return {HTMLElement}\n */\nfunction createFakeElement(value) {\n var isRTL = document.documentElement.getAttribute('dir') === 'rtl';\n var fakeElement = document.createElement('textarea'); // Prevent zooming on iOS\n\n fakeElement.style.fontSize = '12pt'; // Reset box model\n\n fakeElement.style.border = '0';\n fakeElement.style.padding = '0';\n fakeElement.style.margin = '0'; // Move element out of screen horizontally\n\n fakeElement.style.position = 'absolute';\n fakeElement.style[isRTL ? 'right' : 'left'] = '-9999px'; // Move element to the same position vertically\n\n var yPosition = window.pageYOffset || document.documentElement.scrollTop;\n fakeElement.style.top = \"\".concat(yPosition, \"px\");\n fakeElement.setAttribute('readonly', '');\n fakeElement.value = value;\n return fakeElement;\n}\n;// CONCATENATED MODULE: ./src/actions/copy.js\n\n\n\n/**\n * Create fake copy action wrapper using a fake element.\n * @param {String} target\n * @param {Object} options\n * @return {String}\n */\n\nvar fakeCopyAction = function fakeCopyAction(value, options) {\n var fakeElement = createFakeElement(value);\n options.container.appendChild(fakeElement);\n var selectedText = select_default()(fakeElement);\n command('copy');\n fakeElement.remove();\n return selectedText;\n};\n/**\n * Copy action wrapper.\n * @param {String|HTMLElement} target\n * @param {Object} options\n * @return {String}\n */\n\n\nvar ClipboardActionCopy = function ClipboardActionCopy(target) {\n var options = arguments.length > 1 && arguments[1] !== undefined ? arguments[1] : {\n container: document.body\n };\n var selectedText = '';\n\n if (typeof target === 'string') {\n selectedText = fakeCopyAction(target, options);\n } else if (target instanceof HTMLInputElement && !['text', 'search', 'url', 'tel', 'password'].includes(target === null || target === void 0 ? void 0 : target.type)) {\n // If input type doesn't support `setSelectionRange`. Simulate it. https://developer.mozilla.org/en-US/docs/Web/API/HTMLInputElement/setSelectionRange\n selectedText = fakeCopyAction(target.value, options);\n } else {\n selectedText = select_default()(target);\n command('copy');\n }\n\n return selectedText;\n};\n\n/* harmony default export */ var actions_copy = (ClipboardActionCopy);\n;// CONCATENATED MODULE: ./src/actions/default.js\nfunction _typeof(obj) { \"@babel/helpers - typeof\"; if (typeof Symbol === \"function\" && typeof Symbol.iterator === \"symbol\") { _typeof = function _typeof(obj) { return typeof obj; }; } else { _typeof = function _typeof(obj) { return obj && typeof Symbol === \"function\" && obj.constructor === Symbol && obj !== Symbol.prototype ? \"symbol\" : typeof obj; }; } return _typeof(obj); }\n\n\n\n/**\n * Inner function which performs selection from either `text` or `target`\n * properties and then executes copy or cut operations.\n * @param {Object} options\n */\n\nvar ClipboardActionDefault = function ClipboardActionDefault() {\n var options = arguments.length > 0 && arguments[0] !== undefined ? arguments[0] : {};\n // Defines base properties passed from constructor.\n var _options$action = options.action,\n action = _options$action === void 0 ? 'copy' : _options$action,\n container = options.container,\n target = options.target,\n text = options.text; // Sets the `action` to be performed which can be either 'copy' or 'cut'.\n\n if (action !== 'copy' && action !== 'cut') {\n throw new Error('Invalid \"action\" value, use either \"copy\" or \"cut\"');\n } // Sets the `target` property using an element that will be have its content copied.\n\n\n if (target !== undefined) {\n if (target && _typeof(target) === 'object' && target.nodeType === 1) {\n if (action === 'copy' && target.hasAttribute('disabled')) {\n throw new Error('Invalid \"target\" attribute. Please use \"readonly\" instead of \"disabled\" attribute');\n }\n\n if (action === 'cut' && (target.hasAttribute('readonly') || target.hasAttribute('disabled'))) {\n throw new Error('Invalid \"target\" attribute. You can\\'t cut text from elements with \"readonly\" or \"disabled\" attributes');\n }\n } else {\n throw new Error('Invalid \"target\" value, use a valid Element');\n }\n } // Define selection strategy based on `text` property.\n\n\n if (text) {\n return actions_copy(text, {\n container: container\n });\n } // Defines which selection strategy based on `target` property.\n\n\n if (target) {\n return action === 'cut' ? actions_cut(target) : actions_copy(target, {\n container: container\n });\n }\n};\n\n/* harmony default export */ var actions_default = (ClipboardActionDefault);\n;// CONCATENATED MODULE: ./src/clipboard.js\nfunction clipboard_typeof(obj) { \"@babel/helpers - typeof\"; if (typeof Symbol === \"function\" && typeof Symbol.iterator === \"symbol\") { clipboard_typeof = function _typeof(obj) { return typeof obj; }; } else { clipboard_typeof = function _typeof(obj) { return obj && typeof Symbol === \"function\" && obj.constructor === Symbol && obj !== Symbol.prototype ? \"symbol\" : typeof obj; }; } return clipboard_typeof(obj); }\n\nfunction _classCallCheck(instance, Constructor) { if (!(instance instanceof Constructor)) { throw new TypeError(\"Cannot call a class as a function\"); } }\n\nfunction _defineProperties(target, props) { for (var i = 0; i < props.length; i++) { var descriptor = props[i]; descriptor.enumerable = descriptor.enumerable || false; descriptor.configurable = true; if (\"value\" in descriptor) descriptor.writable = true; Object.defineProperty(target, descriptor.key, descriptor); } }\n\nfunction _createClass(Constructor, protoProps, staticProps) { if (protoProps) _defineProperties(Constructor.prototype, protoProps); if (staticProps) _defineProperties(Constructor, staticProps); return Constructor; }\n\nfunction _inherits(subClass, superClass) { if (typeof superClass !== \"function\" && superClass !== null) { throw new TypeError(\"Super expression must either be null or a function\"); } subClass.prototype = Object.create(superClass && superClass.prototype, { constructor: { value: subClass, writable: true, configurable: true } }); if (superClass) _setPrototypeOf(subClass, superClass); }\n\nfunction _setPrototypeOf(o, p) { _setPrototypeOf = Object.setPrototypeOf || function _setPrototypeOf(o, p) { o.__proto__ = p; return o; }; return _setPrototypeOf(o, p); }\n\nfunction _createSuper(Derived) { var hasNativeReflectConstruct = _isNativeReflectConstruct(); return function _createSuperInternal() { var Super = _getPrototypeOf(Derived), result; if (hasNativeReflectConstruct) { var NewTarget = _getPrototypeOf(this).constructor; result = Reflect.construct(Super, arguments, NewTarget); } else { result = Super.apply(this, arguments); } return _possibleConstructorReturn(this, result); }; }\n\nfunction _possibleConstructorReturn(self, call) { if (call && (clipboard_typeof(call) === \"object\" || typeof call === \"function\")) { return call; } return _assertThisInitialized(self); }\n\nfunction _assertThisInitialized(self) { if (self === void 0) { throw new ReferenceError(\"this hasn't been initialised - super() hasn't been called\"); } return self; }\n\nfunction _isNativeReflectConstruct() { if (typeof Reflect === \"undefined\" || !Reflect.construct) return false; if (Reflect.construct.sham) return false; if (typeof Proxy === \"function\") return true; try { Date.prototype.toString.call(Reflect.construct(Date, [], function () {})); return true; } catch (e) { return false; } }\n\nfunction _getPrototypeOf(o) { _getPrototypeOf = Object.setPrototypeOf ? Object.getPrototypeOf : function _getPrototypeOf(o) { return o.__proto__ || Object.getPrototypeOf(o); }; return _getPrototypeOf(o); }\n\n\n\n\n\n\n/**\n * Helper function to retrieve attribute value.\n * @param {String} suffix\n * @param {Element} element\n */\n\nfunction getAttributeValue(suffix, element) {\n var attribute = \"data-clipboard-\".concat(suffix);\n\n if (!element.hasAttribute(attribute)) {\n return;\n }\n\n return element.getAttribute(attribute);\n}\n/**\n * Base class which takes one or more elements, adds event listeners to them,\n * and instantiates a new `ClipboardAction` on each click.\n */\n\n\nvar Clipboard = /*#__PURE__*/function (_Emitter) {\n _inherits(Clipboard, _Emitter);\n\n var _super = _createSuper(Clipboard);\n\n /**\n * @param {String|HTMLElement|HTMLCollection|NodeList} trigger\n * @param {Object} options\n */\n function Clipboard(trigger, options) {\n var _this;\n\n _classCallCheck(this, Clipboard);\n\n _this = _super.call(this);\n\n _this.resolveOptions(options);\n\n _this.listenClick(trigger);\n\n return _this;\n }\n /**\n * Defines if attributes would be resolved using internal setter functions\n * or custom functions that were passed in the constructor.\n * @param {Object} options\n */\n\n\n _createClass(Clipboard, [{\n key: \"resolveOptions\",\n value: function resolveOptions() {\n var options = arguments.length > 0 && arguments[0] !== undefined ? arguments[0] : {};\n this.action = typeof options.action === 'function' ? options.action : this.defaultAction;\n this.target = typeof options.target === 'function' ? options.target : this.defaultTarget;\n this.text = typeof options.text === 'function' ? options.text : this.defaultText;\n this.container = clipboard_typeof(options.container) === 'object' ? options.container : document.body;\n }\n /**\n * Adds a click event listener to the passed trigger.\n * @param {String|HTMLElement|HTMLCollection|NodeList} trigger\n */\n\n }, {\n key: \"listenClick\",\n value: function listenClick(trigger) {\n var _this2 = this;\n\n this.listener = listen_default()(trigger, 'click', function (e) {\n return _this2.onClick(e);\n });\n }\n /**\n * Defines a new `ClipboardAction` on each click event.\n * @param {Event} e\n */\n\n }, {\n key: \"onClick\",\n value: function onClick(e) {\n var trigger = e.delegateTarget || e.currentTarget;\n var action = this.action(trigger) || 'copy';\n var text = actions_default({\n action: action,\n container: this.container,\n target: this.target(trigger),\n text: this.text(trigger)\n }); // Fires an event based on the copy operation result.\n\n this.emit(text ? 'success' : 'error', {\n action: action,\n text: text,\n trigger: trigger,\n clearSelection: function clearSelection() {\n if (trigger) {\n trigger.focus();\n }\n\n window.getSelection().removeAllRanges();\n }\n });\n }\n /**\n * Default `action` lookup function.\n * @param {Element} trigger\n */\n\n }, {\n key: \"defaultAction\",\n value: function defaultAction(trigger) {\n return getAttributeValue('action', trigger);\n }\n /**\n * Default `target` lookup function.\n * @param {Element} trigger\n */\n\n }, {\n key: \"defaultTarget\",\n value: function defaultTarget(trigger) {\n var selector = getAttributeValue('target', trigger);\n\n if (selector) {\n return document.querySelector(selector);\n }\n }\n /**\n * Allow fire programmatically a copy action\n * @param {String|HTMLElement} target\n * @param {Object} options\n * @returns Text copied.\n */\n\n }, {\n key: \"defaultText\",\n\n /**\n * Default `text` lookup function.\n * @param {Element} trigger\n */\n value: function defaultText(trigger) {\n return getAttributeValue('text', trigger);\n }\n /**\n * Destroy lifecycle.\n */\n\n }, {\n key: \"destroy\",\n value: function destroy() {\n this.listener.destroy();\n }\n }], [{\n key: \"copy\",\n value: function copy(target) {\n var options = arguments.length > 1 && arguments[1] !== undefined ? arguments[1] : {\n container: document.body\n };\n return actions_copy(target, options);\n }\n /**\n * Allow fire programmatically a cut action\n * @param {String|HTMLElement} target\n * @returns Text cutted.\n */\n\n }, {\n key: \"cut\",\n value: function cut(target) {\n return actions_cut(target);\n }\n /**\n * Returns the support of the given action, or all actions if no action is\n * given.\n * @param {String} [action]\n */\n\n }, {\n key: \"isSupported\",\n value: function isSupported() {\n var action = arguments.length > 0 && arguments[0] !== undefined ? arguments[0] : ['copy', 'cut'];\n var actions = typeof action === 'string' ? [action] : action;\n var support = !!document.queryCommandSupported;\n actions.forEach(function (action) {\n support = support && !!document.queryCommandSupported(action);\n });\n return support;\n }\n }]);\n\n return Clipboard;\n}((tiny_emitter_default()));\n\n/* harmony default export */ var clipboard = (Clipboard);\n\n/***/ }),\n\n/***/ 828:\n/***/ (function(module) {\n\nvar DOCUMENT_NODE_TYPE = 9;\n\n/**\n * A polyfill for Element.matches()\n */\nif (typeof Element !== 'undefined' && !Element.prototype.matches) {\n var proto = Element.prototype;\n\n proto.matches = proto.matchesSelector ||\n proto.mozMatchesSelector ||\n proto.msMatchesSelector ||\n proto.oMatchesSelector ||\n proto.webkitMatchesSelector;\n}\n\n/**\n * Finds the closest parent that matches a selector.\n *\n * @param {Element} element\n * @param {String} selector\n * @return {Function}\n */\nfunction closest (element, selector) {\n while (element && element.nodeType !== DOCUMENT_NODE_TYPE) {\n if (typeof element.matches === 'function' &&\n element.matches(selector)) {\n return element;\n }\n element = element.parentNode;\n }\n}\n\nmodule.exports = closest;\n\n\n/***/ }),\n\n/***/ 438:\n/***/ (function(module, __unused_webpack_exports, __webpack_require__) {\n\nvar closest = __webpack_require__(828);\n\n/**\n * Delegates event to a selector.\n *\n * @param {Element} element\n * @param {String} selector\n * @param {String} type\n * @param {Function} callback\n * @param {Boolean} useCapture\n * @return {Object}\n */\nfunction _delegate(element, selector, type, callback, useCapture) {\n var listenerFn = listener.apply(this, arguments);\n\n element.addEventListener(type, listenerFn, useCapture);\n\n return {\n destroy: function() {\n element.removeEventListener(type, listenerFn, useCapture);\n }\n }\n}\n\n/**\n * Delegates event to a selector.\n *\n * @param {Element|String|Array} [elements]\n * @param {String} selector\n * @param {String} type\n * @param {Function} callback\n * @param {Boolean} useCapture\n * @return {Object}\n */\nfunction delegate(elements, selector, type, callback, useCapture) {\n // Handle the regular Element usage\n if (typeof elements.addEventListener === 'function') {\n return _delegate.apply(null, arguments);\n }\n\n // Handle Element-less usage, it defaults to global delegation\n if (typeof type === 'function') {\n // Use `document` as the first parameter, then apply arguments\n // This is a short way to .unshift `arguments` without running into deoptimizations\n return _delegate.bind(null, document).apply(null, arguments);\n }\n\n // Handle Selector-based usage\n if (typeof elements === 'string') {\n elements = document.querySelectorAll(elements);\n }\n\n // Handle Array-like based usage\n return Array.prototype.map.call(elements, function (element) {\n return _delegate(element, selector, type, callback, useCapture);\n });\n}\n\n/**\n * Finds closest match and invokes callback.\n *\n * @param {Element} element\n * @param {String} selector\n * @param {String} type\n * @param {Function} callback\n * @return {Function}\n */\nfunction listener(element, selector, type, callback) {\n return function(e) {\n e.delegateTarget = closest(e.target, selector);\n\n if (e.delegateTarget) {\n callback.call(element, e);\n }\n }\n}\n\nmodule.exports = delegate;\n\n\n/***/ }),\n\n/***/ 879:\n/***/ (function(__unused_webpack_module, exports) {\n\n/**\n * Check if argument is a HTML element.\n *\n * @param {Object} value\n * @return {Boolean}\n */\nexports.node = function(value) {\n return value !== undefined\n && value instanceof HTMLElement\n && value.nodeType === 1;\n};\n\n/**\n * Check if argument is a list of HTML elements.\n *\n * @param {Object} value\n * @return {Boolean}\n */\nexports.nodeList = function(value) {\n var type = Object.prototype.toString.call(value);\n\n return value !== undefined\n && (type === '[object NodeList]' || type === '[object HTMLCollection]')\n && ('length' in value)\n && (value.length === 0 || exports.node(value[0]));\n};\n\n/**\n * Check if argument is a string.\n *\n * @param {Object} value\n * @return {Boolean}\n */\nexports.string = function(value) {\n return typeof value === 'string'\n || value instanceof String;\n};\n\n/**\n * Check if argument is a function.\n *\n * @param {Object} value\n * @return {Boolean}\n */\nexports.fn = function(value) {\n var type = Object.prototype.toString.call(value);\n\n return type === '[object Function]';\n};\n\n\n/***/ }),\n\n/***/ 370:\n/***/ (function(module, __unused_webpack_exports, __webpack_require__) {\n\nvar is = __webpack_require__(879);\nvar delegate = __webpack_require__(438);\n\n/**\n * Validates all params and calls the right\n * listener function based on its target type.\n *\n * @param {String|HTMLElement|HTMLCollection|NodeList} target\n * @param {String} type\n * @param {Function} callback\n * @return {Object}\n */\nfunction listen(target, type, callback) {\n if (!target && !type && !callback) {\n throw new Error('Missing required arguments');\n }\n\n if (!is.string(type)) {\n throw new TypeError('Second argument must be a String');\n }\n\n if (!is.fn(callback)) {\n throw new TypeError('Third argument must be a Function');\n }\n\n if (is.node(target)) {\n return listenNode(target, type, callback);\n }\n else if (is.nodeList(target)) {\n return listenNodeList(target, type, callback);\n }\n else if (is.string(target)) {\n return listenSelector(target, type, callback);\n }\n else {\n throw new TypeError('First argument must be a String, HTMLElement, HTMLCollection, or NodeList');\n }\n}\n\n/**\n * Adds an event listener to a HTML element\n * and returns a remove listener function.\n *\n * @param {HTMLElement} node\n * @param {String} type\n * @param {Function} callback\n * @return {Object}\n */\nfunction listenNode(node, type, callback) {\n node.addEventListener(type, callback);\n\n return {\n destroy: function() {\n node.removeEventListener(type, callback);\n }\n }\n}\n\n/**\n * Add an event listener to a list of HTML elements\n * and returns a remove listener function.\n *\n * @param {NodeList|HTMLCollection} nodeList\n * @param {String} type\n * @param {Function} callback\n * @return {Object}\n */\nfunction listenNodeList(nodeList, type, callback) {\n Array.prototype.forEach.call(nodeList, function(node) {\n node.addEventListener(type, callback);\n });\n\n return {\n destroy: function() {\n Array.prototype.forEach.call(nodeList, function(node) {\n node.removeEventListener(type, callback);\n });\n }\n }\n}\n\n/**\n * Add an event listener to a selector\n * and returns a remove listener function.\n *\n * @param {String} selector\n * @param {String} type\n * @param {Function} callback\n * @return {Object}\n */\nfunction listenSelector(selector, type, callback) {\n return delegate(document.body, selector, type, callback);\n}\n\nmodule.exports = listen;\n\n\n/***/ }),\n\n/***/ 817:\n/***/ (function(module) {\n\nfunction select(element) {\n var selectedText;\n\n if (element.nodeName === 'SELECT') {\n element.focus();\n\n selectedText = element.value;\n }\n else if (element.nodeName === 'INPUT' || element.nodeName === 'TEXTAREA') {\n var isReadOnly = element.hasAttribute('readonly');\n\n if (!isReadOnly) {\n element.setAttribute('readonly', '');\n }\n\n element.select();\n element.setSelectionRange(0, element.value.length);\n\n if (!isReadOnly) {\n element.removeAttribute('readonly');\n }\n\n selectedText = element.value;\n }\n else {\n if (element.hasAttribute('contenteditable')) {\n element.focus();\n }\n\n var selection = window.getSelection();\n var range = document.createRange();\n\n range.selectNodeContents(element);\n selection.removeAllRanges();\n selection.addRange(range);\n\n selectedText = selection.toString();\n }\n\n return selectedText;\n}\n\nmodule.exports = select;\n\n\n/***/ }),\n\n/***/ 279:\n/***/ (function(module) {\n\nfunction E () {\n // Keep this empty so it's easier to inherit from\n // (via https://github.com/lipsmack from https://github.com/scottcorgan/tiny-emitter/issues/3)\n}\n\nE.prototype = {\n on: function (name, callback, ctx) {\n var e = this.e || (this.e = {});\n\n (e[name] || (e[name] = [])).push({\n fn: callback,\n ctx: ctx\n });\n\n return this;\n },\n\n once: function (name, callback, ctx) {\n var self = this;\n function listener () {\n self.off(name, listener);\n callback.apply(ctx, arguments);\n };\n\n listener._ = callback\n return this.on(name, listener, ctx);\n },\n\n emit: function (name) {\n var data = [].slice.call(arguments, 1);\n var evtArr = ((this.e || (this.e = {}))[name] || []).slice();\n var i = 0;\n var len = evtArr.length;\n\n for (i; i < len; i++) {\n evtArr[i].fn.apply(evtArr[i].ctx, data);\n }\n\n return this;\n },\n\n off: function (name, callback) {\n var e = this.e || (this.e = {});\n var evts = e[name];\n var liveEvents = [];\n\n if (evts && callback) {\n for (var i = 0, len = evts.length; i < len; i++) {\n if (evts[i].fn !== callback && evts[i].fn._ !== callback)\n liveEvents.push(evts[i]);\n }\n }\n\n // Remove event from queue to prevent memory leak\n // Suggested by https://github.com/lazd\n // Ref: https://github.com/scottcorgan/tiny-emitter/commit/c6ebfaa9bc973b33d110a84a307742b7cf94c953#commitcomment-5024910\n\n (liveEvents.length)\n ? e[name] = liveEvents\n : delete e[name];\n\n return this;\n }\n};\n\nmodule.exports = E;\nmodule.exports.TinyEmitter = E;\n\n\n/***/ })\n\n/******/ \t});\n/************************************************************************/\n/******/ \t// The module cache\n/******/ \tvar __webpack_module_cache__ = {};\n/******/ \t\n/******/ \t// The require function\n/******/ \tfunction __webpack_require__(moduleId) {\n/******/ \t\t// Check if module is in cache\n/******/ \t\tif(__webpack_module_cache__[moduleId]) {\n/******/ \t\t\treturn __webpack_module_cache__[moduleId].exports;\n/******/ \t\t}\n/******/ \t\t// Create a new module (and put it into the cache)\n/******/ \t\tvar module = __webpack_module_cache__[moduleId] = {\n/******/ \t\t\t// no module.id needed\n/******/ \t\t\t// no module.loaded needed\n/******/ \t\t\texports: {}\n/******/ \t\t};\n/******/ \t\n/******/ \t\t// Execute the module function\n/******/ \t\t__webpack_modules__[moduleId](module, module.exports, __webpack_require__);\n/******/ \t\n/******/ \t\t// Return the exports of the module\n/******/ \t\treturn module.exports;\n/******/ \t}\n/******/ \t\n/************************************************************************/\n/******/ \t/* webpack/runtime/compat get default export */\n/******/ \t!function() {\n/******/ \t\t// getDefaultExport function for compatibility with non-harmony modules\n/******/ \t\t__webpack_require__.n = function(module) {\n/******/ \t\t\tvar getter = module && module.__esModule ?\n/******/ \t\t\t\tfunction() { return module['default']; } :\n/******/ \t\t\t\tfunction() { return module; };\n/******/ \t\t\t__webpack_require__.d(getter, { a: getter });\n/******/ \t\t\treturn getter;\n/******/ \t\t};\n/******/ \t}();\n/******/ \t\n/******/ \t/* webpack/runtime/define property getters */\n/******/ \t!function() {\n/******/ \t\t// define getter functions for harmony exports\n/******/ \t\t__webpack_require__.d = function(exports, definition) {\n/******/ \t\t\tfor(var key in definition) {\n/******/ \t\t\t\tif(__webpack_require__.o(definition, key) && !__webpack_require__.o(exports, key)) {\n/******/ \t\t\t\t\tObject.defineProperty(exports, key, { enumerable: true, get: definition[key] });\n/******/ \t\t\t\t}\n/******/ \t\t\t}\n/******/ \t\t};\n/******/ \t}();\n/******/ \t\n/******/ \t/* webpack/runtime/hasOwnProperty shorthand */\n/******/ \t!function() {\n/******/ \t\t__webpack_require__.o = function(obj, prop) { return Object.prototype.hasOwnProperty.call(obj, prop); }\n/******/ \t}();\n/******/ \t\n/************************************************************************/\n/******/ \t// module exports must be returned from runtime so entry inlining is disabled\n/******/ \t// startup\n/******/ \t// Load entry module and return exports\n/******/ \treturn __webpack_require__(686);\n/******/ })()\n.default;\n});", "/*!\n * escape-html\n * Copyright(c) 2012-2013 TJ Holowaychuk\n * Copyright(c) 2015 Andreas Lubbe\n * Copyright(c) 2015 Tiancheng \"Timothy\" Gu\n * MIT Licensed\n */\n\n'use strict';\n\n/**\n * Module variables.\n * @private\n */\n\nvar matchHtmlRegExp = /[\"'&<>]/;\n\n/**\n * Module exports.\n * @public\n */\n\nmodule.exports = escapeHtml;\n\n/**\n * Escape special characters in the given string of html.\n *\n * @param {string} string The string to escape for inserting into HTML\n * @return {string}\n * @public\n */\n\nfunction escapeHtml(string) {\n var str = '' + string;\n var match = matchHtmlRegExp.exec(str);\n\n if (!match) {\n return str;\n }\n\n var escape;\n var html = '';\n var index = 0;\n var lastIndex = 0;\n\n for (index = match.index; index < str.length; index++) {\n switch (str.charCodeAt(index)) {\n case 34: // \"\n escape = '"';\n break;\n case 38: // &\n escape = '&';\n break;\n case 39: // '\n escape = ''';\n break;\n case 60: // <\n escape = '<';\n break;\n case 62: // >\n escape = '>';\n break;\n default:\n continue;\n }\n\n if (lastIndex !== index) {\n html += str.substring(lastIndex, index);\n }\n\n lastIndex = index + 1;\n html += escape;\n }\n\n return lastIndex !== index\n ? html + str.substring(lastIndex, index)\n : html;\n}\n", "/*\n * Copyright (c) 2016-2024 Martin Donath \n *\n * Permission is hereby granted, free of charge, to any person obtaining a copy\n * of this software and associated documentation files (the \"Software\"), to\n * deal in the Software without restriction, including without limitation the\n * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or\n * sell copies of the Software, and to permit persons to whom the Software is\n * furnished to do so, subject to the following conditions:\n *\n * The above copyright notice and this permission notice shall be included in\n * all copies or substantial portions of the Software.\n *\n * THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE\n * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING\n * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS\n * IN THE SOFTWARE.\n */\n\nimport \"focus-visible\"\n\nimport {\n EMPTY,\n NEVER,\n Observable,\n Subject,\n defer,\n delay,\n filter,\n map,\n merge,\n mergeWith,\n shareReplay,\n switchMap\n} from \"rxjs\"\n\nimport { configuration, feature } from \"./_\"\nimport {\n at,\n getActiveElement,\n getOptionalElement,\n requestJSON,\n setLocation,\n setToggle,\n watchDocument,\n watchKeyboard,\n watchLocation,\n watchLocationTarget,\n watchMedia,\n watchPrint,\n watchScript,\n watchViewport\n} from \"./browser\"\nimport {\n getComponentElement,\n getComponentElements,\n mountAnnounce,\n mountBackToTop,\n mountConsent,\n mountContent,\n mountDialog,\n mountHeader,\n mountHeaderTitle,\n mountPalette,\n mountProgress,\n mountSearch,\n mountSearchHiglight,\n mountSidebar,\n mountSource,\n mountTableOfContents,\n mountTabs,\n watchHeader,\n watchMain\n} from \"./components\"\nimport {\n SearchIndex,\n setupClipboardJS,\n setupInstantNavigation,\n setupVersionSelector\n} from \"./integrations\"\nimport {\n patchEllipsis,\n patchIndeterminate,\n patchScrollfix,\n patchScrolllock\n} from \"./patches\"\nimport \"./polyfills\"\n\n/* ----------------------------------------------------------------------------\n * Functions - @todo refactor\n * ------------------------------------------------------------------------- */\n\n/**\n * Fetch search index\n *\n * @returns Search index observable\n */\nfunction fetchSearchIndex(): Observable {\n if (location.protocol === \"file:\") {\n return watchScript(\n `${new URL(\"search/search_index.js\", config.base)}`\n )\n .pipe(\n // @ts-ignore - @todo fix typings\n map(() => __index),\n shareReplay(1)\n )\n } else {\n return requestJSON(\n new URL(\"search/search_index.json\", config.base)\n )\n }\n}\n\n/* ----------------------------------------------------------------------------\n * Application\n * ------------------------------------------------------------------------- */\n\n/* Yay, JavaScript is available */\ndocument.documentElement.classList.remove(\"no-js\")\ndocument.documentElement.classList.add(\"js\")\n\n/* Set up navigation observables and subjects */\nconst document$ = watchDocument()\nconst location$ = watchLocation()\nconst target$ = watchLocationTarget(location$)\nconst keyboard$ = watchKeyboard()\n\n/* Set up media observables */\nconst viewport$ = watchViewport()\nconst tablet$ = watchMedia(\"(min-width: 960px)\")\nconst screen$ = watchMedia(\"(min-width: 1220px)\")\nconst print$ = watchPrint()\n\n/* Retrieve search index, if search is enabled */\nconst config = configuration()\nconst index$ = document.forms.namedItem(\"search\")\n ? fetchSearchIndex()\n : NEVER\n\n/* Set up Clipboard.js integration */\nconst alert$ = new Subject()\nsetupClipboardJS({ alert$ })\n\n/* Set up progress indicator */\nconst progress$ = new Subject()\n\n/* Set up instant navigation, if enabled */\nif (feature(\"navigation.instant\"))\n setupInstantNavigation({ location$, viewport$, progress$ })\n .subscribe(document$)\n\n/* Set up version selector */\nif (config.version?.provider === \"mike\")\n setupVersionSelector({ document$ })\n\n/* Always close drawer and search on navigation */\nmerge(location$, target$)\n .pipe(\n delay(125)\n )\n .subscribe(() => {\n setToggle(\"drawer\", false)\n setToggle(\"search\", false)\n })\n\n/* Set up global keyboard handlers */\nkeyboard$\n .pipe(\n filter(({ mode }) => mode === \"global\")\n )\n .subscribe(key => {\n switch (key.type) {\n\n /* Go to previous page */\n case \"p\":\n case \",\":\n const prev = getOptionalElement(\"link[rel=prev]\")\n if (typeof prev !== \"undefined\")\n setLocation(prev)\n break\n\n /* Go to next page */\n case \"n\":\n case \".\":\n const next = getOptionalElement(\"link[rel=next]\")\n if (typeof next !== \"undefined\")\n setLocation(next)\n break\n\n /* Expand navigation, see https://bit.ly/3ZjG5io */\n case \"Enter\":\n const active = getActiveElement()\n if (active instanceof HTMLLabelElement)\n active.click()\n }\n })\n\n/* Set up patches */\npatchEllipsis({ document$ })\npatchIndeterminate({ document$, tablet$ })\npatchScrollfix({ document$ })\npatchScrolllock({ viewport$, tablet$ })\n\n/* Set up header and main area observable */\nconst header$ = watchHeader(getComponentElement(\"header\"), { viewport$ })\nconst main$ = document$\n .pipe(\n map(() => getComponentElement(\"main\")),\n switchMap(el => watchMain(el, { viewport$, header$ })),\n shareReplay(1)\n )\n\n/* Set up control component observables */\nconst control$ = merge(\n\n /* Consent */\n ...getComponentElements(\"consent\")\n .map(el => mountConsent(el, { target$ })),\n\n /* Dialog */\n ...getComponentElements(\"dialog\")\n .map(el => mountDialog(el, { alert$ })),\n\n /* Header */\n ...getComponentElements(\"header\")\n .map(el => mountHeader(el, { viewport$, header$, main$ })),\n\n /* Color palette */\n ...getComponentElements(\"palette\")\n .map(el => mountPalette(el)),\n\n /* Progress bar */\n ...getComponentElements(\"progress\")\n .map(el => mountProgress(el, { progress$ })),\n\n /* Search */\n ...getComponentElements(\"search\")\n .map(el => mountSearch(el, { index$, keyboard$ })),\n\n /* Repository information */\n ...getComponentElements(\"source\")\n .map(el => mountSource(el))\n)\n\n/* Set up content component observables */\nconst content$ = defer(() => merge(\n\n /* Announcement bar */\n ...getComponentElements(\"announce\")\n .map(el => mountAnnounce(el)),\n\n /* Content */\n ...getComponentElements(\"content\")\n .map(el => mountContent(el, { viewport$, target$, print$ })),\n\n /* Search highlighting */\n ...getComponentElements(\"content\")\n .map(el => feature(\"search.highlight\")\n ? mountSearchHiglight(el, { index$, location$ })\n : EMPTY\n ),\n\n /* Header title */\n ...getComponentElements(\"header-title\")\n .map(el => mountHeaderTitle(el, { viewport$, header$ })),\n\n /* Sidebar */\n ...getComponentElements(\"sidebar\")\n .map(el => el.getAttribute(\"data-md-type\") === \"navigation\"\n ? at(screen$, () => mountSidebar(el, { viewport$, header$, main$ }))\n : at(tablet$, () => mountSidebar(el, { viewport$, header$, main$ }))\n ),\n\n /* Navigation tabs */\n ...getComponentElements(\"tabs\")\n .map(el => mountTabs(el, { viewport$, header$ })),\n\n /* Table of contents */\n ...getComponentElements(\"toc\")\n .map(el => mountTableOfContents(el, {\n viewport$, header$, main$, target$\n })),\n\n /* Back-to-top button */\n ...getComponentElements(\"top\")\n .map(el => mountBackToTop(el, { viewport$, header$, main$, target$ }))\n))\n\n/* Set up component observables */\nconst component$ = document$\n .pipe(\n switchMap(() => content$),\n mergeWith(control$),\n shareReplay(1)\n )\n\n/* Subscribe to all components */\ncomponent$.subscribe()\n\n/* ----------------------------------------------------------------------------\n * Exports\n * ------------------------------------------------------------------------- */\n\nwindow.document$ = document$ /* Document observable */\nwindow.location$ = location$ /* Location subject */\nwindow.target$ = target$ /* Location target observable */\nwindow.keyboard$ = keyboard$ /* Keyboard observable */\nwindow.viewport$ = viewport$ /* Viewport observable */\nwindow.tablet$ = tablet$ /* Media tablet observable */\nwindow.screen$ = screen$ /* Media screen observable */\nwindow.print$ = print$ /* Media print observable */\nwindow.alert$ = alert$ /* Alert subject */\nwindow.progress$ = progress$ /* Progress indicator subject */\nwindow.component$ = component$ /* Component observable */\n", "/*! *****************************************************************************\r\nCopyright (c) Microsoft Corporation.\r\n\r\nPermission to use, copy, modify, and/or distribute this software for any\r\npurpose with or without fee is hereby granted.\r\n\r\nTHE SOFTWARE IS PROVIDED \"AS IS\" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH\r\nREGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY\r\nAND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,\r\nINDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM\r\nLOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR\r\nOTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR\r\nPERFORMANCE OF THIS SOFTWARE.\r\n***************************************************************************** */\r\n/* global Reflect, Promise */\r\n\r\nvar extendStatics = function(d, b) {\r\n extendStatics = Object.setPrototypeOf ||\r\n ({ __proto__: [] } instanceof Array && function (d, b) { d.__proto__ = b; }) ||\r\n function (d, b) { for (var p in b) if (Object.prototype.hasOwnProperty.call(b, p)) d[p] = b[p]; };\r\n return extendStatics(d, b);\r\n};\r\n\r\nexport function __extends(d, b) {\r\n if (typeof b !== \"function\" && b !== null)\r\n throw new TypeError(\"Class extends value \" + String(b) + \" is not a constructor or null\");\r\n extendStatics(d, b);\r\n function __() { this.constructor = d; }\r\n d.prototype = b === null ? Object.create(b) : (__.prototype = b.prototype, new __());\r\n}\r\n\r\nexport var __assign = function() {\r\n __assign = Object.assign || function __assign(t) {\r\n for (var s, i = 1, n = arguments.length; i < n; i++) {\r\n s = arguments[i];\r\n for (var p in s) if (Object.prototype.hasOwnProperty.call(s, p)) t[p] = s[p];\r\n }\r\n return t;\r\n }\r\n return __assign.apply(this, arguments);\r\n}\r\n\r\nexport function __rest(s, e) {\r\n var t = {};\r\n for (var p in s) if (Object.prototype.hasOwnProperty.call(s, p) && e.indexOf(p) < 0)\r\n t[p] = s[p];\r\n if (s != null && typeof Object.getOwnPropertySymbols === \"function\")\r\n for (var i = 0, p = Object.getOwnPropertySymbols(s); i < p.length; i++) {\r\n if (e.indexOf(p[i]) < 0 && Object.prototype.propertyIsEnumerable.call(s, p[i]))\r\n t[p[i]] = s[p[i]];\r\n }\r\n return t;\r\n}\r\n\r\nexport function __decorate(decorators, target, key, desc) {\r\n var c = arguments.length, r = c < 3 ? target : desc === null ? desc = Object.getOwnPropertyDescriptor(target, key) : desc, d;\r\n if (typeof Reflect === \"object\" && typeof Reflect.decorate === \"function\") r = Reflect.decorate(decorators, target, key, desc);\r\n else for (var i = decorators.length - 1; i >= 0; i--) if (d = decorators[i]) r = (c < 3 ? d(r) : c > 3 ? d(target, key, r) : d(target, key)) || r;\r\n return c > 3 && r && Object.defineProperty(target, key, r), r;\r\n}\r\n\r\nexport function __param(paramIndex, decorator) {\r\n return function (target, key) { decorator(target, key, paramIndex); }\r\n}\r\n\r\nexport function __metadata(metadataKey, metadataValue) {\r\n if (typeof Reflect === \"object\" && typeof Reflect.metadata === \"function\") return Reflect.metadata(metadataKey, metadataValue);\r\n}\r\n\r\nexport function __awaiter(thisArg, _arguments, P, generator) {\r\n function adopt(value) { return value instanceof P ? value : new P(function (resolve) { resolve(value); }); }\r\n return new (P || (P = Promise))(function (resolve, reject) {\r\n function fulfilled(value) { try { step(generator.next(value)); } catch (e) { reject(e); } }\r\n function rejected(value) { try { step(generator[\"throw\"](value)); } catch (e) { reject(e); } }\r\n function step(result) { result.done ? resolve(result.value) : adopt(result.value).then(fulfilled, rejected); }\r\n step((generator = generator.apply(thisArg, _arguments || [])).next());\r\n });\r\n}\r\n\r\nexport function __generator(thisArg, body) {\r\n var _ = { label: 0, sent: function() { if (t[0] & 1) throw t[1]; return t[1]; }, trys: [], ops: [] }, f, y, t, g;\r\n return g = { next: verb(0), \"throw\": verb(1), \"return\": verb(2) }, typeof Symbol === \"function\" && (g[Symbol.iterator] = function() { return this; }), g;\r\n function verb(n) { return function (v) { return step([n, v]); }; }\r\n function step(op) {\r\n if (f) throw new TypeError(\"Generator is already executing.\");\r\n while (_) try {\r\n if (f = 1, y && (t = op[0] & 2 ? y[\"return\"] : op[0] ? y[\"throw\"] || ((t = y[\"return\"]) && t.call(y), 0) : y.next) && !(t = t.call(y, op[1])).done) return t;\r\n if (y = 0, t) op = [op[0] & 2, t.value];\r\n switch (op[0]) {\r\n case 0: case 1: t = op; break;\r\n case 4: _.label++; return { value: op[1], done: false };\r\n case 5: _.label++; y = op[1]; op = [0]; continue;\r\n case 7: op = _.ops.pop(); _.trys.pop(); continue;\r\n default:\r\n if (!(t = _.trys, t = t.length > 0 && t[t.length - 1]) && (op[0] === 6 || op[0] === 2)) { _ = 0; continue; }\r\n if (op[0] === 3 && (!t || (op[1] > t[0] && op[1] < t[3]))) { _.label = op[1]; break; }\r\n if (op[0] === 6 && _.label < t[1]) { _.label = t[1]; t = op; break; }\r\n if (t && _.label < t[2]) { _.label = t[2]; _.ops.push(op); break; }\r\n if (t[2]) _.ops.pop();\r\n _.trys.pop(); continue;\r\n }\r\n op = body.call(thisArg, _);\r\n } catch (e) { op = [6, e]; y = 0; } finally { f = t = 0; }\r\n if (op[0] & 5) throw op[1]; return { value: op[0] ? op[1] : void 0, done: true };\r\n }\r\n}\r\n\r\nexport var __createBinding = Object.create ? (function(o, m, k, k2) {\r\n if (k2 === undefined) k2 = k;\r\n Object.defineProperty(o, k2, { enumerable: true, get: function() { return m[k]; } });\r\n}) : (function(o, m, k, k2) {\r\n if (k2 === undefined) k2 = k;\r\n o[k2] = m[k];\r\n});\r\n\r\nexport function __exportStar(m, o) {\r\n for (var p in m) if (p !== \"default\" && !Object.prototype.hasOwnProperty.call(o, p)) __createBinding(o, m, p);\r\n}\r\n\r\nexport function __values(o) {\r\n var s = typeof Symbol === \"function\" && Symbol.iterator, m = s && o[s], i = 0;\r\n if (m) return m.call(o);\r\n if (o && typeof o.length === \"number\") return {\r\n next: function () {\r\n if (o && i >= o.length) o = void 0;\r\n return { value: o && o[i++], done: !o };\r\n }\r\n };\r\n throw new TypeError(s ? \"Object is not iterable.\" : \"Symbol.iterator is not defined.\");\r\n}\r\n\r\nexport function __read(o, n) {\r\n var m = typeof Symbol === \"function\" && o[Symbol.iterator];\r\n if (!m) return o;\r\n var i = m.call(o), r, ar = [], e;\r\n try {\r\n while ((n === void 0 || n-- > 0) && !(r = i.next()).done) ar.push(r.value);\r\n }\r\n catch (error) { e = { error: error }; }\r\n finally {\r\n try {\r\n if (r && !r.done && (m = i[\"return\"])) m.call(i);\r\n }\r\n finally { if (e) throw e.error; }\r\n }\r\n return ar;\r\n}\r\n\r\n/** @deprecated */\r\nexport function __spread() {\r\n for (var ar = [], i = 0; i < arguments.length; i++)\r\n ar = ar.concat(__read(arguments[i]));\r\n return ar;\r\n}\r\n\r\n/** @deprecated */\r\nexport function __spreadArrays() {\r\n for (var s = 0, i = 0, il = arguments.length; i < il; i++) s += arguments[i].length;\r\n for (var r = Array(s), k = 0, i = 0; i < il; i++)\r\n for (var a = arguments[i], j = 0, jl = a.length; j < jl; j++, k++)\r\n r[k] = a[j];\r\n return r;\r\n}\r\n\r\nexport function __spreadArray(to, from, pack) {\r\n if (pack || arguments.length === 2) for (var i = 0, l = from.length, ar; i < l; i++) {\r\n if (ar || !(i in from)) {\r\n if (!ar) ar = Array.prototype.slice.call(from, 0, i);\r\n ar[i] = from[i];\r\n }\r\n }\r\n return to.concat(ar || Array.prototype.slice.call(from));\r\n}\r\n\r\nexport function __await(v) {\r\n return this instanceof __await ? (this.v = v, this) : new __await(v);\r\n}\r\n\r\nexport function __asyncGenerator(thisArg, _arguments, generator) {\r\n if (!Symbol.asyncIterator) throw new TypeError(\"Symbol.asyncIterator is not defined.\");\r\n var g = generator.apply(thisArg, _arguments || []), i, q = [];\r\n return i = {}, verb(\"next\"), verb(\"throw\"), verb(\"return\"), i[Symbol.asyncIterator] = function () { return this; }, i;\r\n function verb(n) { if (g[n]) i[n] = function (v) { return new Promise(function (a, b) { q.push([n, v, a, b]) > 1 || resume(n, v); }); }; }\r\n function resume(n, v) { try { step(g[n](v)); } catch (e) { settle(q[0][3], e); } }\r\n function step(r) { r.value instanceof __await ? Promise.resolve(r.value.v).then(fulfill, reject) : settle(q[0][2], r); }\r\n function fulfill(value) { resume(\"next\", value); }\r\n function reject(value) { resume(\"throw\", value); }\r\n function settle(f, v) { if (f(v), q.shift(), q.length) resume(q[0][0], q[0][1]); }\r\n}\r\n\r\nexport function __asyncDelegator(o) {\r\n var i, p;\r\n return i = {}, verb(\"next\"), verb(\"throw\", function (e) { throw e; }), verb(\"return\"), i[Symbol.iterator] = function () { return this; }, i;\r\n function verb(n, f) { i[n] = o[n] ? function (v) { return (p = !p) ? { value: __await(o[n](v)), done: n === \"return\" } : f ? f(v) : v; } : f; }\r\n}\r\n\r\nexport function __asyncValues(o) {\r\n if (!Symbol.asyncIterator) throw new TypeError(\"Symbol.asyncIterator is not defined.\");\r\n var m = o[Symbol.asyncIterator], i;\r\n return m ? m.call(o) : (o = typeof __values === \"function\" ? __values(o) : o[Symbol.iterator](), i = {}, verb(\"next\"), verb(\"throw\"), verb(\"return\"), i[Symbol.asyncIterator] = function () { return this; }, i);\r\n function verb(n) { i[n] = o[n] && function (v) { return new Promise(function (resolve, reject) { v = o[n](v), settle(resolve, reject, v.done, v.value); }); }; }\r\n function settle(resolve, reject, d, v) { Promise.resolve(v).then(function(v) { resolve({ value: v, done: d }); }, reject); }\r\n}\r\n\r\nexport function __makeTemplateObject(cooked, raw) {\r\n if (Object.defineProperty) { Object.defineProperty(cooked, \"raw\", { value: raw }); } else { cooked.raw = raw; }\r\n return cooked;\r\n};\r\n\r\nvar __setModuleDefault = Object.create ? (function(o, v) {\r\n Object.defineProperty(o, \"default\", { enumerable: true, value: v });\r\n}) : function(o, v) {\r\n o[\"default\"] = v;\r\n};\r\n\r\nexport function __importStar(mod) {\r\n if (mod && mod.__esModule) return mod;\r\n var result = {};\r\n if (mod != null) for (var k in mod) if (k !== \"default\" && Object.prototype.hasOwnProperty.call(mod, k)) __createBinding(result, mod, k);\r\n __setModuleDefault(result, mod);\r\n return result;\r\n}\r\n\r\nexport function __importDefault(mod) {\r\n return (mod && mod.__esModule) ? mod : { default: mod };\r\n}\r\n\r\nexport function __classPrivateFieldGet(receiver, state, kind, f) {\r\n if (kind === \"a\" && !f) throw new TypeError(\"Private accessor was defined without a getter\");\r\n if (typeof state === \"function\" ? receiver !== state || !f : !state.has(receiver)) throw new TypeError(\"Cannot read private member from an object whose class did not declare it\");\r\n return kind === \"m\" ? f : kind === \"a\" ? f.call(receiver) : f ? f.value : state.get(receiver);\r\n}\r\n\r\nexport function __classPrivateFieldSet(receiver, state, value, kind, f) {\r\n if (kind === \"m\") throw new TypeError(\"Private method is not writable\");\r\n if (kind === \"a\" && !f) throw new TypeError(\"Private accessor was defined without a setter\");\r\n if (typeof state === \"function\" ? receiver !== state || !f : !state.has(receiver)) throw new TypeError(\"Cannot write private member to an object whose class did not declare it\");\r\n return (kind === \"a\" ? f.call(receiver, value) : f ? f.value = value : state.set(receiver, value)), value;\r\n}\r\n", "/**\n * Returns true if the object is a function.\n * @param value The value to check\n */\nexport function isFunction(value: any): value is (...args: any[]) => any {\n return typeof value === 'function';\n}\n", "/**\n * Used to create Error subclasses until the community moves away from ES5.\n *\n * This is because compiling from TypeScript down to ES5 has issues with subclassing Errors\n * as well as other built-in types: https://github.com/Microsoft/TypeScript/issues/12123\n *\n * @param createImpl A factory function to create the actual constructor implementation. The returned\n * function should be a named function that calls `_super` internally.\n */\nexport function createErrorClass(createImpl: (_super: any) => any): T {\n const _super = (instance: any) => {\n Error.call(instance);\n instance.stack = new Error().stack;\n };\n\n const ctorFunc = createImpl(_super);\n ctorFunc.prototype = Object.create(Error.prototype);\n ctorFunc.prototype.constructor = ctorFunc;\n return ctorFunc;\n}\n", "import { createErrorClass } from './createErrorClass';\n\nexport interface UnsubscriptionError extends Error {\n readonly errors: any[];\n}\n\nexport interface UnsubscriptionErrorCtor {\n /**\n * @deprecated Internal implementation detail. Do not construct error instances.\n * Cannot be tagged as internal: https://github.com/ReactiveX/rxjs/issues/6269\n */\n new (errors: any[]): UnsubscriptionError;\n}\n\n/**\n * An error thrown when one or more errors have occurred during the\n * `unsubscribe` of a {@link Subscription}.\n */\nexport const UnsubscriptionError: UnsubscriptionErrorCtor = createErrorClass(\n (_super) =>\n function UnsubscriptionErrorImpl(this: any, errors: (Error | string)[]) {\n _super(this);\n this.message = errors\n ? `${errors.length} errors occurred during unsubscription:\n${errors.map((err, i) => `${i + 1}) ${err.toString()}`).join('\\n ')}`\n : '';\n this.name = 'UnsubscriptionError';\n this.errors = errors;\n }\n);\n", "/**\n * Removes an item from an array, mutating it.\n * @param arr The array to remove the item from\n * @param item The item to remove\n */\nexport function arrRemove(arr: T[] | undefined | null, item: T) {\n if (arr) {\n const index = arr.indexOf(item);\n 0 <= index && arr.splice(index, 1);\n }\n}\n", "import { isFunction } from './util/isFunction';\nimport { UnsubscriptionError } from './util/UnsubscriptionError';\nimport { SubscriptionLike, TeardownLogic, Unsubscribable } from './types';\nimport { arrRemove } from './util/arrRemove';\n\n/**\n * Represents a disposable resource, such as the execution of an Observable. A\n * Subscription has one important method, `unsubscribe`, that takes no argument\n * and just disposes the resource held by the subscription.\n *\n * Additionally, subscriptions may be grouped together through the `add()`\n * method, which will attach a child Subscription to the current Subscription.\n * When a Subscription is unsubscribed, all its children (and its grandchildren)\n * will be unsubscribed as well.\n *\n * @class Subscription\n */\nexport class Subscription implements SubscriptionLike {\n /** @nocollapse */\n public static EMPTY = (() => {\n const empty = new Subscription();\n empty.closed = true;\n return empty;\n })();\n\n /**\n * A flag to indicate whether this Subscription has already been unsubscribed.\n */\n public closed = false;\n\n private _parentage: Subscription[] | Subscription | null = null;\n\n /**\n * The list of registered finalizers to execute upon unsubscription. Adding and removing from this\n * list occurs in the {@link #add} and {@link #remove} methods.\n */\n private _finalizers: Exclude[] | null = null;\n\n /**\n * @param initialTeardown A function executed first as part of the finalization\n * process that is kicked off when {@link #unsubscribe} is called.\n */\n constructor(private initialTeardown?: () => void) {}\n\n /**\n * Disposes the resources held by the subscription. May, for instance, cancel\n * an ongoing Observable execution or cancel any other type of work that\n * started when the Subscription was created.\n * @return {void}\n */\n unsubscribe(): void {\n let errors: any[] | undefined;\n\n if (!this.closed) {\n this.closed = true;\n\n // Remove this from it's parents.\n const { _parentage } = this;\n if (_parentage) {\n this._parentage = null;\n if (Array.isArray(_parentage)) {\n for (const parent of _parentage) {\n parent.remove(this);\n }\n } else {\n _parentage.remove(this);\n }\n }\n\n const { initialTeardown: initialFinalizer } = this;\n if (isFunction(initialFinalizer)) {\n try {\n initialFinalizer();\n } catch (e) {\n errors = e instanceof UnsubscriptionError ? e.errors : [e];\n }\n }\n\n const { _finalizers } = this;\n if (_finalizers) {\n this._finalizers = null;\n for (const finalizer of _finalizers) {\n try {\n execFinalizer(finalizer);\n } catch (err) {\n errors = errors ?? [];\n if (err instanceof UnsubscriptionError) {\n errors = [...errors, ...err.errors];\n } else {\n errors.push(err);\n }\n }\n }\n }\n\n if (errors) {\n throw new UnsubscriptionError(errors);\n }\n }\n }\n\n /**\n * Adds a finalizer to this subscription, so that finalization will be unsubscribed/called\n * when this subscription is unsubscribed. If this subscription is already {@link #closed},\n * because it has already been unsubscribed, then whatever finalizer is passed to it\n * will automatically be executed (unless the finalizer itself is also a closed subscription).\n *\n * Closed Subscriptions cannot be added as finalizers to any subscription. Adding a closed\n * subscription to a any subscription will result in no operation. (A noop).\n *\n * Adding a subscription to itself, or adding `null` or `undefined` will not perform any\n * operation at all. (A noop).\n *\n * `Subscription` instances that are added to this instance will automatically remove themselves\n * if they are unsubscribed. Functions and {@link Unsubscribable} objects that you wish to remove\n * will need to be removed manually with {@link #remove}\n *\n * @param teardown The finalization logic to add to this subscription.\n */\n add(teardown: TeardownLogic): void {\n // Only add the finalizer if it's not undefined\n // and don't add a subscription to itself.\n if (teardown && teardown !== this) {\n if (this.closed) {\n // If this subscription is already closed,\n // execute whatever finalizer is handed to it automatically.\n execFinalizer(teardown);\n } else {\n if (teardown instanceof Subscription) {\n // We don't add closed subscriptions, and we don't add the same subscription\n // twice. Subscription unsubscribe is idempotent.\n if (teardown.closed || teardown._hasParent(this)) {\n return;\n }\n teardown._addParent(this);\n }\n (this._finalizers = this._finalizers ?? []).push(teardown);\n }\n }\n }\n\n /**\n * Checks to see if a this subscription already has a particular parent.\n * This will signal that this subscription has already been added to the parent in question.\n * @param parent the parent to check for\n */\n private _hasParent(parent: Subscription) {\n const { _parentage } = this;\n return _parentage === parent || (Array.isArray(_parentage) && _parentage.includes(parent));\n }\n\n /**\n * Adds a parent to this subscription so it can be removed from the parent if it\n * unsubscribes on it's own.\n *\n * NOTE: THIS ASSUMES THAT {@link _hasParent} HAS ALREADY BEEN CHECKED.\n * @param parent The parent subscription to add\n */\n private _addParent(parent: Subscription) {\n const { _parentage } = this;\n this._parentage = Array.isArray(_parentage) ? (_parentage.push(parent), _parentage) : _parentage ? [_parentage, parent] : parent;\n }\n\n /**\n * Called on a child when it is removed via {@link #remove}.\n * @param parent The parent to remove\n */\n private _removeParent(parent: Subscription) {\n const { _parentage } = this;\n if (_parentage === parent) {\n this._parentage = null;\n } else if (Array.isArray(_parentage)) {\n arrRemove(_parentage, parent);\n }\n }\n\n /**\n * Removes a finalizer from this subscription that was previously added with the {@link #add} method.\n *\n * Note that `Subscription` instances, when unsubscribed, will automatically remove themselves\n * from every other `Subscription` they have been added to. This means that using the `remove` method\n * is not a common thing and should be used thoughtfully.\n *\n * If you add the same finalizer instance of a function or an unsubscribable object to a `Subscription` instance\n * more than once, you will need to call `remove` the same number of times to remove all instances.\n *\n * All finalizer instances are removed to free up memory upon unsubscription.\n *\n * @param teardown The finalizer to remove from this subscription\n */\n remove(teardown: Exclude): void {\n const { _finalizers } = this;\n _finalizers && arrRemove(_finalizers, teardown);\n\n if (teardown instanceof Subscription) {\n teardown._removeParent(this);\n }\n }\n}\n\nexport const EMPTY_SUBSCRIPTION = Subscription.EMPTY;\n\nexport function isSubscription(value: any): value is Subscription {\n return (\n value instanceof Subscription ||\n (value && 'closed' in value && isFunction(value.remove) && isFunction(value.add) && isFunction(value.unsubscribe))\n );\n}\n\nfunction execFinalizer(finalizer: Unsubscribable | (() => void)) {\n if (isFunction(finalizer)) {\n finalizer();\n } else {\n finalizer.unsubscribe();\n }\n}\n", "import { Subscriber } from './Subscriber';\nimport { ObservableNotification } from './types';\n\n/**\n * The {@link GlobalConfig} object for RxJS. It is used to configure things\n * like how to react on unhandled errors.\n */\nexport const config: GlobalConfig = {\n onUnhandledError: null,\n onStoppedNotification: null,\n Promise: undefined,\n useDeprecatedSynchronousErrorHandling: false,\n useDeprecatedNextContext: false,\n};\n\n/**\n * The global configuration object for RxJS, used to configure things\n * like how to react on unhandled errors. Accessible via {@link config}\n * object.\n */\nexport interface GlobalConfig {\n /**\n * A registration point for unhandled errors from RxJS. These are errors that\n * cannot were not handled by consuming code in the usual subscription path. For\n * example, if you have this configured, and you subscribe to an observable without\n * providing an error handler, errors from that subscription will end up here. This\n * will _always_ be called asynchronously on another job in the runtime. This is because\n * we do not want errors thrown in this user-configured handler to interfere with the\n * behavior of the library.\n */\n onUnhandledError: ((err: any) => void) | null;\n\n /**\n * A registration point for notifications that cannot be sent to subscribers because they\n * have completed, errored or have been explicitly unsubscribed. By default, next, complete\n * and error notifications sent to stopped subscribers are noops. However, sometimes callers\n * might want a different behavior. For example, with sources that attempt to report errors\n * to stopped subscribers, a caller can configure RxJS to throw an unhandled error instead.\n * This will _always_ be called asynchronously on another job in the runtime. This is because\n * we do not want errors thrown in this user-configured handler to interfere with the\n * behavior of the library.\n */\n onStoppedNotification: ((notification: ObservableNotification, subscriber: Subscriber) => void) | null;\n\n /**\n * The promise constructor used by default for {@link Observable#toPromise toPromise} and {@link Observable#forEach forEach}\n * methods.\n *\n * @deprecated As of version 8, RxJS will no longer support this sort of injection of a\n * Promise constructor. If you need a Promise implementation other than native promises,\n * please polyfill/patch Promise as you see appropriate. Will be removed in v8.\n */\n Promise?: PromiseConstructorLike;\n\n /**\n * If true, turns on synchronous error rethrowing, which is a deprecated behavior\n * in v6 and higher. This behavior enables bad patterns like wrapping a subscribe\n * call in a try/catch block. It also enables producer interference, a nasty bug\n * where a multicast can be broken for all observers by a downstream consumer with\n * an unhandled error. DO NOT USE THIS FLAG UNLESS IT'S NEEDED TO BUY TIME\n * FOR MIGRATION REASONS.\n *\n * @deprecated As of version 8, RxJS will no longer support synchronous throwing\n * of unhandled errors. All errors will be thrown on a separate call stack to prevent bad\n * behaviors described above. Will be removed in v8.\n */\n useDeprecatedSynchronousErrorHandling: boolean;\n\n /**\n * If true, enables an as-of-yet undocumented feature from v5: The ability to access\n * `unsubscribe()` via `this` context in `next` functions created in observers passed\n * to `subscribe`.\n *\n * This is being removed because the performance was severely problematic, and it could also cause\n * issues when types other than POJOs are passed to subscribe as subscribers, as they will likely have\n * their `this` context overwritten.\n *\n * @deprecated As of version 8, RxJS will no longer support altering the\n * context of next functions provided as part of an observer to Subscribe. Instead,\n * you will have access to a subscription or a signal or token that will allow you to do things like\n * unsubscribe and test closed status. Will be removed in v8.\n */\n useDeprecatedNextContext: boolean;\n}\n", "import type { TimerHandle } from './timerHandle';\ntype SetTimeoutFunction = (handler: () => void, timeout?: number, ...args: any[]) => TimerHandle;\ntype ClearTimeoutFunction = (handle: TimerHandle) => void;\n\ninterface TimeoutProvider {\n setTimeout: SetTimeoutFunction;\n clearTimeout: ClearTimeoutFunction;\n delegate:\n | {\n setTimeout: SetTimeoutFunction;\n clearTimeout: ClearTimeoutFunction;\n }\n | undefined;\n}\n\nexport const timeoutProvider: TimeoutProvider = {\n // When accessing the delegate, use the variable rather than `this` so that\n // the functions can be called without being bound to the provider.\n setTimeout(handler: () => void, timeout?: number, ...args) {\n const { delegate } = timeoutProvider;\n if (delegate?.setTimeout) {\n return delegate.setTimeout(handler, timeout, ...args);\n }\n return setTimeout(handler, timeout, ...args);\n },\n clearTimeout(handle) {\n const { delegate } = timeoutProvider;\n return (delegate?.clearTimeout || clearTimeout)(handle as any);\n },\n delegate: undefined,\n};\n", "import { config } from '../config';\nimport { timeoutProvider } from '../scheduler/timeoutProvider';\n\n/**\n * Handles an error on another job either with the user-configured {@link onUnhandledError},\n * or by throwing it on that new job so it can be picked up by `window.onerror`, `process.on('error')`, etc.\n *\n * This should be called whenever there is an error that is out-of-band with the subscription\n * or when an error hits a terminal boundary of the subscription and no error handler was provided.\n *\n * @param err the error to report\n */\nexport function reportUnhandledError(err: any) {\n timeoutProvider.setTimeout(() => {\n const { onUnhandledError } = config;\n if (onUnhandledError) {\n // Execute the user-configured error handler.\n onUnhandledError(err);\n } else {\n // Throw so it is picked up by the runtime's uncaught error mechanism.\n throw err;\n }\n });\n}\n", "/* tslint:disable:no-empty */\nexport function noop() { }\n", "import { CompleteNotification, NextNotification, ErrorNotification } from './types';\n\n/**\n * A completion object optimized for memory use and created to be the\n * same \"shape\" as other notifications in v8.\n * @internal\n */\nexport const COMPLETE_NOTIFICATION = (() => createNotification('C', undefined, undefined) as CompleteNotification)();\n\n/**\n * Internal use only. Creates an optimized error notification that is the same \"shape\"\n * as other notifications.\n * @internal\n */\nexport function errorNotification(error: any): ErrorNotification {\n return createNotification('E', undefined, error) as any;\n}\n\n/**\n * Internal use only. Creates an optimized next notification that is the same \"shape\"\n * as other notifications.\n * @internal\n */\nexport function nextNotification(value: T) {\n return createNotification('N', value, undefined) as NextNotification;\n}\n\n/**\n * Ensures that all notifications created internally have the same \"shape\" in v8.\n *\n * TODO: This is only exported to support a crazy legacy test in `groupBy`.\n * @internal\n */\nexport function createNotification(kind: 'N' | 'E' | 'C', value: any, error: any) {\n return {\n kind,\n value,\n error,\n };\n}\n", "import { config } from '../config';\n\nlet context: { errorThrown: boolean; error: any } | null = null;\n\n/**\n * Handles dealing with errors for super-gross mode. Creates a context, in which\n * any synchronously thrown errors will be passed to {@link captureError}. Which\n * will record the error such that it will be rethrown after the call back is complete.\n * TODO: Remove in v8\n * @param cb An immediately executed function.\n */\nexport function errorContext(cb: () => void) {\n if (config.useDeprecatedSynchronousErrorHandling) {\n const isRoot = !context;\n if (isRoot) {\n context = { errorThrown: false, error: null };\n }\n cb();\n if (isRoot) {\n const { errorThrown, error } = context!;\n context = null;\n if (errorThrown) {\n throw error;\n }\n }\n } else {\n // This is the general non-deprecated path for everyone that\n // isn't crazy enough to use super-gross mode (useDeprecatedSynchronousErrorHandling)\n cb();\n }\n}\n\n/**\n * Captures errors only in super-gross mode.\n * @param err the error to capture\n */\nexport function captureError(err: any) {\n if (config.useDeprecatedSynchronousErrorHandling && context) {\n context.errorThrown = true;\n context.error = err;\n }\n}\n", "import { isFunction } from './util/isFunction';\nimport { Observer, ObservableNotification } from './types';\nimport { isSubscription, Subscription } from './Subscription';\nimport { config } from './config';\nimport { reportUnhandledError } from './util/reportUnhandledError';\nimport { noop } from './util/noop';\nimport { nextNotification, errorNotification, COMPLETE_NOTIFICATION } from './NotificationFactories';\nimport { timeoutProvider } from './scheduler/timeoutProvider';\nimport { captureError } from './util/errorContext';\n\n/**\n * Implements the {@link Observer} interface and extends the\n * {@link Subscription} class. While the {@link Observer} is the public API for\n * consuming the values of an {@link Observable}, all Observers get converted to\n * a Subscriber, in order to provide Subscription-like capabilities such as\n * `unsubscribe`. Subscriber is a common type in RxJS, and crucial for\n * implementing operators, but it is rarely used as a public API.\n *\n * @class Subscriber\n */\nexport class Subscriber extends Subscription implements Observer {\n /**\n * A static factory for a Subscriber, given a (potentially partial) definition\n * of an Observer.\n * @param next The `next` callback of an Observer.\n * @param error The `error` callback of an\n * Observer.\n * @param complete The `complete` callback of an\n * Observer.\n * @return A Subscriber wrapping the (partially defined)\n * Observer represented by the given arguments.\n * @nocollapse\n * @deprecated Do not use. Will be removed in v8. There is no replacement for this\n * method, and there is no reason to be creating instances of `Subscriber` directly.\n * If you have a specific use case, please file an issue.\n */\n static create(next?: (x?: T) => void, error?: (e?: any) => void, complete?: () => void): Subscriber {\n return new SafeSubscriber(next, error, complete);\n }\n\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n protected isStopped: boolean = false;\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n protected destination: Subscriber | Observer; // this `any` is the escape hatch to erase extra type param (e.g. R)\n\n /**\n * @deprecated Internal implementation detail, do not use directly. Will be made internal in v8.\n * There is no reason to directly create an instance of Subscriber. This type is exported for typings reasons.\n */\n constructor(destination?: Subscriber | Observer) {\n super();\n if (destination) {\n this.destination = destination;\n // Automatically chain subscriptions together here.\n // if destination is a Subscription, then it is a Subscriber.\n if (isSubscription(destination)) {\n destination.add(this);\n }\n } else {\n this.destination = EMPTY_OBSERVER;\n }\n }\n\n /**\n * The {@link Observer} callback to receive notifications of type `next` from\n * the Observable, with a value. The Observable may call this method 0 or more\n * times.\n * @param {T} [value] The `next` value.\n * @return {void}\n */\n next(value?: T): void {\n if (this.isStopped) {\n handleStoppedNotification(nextNotification(value), this);\n } else {\n this._next(value!);\n }\n }\n\n /**\n * The {@link Observer} callback to receive notifications of type `error` from\n * the Observable, with an attached `Error`. Notifies the Observer that\n * the Observable has experienced an error condition.\n * @param {any} [err] The `error` exception.\n * @return {void}\n */\n error(err?: any): void {\n if (this.isStopped) {\n handleStoppedNotification(errorNotification(err), this);\n } else {\n this.isStopped = true;\n this._error(err);\n }\n }\n\n /**\n * The {@link Observer} callback to receive a valueless notification of type\n * `complete` from the Observable. Notifies the Observer that the Observable\n * has finished sending push-based notifications.\n * @return {void}\n */\n complete(): void {\n if (this.isStopped) {\n handleStoppedNotification(COMPLETE_NOTIFICATION, this);\n } else {\n this.isStopped = true;\n this._complete();\n }\n }\n\n unsubscribe(): void {\n if (!this.closed) {\n this.isStopped = true;\n super.unsubscribe();\n this.destination = null!;\n }\n }\n\n protected _next(value: T): void {\n this.destination.next(value);\n }\n\n protected _error(err: any): void {\n try {\n this.destination.error(err);\n } finally {\n this.unsubscribe();\n }\n }\n\n protected _complete(): void {\n try {\n this.destination.complete();\n } finally {\n this.unsubscribe();\n }\n }\n}\n\n/**\n * This bind is captured here because we want to be able to have\n * compatibility with monoid libraries that tend to use a method named\n * `bind`. In particular, a library called Monio requires this.\n */\nconst _bind = Function.prototype.bind;\n\nfunction bind any>(fn: Fn, thisArg: any): Fn {\n return _bind.call(fn, thisArg);\n}\n\n/**\n * Internal optimization only, DO NOT EXPOSE.\n * @internal\n */\nclass ConsumerObserver implements Observer {\n constructor(private partialObserver: Partial>) {}\n\n next(value: T): void {\n const { partialObserver } = this;\n if (partialObserver.next) {\n try {\n partialObserver.next(value);\n } catch (error) {\n handleUnhandledError(error);\n }\n }\n }\n\n error(err: any): void {\n const { partialObserver } = this;\n if (partialObserver.error) {\n try {\n partialObserver.error(err);\n } catch (error) {\n handleUnhandledError(error);\n }\n } else {\n handleUnhandledError(err);\n }\n }\n\n complete(): void {\n const { partialObserver } = this;\n if (partialObserver.complete) {\n try {\n partialObserver.complete();\n } catch (error) {\n handleUnhandledError(error);\n }\n }\n }\n}\n\nexport class SafeSubscriber extends Subscriber {\n constructor(\n observerOrNext?: Partial> | ((value: T) => void) | null,\n error?: ((e?: any) => void) | null,\n complete?: (() => void) | null\n ) {\n super();\n\n let partialObserver: Partial>;\n if (isFunction(observerOrNext) || !observerOrNext) {\n // The first argument is a function, not an observer. The next\n // two arguments *could* be observers, or they could be empty.\n partialObserver = {\n next: (observerOrNext ?? undefined) as (((value: T) => void) | undefined),\n error: error ?? undefined,\n complete: complete ?? undefined,\n };\n } else {\n // The first argument is a partial observer.\n let context: any;\n if (this && config.useDeprecatedNextContext) {\n // This is a deprecated path that made `this.unsubscribe()` available in\n // next handler functions passed to subscribe. This only exists behind a flag\n // now, as it is *very* slow.\n context = Object.create(observerOrNext);\n context.unsubscribe = () => this.unsubscribe();\n partialObserver = {\n next: observerOrNext.next && bind(observerOrNext.next, context),\n error: observerOrNext.error && bind(observerOrNext.error, context),\n complete: observerOrNext.complete && bind(observerOrNext.complete, context),\n };\n } else {\n // The \"normal\" path. Just use the partial observer directly.\n partialObserver = observerOrNext;\n }\n }\n\n // Wrap the partial observer to ensure it's a full observer, and\n // make sure proper error handling is accounted for.\n this.destination = new ConsumerObserver(partialObserver);\n }\n}\n\nfunction handleUnhandledError(error: any) {\n if (config.useDeprecatedSynchronousErrorHandling) {\n captureError(error);\n } else {\n // Ideal path, we report this as an unhandled error,\n // which is thrown on a new call stack.\n reportUnhandledError(error);\n }\n}\n\n/**\n * An error handler used when no error handler was supplied\n * to the SafeSubscriber -- meaning no error handler was supplied\n * do the `subscribe` call on our observable.\n * @param err The error to handle\n */\nfunction defaultErrorHandler(err: any) {\n throw err;\n}\n\n/**\n * A handler for notifications that cannot be sent to a stopped subscriber.\n * @param notification The notification being sent\n * @param subscriber The stopped subscriber\n */\nfunction handleStoppedNotification(notification: ObservableNotification, subscriber: Subscriber) {\n const { onStoppedNotification } = config;\n onStoppedNotification && timeoutProvider.setTimeout(() => onStoppedNotification(notification, subscriber));\n}\n\n/**\n * The observer used as a stub for subscriptions where the user did not\n * pass any arguments to `subscribe`. Comes with the default error handling\n * behavior.\n */\nexport const EMPTY_OBSERVER: Readonly> & { closed: true } = {\n closed: true,\n next: noop,\n error: defaultErrorHandler,\n complete: noop,\n};\n", "/**\n * Symbol.observable or a string \"@@observable\". Used for interop\n *\n * @deprecated We will no longer be exporting this symbol in upcoming versions of RxJS.\n * Instead polyfill and use Symbol.observable directly *or* use https://www.npmjs.com/package/symbol-observable\n */\nexport const observable: string | symbol = (() => (typeof Symbol === 'function' && Symbol.observable) || '@@observable')();\n", "/**\n * This function takes one parameter and just returns it. Simply put,\n * this is like `(x: T): T => x`.\n *\n * ## Examples\n *\n * This is useful in some cases when using things like `mergeMap`\n *\n * ```ts\n * import { interval, take, map, range, mergeMap, identity } from 'rxjs';\n *\n * const source$ = interval(1000).pipe(take(5));\n *\n * const result$ = source$.pipe(\n * map(i => range(i)),\n * mergeMap(identity) // same as mergeMap(x => x)\n * );\n *\n * result$.subscribe({\n * next: console.log\n * });\n * ```\n *\n * Or when you want to selectively apply an operator\n *\n * ```ts\n * import { interval, take, identity } from 'rxjs';\n *\n * const shouldLimit = () => Math.random() < 0.5;\n *\n * const source$ = interval(1000);\n *\n * const result$ = source$.pipe(shouldLimit() ? take(5) : identity);\n *\n * result$.subscribe({\n * next: console.log\n * });\n * ```\n *\n * @param x Any value that is returned by this function\n * @returns The value passed as the first parameter to this function\n */\nexport function identity(x: T): T {\n return x;\n}\n", "import { identity } from './identity';\nimport { UnaryFunction } from '../types';\n\nexport function pipe(): typeof identity;\nexport function pipe(fn1: UnaryFunction): UnaryFunction;\nexport function pipe(fn1: UnaryFunction, fn2: UnaryFunction): UnaryFunction;\nexport function pipe(fn1: UnaryFunction, fn2: UnaryFunction, fn3: UnaryFunction): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction\n): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction,\n fn5: UnaryFunction\n): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction,\n fn5: UnaryFunction,\n fn6: UnaryFunction\n): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction,\n fn5: UnaryFunction,\n fn6: UnaryFunction,\n fn7: UnaryFunction\n): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction,\n fn5: UnaryFunction,\n fn6: UnaryFunction,\n fn7: UnaryFunction,\n fn8: UnaryFunction\n): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction,\n fn5: UnaryFunction,\n fn6: UnaryFunction,\n fn7: UnaryFunction,\n fn8: UnaryFunction,\n fn9: UnaryFunction\n): UnaryFunction;\nexport function pipe(\n fn1: UnaryFunction,\n fn2: UnaryFunction,\n fn3: UnaryFunction,\n fn4: UnaryFunction,\n fn5: UnaryFunction,\n fn6: UnaryFunction,\n fn7: UnaryFunction,\n fn8: UnaryFunction,\n fn9: UnaryFunction,\n ...fns: UnaryFunction[]\n): UnaryFunction;\n\n/**\n * pipe() can be called on one or more functions, each of which can take one argument (\"UnaryFunction\")\n * and uses it to return a value.\n * It returns a function that takes one argument, passes it to the first UnaryFunction, and then\n * passes the result to the next one, passes that result to the next one, and so on. \n */\nexport function pipe(...fns: Array>): UnaryFunction {\n return pipeFromArray(fns);\n}\n\n/** @internal */\nexport function pipeFromArray(fns: Array>): UnaryFunction {\n if (fns.length === 0) {\n return identity as UnaryFunction;\n }\n\n if (fns.length === 1) {\n return fns[0];\n }\n\n return function piped(input: T): R {\n return fns.reduce((prev: any, fn: UnaryFunction) => fn(prev), input as any);\n };\n}\n", "import { Operator } from './Operator';\nimport { SafeSubscriber, Subscriber } from './Subscriber';\nimport { isSubscription, Subscription } from './Subscription';\nimport { TeardownLogic, OperatorFunction, Subscribable, Observer } from './types';\nimport { observable as Symbol_observable } from './symbol/observable';\nimport { pipeFromArray } from './util/pipe';\nimport { config } from './config';\nimport { isFunction } from './util/isFunction';\nimport { errorContext } from './util/errorContext';\n\n/**\n * A representation of any set of values over any amount of time. This is the most basic building block\n * of RxJS.\n *\n * @class Observable\n */\nexport class Observable implements Subscribable {\n /**\n * @deprecated Internal implementation detail, do not use directly. Will be made internal in v8.\n */\n source: Observable | undefined;\n\n /**\n * @deprecated Internal implementation detail, do not use directly. Will be made internal in v8.\n */\n operator: Operator | undefined;\n\n /**\n * @constructor\n * @param {Function} subscribe the function that is called when the Observable is\n * initially subscribed to. This function is given a Subscriber, to which new values\n * can be `next`ed, or an `error` method can be called to raise an error, or\n * `complete` can be called to notify of a successful completion.\n */\n constructor(subscribe?: (this: Observable, subscriber: Subscriber) => TeardownLogic) {\n if (subscribe) {\n this._subscribe = subscribe;\n }\n }\n\n // HACK: Since TypeScript inherits static properties too, we have to\n // fight against TypeScript here so Subject can have a different static create signature\n /**\n * Creates a new Observable by calling the Observable constructor\n * @owner Observable\n * @method create\n * @param {Function} subscribe? the subscriber function to be passed to the Observable constructor\n * @return {Observable} a new observable\n * @nocollapse\n * @deprecated Use `new Observable()` instead. Will be removed in v8.\n */\n static create: (...args: any[]) => any = (subscribe?: (subscriber: Subscriber) => TeardownLogic) => {\n return new Observable(subscribe);\n };\n\n /**\n * Creates a new Observable, with this Observable instance as the source, and the passed\n * operator defined as the new observable's operator.\n * @method lift\n * @param operator the operator defining the operation to take on the observable\n * @return a new observable with the Operator applied\n * @deprecated Internal implementation detail, do not use directly. Will be made internal in v8.\n * If you have implemented an operator using `lift`, it is recommended that you create an\n * operator by simply returning `new Observable()` directly. See \"Creating new operators from\n * scratch\" section here: https://rxjs.dev/guide/operators\n */\n lift(operator?: Operator): Observable {\n const observable = new Observable();\n observable.source = this;\n observable.operator = operator;\n return observable;\n }\n\n subscribe(observerOrNext?: Partial> | ((value: T) => void)): Subscription;\n /** @deprecated Instead of passing separate callback arguments, use an observer argument. Signatures taking separate callback arguments will be removed in v8. Details: https://rxjs.dev/deprecations/subscribe-arguments */\n subscribe(next?: ((value: T) => void) | null, error?: ((error: any) => void) | null, complete?: (() => void) | null): Subscription;\n /**\n * Invokes an execution of an Observable and registers Observer handlers for notifications it will emit.\n *\n * Use it when you have all these Observables, but still nothing is happening.\n *\n * `subscribe` is not a regular operator, but a method that calls Observable's internal `subscribe` function. It\n * might be for example a function that you passed to Observable's constructor, but most of the time it is\n * a library implementation, which defines what will be emitted by an Observable, and when it be will emitted. This means\n * that calling `subscribe` is actually the moment when Observable starts its work, not when it is created, as it is often\n * the thought.\n *\n * Apart from starting the execution of an Observable, this method allows you to listen for values\n * that an Observable emits, as well as for when it completes or errors. You can achieve this in two\n * of the following ways.\n *\n * The first way is creating an object that implements {@link Observer} interface. It should have methods\n * defined by that interface, but note that it should be just a regular JavaScript object, which you can create\n * yourself in any way you want (ES6 class, classic function constructor, object literal etc.). In particular, do\n * not attempt to use any RxJS implementation details to create Observers - you don't need them. Remember also\n * that your object does not have to implement all methods. If you find yourself creating a method that doesn't\n * do anything, you can simply omit it. Note however, if the `error` method is not provided and an error happens,\n * it will be thrown asynchronously. Errors thrown asynchronously cannot be caught using `try`/`catch`. Instead,\n * use the {@link onUnhandledError} configuration option or use a runtime handler (like `window.onerror` or\n * `process.on('error)`) to be notified of unhandled errors. Because of this, it's recommended that you provide\n * an `error` method to avoid missing thrown errors.\n *\n * The second way is to give up on Observer object altogether and simply provide callback functions in place of its methods.\n * This means you can provide three functions as arguments to `subscribe`, where the first function is equivalent\n * of a `next` method, the second of an `error` method and the third of a `complete` method. Just as in case of an Observer,\n * if you do not need to listen for something, you can omit a function by passing `undefined` or `null`,\n * since `subscribe` recognizes these functions by where they were placed in function call. When it comes\n * to the `error` function, as with an Observer, if not provided, errors emitted by an Observable will be thrown asynchronously.\n *\n * You can, however, subscribe with no parameters at all. This may be the case where you're not interested in terminal events\n * and you also handled emissions internally by using operators (e.g. using `tap`).\n *\n * Whichever style of calling `subscribe` you use, in both cases it returns a Subscription object.\n * This object allows you to call `unsubscribe` on it, which in turn will stop the work that an Observable does and will clean\n * up all resources that an Observable used. Note that cancelling a subscription will not call `complete` callback\n * provided to `subscribe` function, which is reserved for a regular completion signal that comes from an Observable.\n *\n * Remember that callbacks provided to `subscribe` are not guaranteed to be called asynchronously.\n * It is an Observable itself that decides when these functions will be called. For example {@link of}\n * by default emits all its values synchronously. Always check documentation for how given Observable\n * will behave when subscribed and if its default behavior can be modified with a `scheduler`.\n *\n * #### Examples\n *\n * Subscribe with an {@link guide/observer Observer}\n *\n * ```ts\n * import { of } from 'rxjs';\n *\n * const sumObserver = {\n * sum: 0,\n * next(value) {\n * console.log('Adding: ' + value);\n * this.sum = this.sum + value;\n * },\n * error() {\n * // We actually could just remove this method,\n * // since we do not really care about errors right now.\n * },\n * complete() {\n * console.log('Sum equals: ' + this.sum);\n * }\n * };\n *\n * of(1, 2, 3) // Synchronously emits 1, 2, 3 and then completes.\n * .subscribe(sumObserver);\n *\n * // Logs:\n * // 'Adding: 1'\n * // 'Adding: 2'\n * // 'Adding: 3'\n * // 'Sum equals: 6'\n * ```\n *\n * Subscribe with functions ({@link deprecations/subscribe-arguments deprecated})\n *\n * ```ts\n * import { of } from 'rxjs'\n *\n * let sum = 0;\n *\n * of(1, 2, 3).subscribe(\n * value => {\n * console.log('Adding: ' + value);\n * sum = sum + value;\n * },\n * undefined,\n * () => console.log('Sum equals: ' + sum)\n * );\n *\n * // Logs:\n * // 'Adding: 1'\n * // 'Adding: 2'\n * // 'Adding: 3'\n * // 'Sum equals: 6'\n * ```\n *\n * Cancel a subscription\n *\n * ```ts\n * import { interval } from 'rxjs';\n *\n * const subscription = interval(1000).subscribe({\n * next(num) {\n * console.log(num)\n * },\n * complete() {\n * // Will not be called, even when cancelling subscription.\n * console.log('completed!');\n * }\n * });\n *\n * setTimeout(() => {\n * subscription.unsubscribe();\n * console.log('unsubscribed!');\n * }, 2500);\n *\n * // Logs:\n * // 0 after 1s\n * // 1 after 2s\n * // 'unsubscribed!' after 2.5s\n * ```\n *\n * @param {Observer|Function} observerOrNext (optional) Either an observer with methods to be called,\n * or the first of three possible handlers, which is the handler for each value emitted from the subscribed\n * Observable.\n * @param {Function} error (optional) A handler for a terminal event resulting from an error. If no error handler is provided,\n * the error will be thrown asynchronously as unhandled.\n * @param {Function} complete (optional) A handler for a terminal event resulting from successful completion.\n * @return {Subscription} a subscription reference to the registered handlers\n * @method subscribe\n */\n subscribe(\n observerOrNext?: Partial> | ((value: T) => void) | null,\n error?: ((error: any) => void) | null,\n complete?: (() => void) | null\n ): Subscription {\n const subscriber = isSubscriber(observerOrNext) ? observerOrNext : new SafeSubscriber(observerOrNext, error, complete);\n\n errorContext(() => {\n const { operator, source } = this;\n subscriber.add(\n operator\n ? // We're dealing with a subscription in the\n // operator chain to one of our lifted operators.\n operator.call(subscriber, source)\n : source\n ? // If `source` has a value, but `operator` does not, something that\n // had intimate knowledge of our API, like our `Subject`, must have\n // set it. We're going to just call `_subscribe` directly.\n this._subscribe(subscriber)\n : // In all other cases, we're likely wrapping a user-provided initializer\n // function, so we need to catch errors and handle them appropriately.\n this._trySubscribe(subscriber)\n );\n });\n\n return subscriber;\n }\n\n /** @internal */\n protected _trySubscribe(sink: Subscriber): TeardownLogic {\n try {\n return this._subscribe(sink);\n } catch (err) {\n // We don't need to return anything in this case,\n // because it's just going to try to `add()` to a subscription\n // above.\n sink.error(err);\n }\n }\n\n /**\n * Used as a NON-CANCELLABLE means of subscribing to an observable, for use with\n * APIs that expect promises, like `async/await`. You cannot unsubscribe from this.\n *\n * **WARNING**: Only use this with observables you *know* will complete. If the source\n * observable does not complete, you will end up with a promise that is hung up, and\n * potentially all of the state of an async function hanging out in memory. To avoid\n * this situation, look into adding something like {@link timeout}, {@link take},\n * {@link takeWhile}, or {@link takeUntil} amongst others.\n *\n * #### Example\n *\n * ```ts\n * import { interval, take } from 'rxjs';\n *\n * const source$ = interval(1000).pipe(take(4));\n *\n * async function getTotal() {\n * let total = 0;\n *\n * await source$.forEach(value => {\n * total += value;\n * console.log('observable -> ' + value);\n * });\n *\n * return total;\n * }\n *\n * getTotal().then(\n * total => console.log('Total: ' + total)\n * );\n *\n * // Expected:\n * // 'observable -> 0'\n * // 'observable -> 1'\n * // 'observable -> 2'\n * // 'observable -> 3'\n * // 'Total: 6'\n * ```\n *\n * @param next a handler for each value emitted by the observable\n * @return a promise that either resolves on observable completion or\n * rejects with the handled error\n */\n forEach(next: (value: T) => void): Promise;\n\n /**\n * @param next a handler for each value emitted by the observable\n * @param promiseCtor a constructor function used to instantiate the Promise\n * @return a promise that either resolves on observable completion or\n * rejects with the handled error\n * @deprecated Passing a Promise constructor will no longer be available\n * in upcoming versions of RxJS. This is because it adds weight to the library, for very\n * little benefit. If you need this functionality, it is recommended that you either\n * polyfill Promise, or you create an adapter to convert the returned native promise\n * to whatever promise implementation you wanted. Will be removed in v8.\n */\n forEach(next: (value: T) => void, promiseCtor: PromiseConstructorLike): Promise;\n\n forEach(next: (value: T) => void, promiseCtor?: PromiseConstructorLike): Promise {\n promiseCtor = getPromiseCtor(promiseCtor);\n\n return new promiseCtor((resolve, reject) => {\n const subscriber = new SafeSubscriber({\n next: (value) => {\n try {\n next(value);\n } catch (err) {\n reject(err);\n subscriber.unsubscribe();\n }\n },\n error: reject,\n complete: resolve,\n });\n this.subscribe(subscriber);\n }) as Promise;\n }\n\n /** @internal */\n protected _subscribe(subscriber: Subscriber): TeardownLogic {\n return this.source?.subscribe(subscriber);\n }\n\n /**\n * An interop point defined by the es7-observable spec https://github.com/zenparsing/es-observable\n * @method Symbol.observable\n * @return {Observable} this instance of the observable\n */\n [Symbol_observable]() {\n return this;\n }\n\n /* tslint:disable:max-line-length */\n pipe(): Observable;\n pipe(op1: OperatorFunction): Observable;\n pipe(op1: OperatorFunction, op2: OperatorFunction): Observable;\n pipe(op1: OperatorFunction, op2: OperatorFunction, op3: OperatorFunction): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction\n ): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction,\n op5: OperatorFunction\n ): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction,\n op5: OperatorFunction,\n op6: OperatorFunction\n ): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction,\n op5: OperatorFunction,\n op6: OperatorFunction,\n op7: OperatorFunction\n ): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction,\n op5: OperatorFunction,\n op6: OperatorFunction,\n op7: OperatorFunction,\n op8: OperatorFunction\n ): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction,\n op5: OperatorFunction,\n op6: OperatorFunction,\n op7: OperatorFunction,\n op8: OperatorFunction,\n op9: OperatorFunction\n ): Observable;\n pipe(\n op1: OperatorFunction,\n op2: OperatorFunction,\n op3: OperatorFunction,\n op4: OperatorFunction,\n op5: OperatorFunction,\n op6: OperatorFunction,\n op7: OperatorFunction,\n op8: OperatorFunction,\n op9: OperatorFunction,\n ...operations: OperatorFunction[]\n ): Observable;\n /* tslint:enable:max-line-length */\n\n /**\n * Used to stitch together functional operators into a chain.\n * @method pipe\n * @return {Observable} the Observable result of all of the operators having\n * been called in the order they were passed in.\n *\n * ## Example\n *\n * ```ts\n * import { interval, filter, map, scan } from 'rxjs';\n *\n * interval(1000)\n * .pipe(\n * filter(x => x % 2 === 0),\n * map(x => x + x),\n * scan((acc, x) => acc + x)\n * )\n * .subscribe(x => console.log(x));\n * ```\n */\n pipe(...operations: OperatorFunction[]): Observable {\n return pipeFromArray(operations)(this);\n }\n\n /* tslint:disable:max-line-length */\n /** @deprecated Replaced with {@link firstValueFrom} and {@link lastValueFrom}. Will be removed in v8. Details: https://rxjs.dev/deprecations/to-promise */\n toPromise(): Promise;\n /** @deprecated Replaced with {@link firstValueFrom} and {@link lastValueFrom}. Will be removed in v8. Details: https://rxjs.dev/deprecations/to-promise */\n toPromise(PromiseCtor: typeof Promise): Promise;\n /** @deprecated Replaced with {@link firstValueFrom} and {@link lastValueFrom}. Will be removed in v8. Details: https://rxjs.dev/deprecations/to-promise */\n toPromise(PromiseCtor: PromiseConstructorLike): Promise;\n /* tslint:enable:max-line-length */\n\n /**\n * Subscribe to this Observable and get a Promise resolving on\n * `complete` with the last emission (if any).\n *\n * **WARNING**: Only use this with observables you *know* will complete. If the source\n * observable does not complete, you will end up with a promise that is hung up, and\n * potentially all of the state of an async function hanging out in memory. To avoid\n * this situation, look into adding something like {@link timeout}, {@link take},\n * {@link takeWhile}, or {@link takeUntil} amongst others.\n *\n * @method toPromise\n * @param [promiseCtor] a constructor function used to instantiate\n * the Promise\n * @return A Promise that resolves with the last value emit, or\n * rejects on an error. If there were no emissions, Promise\n * resolves with undefined.\n * @deprecated Replaced with {@link firstValueFrom} and {@link lastValueFrom}. Will be removed in v8. Details: https://rxjs.dev/deprecations/to-promise\n */\n toPromise(promiseCtor?: PromiseConstructorLike): Promise {\n promiseCtor = getPromiseCtor(promiseCtor);\n\n return new promiseCtor((resolve, reject) => {\n let value: T | undefined;\n this.subscribe(\n (x: T) => (value = x),\n (err: any) => reject(err),\n () => resolve(value)\n );\n }) as Promise;\n }\n}\n\n/**\n * Decides between a passed promise constructor from consuming code,\n * A default configured promise constructor, and the native promise\n * constructor and returns it. If nothing can be found, it will throw\n * an error.\n * @param promiseCtor The optional promise constructor to passed by consuming code\n */\nfunction getPromiseCtor(promiseCtor: PromiseConstructorLike | undefined) {\n return promiseCtor ?? config.Promise ?? Promise;\n}\n\nfunction isObserver(value: any): value is Observer {\n return value && isFunction(value.next) && isFunction(value.error) && isFunction(value.complete);\n}\n\nfunction isSubscriber(value: any): value is Subscriber {\n return (value && value instanceof Subscriber) || (isObserver(value) && isSubscription(value));\n}\n", "import { Observable } from '../Observable';\nimport { Subscriber } from '../Subscriber';\nimport { OperatorFunction } from '../types';\nimport { isFunction } from './isFunction';\n\n/**\n * Used to determine if an object is an Observable with a lift function.\n */\nexport function hasLift(source: any): source is { lift: InstanceType['lift'] } {\n return isFunction(source?.lift);\n}\n\n/**\n * Creates an `OperatorFunction`. Used to define operators throughout the library in a concise way.\n * @param init The logic to connect the liftedSource to the subscriber at the moment of subscription.\n */\nexport function operate(\n init: (liftedSource: Observable, subscriber: Subscriber) => (() => void) | void\n): OperatorFunction {\n return (source: Observable) => {\n if (hasLift(source)) {\n return source.lift(function (this: Subscriber, liftedSource: Observable) {\n try {\n return init(liftedSource, this);\n } catch (err) {\n this.error(err);\n }\n });\n }\n throw new TypeError('Unable to lift unknown Observable type');\n };\n}\n", "import { Subscriber } from '../Subscriber';\n\n/**\n * Creates an instance of an `OperatorSubscriber`.\n * @param destination The downstream subscriber.\n * @param onNext Handles next values, only called if this subscriber is not stopped or closed. Any\n * error that occurs in this function is caught and sent to the `error` method of this subscriber.\n * @param onError Handles errors from the subscription, any errors that occur in this handler are caught\n * and send to the `destination` error handler.\n * @param onComplete Handles completion notification from the subscription. Any errors that occur in\n * this handler are sent to the `destination` error handler.\n * @param onFinalize Additional teardown logic here. This will only be called on teardown if the\n * subscriber itself is not already closed. This is called after all other teardown logic is executed.\n */\nexport function createOperatorSubscriber(\n destination: Subscriber,\n onNext?: (value: T) => void,\n onComplete?: () => void,\n onError?: (err: any) => void,\n onFinalize?: () => void\n): Subscriber {\n return new OperatorSubscriber(destination, onNext, onComplete, onError, onFinalize);\n}\n\n/**\n * A generic helper for allowing operators to be created with a Subscriber and\n * use closures to capture necessary state from the operator function itself.\n */\nexport class OperatorSubscriber extends Subscriber {\n /**\n * Creates an instance of an `OperatorSubscriber`.\n * @param destination The downstream subscriber.\n * @param onNext Handles next values, only called if this subscriber is not stopped or closed. Any\n * error that occurs in this function is caught and sent to the `error` method of this subscriber.\n * @param onError Handles errors from the subscription, any errors that occur in this handler are caught\n * and send to the `destination` error handler.\n * @param onComplete Handles completion notification from the subscription. Any errors that occur in\n * this handler are sent to the `destination` error handler.\n * @param onFinalize Additional finalization logic here. This will only be called on finalization if the\n * subscriber itself is not already closed. This is called after all other finalization logic is executed.\n * @param shouldUnsubscribe An optional check to see if an unsubscribe call should truly unsubscribe.\n * NOTE: This currently **ONLY** exists to support the strange behavior of {@link groupBy}, where unsubscription\n * to the resulting observable does not actually disconnect from the source if there are active subscriptions\n * to any grouped observable. (DO NOT EXPOSE OR USE EXTERNALLY!!!)\n */\n constructor(\n destination: Subscriber,\n onNext?: (value: T) => void,\n onComplete?: () => void,\n onError?: (err: any) => void,\n private onFinalize?: () => void,\n private shouldUnsubscribe?: () => boolean\n ) {\n // It's important - for performance reasons - that all of this class's\n // members are initialized and that they are always initialized in the same\n // order. This will ensure that all OperatorSubscriber instances have the\n // same hidden class in V8. This, in turn, will help keep the number of\n // hidden classes involved in property accesses within the base class as\n // low as possible. If the number of hidden classes involved exceeds four,\n // the property accesses will become megamorphic and performance penalties\n // will be incurred - i.e. inline caches won't be used.\n //\n // The reasons for ensuring all instances have the same hidden class are\n // further discussed in this blog post from Benedikt Meurer:\n // https://benediktmeurer.de/2018/03/23/impact-of-polymorphism-on-component-based-frameworks-like-react/\n super(destination);\n this._next = onNext\n ? function (this: OperatorSubscriber, value: T) {\n try {\n onNext(value);\n } catch (err) {\n destination.error(err);\n }\n }\n : super._next;\n this._error = onError\n ? function (this: OperatorSubscriber, err: any) {\n try {\n onError(err);\n } catch (err) {\n // Send any errors that occur down stream.\n destination.error(err);\n } finally {\n // Ensure finalization.\n this.unsubscribe();\n }\n }\n : super._error;\n this._complete = onComplete\n ? function (this: OperatorSubscriber) {\n try {\n onComplete();\n } catch (err) {\n // Send any errors that occur down stream.\n destination.error(err);\n } finally {\n // Ensure finalization.\n this.unsubscribe();\n }\n }\n : super._complete;\n }\n\n unsubscribe() {\n if (!this.shouldUnsubscribe || this.shouldUnsubscribe()) {\n const { closed } = this;\n super.unsubscribe();\n // Execute additional teardown if we have any and we didn't already do so.\n !closed && this.onFinalize?.();\n }\n }\n}\n", "import { Subscription } from '../Subscription';\n\ninterface AnimationFrameProvider {\n schedule(callback: FrameRequestCallback): Subscription;\n requestAnimationFrame: typeof requestAnimationFrame;\n cancelAnimationFrame: typeof cancelAnimationFrame;\n delegate:\n | {\n requestAnimationFrame: typeof requestAnimationFrame;\n cancelAnimationFrame: typeof cancelAnimationFrame;\n }\n | undefined;\n}\n\nexport const animationFrameProvider: AnimationFrameProvider = {\n // When accessing the delegate, use the variable rather than `this` so that\n // the functions can be called without being bound to the provider.\n schedule(callback) {\n let request = requestAnimationFrame;\n let cancel: typeof cancelAnimationFrame | undefined = cancelAnimationFrame;\n const { delegate } = animationFrameProvider;\n if (delegate) {\n request = delegate.requestAnimationFrame;\n cancel = delegate.cancelAnimationFrame;\n }\n const handle = request((timestamp) => {\n // Clear the cancel function. The request has been fulfilled, so\n // attempting to cancel the request upon unsubscription would be\n // pointless.\n cancel = undefined;\n callback(timestamp);\n });\n return new Subscription(() => cancel?.(handle));\n },\n requestAnimationFrame(...args) {\n const { delegate } = animationFrameProvider;\n return (delegate?.requestAnimationFrame || requestAnimationFrame)(...args);\n },\n cancelAnimationFrame(...args) {\n const { delegate } = animationFrameProvider;\n return (delegate?.cancelAnimationFrame || cancelAnimationFrame)(...args);\n },\n delegate: undefined,\n};\n", "import { createErrorClass } from './createErrorClass';\n\nexport interface ObjectUnsubscribedError extends Error {}\n\nexport interface ObjectUnsubscribedErrorCtor {\n /**\n * @deprecated Internal implementation detail. Do not construct error instances.\n * Cannot be tagged as internal: https://github.com/ReactiveX/rxjs/issues/6269\n */\n new (): ObjectUnsubscribedError;\n}\n\n/**\n * An error thrown when an action is invalid because the object has been\n * unsubscribed.\n *\n * @see {@link Subject}\n * @see {@link BehaviorSubject}\n *\n * @class ObjectUnsubscribedError\n */\nexport const ObjectUnsubscribedError: ObjectUnsubscribedErrorCtor = createErrorClass(\n (_super) =>\n function ObjectUnsubscribedErrorImpl(this: any) {\n _super(this);\n this.name = 'ObjectUnsubscribedError';\n this.message = 'object unsubscribed';\n }\n);\n", "import { Operator } from './Operator';\nimport { Observable } from './Observable';\nimport { Subscriber } from './Subscriber';\nimport { Subscription, EMPTY_SUBSCRIPTION } from './Subscription';\nimport { Observer, SubscriptionLike, TeardownLogic } from './types';\nimport { ObjectUnsubscribedError } from './util/ObjectUnsubscribedError';\nimport { arrRemove } from './util/arrRemove';\nimport { errorContext } from './util/errorContext';\n\n/**\n * A Subject is a special type of Observable that allows values to be\n * multicasted to many Observers. Subjects are like EventEmitters.\n *\n * Every Subject is an Observable and an Observer. You can subscribe to a\n * Subject, and you can call next to feed values as well as error and complete.\n */\nexport class Subject extends Observable implements SubscriptionLike {\n closed = false;\n\n private currentObservers: Observer[] | null = null;\n\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n observers: Observer[] = [];\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n isStopped = false;\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n hasError = false;\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n thrownError: any = null;\n\n /**\n * Creates a \"subject\" by basically gluing an observer to an observable.\n *\n * @nocollapse\n * @deprecated Recommended you do not use. Will be removed at some point in the future. Plans for replacement still under discussion.\n */\n static create: (...args: any[]) => any = (destination: Observer, source: Observable): AnonymousSubject => {\n return new AnonymousSubject(destination, source);\n };\n\n constructor() {\n // NOTE: This must be here to obscure Observable's constructor.\n super();\n }\n\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n lift(operator: Operator): Observable {\n const subject = new AnonymousSubject(this, this);\n subject.operator = operator as any;\n return subject as any;\n }\n\n /** @internal */\n protected _throwIfClosed() {\n if (this.closed) {\n throw new ObjectUnsubscribedError();\n }\n }\n\n next(value: T) {\n errorContext(() => {\n this._throwIfClosed();\n if (!this.isStopped) {\n if (!this.currentObservers) {\n this.currentObservers = Array.from(this.observers);\n }\n for (const observer of this.currentObservers) {\n observer.next(value);\n }\n }\n });\n }\n\n error(err: any) {\n errorContext(() => {\n this._throwIfClosed();\n if (!this.isStopped) {\n this.hasError = this.isStopped = true;\n this.thrownError = err;\n const { observers } = this;\n while (observers.length) {\n observers.shift()!.error(err);\n }\n }\n });\n }\n\n complete() {\n errorContext(() => {\n this._throwIfClosed();\n if (!this.isStopped) {\n this.isStopped = true;\n const { observers } = this;\n while (observers.length) {\n observers.shift()!.complete();\n }\n }\n });\n }\n\n unsubscribe() {\n this.isStopped = this.closed = true;\n this.observers = this.currentObservers = null!;\n }\n\n get observed() {\n return this.observers?.length > 0;\n }\n\n /** @internal */\n protected _trySubscribe(subscriber: Subscriber): TeardownLogic {\n this._throwIfClosed();\n return super._trySubscribe(subscriber);\n }\n\n /** @internal */\n protected _subscribe(subscriber: Subscriber): Subscription {\n this._throwIfClosed();\n this._checkFinalizedStatuses(subscriber);\n return this._innerSubscribe(subscriber);\n }\n\n /** @internal */\n protected _innerSubscribe(subscriber: Subscriber) {\n const { hasError, isStopped, observers } = this;\n if (hasError || isStopped) {\n return EMPTY_SUBSCRIPTION;\n }\n this.currentObservers = null;\n observers.push(subscriber);\n return new Subscription(() => {\n this.currentObservers = null;\n arrRemove(observers, subscriber);\n });\n }\n\n /** @internal */\n protected _checkFinalizedStatuses(subscriber: Subscriber) {\n const { hasError, thrownError, isStopped } = this;\n if (hasError) {\n subscriber.error(thrownError);\n } else if (isStopped) {\n subscriber.complete();\n }\n }\n\n /**\n * Creates a new Observable with this Subject as the source. You can do this\n * to create custom Observer-side logic of the Subject and conceal it from\n * code that uses the Observable.\n * @return {Observable} Observable that the Subject casts to\n */\n asObservable(): Observable {\n const observable: any = new Observable();\n observable.source = this;\n return observable;\n }\n}\n\n/**\n * @class AnonymousSubject\n */\nexport class AnonymousSubject extends Subject {\n constructor(\n /** @deprecated Internal implementation detail, do not use directly. Will be made internal in v8. */\n public destination?: Observer,\n source?: Observable\n ) {\n super();\n this.source = source;\n }\n\n next(value: T) {\n this.destination?.next?.(value);\n }\n\n error(err: any) {\n this.destination?.error?.(err);\n }\n\n complete() {\n this.destination?.complete?.();\n }\n\n /** @internal */\n protected _subscribe(subscriber: Subscriber): Subscription {\n return this.source?.subscribe(subscriber) ?? EMPTY_SUBSCRIPTION;\n }\n}\n", "import { TimestampProvider } from '../types';\n\ninterface DateTimestampProvider extends TimestampProvider {\n delegate: TimestampProvider | undefined;\n}\n\nexport const dateTimestampProvider: DateTimestampProvider = {\n now() {\n // Use the variable rather than `this` so that the function can be called\n // without being bound to the provider.\n return (dateTimestampProvider.delegate || Date).now();\n },\n delegate: undefined,\n};\n", "import { Subject } from './Subject';\nimport { TimestampProvider } from './types';\nimport { Subscriber } from './Subscriber';\nimport { Subscription } from './Subscription';\nimport { dateTimestampProvider } from './scheduler/dateTimestampProvider';\n\n/**\n * A variant of {@link Subject} that \"replays\" old values to new subscribers by emitting them when they first subscribe.\n *\n * `ReplaySubject` has an internal buffer that will store a specified number of values that it has observed. Like `Subject`,\n * `ReplaySubject` \"observes\" values by having them passed to its `next` method. When it observes a value, it will store that\n * value for a time determined by the configuration of the `ReplaySubject`, as passed to its constructor.\n *\n * When a new subscriber subscribes to the `ReplaySubject` instance, it will synchronously emit all values in its buffer in\n * a First-In-First-Out (FIFO) manner. The `ReplaySubject` will also complete, if it has observed completion; and it will\n * error if it has observed an error.\n *\n * There are two main configuration items to be concerned with:\n *\n * 1. `bufferSize` - This will determine how many items are stored in the buffer, defaults to infinite.\n * 2. `windowTime` - The amount of time to hold a value in the buffer before removing it from the buffer.\n *\n * Both configurations may exist simultaneously. So if you would like to buffer a maximum of 3 values, as long as the values\n * are less than 2 seconds old, you could do so with a `new ReplaySubject(3, 2000)`.\n *\n * ### Differences with BehaviorSubject\n *\n * `BehaviorSubject` is similar to `new ReplaySubject(1)`, with a couple of exceptions:\n *\n * 1. `BehaviorSubject` comes \"primed\" with a single value upon construction.\n * 2. `ReplaySubject` will replay values, even after observing an error, where `BehaviorSubject` will not.\n *\n * @see {@link Subject}\n * @see {@link BehaviorSubject}\n * @see {@link shareReplay}\n */\nexport class ReplaySubject extends Subject {\n private _buffer: (T | number)[] = [];\n private _infiniteTimeWindow = true;\n\n /**\n * @param bufferSize The size of the buffer to replay on subscription\n * @param windowTime The amount of time the buffered items will stay buffered\n * @param timestampProvider An object with a `now()` method that provides the current timestamp. This is used to\n * calculate the amount of time something has been buffered.\n */\n constructor(\n private _bufferSize = Infinity,\n private _windowTime = Infinity,\n private _timestampProvider: TimestampProvider = dateTimestampProvider\n ) {\n super();\n this._infiniteTimeWindow = _windowTime === Infinity;\n this._bufferSize = Math.max(1, _bufferSize);\n this._windowTime = Math.max(1, _windowTime);\n }\n\n next(value: T): void {\n const { isStopped, _buffer, _infiniteTimeWindow, _timestampProvider, _windowTime } = this;\n if (!isStopped) {\n _buffer.push(value);\n !_infiniteTimeWindow && _buffer.push(_timestampProvider.now() + _windowTime);\n }\n this._trimBuffer();\n super.next(value);\n }\n\n /** @internal */\n protected _subscribe(subscriber: Subscriber): Subscription {\n this._throwIfClosed();\n this._trimBuffer();\n\n const subscription = this._innerSubscribe(subscriber);\n\n const { _infiniteTimeWindow, _buffer } = this;\n // We use a copy here, so reentrant code does not mutate our array while we're\n // emitting it to a new subscriber.\n const copy = _buffer.slice();\n for (let i = 0; i < copy.length && !subscriber.closed; i += _infiniteTimeWindow ? 1 : 2) {\n subscriber.next(copy[i] as T);\n }\n\n this._checkFinalizedStatuses(subscriber);\n\n return subscription;\n }\n\n private _trimBuffer() {\n const { _bufferSize, _timestampProvider, _buffer, _infiniteTimeWindow } = this;\n // If we don't have an infinite buffer size, and we're over the length,\n // use splice to truncate the old buffer values off. Note that we have to\n // double the size for instances where we're not using an infinite time window\n // because we're storing the values and the timestamps in the same array.\n const adjustedBufferSize = (_infiniteTimeWindow ? 1 : 2) * _bufferSize;\n _bufferSize < Infinity && adjustedBufferSize < _buffer.length && _buffer.splice(0, _buffer.length - adjustedBufferSize);\n\n // Now, if we're not in an infinite time window, remove all values where the time is\n // older than what is allowed.\n if (!_infiniteTimeWindow) {\n const now = _timestampProvider.now();\n let last = 0;\n // Search the array for the first timestamp that isn't expired and\n // truncate the buffer up to that point.\n for (let i = 1; i < _buffer.length && (_buffer[i] as number) <= now; i += 2) {\n last = i;\n }\n last && _buffer.splice(0, last + 1);\n }\n }\n}\n", "import { Scheduler } from '../Scheduler';\nimport { Subscription } from '../Subscription';\nimport { SchedulerAction } from '../types';\n\n/**\n * A unit of work to be executed in a `scheduler`. An action is typically\n * created from within a {@link SchedulerLike} and an RxJS user does not need to concern\n * themselves about creating and manipulating an Action.\n *\n * ```ts\n * class Action extends Subscription {\n * new (scheduler: Scheduler, work: (state?: T) => void);\n * schedule(state?: T, delay: number = 0): Subscription;\n * }\n * ```\n *\n * @class Action\n */\nexport class Action extends Subscription {\n constructor(scheduler: Scheduler, work: (this: SchedulerAction, state?: T) => void) {\n super();\n }\n /**\n * Schedules this action on its parent {@link SchedulerLike} for execution. May be passed\n * some context object, `state`. May happen at some point in the future,\n * according to the `delay` parameter, if specified.\n * @param {T} [state] Some contextual data that the `work` function uses when\n * called by the Scheduler.\n * @param {number} [delay] Time to wait before executing the work, where the\n * time unit is implicit and defined by the Scheduler.\n * @return {void}\n */\n public schedule(state?: T, delay: number = 0): Subscription {\n return this;\n }\n}\n", "import type { TimerHandle } from './timerHandle';\ntype SetIntervalFunction = (handler: () => void, timeout?: number, ...args: any[]) => TimerHandle;\ntype ClearIntervalFunction = (handle: TimerHandle) => void;\n\ninterface IntervalProvider {\n setInterval: SetIntervalFunction;\n clearInterval: ClearIntervalFunction;\n delegate:\n | {\n setInterval: SetIntervalFunction;\n clearInterval: ClearIntervalFunction;\n }\n | undefined;\n}\n\nexport const intervalProvider: IntervalProvider = {\n // When accessing the delegate, use the variable rather than `this` so that\n // the functions can be called without being bound to the provider.\n setInterval(handler: () => void, timeout?: number, ...args) {\n const { delegate } = intervalProvider;\n if (delegate?.setInterval) {\n return delegate.setInterval(handler, timeout, ...args);\n }\n return setInterval(handler, timeout, ...args);\n },\n clearInterval(handle) {\n const { delegate } = intervalProvider;\n return (delegate?.clearInterval || clearInterval)(handle as any);\n },\n delegate: undefined,\n};\n", "import { Action } from './Action';\nimport { SchedulerAction } from '../types';\nimport { Subscription } from '../Subscription';\nimport { AsyncScheduler } from './AsyncScheduler';\nimport { intervalProvider } from './intervalProvider';\nimport { arrRemove } from '../util/arrRemove';\nimport { TimerHandle } from './timerHandle';\n\nexport class AsyncAction extends Action {\n public id: TimerHandle | undefined;\n public state?: T;\n // @ts-ignore: Property has no initializer and is not definitely assigned\n public delay: number;\n protected pending: boolean = false;\n\n constructor(protected scheduler: AsyncScheduler, protected work: (this: SchedulerAction, state?: T) => void) {\n super(scheduler, work);\n }\n\n public schedule(state?: T, delay: number = 0): Subscription {\n if (this.closed) {\n return this;\n }\n\n // Always replace the current state with the new state.\n this.state = state;\n\n const id = this.id;\n const scheduler = this.scheduler;\n\n //\n // Important implementation note:\n //\n // Actions only execute once by default, unless rescheduled from within the\n // scheduled callback. This allows us to implement single and repeat\n // actions via the same code path, without adding API surface area, as well\n // as mimic traditional recursion but across asynchronous boundaries.\n //\n // However, JS runtimes and timers distinguish between intervals achieved by\n // serial `setTimeout` calls vs. a single `setInterval` call. An interval of\n // serial `setTimeout` calls can be individually delayed, which delays\n // scheduling the next `setTimeout`, and so on. `setInterval` attempts to\n // guarantee the interval callback will be invoked more precisely to the\n // interval period, regardless of load.\n //\n // Therefore, we use `setInterval` to schedule single and repeat actions.\n // If the action reschedules itself with the same delay, the interval is not\n // canceled. If the action doesn't reschedule, or reschedules with a\n // different delay, the interval will be canceled after scheduled callback\n // execution.\n //\n if (id != null) {\n this.id = this.recycleAsyncId(scheduler, id, delay);\n }\n\n // Set the pending flag indicating that this action has been scheduled, or\n // has recursively rescheduled itself.\n this.pending = true;\n\n this.delay = delay;\n // If this action has already an async Id, don't request a new one.\n this.id = this.id ?? this.requestAsyncId(scheduler, this.id, delay);\n\n return this;\n }\n\n protected requestAsyncId(scheduler: AsyncScheduler, _id?: TimerHandle, delay: number = 0): TimerHandle {\n return intervalProvider.setInterval(scheduler.flush.bind(scheduler, this), delay);\n }\n\n protected recycleAsyncId(_scheduler: AsyncScheduler, id?: TimerHandle, delay: number | null = 0): TimerHandle | undefined {\n // If this action is rescheduled with the same delay time, don't clear the interval id.\n if (delay != null && this.delay === delay && this.pending === false) {\n return id;\n }\n // Otherwise, if the action's delay time is different from the current delay,\n // or the action has been rescheduled before it's executed, clear the interval id\n if (id != null) {\n intervalProvider.clearInterval(id);\n }\n\n return undefined;\n }\n\n /**\n * Immediately executes this action and the `work` it contains.\n * @return {any}\n */\n public execute(state: T, delay: number): any {\n if (this.closed) {\n return new Error('executing a cancelled action');\n }\n\n this.pending = false;\n const error = this._execute(state, delay);\n if (error) {\n return error;\n } else if (this.pending === false && this.id != null) {\n // Dequeue if the action didn't reschedule itself. Don't call\n // unsubscribe(), because the action could reschedule later.\n // For example:\n // ```\n // scheduler.schedule(function doWork(counter) {\n // /* ... I'm a busy worker bee ... */\n // var originalAction = this;\n // /* wait 100ms before rescheduling the action */\n // setTimeout(function () {\n // originalAction.schedule(counter + 1);\n // }, 100);\n // }, 1000);\n // ```\n this.id = this.recycleAsyncId(this.scheduler, this.id, null);\n }\n }\n\n protected _execute(state: T, _delay: number): any {\n let errored: boolean = false;\n let errorValue: any;\n try {\n this.work(state);\n } catch (e) {\n errored = true;\n // HACK: Since code elsewhere is relying on the \"truthiness\" of the\n // return here, we can't have it return \"\" or 0 or false.\n // TODO: Clean this up when we refactor schedulers mid-version-8 or so.\n errorValue = e ? e : new Error('Scheduled action threw falsy error');\n }\n if (errored) {\n this.unsubscribe();\n return errorValue;\n }\n }\n\n unsubscribe() {\n if (!this.closed) {\n const { id, scheduler } = this;\n const { actions } = scheduler;\n\n this.work = this.state = this.scheduler = null!;\n this.pending = false;\n\n arrRemove(actions, this);\n if (id != null) {\n this.id = this.recycleAsyncId(scheduler, id, null);\n }\n\n this.delay = null!;\n super.unsubscribe();\n }\n }\n}\n", "import { Action } from './scheduler/Action';\nimport { Subscription } from './Subscription';\nimport { SchedulerLike, SchedulerAction } from './types';\nimport { dateTimestampProvider } from './scheduler/dateTimestampProvider';\n\n/**\n * An execution context and a data structure to order tasks and schedule their\n * execution. Provides a notion of (potentially virtual) time, through the\n * `now()` getter method.\n *\n * Each unit of work in a Scheduler is called an `Action`.\n *\n * ```ts\n * class Scheduler {\n * now(): number;\n * schedule(work, delay?, state?): Subscription;\n * }\n * ```\n *\n * @class Scheduler\n * @deprecated Scheduler is an internal implementation detail of RxJS, and\n * should not be used directly. Rather, create your own class and implement\n * {@link SchedulerLike}. Will be made internal in v8.\n */\nexport class Scheduler implements SchedulerLike {\n public static now: () => number = dateTimestampProvider.now;\n\n constructor(private schedulerActionCtor: typeof Action, now: () => number = Scheduler.now) {\n this.now = now;\n }\n\n /**\n * A getter method that returns a number representing the current time\n * (at the time this function was called) according to the scheduler's own\n * internal clock.\n * @return {number} A number that represents the current time. May or may not\n * have a relation to wall-clock time. May or may not refer to a time unit\n * (e.g. milliseconds).\n */\n public now: () => number;\n\n /**\n * Schedules a function, `work`, for execution. May happen at some point in\n * the future, according to the `delay` parameter, if specified. May be passed\n * some context object, `state`, which will be passed to the `work` function.\n *\n * The given arguments will be processed an stored as an Action object in a\n * queue of actions.\n *\n * @param {function(state: ?T): ?Subscription} work A function representing a\n * task, or some unit of work to be executed by the Scheduler.\n * @param {number} [delay] Time to wait before executing the work, where the\n * time unit is implicit and defined by the Scheduler itself.\n * @param {T} [state] Some contextual data that the `work` function uses when\n * called by the Scheduler.\n * @return {Subscription} A subscription in order to be able to unsubscribe\n * the scheduled work.\n */\n public schedule(work: (this: SchedulerAction, state?: T) => void, delay: number = 0, state?: T): Subscription {\n return new this.schedulerActionCtor(this, work).schedule(state, delay);\n }\n}\n", "import { Scheduler } from '../Scheduler';\nimport { Action } from './Action';\nimport { AsyncAction } from './AsyncAction';\nimport { TimerHandle } from './timerHandle';\n\nexport class AsyncScheduler extends Scheduler {\n public actions: Array> = [];\n /**\n * A flag to indicate whether the Scheduler is currently executing a batch of\n * queued actions.\n * @type {boolean}\n * @internal\n */\n public _active: boolean = false;\n /**\n * An internal ID used to track the latest asynchronous task such as those\n * coming from `setTimeout`, `setInterval`, `requestAnimationFrame`, and\n * others.\n * @type {any}\n * @internal\n */\n public _scheduled: TimerHandle | undefined;\n\n constructor(SchedulerAction: typeof Action, now: () => number = Scheduler.now) {\n super(SchedulerAction, now);\n }\n\n public flush(action: AsyncAction): void {\n const { actions } = this;\n\n if (this._active) {\n actions.push(action);\n return;\n }\n\n let error: any;\n this._active = true;\n\n do {\n if ((error = action.execute(action.state, action.delay))) {\n break;\n }\n } while ((action = actions.shift()!)); // exhaust the scheduler queue\n\n this._active = false;\n\n if (error) {\n while ((action = actions.shift()!)) {\n action.unsubscribe();\n }\n throw error;\n }\n }\n}\n", "import { AsyncAction } from './AsyncAction';\nimport { AsyncScheduler } from './AsyncScheduler';\n\n/**\n *\n * Async Scheduler\n *\n * Schedule task as if you used setTimeout(task, duration)\n *\n * `async` scheduler schedules tasks asynchronously, by putting them on the JavaScript\n * event loop queue. It is best used to delay tasks in time or to schedule tasks repeating\n * in intervals.\n *\n * If you just want to \"defer\" task, that is to perform it right after currently\n * executing synchronous code ends (commonly achieved by `setTimeout(deferredTask, 0)`),\n * better choice will be the {@link asapScheduler} scheduler.\n *\n * ## Examples\n * Use async scheduler to delay task\n * ```ts\n * import { asyncScheduler } from 'rxjs';\n *\n * const task = () => console.log('it works!');\n *\n * asyncScheduler.schedule(task, 2000);\n *\n * // After 2 seconds logs:\n * // \"it works!\"\n * ```\n *\n * Use async scheduler to repeat task in intervals\n * ```ts\n * import { asyncScheduler } from 'rxjs';\n *\n * function task(state) {\n * console.log(state);\n * this.schedule(state + 1, 1000); // `this` references currently executing Action,\n * // which we reschedule with new state and delay\n * }\n *\n * asyncScheduler.schedule(task, 3000, 0);\n *\n * // Logs:\n * // 0 after 3s\n * // 1 after 4s\n * // 2 after 5s\n * // 3 after 6s\n * ```\n */\n\nexport const asyncScheduler = new AsyncScheduler(AsyncAction);\n\n/**\n * @deprecated Renamed to {@link asyncScheduler}. Will be removed in v8.\n */\nexport const async = asyncScheduler;\n", "import { AsyncAction } from './AsyncAction';\nimport { AnimationFrameScheduler } from './AnimationFrameScheduler';\nimport { SchedulerAction } from '../types';\nimport { animationFrameProvider } from './animationFrameProvider';\nimport { TimerHandle } from './timerHandle';\n\nexport class AnimationFrameAction extends AsyncAction {\n constructor(protected scheduler: AnimationFrameScheduler, protected work: (this: SchedulerAction, state?: T) => void) {\n super(scheduler, work);\n }\n\n protected requestAsyncId(scheduler: AnimationFrameScheduler, id?: TimerHandle, delay: number = 0): TimerHandle {\n // If delay is greater than 0, request as an async action.\n if (delay !== null && delay > 0) {\n return super.requestAsyncId(scheduler, id, delay);\n }\n // Push the action to the end of the scheduler queue.\n scheduler.actions.push(this);\n // If an animation frame has already been requested, don't request another\n // one. If an animation frame hasn't been requested yet, request one. Return\n // the current animation frame request id.\n return scheduler._scheduled || (scheduler._scheduled = animationFrameProvider.requestAnimationFrame(() => scheduler.flush(undefined)));\n }\n\n protected recycleAsyncId(scheduler: AnimationFrameScheduler, id?: TimerHandle, delay: number = 0): TimerHandle | undefined {\n // If delay exists and is greater than 0, or if the delay is null (the\n // action wasn't rescheduled) but was originally scheduled as an async\n // action, then recycle as an async action.\n if (delay != null ? delay > 0 : this.delay > 0) {\n return super.recycleAsyncId(scheduler, id, delay);\n }\n // If the scheduler queue has no remaining actions with the same async id,\n // cancel the requested animation frame and set the scheduled flag to\n // undefined so the next AnimationFrameAction will request its own.\n const { actions } = scheduler;\n if (id != null && actions[actions.length - 1]?.id !== id) {\n animationFrameProvider.cancelAnimationFrame(id as number);\n scheduler._scheduled = undefined;\n }\n // Return undefined so the action knows to request a new async id if it's rescheduled.\n return undefined;\n }\n}\n", "import { AsyncAction } from './AsyncAction';\nimport { AsyncScheduler } from './AsyncScheduler';\n\nexport class AnimationFrameScheduler extends AsyncScheduler {\n public flush(action?: AsyncAction): void {\n this._active = true;\n // The async id that effects a call to flush is stored in _scheduled.\n // Before executing an action, it's necessary to check the action's async\n // id to determine whether it's supposed to be executed in the current\n // flush.\n // Previous implementations of this method used a count to determine this,\n // but that was unsound, as actions that are unsubscribed - i.e. cancelled -\n // are removed from the actions array and that can shift actions that are\n // scheduled to be executed in a subsequent flush into positions at which\n // they are executed within the current flush.\n const flushId = this._scheduled;\n this._scheduled = undefined;\n\n const { actions } = this;\n let error: any;\n action = action || actions.shift()!;\n\n do {\n if ((error = action.execute(action.state, action.delay))) {\n break;\n }\n } while ((action = actions[0]) && action.id === flushId && actions.shift());\n\n this._active = false;\n\n if (error) {\n while ((action = actions[0]) && action.id === flushId && actions.shift()) {\n action.unsubscribe();\n }\n throw error;\n }\n }\n}\n", "import { AnimationFrameAction } from './AnimationFrameAction';\nimport { AnimationFrameScheduler } from './AnimationFrameScheduler';\n\n/**\n *\n * Animation Frame Scheduler\n *\n * Perform task when `window.requestAnimationFrame` would fire\n *\n * When `animationFrame` scheduler is used with delay, it will fall back to {@link asyncScheduler} scheduler\n * behaviour.\n *\n * Without delay, `animationFrame` scheduler can be used to create smooth browser animations.\n * It makes sure scheduled task will happen just before next browser content repaint,\n * thus performing animations as efficiently as possible.\n *\n * ## Example\n * Schedule div height animation\n * ```ts\n * // html:
\n * import { animationFrameScheduler } from 'rxjs';\n *\n * const div = document.querySelector('div');\n *\n * animationFrameScheduler.schedule(function(height) {\n * div.style.height = height + \"px\";\n *\n * this.schedule(height + 1); // `this` references currently executing Action,\n * // which we reschedule with new state\n * }, 0, 0);\n *\n * // You will see a div element growing in height\n * ```\n */\n\nexport const animationFrameScheduler = new AnimationFrameScheduler(AnimationFrameAction);\n\n/**\n * @deprecated Renamed to {@link animationFrameScheduler}. Will be removed in v8.\n */\nexport const animationFrame = animationFrameScheduler;\n", "import { Observable } from '../Observable';\nimport { SchedulerLike } from '../types';\n\n/**\n * A simple Observable that emits no items to the Observer and immediately\n * emits a complete notification.\n *\n * Just emits 'complete', and nothing else.\n *\n * ![](empty.png)\n *\n * A simple Observable that only emits the complete notification. It can be used\n * for composing with other Observables, such as in a {@link mergeMap}.\n *\n * ## Examples\n *\n * Log complete notification\n *\n * ```ts\n * import { EMPTY } from 'rxjs';\n *\n * EMPTY.subscribe({\n * next: () => console.log('Next'),\n * complete: () => console.log('Complete!')\n * });\n *\n * // Outputs\n * // Complete!\n * ```\n *\n * Emit the number 7, then complete\n *\n * ```ts\n * import { EMPTY, startWith } from 'rxjs';\n *\n * const result = EMPTY.pipe(startWith(7));\n * result.subscribe(x => console.log(x));\n *\n * // Outputs\n * // 7\n * ```\n *\n * Map and flatten only odd numbers to the sequence `'a'`, `'b'`, `'c'`\n *\n * ```ts\n * import { interval, mergeMap, of, EMPTY } from 'rxjs';\n *\n * const interval$ = interval(1000);\n * const result = interval$.pipe(\n * mergeMap(x => x % 2 === 1 ? of('a', 'b', 'c') : EMPTY),\n * );\n * result.subscribe(x => console.log(x));\n *\n * // Results in the following to the console:\n * // x is equal to the count on the interval, e.g. (0, 1, 2, 3, ...)\n * // x will occur every 1000ms\n * // if x % 2 is equal to 1, print a, b, c (each on its own)\n * // if x % 2 is not equal to 1, nothing will be output\n * ```\n *\n * @see {@link Observable}\n * @see {@link NEVER}\n * @see {@link of}\n * @see {@link throwError}\n */\nexport const EMPTY = new Observable((subscriber) => subscriber.complete());\n\n/**\n * @param scheduler A {@link SchedulerLike} to use for scheduling\n * the emission of the complete notification.\n * @deprecated Replaced with the {@link EMPTY} constant or {@link scheduled} (e.g. `scheduled([], scheduler)`). Will be removed in v8.\n */\nexport function empty(scheduler?: SchedulerLike) {\n return scheduler ? emptyScheduled(scheduler) : EMPTY;\n}\n\nfunction emptyScheduled(scheduler: SchedulerLike) {\n return new Observable((subscriber) => scheduler.schedule(() => subscriber.complete()));\n}\n", "import { SchedulerLike } from '../types';\nimport { isFunction } from './isFunction';\n\nexport function isScheduler(value: any): value is SchedulerLike {\n return value && isFunction(value.schedule);\n}\n", "import { SchedulerLike } from '../types';\nimport { isFunction } from './isFunction';\nimport { isScheduler } from './isScheduler';\n\nfunction last(arr: T[]): T | undefined {\n return arr[arr.length - 1];\n}\n\nexport function popResultSelector(args: any[]): ((...args: unknown[]) => unknown) | undefined {\n return isFunction(last(args)) ? args.pop() : undefined;\n}\n\nexport function popScheduler(args: any[]): SchedulerLike | undefined {\n return isScheduler(last(args)) ? args.pop() : undefined;\n}\n\nexport function popNumber(args: any[], defaultValue: number): number {\n return typeof last(args) === 'number' ? args.pop()! : defaultValue;\n}\n", "export const isArrayLike = ((x: any): x is ArrayLike => x && typeof x.length === 'number' && typeof x !== 'function');", "import { isFunction } from \"./isFunction\";\n\n/**\n * Tests to see if the object is \"thennable\".\n * @param value the object to test\n */\nexport function isPromise(value: any): value is PromiseLike {\n return isFunction(value?.then);\n}\n", "import { InteropObservable } from '../types';\nimport { observable as Symbol_observable } from '../symbol/observable';\nimport { isFunction } from './isFunction';\n\n/** Identifies an input as being Observable (but not necessary an Rx Observable) */\nexport function isInteropObservable(input: any): input is InteropObservable {\n return isFunction(input[Symbol_observable]);\n}\n", "import { isFunction } from './isFunction';\n\nexport function isAsyncIterable(obj: any): obj is AsyncIterable {\n return Symbol.asyncIterator && isFunction(obj?.[Symbol.asyncIterator]);\n}\n", "/**\n * Creates the TypeError to throw if an invalid object is passed to `from` or `scheduled`.\n * @param input The object that was passed.\n */\nexport function createInvalidObservableTypeError(input: any) {\n // TODO: We should create error codes that can be looked up, so this can be less verbose.\n return new TypeError(\n `You provided ${\n input !== null && typeof input === 'object' ? 'an invalid object' : `'${input}'`\n } where a stream was expected. You can provide an Observable, Promise, ReadableStream, Array, AsyncIterable, or Iterable.`\n );\n}\n", "export function getSymbolIterator(): symbol {\n if (typeof Symbol !== 'function' || !Symbol.iterator) {\n return '@@iterator' as any;\n }\n\n return Symbol.iterator;\n}\n\nexport const iterator = getSymbolIterator();\n", "import { iterator as Symbol_iterator } from '../symbol/iterator';\nimport { isFunction } from './isFunction';\n\n/** Identifies an input as being an Iterable */\nexport function isIterable(input: any): input is Iterable {\n return isFunction(input?.[Symbol_iterator]);\n}\n", "import { ReadableStreamLike } from '../types';\nimport { isFunction } from './isFunction';\n\nexport async function* readableStreamLikeToAsyncGenerator(readableStream: ReadableStreamLike): AsyncGenerator {\n const reader = readableStream.getReader();\n try {\n while (true) {\n const { value, done } = await reader.read();\n if (done) {\n return;\n }\n yield value!;\n }\n } finally {\n reader.releaseLock();\n }\n}\n\nexport function isReadableStreamLike(obj: any): obj is ReadableStreamLike {\n // We don't want to use instanceof checks because they would return\n // false for instances from another Realm, like an + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/faq/index.html b/faq/index.html new file mode 100644 index 00000000..28f0b3d8 --- /dev/null +++ b/faq/index.html @@ -0,0 +1,1700 @@ + + + + + + + + + + + + + + + + + + + + + + + + + FAQ - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Frequently Asked Questions

+

If your question is not listed, try using DeepWiki's AI assistant for common issues.

+

For unresolved problems, join our Discord or WeChat community for support.

+
+Encountered the error ImportError: libGL.so.1: cannot open shared object file: No such file or directory in Ubuntu 22.04 on WSL2 +

The libgl library is missing in Ubuntu 22.04 on WSL2. You can install the libgl library with the following command to resolve the issue:

+
sudo apt-get install libgl1-mesa-glx
+
+

Reference: #388

+
+
+Missing text information in parsing results when installing and using on Linux systems. +

MinerU uses pypdfium2 instead of pymupdf as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images. +To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems: +

sudo apt update
+sudo apt install fonts-noto-core
+sudo apt install fonts-noto-cjk
+fc-cache -fv
+
+You can also directly use our Docker deployment method to build the image, which includes the above font packages by default. +

Reference: #2915

+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/images/MinerU-logo.png b/images/MinerU-logo.png new file mode 100644 index 00000000..09ab46b2 Binary files /dev/null and b/images/MinerU-logo.png differ diff --git a/images/datalab_logo.png b/images/datalab_logo.png new file mode 100644 index 00000000..5019ae7c Binary files /dev/null and b/images/datalab_logo.png differ diff --git a/images/flowchart_en.png b/images/flowchart_en.png new file mode 100644 index 00000000..b490011e Binary files /dev/null and b/images/flowchart_en.png differ diff --git a/images/flowchart_zh_cn.png b/images/flowchart_zh_cn.png new file mode 100644 index 00000000..32e0a142 Binary files /dev/null and b/images/flowchart_zh_cn.png differ diff --git a/images/layout_example.png b/images/layout_example.png new file mode 100644 index 00000000..4a57dffe Binary files /dev/null and b/images/layout_example.png differ diff --git a/images/logo.png b/images/logo.png new file mode 100644 index 00000000..01818084 Binary files /dev/null and b/images/logo.png differ diff --git a/images/logo.svg b/images/logo.svg new file mode 100644 index 00000000..65539780 --- /dev/null +++ b/images/logo.svg @@ -0,0 +1,22 @@ + + + + + + + + + + + + + + + + + + + + + + diff --git a/images/poly.png b/images/poly.png new file mode 100644 index 00000000..14af7726 Binary files /dev/null and b/images/poly.png differ diff --git a/images/project_panorama_en.png b/images/project_panorama_en.png new file mode 100644 index 00000000..19616da6 Binary files /dev/null and b/images/project_panorama_en.png differ diff --git a/images/project_panorama_zh_cn.png b/images/project_panorama_zh_cn.png new file mode 100644 index 00000000..3cd6843e Binary files /dev/null and b/images/project_panorama_zh_cn.png differ diff --git a/images/spans_example.png b/images/spans_example.png new file mode 100644 index 00000000..14de87ed Binary files /dev/null and b/images/spans_example.png differ diff --git a/index.html b/index.html new file mode 100644 index 00000000..8b648f7d --- /dev/null +++ b/index.html @@ -0,0 +1,1742 @@ + + + + + + + + + + + + + + + + + + + + + + + MinerU - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MinerU

+ +
+ +

+ +

+
+ + + +

stars +forks +open issues +issue resolution +PyPI version +PyPI - Python Version +Downloads +Downloads +OpenDataLab +HuggingFace +ModelScope +Colab +arXiv +arXiv +Ask DeepWiki

+ + +

Project Introduction

+

MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format. +MinerU was born during the pre-training process of InternLM. We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. +Compared to well-known commercial products domestically and internationally, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on GitHub Issues and attach the relevant PDF.

+

+

Key Features

+
    +
  • Remove headers, footers, footnotes, page numbers and other elements to ensure semantic coherence
  • +
  • Output text in human reading order, suitable for single-column, multi-column and complex layouts
  • +
  • Retain the original document structure, including titles, paragraphs, lists, etc.
  • +
  • Extract images, image descriptions, tables, table titles and footnotes
  • +
  • Automatically identify and convert formulas in documents to LaTeX format
  • +
  • Automatically identify and convert tables in documents to HTML format
  • +
  • Automatically detect scanned PDFs and garbled PDFs, and enable OCR functionality
  • +
  • OCR supports detection and recognition of 109 languages
  • +
  • Support multiple output formats, such as multimodal and NLP Markdown, reading-order-sorted JSON, and information-rich intermediate formats
  • +
  • Support multiple visualization results, including layout visualization, span visualization, etc., for efficient confirmation of output effects and quality inspection
  • +
  • Support pure CPU environment operation, and support GPU(CUDA)/NPU(CANN)/MPS acceleration
  • +
  • Compatible with Windows, Linux and Mac platforms
  • +
+

User Guide

+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/quick_start/docker_deployment/index.html b/quick_start/docker_deployment/index.html new file mode 100644 index 00000000..7e4a98f3 --- /dev/null +++ b/quick_start/docker_deployment/index.html @@ -0,0 +1,1963 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Docker Deployment - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Deploying MinerU with Docker

+

MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues.

+

Build Docker Image using Dockerfile

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile
+docker build -t mineru:latest -f Dockerfile .
+
+

Docker Description

+

MinerU's Docker uses vllm/vllm-openai as the base image, so it includes the vllm inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use vllm to accelerate VLM model inference.

+
+

Note

+

Requirements for using vllm to accelerate VLM model inference:

+
    +
  • Device must have Volta architecture or later graphics cards with 8GB+ available VRAM.
  • +
  • The host machine's graphics driver should support CUDA 12.9.1 or higher; You can check the driver version using the nvidia-smi command.
  • +
  • Docker container must have access to the host machine's graphics devices.
  • +
+
+

Start Docker Container

+
docker run --gpus all \
+  --shm-size 32g \
+  -p 30000:30000 -p 7860:7860 -p 8000:8000 \
+  --ipc=host \
+  -it mineru:latest \
+  /bin/bash
+
+

After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features. +You can also directly start MinerU services by replacing /bin/bash with service startup commands. For detailed instructions, please refer to the Start the service via command.

+

Start Services Directly with Docker Compose

+

We provide a compose.yaml file that you can use to quickly start MinerU services.

+
# Download compose.yaml file
+wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
+
+
+

Note

+
    +
  • The compose.yaml file contains configurations for multiple services of MinerU, you can choose to start specific services as needed.
  • +
  • Different services might have additional parameter configurations, which you can view and edit in the compose.yaml file.
  • +
  • Due to the pre-allocation of GPU memory by the vllm inference acceleration framework, you may not be able to run multiple vllm services simultaneously on the same machine. Therefore, ensure that other services that might use GPU memory have been stopped before starting the vlm-openai-server service or using the vlm-vllm-engine backend.
  • +
+
+
+

Start OpenAI-compatible server service

+

connect to openai-server via vlm-http-client backend +

docker compose -f compose.yaml --profile openai-server up -d
+
+
+

Tip

+

In another terminal, connect to openai server via http client (only requires CPU and network, no vllm environment needed) +

mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://<server_ip>:30000
+
+
+
+

Start Web API service

+
docker compose -f compose.yaml --profile api up -d
+
+
+

Tip

+

Access http://<server_ip>:8000/docs in your browser to view the API documentation.

+
+
+

Start Gradio WebUI service

+
docker compose -f compose.yaml --profile gradio up -d
+
+
+

Tip

+
    +
  • Access http://<server_ip>:7860 in your browser to use the Gradio WebUI.
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/quick_start/extension_modules/index.html b/quick_start/extension_modules/index.html new file mode 100644 index 00000000..8b948c35 --- /dev/null +++ b/quick_start/extension_modules/index.html @@ -0,0 +1,1921 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Extension Modules - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MinerU Extension Modules Installation Guide

+

MinerU supports installing extension modules on demand based on different needs to enhance functionality or support specific model backends.

+

Common Scenarios

+

Core Functionality Installation

+

The core module is the core dependency of MinerU, containing all functional modules except vllm/lmdeploy. Installing this module ensures the basic functionality of MinerU works properly. +

uv pip install "mineru[core]"
+
+
+

Using vllm to Accelerate VLM Model Inference

+
+

Note

+

vllm and lmdeploy have nearly identical VLM inference acceleration effects and usage methods. You can choose one of them to install and use based on your actual needs, but it is not recommended to install both modules simultaneously to avoid potential dependency conflicts.

+
+

The vllm module provides acceleration support for VLM model inference, suitable for graphics cards with Volta architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.

+
uv pip install "mineru[core,vllm]"
+
+
+

Tip

+

If exceptions occur during installation of the extra package including vllm, please refer to the vllm official documentation to try to resolve the issue, or directly use the Docker deployment method.

+
+
+

Using lmdeploy to Accelerate VLM Model Inference

+
+

Note

+

vllm and lmdeploy have nearly identical VLM inference acceleration effects and usage methods. You can choose one of them to install and use based on your actual needs, but it is not recommended to install both modules simultaneously to avoid potential dependency conflicts.

+
+

The lmdeploy module provides acceleration support for VLM model inference, suitable for graphics cards with Volta architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.

+
uv pip install "mineru[core,lmdeploy]"
+
+
+

Tip

+

If exceptions occur during installation of the extra package including lmdeploy, please refer to the lmdeploy official documentation to try to resolve the issue.

+
+
+

Installing Lightweight Client to Connect to OpenAI-compatible servers (for vlm-http-client mode)

+

If you need to install a lightweight client on edge devices to connect to an OpenAI-compatible server for using VLM mode, you can install the basic mineru package, which is very lightweight and suitable for devices with only CPU and network connectivity. +

uv pip install mineru
+mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:30000
+
+
+

Installing Lightweight Client to Connect to OpenAI-compatible servers (for hybrid-http-client mode)

+

If you need to install a lightweight client on edge devices to connect to an OpenAI-compatible server for using hybrid mode, you can install the mineru pipeline extension package, which is relatively lightweight and can be used on devices with only CPU and network connectivity, while running faster on devices that support GPU acceleration. +

uv pip install "mineru[pipeline]"
+mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/quick_start/index.html b/quick_start/index.html new file mode 100644 index 00000000..97adaec1 --- /dev/null +++ b/quick_start/index.html @@ -0,0 +1,1800 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Quick Start - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Quick Start

+

If you encounter any installation issues, please check the FAQ first.

+

Online Experience

+

Official online web application

+

The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use

+
    +
  • OpenDataLab
  • +
+

Gradio-based online demo

+

A WebUI developed based on Gradio, with a simple interface and only core parsing functionality, no login required

+
    +
  • ModelScope
  • +
  • HuggingFace
  • +
+

Local Deployment

+
+

Warning

+

Prerequisites - Hardware and Software Environment Support

+

To ensure the stability and reliability of the project, we have optimized and tested only specific hardware and software environments during development. This ensures that users can achieve optimal performance and encounter the fewest compatibility issues when deploying and running the project on recommended system configurations.

+

By concentrating our resources and efforts on mainstream environments, our team can more efficiently resolve potential bugs and timely develop new features.

+

In non-mainstream environments, due to the diversity of hardware and software configurations, as well as compatibility issues with third-party dependencies, we cannot guarantee 100% usability of the project. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first, as most issues have corresponding solutions in the FAQ. Additionally, we encourage community feedback on issues so that we can gradually expand our support range.

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Parsing Backendpipeline*-auto-engine*-http-client
hybridvlmhybridvlm
Backend FeaturesGood CompatibilityHigh Hardware RequirementsFor OpenAI Compatible Servers2
Accuracy182+90+
Operating SystemLinux3 / Windows4 / macOS5
Pure CPU Support
GPU AccelerationVolta and later architecture GPUs or Apple SiliconNot Required
Min VRAM6GB10GB8GB3GB
RAMMin 16GB+, Recommended 32GB+8GB
Disk Space20GB+, SSD Recommended2GB
Python Version3.10-3.13
+ +

1 Accuracy metrics are the End-to-End Evaluation Overall scores from OmniDocBench (v1.5), based on the latest version of MinerU.
+2 Servers compatible with OpenAI API, such as local model servers or remote model services deployed via inference frameworks like vLLM/SGLang/LMDeploy.
+3 Linux only supports distributions from 2019 and later.
+4 Since the key dependency ray does not support Python 3.13 on Windows, only versions 3.10~3.12 are supported.
+5 macOS requires version 14.0 or later.

+

Install MinerU

+

Install MinerU using pip or uv

+
pip install --upgrade pip
+pip install uv
+uv pip install -U "mineru[all]"
+
+

Install MinerU from source code

+
git clone https://github.com/opendatalab/MinerU.git
+cd MinerU
+uv pip install -e .[all]
+
+
+

Tip

+

mineru[all] includes all core features, compatible with Windows / Linux / macOS systems, suitable for most users. +If you need to specify the inference framework for the VLM model, or only intend to install a lightweight client on an edge device, please refer to the documentation Extension Modules Installation Guide.

+
+
+

Deploy MinerU using Docker

+

MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues. +You can get the Docker Deployment Instructions in the documentation.

+
+

Using MinerU

+

If your device meets the GPU acceleration requirements in the table above, you can use a simple command line for document parsing: +

mineru -p <input_path> -o <output_path>
+
+If your device does not meet the GPU acceleration requirements, you can specify the backend as pipeline to run in a pure CPU environment: +
mineru -p <input_path> -o <output_path> -b pipeline
+
+

You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the Usage Guide.

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/reference/changelog/index.html b/reference/changelog/index.html new file mode 100644 index 00000000..4441c2e1 --- /dev/null +++ b/reference/changelog/index.html @@ -0,0 +1,3229 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Changelog - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Changelog

+

This document records the update history of MinerU project for version 2.6.7 and earlier. For the latest version updates, please check the project README.

+
+

2.6 Series Versions

+

2.6.7 (2025/12/12)

+
    +
  • Bug fix: #4168
  • +
+

2.6.6 (2025/12/02)

+

mineru-api tool optimizations

+
    +
  • Added descriptive text to mineru-api interface parameters to improve API documentation readability.
  • +
  • You can use the environment variable MINERU_API_ENABLE_FASTAPI_DOCS to control whether the auto-generated interface documentation page is enabled (enabled by default).
  • +
  • Added concurrency configuration options for the vlm-vllm-async-engine, vlm-lmdeploy-engine, and vlm-http-client backends. Users can use the environment variable MINERU_API_MAX_CONCURRENT_REQUESTS to set the maximum number of concurrent API requests (unlimited by default).
  • +
+

2.6.5 (2025/11/26)

+
    +
  • Added support for a new backend vlm-lmdeploy-engine. Its usage is similar to vlm-vllm-(async)engine, but it uses lmdeploy as the inference engine and additionally supports native inference acceleration on Windows platforms compared to vllm.
  • +
+

2.6.4 (2025/11/04)

+
    +
  • Added timeout configuration for PDF image rendering, default is 300 seconds, can be configured via environment variable MINERU_PDF_RENDER_TIMEOUT to prevent long blocking of the rendering process caused by some abnormal PDF files.
  • +
  • Added CPU thread count configuration options for ONNX models, default is the system CPU core count, can be configured via environment variables MINERU_INTRA_OP_NUM_THREADS and MINERU_INTER_OP_NUM_THREADS to reduce CPU resource contention conflicts in high concurrency scenarios.
  • +
+

2.6.3 (2025/10/31)

+
    +
  • Added support for a new backend vlm-mlx-engine, enabling MLX-accelerated inference for the MinerU2.5 model on Apple Silicon devices. Compared to the vlm-transformers backend, vlm-mlx-engine delivers a 100%–200% speed improvement.
  • +
  • Bug fixes: #3849, #3859
  • +
+

2.6.2 (2025/10/24)

+

pipeline backend optimizations

+
    +
  • Added experimental support for Chinese formulas, which can be enabled by setting the environment variable export MINERU_FORMULA_CH_SUPPORT=1. This feature may cause a slight decrease in MFR speed and failures in recognizing some long formulas. It is recommended to enable it only when parsing Chinese formulas is needed. To disable this feature, set the environment variable to 0.
  • +
  • OCR speed significantly improved by 200%~300%, thanks to the optimization solution provided by @cjsdurj
  • +
  • OCR models optimized for improved accuracy and coverage of Latin script recognition, and updated Cyrillic, Arabic, Devanagari, Telugu (te), and Tamil (ta) language systems to ppocr-v5 version, with accuracy improved by over 40% compared to previous models
  • +
+

vlm backend optimizations

+
    +
  • table_caption and table_footnote matching logic optimized to improve the accuracy of table caption and footnote matching and reading order rationality in scenarios with multiple consecutive tables on a page
  • +
  • Optimized CPU resource usage during high concurrency when using vllm backend, reducing server pressure
  • +
  • Adapted to vllm version 0.11.0
  • +
+

General optimizations

+
    +
  • Cross-page table merging effect optimized, added support for cross-page continuation table merging, improving table merging effectiveness in multi-column merge scenarios
  • +
  • Added environment variable configuration option MINERU_TABLE_MERGE_ENABLE for table merging feature. Table merging is enabled by default and can be disabled by setting this variable to 0
  • +
+
+

2.5 Series Versions

+

2.5.4 (2025/09/26)

+
    +
  • 🎉🎉 The MinerU2.5 Technical Report is now available! We welcome you to read it for a comprehensive overview of its model architecture, training strategy, data engineering and evaluation results.
  • +
  • Fixed an issue where some PDF files were mistakenly identified as AI files, causing parsing failures
  • +
+

2.5.3 (2025/09/20)

+
    +
  • Dependency version range adjustment to enable Turing and earlier architecture GPUs to use vLLM acceleration for MinerU2.5 model inference.
  • +
  • pipeline backend compatibility fixes for torch 2.8.0.
  • +
  • Reduced default concurrency for vLLM async backend to lower server pressure and avoid connection closure issues caused by high load.
  • +
  • More compatibility-related details can be found in the announcement
  • +
+

2.5.2 (2025/09/19)

+

We are officially releasing MinerU2.5, currently the most powerful multimodal large model for document parsing.

+

With only 1.2B parameters, MinerU2.5's accuracy on the OmniDocBench benchmark comprehensively surpasses top-tier multimodal models like Gemini 2.5 Pro, GPT-4o, and Qwen2.5-VL-72B. It also significantly outperforms leading specialized models such as dots.ocr, MonkeyOCR, and PP-StructureV3.

+

The model has been released on HuggingFace and ModelScope platforms. Welcome to download and use!

+

Core Highlights

+
    +
  • SOTA Performance with Extreme Efficiency: As a 1.2B model, it achieves State-of-the-Art (SOTA) results that exceed models in the 10B and 100B+ classes, redefining the performance-per-parameter standard in document AI.
  • +
  • Advanced Architecture for Across-the-Board Leadership: By combining a two-stage inference pipeline (decoupling layout analysis from content recognition) with a native high-resolution architecture, it achieves SOTA performance across five key areas: layout analysis, text recognition, formula recognition, table recognition, and reading order.
  • +
+

Key Capability Enhancements

+
    +
  • Layout Detection: Delivers more complete results by accurately covering non-body content like headers, footers, and page numbers. It also provides more precise element localization and natural format reconstruction for lists and references.
  • +
  • Table Parsing: Drastically improves parsing for challenging cases, including rotated tables, borderless/semi-structured tables, and long/complex tables.
  • +
  • Formula Recognition: Significantly boosts accuracy for complex, long-form, and hybrid Chinese-English formulas, greatly enhancing the parsing capability for mathematical documents.
  • +
+

Repository Adjustments

+

Additionally, with the release of vlm 2.5, we have made some adjustments to the repository:

+
    +
  • The vlm backend has been upgraded to version 2.5, supporting the MinerU2.5 model and no longer compatible with the MinerU2.0-2505-0.9B model. The last version supporting the 2.0 model is mineru-2.2.2.
  • +
  • VLM inference-related code has been moved to mineru_vl_utils, reducing coupling with the main mineru repository and facilitating independent iteration in the future.
  • +
  • The vlm accelerated inference framework has been switched from sglang to vllm, achieving full compatibility with the vllm ecosystem, allowing users to use the MinerU2.5 model and accelerated inference on any platform that supports the vllm framework.
  • +
  • Due to major upgrades in the vlm model supporting more layout types, we have made some adjustments to the structure of the parsing intermediate file middle.json and result file content_list.json. Please refer to the documentation for details.
  • +
+

Other Repository Optimizations

+
    +
  • Removed file extension whitelist validation for input files. When input files are PDF documents or images, there are no longer requirements for file extensions, improving usability.
  • +
+
+

2.2 - 2.4 Series Versions

+

2.2.2 (2025/09/10)

+
    +
  • Fixed the issue where the new table recognition model would affect the overall parsing task when some table parsing failed
  • +
+

2.2.1 (2025/09/08)

+
    +
  • Fixed the issue where some newly added models were not downloaded when using the model download command.
  • +
+

2.2.0 (2025/09/05)

+

Major Updates

+
    +
  • In this version, we focused on improving table parsing accuracy by introducing a new wired table recognition model and a brand-new hybrid table structure parsing algorithm, significantly enhancing the table recognition capabilities of the pipeline backend.
  • +
  • We also added support for cross-page table merging, which is supported by both pipeline and vlm backends, further improving the completeness and accuracy of table parsing.
  • +
+

Other Updates

+
    +
  • The pipeline backend now supports 270-degree rotated table parsing, bringing support for table parsing in 0/90/270-degree orientations
  • +
  • pipeline added OCR capability support for Thai and Greek, and updated the English OCR model to the latest version. English recognition accuracy improved by 11%, Thai recognition model accuracy is 82.68%, and Greek recognition model accuracy is 89.28% (by PPOCRv5)
  • +
  • Added bbox field (mapped to 0-1000 range) in the output content_list.json, making it convenient for users to directly obtain position information for each content block
  • +
  • Removed the pipeline_old_linux installation option, no longer supporting legacy Linux systems such as CentOS 7, to provide better support for uv's sync/run commands
  • +
+
+

2.1 Series Versions

+

2.1.10 (2025/08/01)

+
    +
  • Fixed an issue in the pipeline backend where block overlap caused the parsing results to deviate from expectations #3232
  • +
+

2.1.9 (2025/07/30)

+
    +
  • transformers 4.54.1 version adaptation
  • +
+

2.1.8 (2025/07/28)

+
    +
  • sglang 0.4.9.post5 version adaptation
  • +
+

2.1.7 (2025/07/27)

+
    +
  • transformers 4.54.0 version adaptation
  • +
+

2.1.6 (2025/07/26)

+
    +
  • Fixed table parsing issues in handwritten documents when using vlm backend
  • +
  • Fixed visualization box position drift issue when document is rotated #3175
  • +
+

2.1.5 (2025/07/24)

+
    +
  • sglang 0.4.9 version adaptation, synchronously upgrading the dockerfile base image to sglang 0.4.9.post3
  • +
+

2.1.4 (2025/07/23)

+

Bug Fixes

+
    +
  • Fixed the issue of excessive memory consumption during the MFR step in the pipeline backend under certain scenarios #2771
  • +
  • Fixed the inaccurate matching between image/table and caption/footnote under certain conditions #3129
  • +
+

2.1.1 (2025/07/16)

+

Bug fixes

+
    +
  • Fixed text block content loss issue that could occur in certain pipeline scenarios #3005
  • +
  • Fixed issue where sglang-client required unnecessary packages like torch #2968
  • +
  • Updated dockerfile to fix incomplete text content parsing due to missing fonts in Linux #2915
  • +
+

Usability improvements

+
    +
  • Updated compose.yaml to facilitate direct startup of sglang-server, mineru-api, and mineru-gradio services
  • +
  • Launched brand new online documentation site, simplified readme, providing better documentation experience
  • +
+

2.1.0 (2025/07/05)

+

This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:

+

Performance Optimizations

+
    +
  • Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).
  • +
  • Greatly enhanced post-processing speed when the pipeline backend handles batch processing of documents with fewer pages (<10 pages).
  • +
  • Layout analysis speed of the pipeline backend has been increased by approximately 20%.
  • +
+

Experience Enhancements

+
    +
  • Built-in ready-to-use fastapi service and gradio webui. For detailed usage instructions, please refer to Documentation.
  • +
  • Adapted to sglang version 0.4.8, significantly reducing the GPU memory requirements for the vlm-sglang backend. It can now run on graphics cards with as little as 8GB GPU memory (Turing architecture or newer).
  • +
  • Added transparent parameter passing for all commands related to sglang, allowing the sglang-engine backend to receive all sglang parameters consistently with the sglang-server.
  • +
  • Supports feature extensions based on configuration files, including custom formula delimiters, enabling heading classification, and customizing local model directories. For detailed usage instructions, please refer to Documentation.
  • +
+

New Features

+
    +
  • Updated the pipeline backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. Details
  • +
  • Introduced limited support for vertical text layout in the pipeline backend.
  • +
+
+

2.0 Series Versions

+

2.0.6 (2025/06/20)

+
    +
  • Fixed occasional parsing interruptions caused by invalid block content in vlm mode
  • +
  • Fixed parsing interruptions caused by incomplete table structures in vlm mode
  • +
+

2.0.5 (2025/06/17)

+
    +
  • Fixed the issue where models were still required to be downloaded in the sglang-client mode
  • +
  • Fixed the issue where the sglang-client mode unnecessarily depended on packages like torch during runtime.
  • +
  • Fixed the issue where only the first instance would take effect when attempting to launch multiple sglang-client instances via multiple URLs within the same process
  • +
+

2.0.3 (2025/06/15)

+
    +
  • Fixed a configuration file key-value update error that occurred when downloading model type was set to all
  • +
  • Fixed the issue where the formula and table feature toggle switches were not working in command line mode, causing the features to remain enabled.
  • +
  • Fixed compatibility issues with sglang version 0.4.7 in the sglang-engine mode.
  • +
  • Updated Dockerfile and installation documentation for deploying the full version of MinerU in sglang environment
  • +
+

2.0.0 (2025/06/13)

+

New Architecture

+

MinerU 2.0 has been deeply restructured in code organization and interaction methods, significantly improving system usability, maintainability, and extensibility.

+
    +
  • Removal of Third-party Dependency Limitations: Completely eliminated the dependency on pymupdf, moving the project toward a more open and compliant open-source direction.
  • +
  • Ready-to-use, Easy Configuration: No need to manually edit JSON configuration files; most parameters can now be set directly via command line or API.
  • +
  • Automatic Model Management: Added automatic model download and update mechanisms, allowing users to complete model deployment without manual intervention.
  • +
  • Offline Deployment Friendly: Provides built-in model download commands, supporting deployment requirements in completely offline environments.
  • +
  • Streamlined Code Structure: Removed thousands of lines of redundant code, simplified class inheritance logic, significantly improving code readability and development efficiency.
  • +
  • Unified Intermediate Format Output: Adopted standardized middle_json format, compatible with most secondary development scenarios based on this format, ensuring seamless ecosystem business migration.
  • +
+

New Model

+

MinerU 2.0 integrates our latest small-parameter, high-performance multimodal document parsing model, achieving end-to-end high-speed, high-precision document understanding.

+
    +
  • Small Model, Big Capabilities: With parameters under 1B, yet surpassing traditional 72B-level vision-language models (VLMs) in parsing accuracy.
  • +
  • Multiple Functions in One: A single model covers multilingual recognition, handwriting recognition, layout analysis, table parsing, formula recognition, reading order sorting, and other core tasks.
  • +
  • Ultimate Inference Speed: Achieves peak throughput exceeding 10,000 tokens/s through sglang acceleration on a single NVIDIA 4090 card, easily handling large-scale document processing requirements.
  • +
  • Online Experience: You can experience our brand-new VLM model on MinerU.net, Hugging Face, and ModelScope.
  • +
+

Incompatible Changes Notice

+

To improve overall architectural rationality and long-term maintainability, this version contains some incompatible changes:

+
    +
  • Python package name changed from magic-pdf to mineru, and the command-line tool changed from magic-pdf to mineru. Please update your scripts and command calls accordingly.
  • +
  • For modular system design and ecosystem consistency considerations, MinerU 2.0 no longer includes the LibreOffice document conversion module. If you need to process Office documents, we recommend converting them to PDF format through an independently deployed LibreOffice service before proceeding with subsequent parsing operations.
  • +
+
+

1.x Series Historical Versions

+

1.3.12 (2025/05/24)

+

Added support for PPOCRv5 models, updated ch_server model to PP-OCRv5_rec_server, and ch_lite model to PP-OCRv5_rec_mobile (model update required)

+
    +
  • In testing, we found that PPOCRv5(server) has some improvement for handwritten documents, but has slightly lower accuracy than v4_server_doc for other document types, so the default ch model remains unchanged as PP-OCRv4_server_rec_doc.
  • +
  • Since PPOCRv5 has enhanced recognition capabilities for handwriting and special characters, you can manually choose the PPOCRv5 model for Japanese-Traditional Chinese mixed scenarios and handwritten documents
  • +
  • You can select the appropriate model through the lang parameter lang='ch_server' (Python API) or --lang ch_server (command line):
  • +
  • ch: PP-OCRv4_server_rec_doc (default) (Chinese/English/Japanese/Traditional Chinese mixed/15K dictionary)
  • +
  • ch_server: PP-OCRv5_rec_server (Chinese/English/Japanese/Traditional Chinese mixed + handwriting/18K dictionary)
  • +
  • ch_lite: PP-OCRv5_rec_mobile (Chinese/English/Japanese/Traditional Chinese mixed + handwriting/18K dictionary)
  • +
  • ch_server_v4: PP-OCRv4_rec_server (Chinese/English mixed/6K dictionary)
  • +
  • ch_lite_v4: PP-OCRv4_rec_mobile (Chinese/English mixed/6K dictionary)
  • +
+

Added support for handwritten documents through optimized layout recognition of handwritten text areas

+
    +
  • This feature is supported by default, no additional configuration required
  • +
  • You can refer to the instructions above to manually select the PPOCRv5 model for better handwritten document parsing results
  • +
+

The huggingface and modelscope demos have been updated to versions that support handwriting recognition and PPOCRv5 models, which you can experience online

+

1.3.10 (2025/04/29)

+
    +
  • Added support for custom formula delimiters, which can be configured by modifying the latex-delimiter-config section in the magic-pdf.json file in your user directory.
  • +
+

1.3.9 (2025/04/27)

+
    +
  • Optimized formula parsing functionality, improved formula rendering success rate
  • +
+

1.3.8 (2025/04/23)

+

The default ocr model (ch) has been updated to PP-OCRv4_server_rec_doc (model update required)

+
    +
  • PP-OCRv4_server_rec_doc is trained on a mixture of more Chinese document data and PP-OCR training data based on PP-OCRv4_server_rec, adding recognition capabilities for some traditional Chinese characters, Japanese, and special characters. It can recognize over 15,000 characters and improves both document-specific and general text recognition abilities.
  • +
  • Performance comparison of PP-OCRv4_server_rec_doc/PP-OCRv4_server_rec/PP-OCRv4_mobile_rec
  • +
  • After verification, the PP-OCRv4_server_rec_doc model shows significant accuracy improvements in Chinese/English/Japanese/Traditional Chinese in both single language and mixed language scenarios, with comparable speed to PP-OCRv4_server_rec, making it suitable for most use cases.
  • +
  • In some pure English scenarios, PP-OCRv4_server_rec_doc may have word adhesion issues, while PP-OCRv4_server_rec performs better in these cases. Therefore, we've kept the PP-OCRv4_server_rec model, which users can access by adding the parameter lang='ch_server' (Python API) or --lang ch_server (command line).
  • +
+

1.3.7 (2025/04/22)

+
    +
  • Fixed the issue where the lang parameter was ineffective during table parsing model initialization
  • +
  • Fixed the significant speed reduction of OCR and table parsing in cpu mode
  • +
+

1.3.4 (2025/04/16)

+
    +
  • Slightly improved OCR-det speed by removing some unnecessary blocks
  • +
  • Fixed page-internal sorting errors caused by footnotes in certain cases
  • +
+

1.3.2 (2025/04/12)

+
    +
  • Fixed dependency version incompatibility issues when installing on Windows with Python 3.13
  • +
  • Optimized memory usage during batch inference
  • +
  • Improved parsing of tables rotated 90 degrees
  • +
  • Enhanced parsing of oversized tables in financial report samples
  • +
  • Fixed the occasional word adhesion issue in English text areas when OCR language is not specified (model update required)
  • +
+

1.3.1 (2025/04/08)

+

Fixed several compatibility issues

+
    +
  • Added support for Python 3.13
  • +
  • Made final adaptations for outdated Linux systems (such as CentOS 7) with no guarantee of continued support in future versions, installation instructions
  • +
+

1.3.0 (2025/04/03)

+

Installation and compatibility optimizations

+
    +
  • Resolved compatibility issues caused by detectron2 by removing layoutlmv3 usage in layout
  • +
  • Extended torch version compatibility to 2.2~2.6 (excluding 2.5)
  • +
  • Added CUDA compatibility for versions 11.8/12.4/12.6/12.8 (CUDA version determined by torch), solving compatibility issues for users with 50-series and H-series GPUs
  • +
  • Extended Python compatibility to versions 3.10~3.12, fixing the issue of automatic downgrade to version 0.6.1 when installing in non-3.10 environments
  • +
  • Optimized offline deployment process, eliminating the need to download any model files after successful deployment
  • +
+

Performance optimizations

+
    +
  • Enhanced parsing speed for batches of small files by supporting batch processing of multiple PDF files (script example), with formula parsing speed improved by up to 1400% and overall parsing speed improved by up to 500% compared to version 1.0.1
  • +
  • Reduced memory usage and improved parsing speed by optimizing MFR model loading and usage (requires re-running the model download process to get incremental updates to model files)
  • +
  • Optimized GPU memory usage, requiring only 6GB minimum to run this project
  • +
  • Improved running speed on MPS devices
  • +
+

Parsing effect optimizations

+
    +
  • Updated MFR model to unimernet(2503), fixing line break loss issues in multi-line formulas
  • +
+

Usability optimizations

+
    +
  • Completely replaced the paddle framework and paddleocr in the project by using paddleocr2torch, resolving conflicts between paddle and torch, as well as thread safety issues caused by the paddle framework
  • +
  • Added real-time progress bar display during parsing, allowing precise tracking of parsing progress and making the waiting process more bearable
  • +
+

1.2.1 (2025/03/03)

+

Fixed some issues

+
    +
  • Fixed the impact on punctuation marks during full-width to half-width conversion of letters and numbers
  • +
  • Fixed caption matching inaccuracies in certain scenarios
  • +
  • Fixed formula span loss issues in certain scenarios
  • +
+

1.2.0 (2025/02/24)

+

This version includes several fixes and improvements to enhance parsing efficiency and accuracy:

+

Performance Optimization

+
    +
  • Increased classification speed for PDF documents in auto mode.
  • +
+

Parsing Optimization

+
    +
  • Improved parsing logic for documents containing watermarks, significantly enhancing the parsing results for such documents.
  • +
  • Enhanced the matching logic for multiple images/tables and captions within a single page, improving the accuracy of image-text matching in complex layouts.
  • +
+

Bug Fixes

+
    +
  • Fixed an issue where image/table spans were incorrectly filled into text blocks under certain conditions.
  • +
  • Resolved an issue where title blocks were empty in some cases.
  • +
+

1.1.0 (2025/01/22)

+

In this version we have focused on improving parsing accuracy and efficiency:

+

Model capability upgrade (requires re-executing the model download process to obtain incremental updates of model files)

+
    +
  • The layout recognition model has been upgraded to the latest doclayout_yolo(2501) model, improving layout recognition accuracy.
  • +
  • The formula parsing model has been upgraded to the latest unimernet(2501) model, improving formula recognition accuracy.
  • +
+

Performance optimization

+
    +
  • On devices that meet certain configuration requirements (16GB+ VRAM), by optimizing resource usage and restructuring the processing pipeline, overall parsing speed has been increased by more than 50%.
  • +
+

Parsing effect optimization

+
    +
  • Added a new heading classification feature (testing version, enabled by default) to the online demo (mineru.net/huggingface/modelscope), which supports hierarchical classification of headings, thereby enhancing document structuring.
  • +
+

1.0.1 (2025/01/10)

+

This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring, as well as a brand new automatic language identification feature:

+

New API Interface

+
    +
  • For the data-side API, we have introduced the Dataset class, designed to provide a robust and flexible data processing framework. This framework currently supports a variety of document formats, including images (.jpg and .png), PDFs, Word documents (.doc and .docx), and PowerPoint presentations (.ppt and .pptx). It ensures effective support for data processing tasks ranging from simple to complex.
  • +
  • For the user-side API, we have meticulously designed the MinerU processing workflow as a series of composable Stages. Each Stage represents a specific processing step, allowing users to define new Stages according to their needs and creatively combine these stages to customize their data processing workflows.
  • +
+

Enhanced Compatibility

+
    +
  • By optimizing the dependency environment and configuration items, we ensure stable and efficient operation on ARM architecture Linux systems.
  • +
  • We have deeply integrated with Huawei Ascend NPU acceleration, providing autonomous and controllable high-performance computing capabilities. This supports the localization and development of AI application platforms in China. Ascend NPU Acceleration
  • +
+

Automatic Language Identification

+
    +
  • By introducing a new language recognition model, setting the lang configuration to auto during document parsing will automatically select the appropriate OCR language model, improving the accuracy of scanned document parsing.
  • +
+
+

0.x Series Historical Versions

+

0.10.0 (2024/11/22)

+

Introducing hybrid OCR text extraction capabilities:

+
    +
  • Significantly improved parsing performance in complex text distribution scenarios such as dense formulas, irregular span regions, and text represented by images.
  • +
  • Combines the dual advantages of accurate content extraction and faster speed in text mode, and more precise span/line region recognition in OCR mode.
  • +
+

0.9.3 (2024/11/15)

+

Integrated RapidTable for table recognition, improving single-table parsing speed by more than 10 times, with higher accuracy and lower GPU memory usage.

+

0.9.2 (2024/11/06)

+

Integrated the StructTable-InternVL2-1B model for table recognition functionality.

+

0.9.0 (2024/10/31)

+

This is a major new version with extensive code refactoring, addressing numerous issues, improving performance, reducing hardware requirements, and enhancing usability:

+
    +
  • Refactored the sorting module code to use layoutreader for reading order sorting, ensuring high accuracy in various layouts.
  • +
  • Refactored the paragraph concatenation module to achieve good results in cross-column, cross-page, cross-figure, and cross-table scenarios.
  • +
  • Refactored the list and table of contents recognition functions, significantly improving the accuracy of list blocks and table of contents blocks, as well as the parsing of corresponding text paragraphs.
  • +
  • Refactored the matching logic for figures, tables, and descriptive text, greatly enhancing the accuracy of matching captions and footnotes to figures and tables, and reducing the loss rate of descriptive text to near zero.
  • +
  • Added multi-language support for OCR, supporting detection and recognition of 84 languages. For the list of supported languages, see OCR Language Support List.
  • +
  • Added memory recycling logic and other memory optimization measures, significantly reducing memory usage. The memory requirement for enabling all acceleration features except table acceleration (layout/formula/OCR) has been reduced from 16GB to 8GB, and the memory requirement for enabling all acceleration features has been reduced from 24GB to 10GB.
  • +
  • Optimized configuration file feature switches, adding an independent formula detection switch to significantly improve speed and parsing results when formula detection is not needed.
  • +
  • Integrated PDF-Extract-Kit 1.0:
  • +
  • Added the self-developed doclayout_yolo model, which speeds up processing by more than 10 times compared to the original solution while maintaining similar parsing effects, and can be freely switched with layoutlmv3 via the configuration file.
  • +
  • Upgraded formula parsing to unimernet 0.2.1, improving formula parsing accuracy while significantly reducing memory usage.
  • +
  • Due to the repository change for PDF-Extract-Kit 1.0, you need to re-download the model. Please refer to How to Download Models for detailed steps.
  • +
+

0.8.1 (2024/09/27)

+

Fixed some bugs, and providing a localized deployment version of the online demo and the front-end interface.

+

0.8.0 (2024/09/09)

+

Supporting fast deployment with Dockerfile, and launching demos on Huggingface and Modelscope.

+

0.7.1 (2024/08/30)

+

Add paddle tablemaster table recognition option

+

0.7.0b1 (2024/08/09)

+

Simplified installation process, added table recognition functionality

+

0.6.2b1 (2024/08/01)

+

Optimized dependency conflict issues and installation documentation

+

Initial Open-Source Release (2024/07/05)

+

MinerU project's first open-source release

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/reference/index.html b/reference/index.html new file mode 100644 index 00000000..691ff0d4 --- /dev/null +++ b/reference/index.html @@ -0,0 +1,1701 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Reference - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Reference Documentation

+

This section provides detailed reference materials for MinerU project. Here you can find technical specifications, API documentation, output file formats, and version history.

+

Table of Contents

+ +

Documentation Overview

+

Output Files Documentation

+

Understanding the output files generated by MinerU is crucial for effective use of the tool. The output files documentation provides:

+
    +
  • Visual debugging files: Help you understand the document parsing process
  • +
  • Structured data files: Contain detailed parsing results for further processing
  • +
  • File format specifications: Detailed descriptions of each output file type
  • +
+

Changelog

+

The changelog documents the evolution of MinerU, including:

+
    +
  • Version updates: New features and improvements for each release
  • +
  • Bug fixes: Issues resolved in each version
  • +
  • Breaking changes: Important changes that may affect your usage
  • +
  • Deprecations: Features that are being phased out
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/reference/output_files/index.html b/reference/output_files/index.html new file mode 100644 index 00000000..44af4e96 --- /dev/null +++ b/reference/output_files/index.html @@ -0,0 +1,3342 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Output File Format - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MinerU Output Files Documentation

+

Overview

+

After executing the mineru command, in addition to the main markdown file output, multiple auxiliary files are generated for debugging, quality inspection, and further processing. These files include:

+
    +
  • Visual debugging files: Help users intuitively understand the document parsing process and results
  • +
  • Structured data files: Contain detailed parsing data for secondary development
  • +
+

The following sections provide detailed descriptions of each file's purpose and format.

+

Visual Debugging Files

+

Layout Analysis File (layout.pdf)

+

File naming format: {original_filename}_layout.pdf

+

Functionality:

+
    +
  • Visualizes layout analysis results for each page
  • +
  • Numbers in the top-right corner of each detection box indicate reading order
  • +
  • Different background colors distinguish different types of content blocks
  • +
+

Use cases:

+
    +
  • Check if layout analysis is correct
  • +
  • Verify if reading order is reasonable
  • +
  • Debug layout-related issues
  • +
+

layout page example

+

Text Spans File (span.pdf)

+
+

Note

+

Only applicable to pipeline backend

+
+

File naming format: {original_filename}_span.pdf

+

Functionality:

+
    +
  • Uses different colored line boxes to annotate page content based on span type
  • +
  • Used for quality inspection and issue troubleshooting
  • +
+

Use cases:

+
    +
  • Quickly troubleshoot text loss issues
  • +
  • Check inline formula recognition
  • +
  • Verify text segmentation accuracy
  • +
+

span page example

+

Structured Data Files

+
+

Important

+

The VLM backend output has significant changes in version 2.5 and is not backward-compatible with the pipeline backend. If you plan to build secondary development on structured outputs, please read this document carefully.

+
+

Pipeline Backend Output Results

+

Model Inference Results (model.json)

+

File naming format: {original_filename}_model.json

+
Data Structure Definition
+
from pydantic import BaseModel, Field
+from enum import IntEnum
+
+class CategoryType(IntEnum):
+    """Content category enumeration"""
+    title = 0               # Title
+    plain_text = 1          # Text
+    abandon = 2             # Including headers, footers, page numbers, and page annotations
+    figure = 3              # Image
+    figure_caption = 4      # Image caption
+    table = 5               # Table
+    table_caption = 6       # Table caption
+    table_footnote = 7      # Table footnote
+    isolate_formula = 8     # Interline formula
+    formula_caption = 9     # Interline formula number
+    embedding = 13          # Inline formula
+    isolated = 14           # Interline formula
+    text = 15               # OCR recognition result
+
+class PageInfo(BaseModel):
+    """Page information"""
+    page_no: int = Field(description="Page number, first page is 0", ge=0)
+    height: int = Field(description="Page height", gt=0)
+    width: int = Field(description="Page width", ge=0)
+
+class ObjectInferenceResult(BaseModel):
+    """Object recognition result"""
+    category_id: CategoryType = Field(description="Category", ge=0)
+    poly: list[float] = Field(description="Quadrilateral coordinates, format: [x0,y0,x1,y1,x2,y2,x3,y3]")
+    score: float = Field(description="Confidence score of inference result")
+    latex: str | None = Field(description="LaTeX parsing result", default=None)
+    html: str | None = Field(description="HTML parsing result", default=None)
+
+class PageInferenceResults(BaseModel):
+    """Page inference results"""
+    layout_dets: list[ObjectInferenceResult] = Field(description="Page recognition results")
+    page_info: PageInfo = Field(description="Page metadata")
+
+# Complete inference results
+inference_result: list[PageInferenceResults] = []
+
+
Coordinate System Description
+

poly coordinate format: [x0, y0, x1, y1, x2, y2, x3, y3]

+
    +
  • Represents coordinates of top-left, top-right, bottom-right, bottom-left points respectively
  • +
  • Coordinate origin is at the top-left corner of the page
  • +
+

poly coordinate diagram

+
Sample Data
+
[
+    {
+        "layout_dets": [
+            {
+                "category_id": 2,
+                "poly": [
+                    99.1906967163086,
+                    100.3119125366211,
+                    730.3707885742188,
+                    100.3119125366211,
+                    730.3707885742188,
+                    245.81326293945312,
+                    99.1906967163086,
+                    245.81326293945312
+                ],
+                "score": 0.9999997615814209
+            }
+        ],
+        "page_info": {
+            "page_no": 0,
+            "height": 2339,
+            "width": 1654
+        }
+    },
+    {
+        "layout_dets": [
+            {
+                "category_id": 5,
+                "poly": [
+                    99.13092803955078,
+                    2210.680419921875,
+                    497.3183898925781,
+                    2210.680419921875,
+                    497.3183898925781,
+                    2264.78076171875,
+                    99.13092803955078,
+                    2264.78076171875
+                ],
+                "score": 0.9999997019767761
+            }
+        ],
+        "page_info": {
+            "page_no": 1,
+            "height": 2339,
+            "width": 1654
+        }
+    }
+]
+
+

Intermediate Processing Results (middle.json)

+

File naming format: {original_filename}_middle.json

+
Top-level Structure
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameTypeDescription
pdf_infolist[dict]Array of parsing results for each page
_backendstringParsing mode: pipeline or vlm
_version_namestringMinerU version number
+
Page Information Structure (pdf_info)
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
preproc_blocksUnsegmented intermediate results after PDF preprocessing
page_idxPage number, starting from 0
page_sizePage width and height [width, height]
imagesImage block information list
tablesTable block information list
interline_equationsInterline formula block information list
discarded_blocksBlock information to be discarded
para_blocksContent block results after segmentation
+
Block Structure Hierarchy
+
Level 1 blocks (table | image)
+└── Level 2 blocks
+    └── Lines
+        └── Spans
+
+
Level 1 Block Fields
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
typeBlock type: table or image
bboxRectangular box coordinates of the block [x0, y0, x1, y1]
blocksList of contained level 2 blocks
+
Level 2 Block Fields
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
typeBlock type (see table below)
bboxRectangular box coordinates of the block
linesList of contained line information
+
Level 2 Block Types
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TypeDescription
image_bodyImage body
image_captionImage caption text
image_footnoteImage footnote
table_bodyTable body
table_captionTable caption text
table_footnoteTable footnote
textText block
titleTitle block
indexIndex block
listList block
interline_equationInterline formula block
+
Line and Span Structure
+

Line fields: +- bbox: Rectangular box coordinates of the line +- spans: List of contained spans

+

Span fields: +- bbox: Rectangular box coordinates of the span +- type: Span type (image, table, text, inline_equation, interline_equation) +- content | img_path: Text content or image path

+
Sample Data
+
{
+    "pdf_info": [
+        {
+            "preproc_blocks": [
+                {
+                    "type": "text",
+                    "bbox": [
+                        52,
+                        61.956024169921875,
+                        294,
+                        82.99800872802734
+                    ],
+                    "lines": [
+                        {
+                            "bbox": [
+                                52,
+                                61.956024169921875,
+                                294,
+                                72.0000228881836
+                            ],
+                            "spans": [
+                                {
+                                    "bbox": [
+                                        54.0,
+                                        61.956024169921875,
+                                        296.2261657714844,
+                                        72.0000228881836
+                                    ],
+                                    "content": "dependent on the service headway and the reliability of the departure ",
+                                    "type": "text",
+                                    "score": 1.0
+                                }
+                            ]
+                        }
+                    ]
+                }
+            ],
+            "layout_bboxes": [
+                {
+                    "layout_bbox": [
+                        52,
+                        61,
+                        294,
+                        731
+                    ],
+                    "layout_label": "V",
+                    "sub_layout": []
+                }
+            ],
+            "page_idx": 0,
+            "page_size": [
+                612.0,
+                792.0
+            ],
+            "_layout_tree": [],
+            "images": [],
+            "tables": [],
+            "interline_equations": [],
+            "discarded_blocks": [],
+            "para_blocks": [
+                {
+                    "type": "text",
+                    "bbox": [
+                        52,
+                        61.956024169921875,
+                        294,
+                        82.99800872802734
+                    ],
+                    "lines": [
+                        {
+                            "bbox": [
+                                52,
+                                61.956024169921875,
+                                294,
+                                72.0000228881836
+                            ],
+                            "spans": [
+                                {
+                                    "bbox": [
+                                        54.0,
+                                        61.956024169921875,
+                                        296.2261657714844,
+                                        72.0000228881836
+                                    ],
+                                    "content": "dependent on the service headway and the reliability of the departure ",
+                                    "type": "text",
+                                    "score": 1.0
+                                }
+                            ]
+                        }
+                    ]
+                }
+            ]
+        }
+    ],
+    "_backend": "pipeline",
+    "_version_name": "0.6.1"
+}
+
+

Content List (content_list.json)

+

File naming format: {original_filename}_content_list.json

+
Functionality
+

This is a simplified version of middle.json that stores all readable content blocks in reading order as a flat structure, removing complex layout information for easier subsequent processing.

+
Content Types
+ + + + + + + + + + + + + + + + + + + + + + + + + +
TypeDescription
imageImage
tableTable
textText/Title
equationInterline formula
+
Text Level Identification
+

Text levels are distinguished through the text_level field:

+
    +
  • No text_level or text_level: 0: Body text
  • +
  • text_level: 1: Level 1 heading
  • +
  • text_level: 2: Level 2 heading
  • +
  • And so on...
  • +
+
Common Fields
+
    +
  • All content blocks include a page_idx field indicating the page number (starting from 0).
  • +
  • All content blocks include a bbox field representing the bounding box coordinates of the content block [x0, y0, x1, y1], mapped to a range of 0-1000.
  • +
+
Sample Data
+
[
+        {
+        "type": "text",
+        "text": "The response of flow duration curves to afforestation ",
+        "text_level": 1, 
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],
+        "page_idx": 0
+    },
+    {
+        "type": "image",
+        "img_path": "images/a8ecda1c69b27e4f79fce1589175a9d721cbdc1cf78b4cc06a015f3746f6b9d8.jpg",
+        "image_caption": [
+            "Fig. 1. Annual flow duration curves of daily flows from Pine Creek, Australia, 1989–2000. "
+        ],
+        "image_footnote": [],
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],
+        "page_idx": 1
+    },
+    {
+        "type": "equation",
+        "img_path": "images/181ea56ef185060d04bf4e274685f3e072e922e7b839f093d482c29bf89b71e8.jpg",
+        "text": "$$\nQ _ { \\% } = f ( P ) + g ( T )\n$$",
+        "text_format": "latex",
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],
+        "page_idx": 2
+    },
+    {
+        "type": "table",
+        "img_path": "images/e3cb413394a475e555807ffdad913435940ec637873d673ee1b039e3bc3496d0.jpg",
+        "table_caption": [
+            "Table 2 Significance of the rainfall and time terms "
+        ],
+        "table_footnote": [
+            "indicates that the rainfall term was significant at the $5 \\%$ level, $T$ indicates that the time term was significant at the $5 \\%$ level, \\* represents significance at the $10 \\%$ level, and na denotes too few data points for meaningful analysis. "
+        ],
+        "table_body": "<html><body><table><tr><td rowspan=\"2\">Site</td><td colspan=\"10\">Percentile</td></tr><tr><td>10</td><td>20</td><td>30</td><td>40</td><td>50</td><td>60</td><td>70</td><td>80</td><td>90</td><td>100</td></tr><tr><td>Traralgon Ck</td><td>P</td><td>P,*</td><td>P</td><td>P</td><td>P,</td><td>P,</td><td>P,</td><td>P,</td><td>P</td><td>P</td></tr><tr><td>Redhill</td><td>P,T</td><td>P,T</td><td>,*</td><td>**</td><td>P.T</td><td>P,*</td><td>P*</td><td>P*</td><td>*</td><td>,*</td></tr><tr><td>Pine Ck</td><td></td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td><td>T</td><td>na</td><td>na</td></tr><tr><td>Stewarts Ck 5</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P.T</td><td>P.T</td><td>P,T</td><td>na</td><td>na</td><td>na</td></tr><tr><td>Glendhu 2</td><td>P</td><td>P,T</td><td>P,*</td><td>P,T</td><td>P.T</td><td>P,ns</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td></tr><tr><td>Cathedral Peak 2</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Cathedral Peak 3</td><td>P.T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Lambrechtsbos A</td><td>P,T</td><td>P</td><td>P</td><td>P,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>T</td></tr><tr><td>Lambrechtsbos B</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td></tr><tr><td>Biesievlei</td><td>P,T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>*,T</td><td>T</td><td>T</td><td>P,T</td><td>P,T</td></tr></table></body></html>",
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],  
+        "page_idx": 5
+    }
+]
+
+

VLM Backend Output Results

+

Model Inference Results (model.json)

+

File naming format: {original_filename}_model.json

+
File format description
+
    +
  • Two-level nested list: outer list = pages; inner list = content blocks of that page
  • +
  • Each block is a dict with at least: type, bbox, angle, content (some types add extra fields like score, block_tags, content_tags, format)
  • +
  • Designed for direct, raw model inspection
  • +
+
Supported content types (type field values)
+
{
+  "text": "Plain text",
+  "title": "Title",
+  "equation": "Display (interline) formula",
+  "image": "Image",
+  "image_caption": "Image caption",
+  "image_footnote": "Image footnote",
+  "table": "Table",
+  "table_caption": "Table caption",
+  "table_footnote": "Table footnote",
+  "phonetic": "Phonetic annotation",
+  "code": "Code block",
+  "code_caption": "Code caption",
+  "ref_text": "Reference / citation entry",
+  "algorithm": "Algorithm block (treated as code subtype)",
+  "list": "List container",
+  "header": "Page header",
+  "footer": "Page footer",
+  "page_number": "Page number",
+  "aside_text": "Side / margin note",
+  "page_footnote": "Page footnote"
+}
+
+
Coordinate system
+
    +
  • bbox = [x0, y0, x1, y1] (top-left, bottom-right)
  • +
  • Origin at top-left of the page
  • +
  • All coordinates are normalized percentages in [0,1]
  • +
+
Sample data
+
[
+  [
+    {
+      "type": "header",
+      "bbox": [0.077, 0.095, 0.18, 0.181],
+      "angle": 0,
+      "score": null,
+      "block_tags": null,
+      "content": "ELSEVIER",
+      "format": null,
+      "content_tags": null
+    },
+    {
+      "type": "title",
+      "bbox": [0.157, 0.228, 0.833, 0.253],
+      "angle": 0,
+      "score": null,
+      "block_tags": null,
+      "content": "The response of flow duration curves to afforestation",
+      "format": null,
+      "content_tags": null
+    }
+  ]
+]
+
+

Intermediate Processing Results (middle.json)

+

File naming format: {original_filename}_middle.json

+

Structure is broadly similar to the pipeline backend, but with these differences:

+
    +
  • list becomes a second‑level block, a new field sub_type distinguishes list categories:
      +
    • text: ordinary list
    • +
    • ref_text: reference / bibliography style list
    • +
    +
  • +
  • New code block type with sub_type(a code block always has at least a code_body, it may optionally have a code_caption):
      +
    • code
    • +
    • algorithm
    • +
    +
  • +
  • discarded_blocks may contain additional types:
      +
    • header
    • +
    • footer
    • +
    • page_number
    • +
    • aside_text
    • +
    • page_footnote
    • +
    +
  • +
  • All blocks include an angle field indicating rotation (one of 0, 90, 180, 270).
  • +
+
Examples
+
    +
  • +

    Example: list block +

    {
    +  "bbox": [174,155,818,333],
    +  "type": "list",
    +  "angle": 0,
    +  "index": 11,
    +  "blocks": [
    +    {
    +      "bbox": [174,157,311,175],
    +      "type": "text",
    +      "angle": 0,
    +      "lines": [
    +        {
    +          "bbox": [174,157,311,175],
    +            "spans": [
    +              {
    +                "bbox": [174,157,311,175],
    +                "type": "text",
    +                "content": "H.1 Introduction"
    +              }
    +            ]
    +        }
    +      ],
    +      "index": 3
    +    },
    +    {
    +      "bbox": [175,182,464,229],
    +      "type": "text",
    +      "angle": 0,
    +      "lines": [
    +        {
    +          "bbox": [175,182,464,229],
    +          "spans": [
    +            {
    +              "bbox": [175,182,464,229],
    +              "type": "text",
    +              "content": "H.2 Example: Divide by Zero without Exception Handling"
    +            }
    +          ]
    +        }
    +      ],
    +      "index": 4
    +    }
    +  ],
    +  "sub_type": "text"
    +}
    +
    +
  • +
  • +

    Example: code block with optional caption: +

    {
    +  "type": "code",
    +  "bbox": [114,780,885,1231],
    +  "blocks": [
    +    {
    +      "bbox": [114,780,885,1231],
    +      "lines": [
    +        {
    +          "bbox": [114,780,885,1231],
    +          "spans": [
    +            {
    +              "bbox": [114,780,885,1231],
    +              "type": "text",
    +              "content": "1 // Fig. H.1: DivideByZeroNoExceptionHandling.java  \n2 // Integer division without exception handling.  \n3 import java.util.Scanner;  \n4  \n5 public class DivideByZeroNoExceptionHandling  \n6 {  \n7 // demonstrates throwing an exception when a divide-by-zero occurs  \n8 public static int quotient( int numerator, int denominator )  \n9 {  \n10 return numerator / denominator; // possible division by zero  \n11 } // end method quotient  \n12  \n13 public static void main(String[] args)  \n14 {  \n15 Scanner scanner = new Scanner(System.in); // scanner for input  \n16  \n17 System.out.print(\"Please enter an integer numerator: \");  \n18 int numerator = scanner.nextInt();  \n19 System.out.print(\"Please enter an integer denominator: \");  \n20 int denominator = scanner.nextInt();  \n21"
    +            }
    +          ]
    +        }
    +      ],
    +      "index": 17,
    +      "angle": 0,
    +      "type": "code_body"
    +    },
    +    {
    +      "bbox": [867,160,1280,189],
    +      "lines": [
    +        {
    +          "bbox": [867,160,1280,189],
    +          "spans": [
    +            {
    +              "bbox": [867,160,1280,189],
    +              "type": "text",
    +              "content": "Algorithm 1 Modules for MCTSteg"
    +            }
    +          ]
    +        }
    +      ],
    +      "index": 19,
    +      "angle": 0,
    +      "type": "code_caption"
    +    }
    +  ],
    +  "index": 17,
    +  "sub_type": "code"
    +}
    +
    +
  • +
+

Content List (content_list.json)

+

File naming format: {original_filename}_content_list.json

+

Based on the pipeline format, with these VLM-specific extensions:

+
    +
  • New code type with sub_type (code | algorithm):
      +
    • Fields: code_body (string), optional code_caption (list of strings)
    • +
    +
  • +
  • New list type with sub_type (text | ref_text):
      +
    • Field: list_items (array of strings)
    • +
    +
  • +
  • All discarded_blocks entries are also output (e.g., headers, footers, page numbers, margin notes, page footnotes).
  • +
  • Existing types (image, table, text, equation) remain unchanged.
  • +
  • bbox still uses the 0–1000 normalized coordinate mapping.
  • +
+
Examples
+

Example: code (algorithm) entry +

{
+  "type": "code",
+  "sub_type": "algorithm",
+  "code_caption": ["Algorithm 1 Modules for MCTSteg"],
+  "code_body": "1: function GETCOORDINATE(d)  \n2:  $x \\gets d / l$ ,  $y \\gets d$  mod  $l$   \n3: return  $(x, y)$   \n4: end function  \n5: function BESTCHILD(v)  \n6:  $C \\gets$  child set of  $v$   \n7:  $v' \\gets \\arg \\max_{c \\in C} \\mathrm{UCTScore}(c)$   \n8:  $v'.n \\gets v'.n + 1$   \n9: return  $v'$   \n10: end function  \n11: function BACK PROPAGATE(v)  \n12: Calculate  $R$  using Equation 11  \n13: while  $v$  is not a root node do  \n14:  $v.r \\gets v.r + R$ ,  $v \\gets v.p$   \n15: end while  \n16: end function  \n17: function RANDOMSEARCH(v)  \n18: while  $v$  is not a leaf node do  \n19: Randomly select an untried action  $a \\in A(v)$   \n20: Create a new node  $v'$   \n21:  $(x, y) \\gets \\mathrm{GETCOORDINATE}(v'.d)$   \n22:  $v'.p \\gets v$ ,  $v'.d \\gets v.d + 1$ ,  $v'.\\Gamma \\gets v.\\Gamma$   \n23:  $v'.\\gamma_{x,y} \\gets a$   \n24: if  $a = -1$  then  \n25:  $v.lc \\gets v'$   \n26: else if  $a = 0$  then  \n27:  $v.mc \\gets v'$   \n28: else  \n29:  $v.rc \\gets v'$   \n30: end if  \n31:  $v \\gets v'$   \n32: end while  \n33: return  $v$   \n34: end function  \n35: function SEARCH(v)  \n36: while  $v$  is fully expanded do  \n37:  $v \\gets$  BESTCHILD(v)  \n38: end while  \n39: if  $v$  is not a leaf node then  \n40:  $v \\gets$  RANDOMSEARCH(v)  \n41: end if  \n42: return  $v$   \n43: end function",
+  "bbox": [510,87,881,740],
+  "page_idx": 0
+}
+
+

Example: list (text) entry +

{
+  "type": "list",
+  "sub_type": "text",
+  "list_items": [
+    "H.1 Introduction",
+    "H.2 Example: Divide by Zero without Exception Handling",
+    "H.3 Example: Divide by Zero with Exception Handling",
+    "H.4 Summary"
+  ],
+  "bbox": [174,155,818,333],
+  "page_idx": 0
+}
+
+

Example: discarded blocks output +

[
+  {
+    "type": "header",
+    "text": "Journal of Hydrology 310 (2005) 253-265",
+    "bbox": [363,164,623,177],
+    "page_idx": 0
+  },
+  {
+    "type": "page_footnote",
+    "text": "* Corresponding author. Address: Forest Science Centre, Department of Sustainability and Environment, P.O. Box 137, Heidelberg, Vic. 3084, Australia. Tel.: +61 3 9450 8719; fax: +61 3 9450 8644.",
+    "bbox": [71,815,915,841],
+    "page_idx": 0
+  }
+]
+
+

Summary

+

The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:

+
    +
  • +

    Model outputs (Use raw outputs):

    +
      +
    • model.json
    • +
    +
  • +
  • +

    Debugging and verification (Use visualization files):

    +
      +
    • layout.pdf
    • +
    • span.pdf
    • +
    +
  • +
  • +

    Content extraction: (Use simplified files):

    +
      +
    • *.md
    • +
    • content_list.json
    • +
    +
  • +
  • +

    Secondary development: (Use structured files):

    +
      +
    • middle.json
    • +
    +
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 00000000..6669adad --- /dev/null +++ b/requirements.txt @@ -0,0 +1,4 @@ +mkdocs +mkdocs-static-i18n +markdown-gfm-admonition +mkdocs-video \ No newline at end of file diff --git a/search/search_index.json b/search/search_index.json new file mode 100644 index 00000000..d598a198 --- /dev/null +++ b/search/search_index.json @@ -0,0 +1 @@ +{"config":{"lang":["en","zh"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"MinerU","text":"

\ud83d\ude80MinerU Official Website\u2192\u2705 Zero-Install Online Version \u2705 Full-Featured Client \u2705 Developer API Online Access, skip deployment hassles, get all product formats with one click, go fast!

\ud83d\udc4b join us on Discord and WeChat

"},{"location":"#project-introduction","title":"Project Introduction","text":"

MinerU is a tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format. MinerU was born during the pre-training process of InternLM. We focus on solving symbol conversion issues in scientific literature and hope to contribute to technological development in the era of large models. Compared to well-known commercial products domestically and internationally, MinerU is still young. If you encounter any issues or if the results are not as expected, please submit an issue on GitHub Issues and attach the relevant PDF.

"},{"location":"#key-features","title":"Key Features","text":"
  • Remove headers, footers, footnotes, page numbers and other elements to ensure semantic coherence
  • Output text in human reading order, suitable for single-column, multi-column and complex layouts
  • Retain the original document structure, including titles, paragraphs, lists, etc.
  • Extract images, image descriptions, tables, table titles and footnotes
  • Automatically identify and convert formulas in documents to LaTeX format
  • Automatically identify and convert tables in documents to HTML format
  • Automatically detect scanned PDFs and garbled PDFs, and enable OCR functionality
  • OCR supports detection and recognition of 109 languages
  • Support multiple output formats, such as multimodal and NLP Markdown, reading-order-sorted JSON, and information-rich intermediate formats
  • Support multiple visualization results, including layout visualization, span visualization, etc., for efficient confirmation of output effects and quality inspection
  • Support pure CPU environment operation, and support GPU(CUDA)/NPU(CANN)/MPS acceleration
  • Compatible with Windows, Linux and Mac platforms
"},{"location":"#user-guide","title":"User Guide","text":"
  • Quick Start Guide
  • Detailed Usage Instructions
"},{"location":"demo/","title":"Demo","text":""},{"location":"faq/","title":"FAQ","text":""},{"location":"faq/#frequently-asked-questions","title":"Frequently Asked Questions","text":"

If your question is not listed, try using DeepWiki's AI assistant for common issues.

For unresolved problems, join our Discord or WeChat community for support.

Encountered the error ImportError: libGL.so.1: cannot open shared object file: No such file or directory in Ubuntu 22.04 on WSL2

The libgl library is missing in Ubuntu 22.04 on WSL2. You can install the libgl library with the following command to resolve the issue:

sudo apt-get install libgl1-mesa-glx\n

Reference: #388

Missing text information in parsing results when installing and using on Linux systems.

MinerU uses pypdfium2 instead of pymupdf as the PDF page rendering engine in versions >=2.0 to resolve AGPLv3 license issues. On some Linux distributions, due to missing CJK fonts, some text may be lost during the process of rendering PDFs to images. To solve this problem, you can install the noto font package with the following commands, which are effective on Ubuntu/Debian systems:

sudo apt update\nsudo apt install fonts-noto-core\nsudo apt install fonts-noto-cjk\nfc-cache -fv\n
You can also directly use our Docker deployment method to build the image, which includes the above font packages by default.

Reference: #2915

"},{"location":"quick_start/","title":"Quick Start","text":""},{"location":"quick_start/#quick-start","title":"Quick Start","text":"

If you encounter any installation issues, please check the FAQ first.

"},{"location":"quick_start/#online-experience","title":"Online Experience","text":""},{"location":"quick_start/#official-online-web-application","title":"Official online web application","text":"

The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use

"},{"location":"quick_start/#gradio-based-online-demo","title":"Gradio-based online demo","text":"

A WebUI developed based on Gradio, with a simple interface and only core parsing functionality, no login required

"},{"location":"quick_start/#local-deployment","title":"Local Deployment","text":"

Warning

Prerequisites - Hardware and Software Environment Support

To ensure the stability and reliability of the project, we have optimized and tested only specific hardware and software environments during development. This ensures that users can achieve optimal performance and encounter the fewest compatibility issues when deploying and running the project on recommended system configurations.

By concentrating our resources and efforts on mainstream environments, our team can more efficiently resolve potential bugs and timely develop new features.

In non-mainstream environments, due to the diversity of hardware and software configurations, as well as compatibility issues with third-party dependencies, we cannot guarantee 100% usability of the project. Therefore, for users who wish to use this project in non-recommended environments, we suggest carefully reading the documentation and FAQ first, as most issues have corresponding solutions in the FAQ. Additionally, we encourage community feedback on issues so that we can gradually expand our support range.

Parsing Backend pipeline *-auto-engine *-http-client hybrid vlm hybrid vlm Backend Features Good Compatibility High Hardware Requirements For OpenAI Compatible Servers2 Accuracy1 82+ 90+ Operating System Linux3 / Windows4 / macOS5 Pure CPU Support \u2705 \u274c \u2705 GPU Acceleration Volta and later architecture GPUs or Apple Silicon Not Required Min VRAM 6GB 10GB 8GB 3GB RAM Min 16GB+, Recommended 32GB+ 8GB Disk Space 20GB+, SSD Recommended 2GB Python Version 3.10-3.13

1 Accuracy metrics are the End-to-End Evaluation Overall scores from OmniDocBench (v1.5), based on the latest version of MinerU. 2 Servers compatible with OpenAI API, such as local model servers or remote model services deployed via inference frameworks like vLLM/SGLang/LMDeploy. 3 Linux only supports distributions from 2019 and later. 4 Since the key dependency ray does not support Python 3.13 on Windows, only versions 3.10~3.12 are supported. 5 macOS requires version 14.0 or later.

"},{"location":"quick_start/#install-mineru","title":"Install MinerU","text":""},{"location":"quick_start/#install-mineru-using-pip-or-uv","title":"Install MinerU using pip or uv","text":"
pip install --upgrade pip\npip install uv\nuv pip install -U \"mineru[all]\"\n
"},{"location":"quick_start/#install-mineru-from-source-code","title":"Install MinerU from source code","text":"
git clone https://github.com/opendatalab/MinerU.git\ncd MinerU\nuv pip install -e .[all]\n

Tip

mineru[all] includes all core features, compatible with Windows / Linux / macOS systems, suitable for most users. If you need to specify the inference framework for the VLM model, or only intend to install a lightweight client on an edge device, please refer to the documentation Extension Modules Installation Guide.

"},{"location":"quick_start/#deploy-mineru-using-docker","title":"Deploy MinerU using Docker","text":"

MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues. You can get the Docker Deployment Instructions in the documentation.

"},{"location":"quick_start/#using-mineru","title":"Using MinerU","text":"

If your device meets the GPU acceleration requirements in the table above, you can use a simple command line for document parsing:

mineru -p <input_path> -o <output_path>\n
If your device does not meet the GPU acceleration requirements, you can specify the backend as pipeline to run in a pure CPU environment:
mineru -p <input_path> -o <output_path> -b pipeline\n

You can use MinerU for PDF parsing through various methods such as command line, API, and WebUI. For detailed instructions, please refer to the Usage Guide.

"},{"location":"quick_start/docker_deployment/","title":"Docker Deployment","text":""},{"location":"quick_start/docker_deployment/#deploying-mineru-with-docker","title":"Deploying MinerU with Docker","text":"

MinerU provides a convenient Docker deployment method, which helps quickly set up the environment and solve some tricky environment compatibility issues.

"},{"location":"quick_start/docker_deployment/#build-docker-image-using-dockerfile","title":"Build Docker Image using Dockerfile","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/global/Dockerfile\ndocker build -t mineru:latest -f Dockerfile .\n
"},{"location":"quick_start/docker_deployment/#docker-description","title":"Docker Description","text":"

MinerU's Docker uses vllm/vllm-openai as the base image, so it includes the vllm inference acceleration framework and necessary dependencies by default. Therefore, on compatible devices, you can directly use vllm to accelerate VLM model inference.

Note

Requirements for using vllm to accelerate VLM model inference:

  • Device must have Volta architecture or later graphics cards with 8GB+ available VRAM.
  • The host machine's graphics driver should support CUDA 12.9.1 or higher; You can check the driver version using the nvidia-smi command.
  • Docker container must have access to the host machine's graphics devices.
"},{"location":"quick_start/docker_deployment/#start-docker-container","title":"Start Docker Container","text":"
docker run --gpus all \\\n  --shm-size 32g \\\n  -p 30000:30000 -p 7860:7860 -p 8000:8000 \\\n  --ipc=host \\\n  -it mineru:latest \\\n  /bin/bash\n

After executing this command, you will enter the Docker container's interactive terminal with some ports mapped for potential services. You can directly run MinerU-related commands within the container to use MinerU's features. You can also directly start MinerU services by replacing /bin/bash with service startup commands. For detailed instructions, please refer to the Start the service via command.

"},{"location":"quick_start/docker_deployment/#start-services-directly-with-docker-compose","title":"Start Services Directly with Docker Compose","text":"

We provide a compose.yaml file that you can use to quickly start MinerU services.

# Download compose.yaml file\nwget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml\n

Note

  • The compose.yaml file contains configurations for multiple services of MinerU, you can choose to start specific services as needed.
  • Different services might have additional parameter configurations, which you can view and edit in the compose.yaml file.
  • Due to the pre-allocation of GPU memory by the vllm inference acceleration framework, you may not be able to run multiple vllm services simultaneously on the same machine. Therefore, ensure that other services that might use GPU memory have been stopped before starting the vlm-openai-server service or using the vlm-vllm-engine backend.
"},{"location":"quick_start/docker_deployment/#start-openai-compatible-server-service","title":"Start OpenAI-compatible server service","text":"

connect to openai-server via vlm-http-client backend

docker compose -f compose.yaml --profile openai-server up -d\n

Tip

In another terminal, connect to openai server via http client (only requires CPU and network, no vllm environment needed)

mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://<server_ip>:30000\n
"},{"location":"quick_start/docker_deployment/#start-web-api-service","title":"Start Web API service","text":"
docker compose -f compose.yaml --profile api up -d\n

Tip

Access http://<server_ip>:8000/docs in your browser to view the API documentation.

"},{"location":"quick_start/docker_deployment/#start-gradio-webui-service","title":"Start Gradio WebUI service","text":"
docker compose -f compose.yaml --profile gradio up -d\n

Tip

  • Access http://<server_ip>:7860 in your browser to use the Gradio WebUI.
"},{"location":"quick_start/extension_modules/","title":"Extension Modules","text":""},{"location":"quick_start/extension_modules/#mineru-extension-modules-installation-guide","title":"MinerU Extension Modules Installation Guide","text":"

MinerU supports installing extension modules on demand based on different needs to enhance functionality or support specific model backends.

"},{"location":"quick_start/extension_modules/#common-scenarios","title":"Common Scenarios","text":""},{"location":"quick_start/extension_modules/#core-functionality-installation","title":"Core Functionality Installation","text":"

The core module is the core dependency of MinerU, containing all functional modules except vllm/lmdeploy. Installing this module ensures the basic functionality of MinerU works properly.

uv pip install \"mineru[core]\"\n
"},{"location":"quick_start/extension_modules/#using-vllm-to-accelerate-vlm-model-inference","title":"Using vllm to Accelerate VLM Model Inference","text":"

Note

vllm and lmdeploy have nearly identical VLM inference acceleration effects and usage methods. You can choose one of them to install and use based on your actual needs, but it is not recommended to install both modules simultaneously to avoid potential dependency conflicts.

The vllm module provides acceleration support for VLM model inference, suitable for graphics cards with Volta architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.

uv pip install \"mineru[core,vllm]\"\n

Tip

If exceptions occur during installation of the extra package including vllm, please refer to the vllm official documentation to try to resolve the issue, or directly use the Docker deployment method.

"},{"location":"quick_start/extension_modules/#using-lmdeploy-to-accelerate-vlm-model-inference","title":"Using lmdeploy to Accelerate VLM Model Inference","text":"

Note

vllm and lmdeploy have nearly identical VLM inference acceleration effects and usage methods. You can choose one of them to install and use based on your actual needs, but it is not recommended to install both modules simultaneously to avoid potential dependency conflicts.

The lmdeploy module provides acceleration support for VLM model inference, suitable for graphics cards with Volta architecture and later (8GB+ VRAM). Installing this module can significantly improve model inference speed.

uv pip install \"mineru[core,lmdeploy]\"\n

Tip

If exceptions occur during installation of the extra package including lmdeploy, please refer to the lmdeploy official documentation to try to resolve the issue.

"},{"location":"quick_start/extension_modules/#installing-lightweight-client-to-connect-to-openai-compatible-servers-for-vlm-http-client-mode","title":"Installing Lightweight Client to Connect to OpenAI-compatible servers (for vlm-http-client mode)","text":"

If you need to install a lightweight client on edge devices to connect to an OpenAI-compatible server for using VLM mode, you can install the basic mineru package, which is very lightweight and suitable for devices with only CPU and network connectivity.

uv pip install mineru\nmineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:30000\n
"},{"location":"quick_start/extension_modules/#installing-lightweight-client-to-connect-to-openai-compatible-servers-for-hybrid-http-client-mode","title":"Installing Lightweight Client to Connect to OpenAI-compatible servers (for hybrid-http-client mode)","text":"

If you need to install a lightweight client on edge devices to connect to an OpenAI-compatible server for using hybrid mode, you can install the mineru pipeline extension package, which is relatively lightweight and can be used on devices with only CPU and network connectivity, while running faster on devices that support GPU acceleration.

uv pip install \"mineru[pipeline]\"\nmineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000\n
"},{"location":"reference/","title":"Reference","text":""},{"location":"reference/#reference-documentation","title":"Reference Documentation","text":"

This section provides detailed reference materials for MinerU project. Here you can find technical specifications, API documentation, output file formats, and version history.

"},{"location":"reference/#table-of-contents","title":"Table of Contents","text":"
  • Output Files Documentation - Detailed explanation of all output files and their formats
  • Changelog - Version update history and release notes
"},{"location":"reference/#documentation-overview","title":"Documentation Overview","text":""},{"location":"reference/#output-files-documentation","title":"Output Files Documentation","text":"

Understanding the output files generated by MinerU is crucial for effective use of the tool. The output files documentation provides:

  • Visual debugging files: Help you understand the document parsing process
  • Structured data files: Contain detailed parsing results for further processing
  • File format specifications: Detailed descriptions of each output file type
"},{"location":"reference/#changelog","title":"Changelog","text":"

The changelog documents the evolution of MinerU, including:

  • Version updates: New features and improvements for each release
  • Bug fixes: Issues resolved in each version
  • Breaking changes: Important changes that may affect your usage
  • Deprecations: Features that are being phased out
"},{"location":"reference/changelog/","title":"Changelog","text":""},{"location":"reference/changelog/#changelog","title":"Changelog","text":"

This document records the update history of MinerU project for version 2.6.7 and earlier. For the latest version updates, please check the project README.

"},{"location":"reference/changelog/#26-series-versions","title":"2.6 Series Versions","text":""},{"location":"reference/changelog/#267-20251212","title":"2.6.7 (2025/12/12)","text":"
  • Bug fix: #4168
"},{"location":"reference/changelog/#266-20251202","title":"2.6.6 (2025/12/02)","text":"

mineru-api tool optimizations

  • Added descriptive text to mineru-api interface parameters to improve API documentation readability.
  • You can use the environment variable MINERU_API_ENABLE_FASTAPI_DOCS to control whether the auto-generated interface documentation page is enabled (enabled by default).
  • Added concurrency configuration options for the vlm-vllm-async-engine, vlm-lmdeploy-engine, and vlm-http-client backends. Users can use the environment variable MINERU_API_MAX_CONCURRENT_REQUESTS to set the maximum number of concurrent API requests (unlimited by default).
"},{"location":"reference/changelog/#265-20251126","title":"2.6.5 (2025/11/26)","text":"
  • Added support for a new backend vlm-lmdeploy-engine. Its usage is similar to vlm-vllm-(async)engine, but it uses lmdeploy as the inference engine and additionally supports native inference acceleration on Windows platforms compared to vllm.
"},{"location":"reference/changelog/#264-20251104","title":"2.6.4 (2025/11/04)","text":"
  • Added timeout configuration for PDF image rendering, default is 300 seconds, can be configured via environment variable MINERU_PDF_RENDER_TIMEOUT to prevent long blocking of the rendering process caused by some abnormal PDF files.
  • Added CPU thread count configuration options for ONNX models, default is the system CPU core count, can be configured via environment variables MINERU_INTRA_OP_NUM_THREADS and MINERU_INTER_OP_NUM_THREADS to reduce CPU resource contention conflicts in high concurrency scenarios.
"},{"location":"reference/changelog/#263-20251031","title":"2.6.3 (2025/10/31)","text":"
  • Added support for a new backend vlm-mlx-engine, enabling MLX-accelerated inference for the MinerU2.5 model on Apple Silicon devices. Compared to the vlm-transformers backend, vlm-mlx-engine delivers a 100%\u2013200% speed improvement.
  • Bug fixes: #3849, #3859
"},{"location":"reference/changelog/#262-20251024","title":"2.6.2 (2025/10/24)","text":"

pipeline backend optimizations

  • Added experimental support for Chinese formulas, which can be enabled by setting the environment variable export MINERU_FORMULA_CH_SUPPORT=1. This feature may cause a slight decrease in MFR speed and failures in recognizing some long formulas. It is recommended to enable it only when parsing Chinese formulas is needed. To disable this feature, set the environment variable to 0.
  • OCR speed significantly improved by 200%~300%, thanks to the optimization solution provided by @cjsdurj
  • OCR models optimized for improved accuracy and coverage of Latin script recognition, and updated Cyrillic, Arabic, Devanagari, Telugu (te), and Tamil (ta) language systems to ppocr-v5 version, with accuracy improved by over 40% compared to previous models

vlm backend optimizations

  • table_caption and table_footnote matching logic optimized to improve the accuracy of table caption and footnote matching and reading order rationality in scenarios with multiple consecutive tables on a page
  • Optimized CPU resource usage during high concurrency when using vllm backend, reducing server pressure
  • Adapted to vllm version 0.11.0

General optimizations

  • Cross-page table merging effect optimized, added support for cross-page continuation table merging, improving table merging effectiveness in multi-column merge scenarios
  • Added environment variable configuration option MINERU_TABLE_MERGE_ENABLE for table merging feature. Table merging is enabled by default and can be disabled by setting this variable to 0
"},{"location":"reference/changelog/#25-series-versions","title":"2.5 Series Versions","text":""},{"location":"reference/changelog/#254-20250926","title":"2.5.4 (2025/09/26)","text":"
  • \ud83c\udf89\ud83c\udf89 The MinerU2.5 Technical Report is now available! We welcome you to read it for a comprehensive overview of its model architecture, training strategy, data engineering and evaluation results.
  • Fixed an issue where some PDF files were mistakenly identified as AI files, causing parsing failures
"},{"location":"reference/changelog/#253-20250920","title":"2.5.3 (2025/09/20)","text":"
  • Dependency version range adjustment to enable Turing and earlier architecture GPUs to use vLLM acceleration for MinerU2.5 model inference.
  • pipeline backend compatibility fixes for torch 2.8.0.
  • Reduced default concurrency for vLLM async backend to lower server pressure and avoid connection closure issues caused by high load.
  • More compatibility-related details can be found in the announcement
"},{"location":"reference/changelog/#252-20250919","title":"2.5.2 (2025/09/19)","text":"

We are officially releasing MinerU2.5, currently the most powerful multimodal large model for document parsing.

With only 1.2B parameters, MinerU2.5's accuracy on the OmniDocBench benchmark comprehensively surpasses top-tier multimodal models like Gemini 2.5 Pro, GPT-4o, and Qwen2.5-VL-72B. It also significantly outperforms leading specialized models such as dots.ocr, MonkeyOCR, and PP-StructureV3.

The model has been released on HuggingFace and ModelScope platforms. Welcome to download and use!

Core Highlights

  • SOTA Performance with Extreme Efficiency: As a 1.2B model, it achieves State-of-the-Art (SOTA) results that exceed models in the 10B and 100B+ classes, redefining the performance-per-parameter standard in document AI.
  • Advanced Architecture for Across-the-Board Leadership: By combining a two-stage inference pipeline (decoupling layout analysis from content recognition) with a native high-resolution architecture, it achieves SOTA performance across five key areas: layout analysis, text recognition, formula recognition, table recognition, and reading order.

Key Capability Enhancements

  • Layout Detection: Delivers more complete results by accurately covering non-body content like headers, footers, and page numbers. It also provides more precise element localization and natural format reconstruction for lists and references.
  • Table Parsing: Drastically improves parsing for challenging cases, including rotated tables, borderless/semi-structured tables, and long/complex tables.
  • Formula Recognition: Significantly boosts accuracy for complex, long-form, and hybrid Chinese-English formulas, greatly enhancing the parsing capability for mathematical documents.

Repository Adjustments

Additionally, with the release of vlm 2.5, we have made some adjustments to the repository:

  • The vlm backend has been upgraded to version 2.5, supporting the MinerU2.5 model and no longer compatible with the MinerU2.0-2505-0.9B model. The last version supporting the 2.0 model is mineru-2.2.2.
  • VLM inference-related code has been moved to mineru_vl_utils, reducing coupling with the main mineru repository and facilitating independent iteration in the future.
  • The vlm accelerated inference framework has been switched from sglang to vllm, achieving full compatibility with the vllm ecosystem, allowing users to use the MinerU2.5 model and accelerated inference on any platform that supports the vllm framework.
  • Due to major upgrades in the vlm model supporting more layout types, we have made some adjustments to the structure of the parsing intermediate file middle.json and result file content_list.json. Please refer to the documentation for details.

Other Repository Optimizations

  • Removed file extension whitelist validation for input files. When input files are PDF documents or images, there are no longer requirements for file extensions, improving usability.
"},{"location":"reference/changelog/#22-24-series-versions","title":"2.2 - 2.4 Series Versions","text":""},{"location":"reference/changelog/#222-20250910","title":"2.2.2 (2025/09/10)","text":"
  • Fixed the issue where the new table recognition model would affect the overall parsing task when some table parsing failed
"},{"location":"reference/changelog/#221-20250908","title":"2.2.1 (2025/09/08)","text":"
  • Fixed the issue where some newly added models were not downloaded when using the model download command.
"},{"location":"reference/changelog/#220-20250905","title":"2.2.0 (2025/09/05)","text":"

Major Updates

  • In this version, we focused on improving table parsing accuracy by introducing a new wired table recognition model and a brand-new hybrid table structure parsing algorithm, significantly enhancing the table recognition capabilities of the pipeline backend.
  • We also added support for cross-page table merging, which is supported by both pipeline and vlm backends, further improving the completeness and accuracy of table parsing.

Other Updates

  • The pipeline backend now supports 270-degree rotated table parsing, bringing support for table parsing in 0/90/270-degree orientations
  • pipeline added OCR capability support for Thai and Greek, and updated the English OCR model to the latest version. English recognition accuracy improved by 11%, Thai recognition model accuracy is 82.68%, and Greek recognition model accuracy is 89.28% (by PPOCRv5)
  • Added bbox field (mapped to 0-1000 range) in the output content_list.json, making it convenient for users to directly obtain position information for each content block
  • Removed the pipeline_old_linux installation option, no longer supporting legacy Linux systems such as CentOS 7, to provide better support for uv's sync/run commands
"},{"location":"reference/changelog/#21-series-versions","title":"2.1 Series Versions","text":""},{"location":"reference/changelog/#2110-20250801","title":"2.1.10 (2025/08/01)","text":"
  • Fixed an issue in the pipeline backend where block overlap caused the parsing results to deviate from expectations #3232
"},{"location":"reference/changelog/#219-20250730","title":"2.1.9 (2025/07/30)","text":"
  • transformers 4.54.1 version adaptation
"},{"location":"reference/changelog/#218-20250728","title":"2.1.8 (2025/07/28)","text":"
  • sglang 0.4.9.post5 version adaptation
"},{"location":"reference/changelog/#217-20250727","title":"2.1.7 (2025/07/27)","text":"
  • transformers 4.54.0 version adaptation
"},{"location":"reference/changelog/#216-20250726","title":"2.1.6 (2025/07/26)","text":"
  • Fixed table parsing issues in handwritten documents when using vlm backend
  • Fixed visualization box position drift issue when document is rotated #3175
"},{"location":"reference/changelog/#215-20250724","title":"2.1.5 (2025/07/24)","text":"
  • sglang 0.4.9 version adaptation, synchronously upgrading the dockerfile base image to sglang 0.4.9.post3
"},{"location":"reference/changelog/#214-20250723","title":"2.1.4 (2025/07/23)","text":"

Bug Fixes

  • Fixed the issue of excessive memory consumption during the MFR step in the pipeline backend under certain scenarios #2771
  • Fixed the inaccurate matching between image/table and caption/footnote under certain conditions #3129
"},{"location":"reference/changelog/#211-20250716","title":"2.1.1 (2025/07/16)","text":"

Bug fixes

  • Fixed text block content loss issue that could occur in certain pipeline scenarios #3005
  • Fixed issue where sglang-client required unnecessary packages like torch #2968
  • Updated dockerfile to fix incomplete text content parsing due to missing fonts in Linux #2915

Usability improvements

  • Updated compose.yaml to facilitate direct startup of sglang-server, mineru-api, and mineru-gradio services
  • Launched brand new online documentation site, simplified readme, providing better documentation experience
"},{"location":"reference/changelog/#210-20250705","title":"2.1.0 (2025/07/05)","text":"

This is the first major update of MinerU 2, which includes a large number of new features and improvements, covering significant performance optimizations, user experience enhancements, and bug fixes. The detailed update contents are as follows:

Performance Optimizations

  • Significantly improved preprocessing speed for documents with specific resolutions (around 2000 pixels on the long side).
  • Greatly enhanced post-processing speed when the pipeline backend handles batch processing of documents with fewer pages (<10 pages).
  • Layout analysis speed of the pipeline backend has been increased by approximately 20%.

Experience Enhancements

  • Built-in ready-to-use fastapi service and gradio webui. For detailed usage instructions, please refer to Documentation.
  • Adapted to sglang version 0.4.8, significantly reducing the GPU memory requirements for the vlm-sglang backend. It can now run on graphics cards with as little as 8GB GPU memory (Turing architecture or newer).
  • Added transparent parameter passing for all commands related to sglang, allowing the sglang-engine backend to receive all sglang parameters consistently with the sglang-server.
  • Supports feature extensions based on configuration files, including custom formula delimiters, enabling heading classification, and customizing local model directories. For detailed usage instructions, please refer to Documentation.

New Features

  • Updated the pipeline backend with the PP-OCRv5 multilingual text recognition model, supporting text recognition in 37 languages such as French, Spanish, Portuguese, Russian, and Korean, with an average accuracy improvement of over 30%. Details
  • Introduced limited support for vertical text layout in the pipeline backend.
"},{"location":"reference/changelog/#20-series-versions","title":"2.0 Series Versions","text":""},{"location":"reference/changelog/#206-20250620","title":"2.0.6 (2025/06/20)","text":"
  • Fixed occasional parsing interruptions caused by invalid block content in vlm mode
  • Fixed parsing interruptions caused by incomplete table structures in vlm mode
"},{"location":"reference/changelog/#205-20250617","title":"2.0.5 (2025/06/17)","text":"
  • Fixed the issue where models were still required to be downloaded in the sglang-client mode
  • Fixed the issue where the sglang-client mode unnecessarily depended on packages like torch during runtime.
  • Fixed the issue where only the first instance would take effect when attempting to launch multiple sglang-client instances via multiple URLs within the same process
"},{"location":"reference/changelog/#203-20250615","title":"2.0.3 (2025/06/15)","text":"
  • Fixed a configuration file key-value update error that occurred when downloading model type was set to all
  • Fixed the issue where the formula and table feature toggle switches were not working in command line mode, causing the features to remain enabled.
  • Fixed compatibility issues with sglang version 0.4.7 in the sglang-engine mode.
  • Updated Dockerfile and installation documentation for deploying the full version of MinerU in sglang environment
"},{"location":"reference/changelog/#200-20250613","title":"2.0.0 (2025/06/13)","text":"

New Architecture

MinerU 2.0 has been deeply restructured in code organization and interaction methods, significantly improving system usability, maintainability, and extensibility.

  • Removal of Third-party Dependency Limitations: Completely eliminated the dependency on pymupdf, moving the project toward a more open and compliant open-source direction.
  • Ready-to-use, Easy Configuration: No need to manually edit JSON configuration files; most parameters can now be set directly via command line or API.
  • Automatic Model Management: Added automatic model download and update mechanisms, allowing users to complete model deployment without manual intervention.
  • Offline Deployment Friendly: Provides built-in model download commands, supporting deployment requirements in completely offline environments.
  • Streamlined Code Structure: Removed thousands of lines of redundant code, simplified class inheritance logic, significantly improving code readability and development efficiency.
  • Unified Intermediate Format Output: Adopted standardized middle_json format, compatible with most secondary development scenarios based on this format, ensuring seamless ecosystem business migration.

New Model

MinerU 2.0 integrates our latest small-parameter, high-performance multimodal document parsing model, achieving end-to-end high-speed, high-precision document understanding.

  • Small Model, Big Capabilities: With parameters under 1B, yet surpassing traditional 72B-level vision-language models (VLMs) in parsing accuracy.
  • Multiple Functions in One: A single model covers multilingual recognition, handwriting recognition, layout analysis, table parsing, formula recognition, reading order sorting, and other core tasks.
  • Ultimate Inference Speed: Achieves peak throughput exceeding 10,000 tokens/s through sglang acceleration on a single NVIDIA 4090 card, easily handling large-scale document processing requirements.
  • Online Experience: You can experience our brand-new VLM model on MinerU.net, Hugging Face, and ModelScope.

Incompatible Changes Notice

To improve overall architectural rationality and long-term maintainability, this version contains some incompatible changes:

  • Python package name changed from magic-pdf to mineru, and the command-line tool changed from magic-pdf to mineru. Please update your scripts and command calls accordingly.
  • For modular system design and ecosystem consistency considerations, MinerU 2.0 no longer includes the LibreOffice document conversion module. If you need to process Office documents, we recommend converting them to PDF format through an independently deployed LibreOffice service before proceeding with subsequent parsing operations.
"},{"location":"reference/changelog/#1x-series-historical-versions","title":"1.x Series Historical Versions","text":""},{"location":"reference/changelog/#1312-20250524","title":"1.3.12 (2025/05/24)","text":"

Added support for PPOCRv5 models, updated ch_server model to PP-OCRv5_rec_server, and ch_lite model to PP-OCRv5_rec_mobile (model update required)

  • In testing, we found that PPOCRv5(server) has some improvement for handwritten documents, but has slightly lower accuracy than v4_server_doc for other document types, so the default ch model remains unchanged as PP-OCRv4_server_rec_doc.
  • Since PPOCRv5 has enhanced recognition capabilities for handwriting and special characters, you can manually choose the PPOCRv5 model for Japanese-Traditional Chinese mixed scenarios and handwritten documents
  • You can select the appropriate model through the lang parameter lang='ch_server' (Python API) or --lang ch_server (command line):
  • ch: PP-OCRv4_server_rec_doc (default) (Chinese/English/Japanese/Traditional Chinese mixed/15K dictionary)
  • ch_server: PP-OCRv5_rec_server (Chinese/English/Japanese/Traditional Chinese mixed + handwriting/18K dictionary)
  • ch_lite: PP-OCRv5_rec_mobile (Chinese/English/Japanese/Traditional Chinese mixed + handwriting/18K dictionary)
  • ch_server_v4: PP-OCRv4_rec_server (Chinese/English mixed/6K dictionary)
  • ch_lite_v4: PP-OCRv4_rec_mobile (Chinese/English mixed/6K dictionary)

Added support for handwritten documents through optimized layout recognition of handwritten text areas

  • This feature is supported by default, no additional configuration required
  • You can refer to the instructions above to manually select the PPOCRv5 model for better handwritten document parsing results

The huggingface and modelscope demos have been updated to versions that support handwriting recognition and PPOCRv5 models, which you can experience online

"},{"location":"reference/changelog/#1310-20250429","title":"1.3.10 (2025/04/29)","text":"
  • Added support for custom formula delimiters, which can be configured by modifying the latex-delimiter-config section in the magic-pdf.json file in your user directory.
"},{"location":"reference/changelog/#139-20250427","title":"1.3.9 (2025/04/27)","text":"
  • Optimized formula parsing functionality, improved formula rendering success rate
"},{"location":"reference/changelog/#138-20250423","title":"1.3.8 (2025/04/23)","text":"

The default ocr model (ch) has been updated to PP-OCRv4_server_rec_doc (model update required)

  • PP-OCRv4_server_rec_doc is trained on a mixture of more Chinese document data and PP-OCR training data based on PP-OCRv4_server_rec, adding recognition capabilities for some traditional Chinese characters, Japanese, and special characters. It can recognize over 15,000 characters and improves both document-specific and general text recognition abilities.
  • Performance comparison of PP-OCRv4_server_rec_doc/PP-OCRv4_server_rec/PP-OCRv4_mobile_rec
  • After verification, the PP-OCRv4_server_rec_doc model shows significant accuracy improvements in Chinese/English/Japanese/Traditional Chinese in both single language and mixed language scenarios, with comparable speed to PP-OCRv4_server_rec, making it suitable for most use cases.
  • In some pure English scenarios, PP-OCRv4_server_rec_doc may have word adhesion issues, while PP-OCRv4_server_rec performs better in these cases. Therefore, we've kept the PP-OCRv4_server_rec model, which users can access by adding the parameter lang='ch_server' (Python API) or --lang ch_server (command line).
"},{"location":"reference/changelog/#137-20250422","title":"1.3.7 (2025/04/22)","text":"
  • Fixed the issue where the lang parameter was ineffective during table parsing model initialization
  • Fixed the significant speed reduction of OCR and table parsing in cpu mode
"},{"location":"reference/changelog/#134-20250416","title":"1.3.4 (2025/04/16)","text":"
  • Slightly improved OCR-det speed by removing some unnecessary blocks
  • Fixed page-internal sorting errors caused by footnotes in certain cases
"},{"location":"reference/changelog/#132-20250412","title":"1.3.2 (2025/04/12)","text":"
  • Fixed dependency version incompatibility issues when installing on Windows with Python 3.13
  • Optimized memory usage during batch inference
  • Improved parsing of tables rotated 90 degrees
  • Enhanced parsing of oversized tables in financial report samples
  • Fixed the occasional word adhesion issue in English text areas when OCR language is not specified (model update required)
"},{"location":"reference/changelog/#131-20250408","title":"1.3.1 (2025/04/08)","text":"

Fixed several compatibility issues

  • Added support for Python 3.13
  • Made final adaptations for outdated Linux systems (such as CentOS 7) with no guarantee of continued support in future versions, installation instructions
"},{"location":"reference/changelog/#130-20250403","title":"1.3.0 (2025/04/03)","text":"

Installation and compatibility optimizations

  • Resolved compatibility issues caused by detectron2 by removing layoutlmv3 usage in layout
  • Extended torch version compatibility to 2.2~2.6 (excluding 2.5)
  • Added CUDA compatibility for versions 11.8/12.4/12.6/12.8 (CUDA version determined by torch), solving compatibility issues for users with 50-series and H-series GPUs
  • Extended Python compatibility to versions 3.10~3.12, fixing the issue of automatic downgrade to version 0.6.1 when installing in non-3.10 environments
  • Optimized offline deployment process, eliminating the need to download any model files after successful deployment

Performance optimizations

  • Enhanced parsing speed for batches of small files by supporting batch processing of multiple PDF files (script example), with formula parsing speed improved by up to 1400% and overall parsing speed improved by up to 500% compared to version 1.0.1
  • Reduced memory usage and improved parsing speed by optimizing MFR model loading and usage (requires re-running the model download process to get incremental updates to model files)
  • Optimized GPU memory usage, requiring only 6GB minimum to run this project
  • Improved running speed on MPS devices

Parsing effect optimizations

  • Updated MFR model to unimernet(2503), fixing line break loss issues in multi-line formulas

Usability optimizations

  • Completely replaced the paddle framework and paddleocr in the project by using paddleocr2torch, resolving conflicts between paddle and torch, as well as thread safety issues caused by the paddle framework
  • Added real-time progress bar display during parsing, allowing precise tracking of parsing progress and making the waiting process more bearable
"},{"location":"reference/changelog/#121-20250303","title":"1.2.1 (2025/03/03)","text":"

Fixed some issues

  • Fixed the impact on punctuation marks during full-width to half-width conversion of letters and numbers
  • Fixed caption matching inaccuracies in certain scenarios
  • Fixed formula span loss issues in certain scenarios
"},{"location":"reference/changelog/#120-20250224","title":"1.2.0 (2025/02/24)","text":"

This version includes several fixes and improvements to enhance parsing efficiency and accuracy:

Performance Optimization

  • Increased classification speed for PDF documents in auto mode.

Parsing Optimization

  • Improved parsing logic for documents containing watermarks, significantly enhancing the parsing results for such documents.
  • Enhanced the matching logic for multiple images/tables and captions within a single page, improving the accuracy of image-text matching in complex layouts.

Bug Fixes

  • Fixed an issue where image/table spans were incorrectly filled into text blocks under certain conditions.
  • Resolved an issue where title blocks were empty in some cases.
"},{"location":"reference/changelog/#110-20250122","title":"1.1.0 (2025/01/22)","text":"

In this version we have focused on improving parsing accuracy and efficiency:

Model capability upgrade (requires re-executing the model download process to obtain incremental updates of model files)

  • The layout recognition model has been upgraded to the latest doclayout_yolo(2501) model, improving layout recognition accuracy.
  • The formula parsing model has been upgraded to the latest unimernet(2501) model, improving formula recognition accuracy.

Performance optimization

  • On devices that meet certain configuration requirements (16GB+ VRAM), by optimizing resource usage and restructuring the processing pipeline, overall parsing speed has been increased by more than 50%.

Parsing effect optimization

  • Added a new heading classification feature (testing version, enabled by default) to the online demo (mineru.net/huggingface/modelscope), which supports hierarchical classification of headings, thereby enhancing document structuring.
"},{"location":"reference/changelog/#101-20250110","title":"1.0.1 (2025/01/10)","text":"

This is our first official release, where we have introduced a completely new API interface and enhanced compatibility through extensive refactoring, as well as a brand new automatic language identification feature:

New API Interface

  • For the data-side API, we have introduced the Dataset class, designed to provide a robust and flexible data processing framework. This framework currently supports a variety of document formats, including images (.jpg and .png), PDFs, Word documents (.doc and .docx), and PowerPoint presentations (.ppt and .pptx). It ensures effective support for data processing tasks ranging from simple to complex.
  • For the user-side API, we have meticulously designed the MinerU processing workflow as a series of composable Stages. Each Stage represents a specific processing step, allowing users to define new Stages according to their needs and creatively combine these stages to customize their data processing workflows.

Enhanced Compatibility

  • By optimizing the dependency environment and configuration items, we ensure stable and efficient operation on ARM architecture Linux systems.
  • We have deeply integrated with Huawei Ascend NPU acceleration, providing autonomous and controllable high-performance computing capabilities. This supports the localization and development of AI application platforms in China. Ascend NPU Acceleration

Automatic Language Identification

  • By introducing a new language recognition model, setting the lang configuration to auto during document parsing will automatically select the appropriate OCR language model, improving the accuracy of scanned document parsing.
"},{"location":"reference/changelog/#0x-series-historical-versions","title":"0.x Series Historical Versions","text":""},{"location":"reference/changelog/#0100-20241122","title":"0.10.0 (2024/11/22)","text":"

Introducing hybrid OCR text extraction capabilities:

  • Significantly improved parsing performance in complex text distribution scenarios such as dense formulas, irregular span regions, and text represented by images.
  • Combines the dual advantages of accurate content extraction and faster speed in text mode, and more precise span/line region recognition in OCR mode.
"},{"location":"reference/changelog/#093-20241115","title":"0.9.3 (2024/11/15)","text":"

Integrated RapidTable for table recognition, improving single-table parsing speed by more than 10 times, with higher accuracy and lower GPU memory usage.

"},{"location":"reference/changelog/#092-20241106","title":"0.9.2 (2024/11/06)","text":"

Integrated the StructTable-InternVL2-1B model for table recognition functionality.

"},{"location":"reference/changelog/#090-20241031","title":"0.9.0 (2024/10/31)","text":"

This is a major new version with extensive code refactoring, addressing numerous issues, improving performance, reducing hardware requirements, and enhancing usability:

  • Refactored the sorting module code to use layoutreader for reading order sorting, ensuring high accuracy in various layouts.
  • Refactored the paragraph concatenation module to achieve good results in cross-column, cross-page, cross-figure, and cross-table scenarios.
  • Refactored the list and table of contents recognition functions, significantly improving the accuracy of list blocks and table of contents blocks, as well as the parsing of corresponding text paragraphs.
  • Refactored the matching logic for figures, tables, and descriptive text, greatly enhancing the accuracy of matching captions and footnotes to figures and tables, and reducing the loss rate of descriptive text to near zero.
  • Added multi-language support for OCR, supporting detection and recognition of 84 languages. For the list of supported languages, see OCR Language Support List.
  • Added memory recycling logic and other memory optimization measures, significantly reducing memory usage. The memory requirement for enabling all acceleration features except table acceleration (layout/formula/OCR) has been reduced from 16GB to 8GB, and the memory requirement for enabling all acceleration features has been reduced from 24GB to 10GB.
  • Optimized configuration file feature switches, adding an independent formula detection switch to significantly improve speed and parsing results when formula detection is not needed.
  • Integrated PDF-Extract-Kit 1.0:
  • Added the self-developed doclayout_yolo model, which speeds up processing by more than 10 times compared to the original solution while maintaining similar parsing effects, and can be freely switched with layoutlmv3 via the configuration file.
  • Upgraded formula parsing to unimernet 0.2.1, improving formula parsing accuracy while significantly reducing memory usage.
  • Due to the repository change for PDF-Extract-Kit 1.0, you need to re-download the model. Please refer to How to Download Models for detailed steps.
"},{"location":"reference/changelog/#081-20240927","title":"0.8.1 (2024/09/27)","text":"

Fixed some bugs, and providing a localized deployment version of the online demo and the front-end interface.

"},{"location":"reference/changelog/#080-20240909","title":"0.8.0 (2024/09/09)","text":"

Supporting fast deployment with Dockerfile, and launching demos on Huggingface and Modelscope.

"},{"location":"reference/changelog/#071-20240830","title":"0.7.1 (2024/08/30)","text":"

Add paddle tablemaster table recognition option

"},{"location":"reference/changelog/#070b1-20240809","title":"0.7.0b1 (2024/08/09)","text":"

Simplified installation process, added table recognition functionality

"},{"location":"reference/changelog/#062b1-20240801","title":"0.6.2b1 (2024/08/01)","text":"

Optimized dependency conflict issues and installation documentation

"},{"location":"reference/changelog/#initial-open-source-release-20240705","title":"Initial Open-Source Release (2024/07/05)","text":"

MinerU project's first open-source release

"},{"location":"reference/output_files/","title":"Output File Format","text":""},{"location":"reference/output_files/#mineru-output-files-documentation","title":"MinerU Output Files Documentation","text":""},{"location":"reference/output_files/#overview","title":"Overview","text":"

After executing the mineru command, in addition to the main markdown file output, multiple auxiliary files are generated for debugging, quality inspection, and further processing. These files include:

  • Visual debugging files: Help users intuitively understand the document parsing process and results
  • Structured data files: Contain detailed parsing data for secondary development

The following sections provide detailed descriptions of each file's purpose and format.

"},{"location":"reference/output_files/#visual-debugging-files","title":"Visual Debugging Files","text":""},{"location":"reference/output_files/#layout-analysis-file-layoutpdf","title":"Layout Analysis File (layout.pdf)","text":"

File naming format: {original_filename}_layout.pdf

Functionality:

  • Visualizes layout analysis results for each page
  • Numbers in the top-right corner of each detection box indicate reading order
  • Different background colors distinguish different types of content blocks

Use cases:

  • Check if layout analysis is correct
  • Verify if reading order is reasonable
  • Debug layout-related issues

"},{"location":"reference/output_files/#text-spans-file-spanpdf","title":"Text Spans File (span.pdf)","text":"

Note

Only applicable to pipeline backend

File naming format: {original_filename}_span.pdf

Functionality:

  • Uses different colored line boxes to annotate page content based on span type
  • Used for quality inspection and issue troubleshooting

Use cases:

  • Quickly troubleshoot text loss issues
  • Check inline formula recognition
  • Verify text segmentation accuracy

"},{"location":"reference/output_files/#structured-data-files","title":"Structured Data Files","text":"

Important

The VLM backend output has significant changes in version 2.5 and is not backward-compatible with the pipeline backend. If you plan to build secondary development on structured outputs, please read this document carefully.

"},{"location":"reference/output_files/#pipeline-backend-output-results","title":"Pipeline Backend Output Results","text":""},{"location":"reference/output_files/#model-inference-results-modeljson","title":"Model Inference Results (model.json)","text":"

File naming format: {original_filename}_model.json

"},{"location":"reference/output_files/#data-structure-definition","title":"Data Structure Definition","text":"
from pydantic import BaseModel, Field\nfrom enum import IntEnum\n\nclass CategoryType(IntEnum):\n    \"\"\"Content category enumeration\"\"\"\n    title = 0               # Title\n    plain_text = 1          # Text\n    abandon = 2             # Including headers, footers, page numbers, and page annotations\n    figure = 3              # Image\n    figure_caption = 4      # Image caption\n    table = 5               # Table\n    table_caption = 6       # Table caption\n    table_footnote = 7      # Table footnote\n    isolate_formula = 8     # Interline formula\n    formula_caption = 9     # Interline formula number\n    embedding = 13          # Inline formula\n    isolated = 14           # Interline formula\n    text = 15               # OCR recognition result\n\nclass PageInfo(BaseModel):\n    \"\"\"Page information\"\"\"\n    page_no: int = Field(description=\"Page number, first page is 0\", ge=0)\n    height: int = Field(description=\"Page height\", gt=0)\n    width: int = Field(description=\"Page width\", ge=0)\n\nclass ObjectInferenceResult(BaseModel):\n    \"\"\"Object recognition result\"\"\"\n    category_id: CategoryType = Field(description=\"Category\", ge=0)\n    poly: list[float] = Field(description=\"Quadrilateral coordinates, format: [x0,y0,x1,y1,x2,y2,x3,y3]\")\n    score: float = Field(description=\"Confidence score of inference result\")\n    latex: str | None = Field(description=\"LaTeX parsing result\", default=None)\n    html: str | None = Field(description=\"HTML parsing result\", default=None)\n\nclass PageInferenceResults(BaseModel):\n    \"\"\"Page inference results\"\"\"\n    layout_dets: list[ObjectInferenceResult] = Field(description=\"Page recognition results\")\n    page_info: PageInfo = Field(description=\"Page metadata\")\n\n# Complete inference results\ninference_result: list[PageInferenceResults] = []\n
"},{"location":"reference/output_files/#coordinate-system-description","title":"Coordinate System Description","text":"

poly coordinate format: [x0, y0, x1, y1, x2, y2, x3, y3]

  • Represents coordinates of top-left, top-right, bottom-right, bottom-left points respectively
  • Coordinate origin is at the top-left corner of the page

"},{"location":"reference/output_files/#sample-data","title":"Sample Data","text":"
[\n    {\n        \"layout_dets\": [\n            {\n                \"category_id\": 2,\n                \"poly\": [\n                    99.1906967163086,\n                    100.3119125366211,\n                    730.3707885742188,\n                    100.3119125366211,\n                    730.3707885742188,\n                    245.81326293945312,\n                    99.1906967163086,\n                    245.81326293945312\n                ],\n                \"score\": 0.9999997615814209\n            }\n        ],\n        \"page_info\": {\n            \"page_no\": 0,\n            \"height\": 2339,\n            \"width\": 1654\n        }\n    },\n    {\n        \"layout_dets\": [\n            {\n                \"category_id\": 5,\n                \"poly\": [\n                    99.13092803955078,\n                    2210.680419921875,\n                    497.3183898925781,\n                    2210.680419921875,\n                    497.3183898925781,\n                    2264.78076171875,\n                    99.13092803955078,\n                    2264.78076171875\n                ],\n                \"score\": 0.9999997019767761\n            }\n        ],\n        \"page_info\": {\n            \"page_no\": 1,\n            \"height\": 2339,\n            \"width\": 1654\n        }\n    }\n]\n
"},{"location":"reference/output_files/#intermediate-processing-results-middlejson","title":"Intermediate Processing Results (middle.json)","text":"

File naming format: {original_filename}_middle.json

"},{"location":"reference/output_files/#top-level-structure","title":"Top-level Structure","text":"Field Name Type Description pdf_info list[dict] Array of parsing results for each page _backend string Parsing mode: pipeline or vlm _version_name string MinerU version number"},{"location":"reference/output_files/#page-information-structure-pdf_info","title":"Page Information Structure (pdf_info)","text":"Field Name Description preproc_blocks Unsegmented intermediate results after PDF preprocessing page_idx Page number, starting from 0 page_size Page width and height [width, height] images Image block information list tables Table block information list interline_equations Interline formula block information list discarded_blocks Block information to be discarded para_blocks Content block results after segmentation"},{"location":"reference/output_files/#block-structure-hierarchy","title":"Block Structure Hierarchy","text":"
Level 1 blocks (table | image)\n\u2514\u2500\u2500 Level 2 blocks\n    \u2514\u2500\u2500 Lines\n        \u2514\u2500\u2500 Spans\n
"},{"location":"reference/output_files/#level-1-block-fields","title":"Level 1 Block Fields","text":"Field Name Description type Block type: table or image bbox Rectangular box coordinates of the block [x0, y0, x1, y1] blocks List of contained level 2 blocks"},{"location":"reference/output_files/#level-2-block-fields","title":"Level 2 Block Fields","text":"Field Name Description type Block type (see table below) bbox Rectangular box coordinates of the block lines List of contained line information"},{"location":"reference/output_files/#level-2-block-types","title":"Level 2 Block Types","text":"Type Description image_body Image body image_caption Image caption text image_footnote Image footnote table_body Table body table_caption Table caption text table_footnote Table footnote text Text block title Title block index Index block list List block interline_equation Interline formula block"},{"location":"reference/output_files/#line-and-span-structure","title":"Line and Span Structure","text":"

Line fields: - bbox: Rectangular box coordinates of the line - spans: List of contained spans

Span fields: - bbox: Rectangular box coordinates of the span - type: Span type (image, table, text, inline_equation, interline_equation) - content | img_path: Text content or image path

"},{"location":"reference/output_files/#sample-data_1","title":"Sample Data","text":"
{\n    \"pdf_info\": [\n        {\n            \"preproc_blocks\": [\n                {\n                    \"type\": \"text\",\n                    \"bbox\": [\n                        52,\n                        61.956024169921875,\n                        294,\n                        82.99800872802734\n                    ],\n                    \"lines\": [\n                        {\n                            \"bbox\": [\n                                52,\n                                61.956024169921875,\n                                294,\n                                72.0000228881836\n                            ],\n                            \"spans\": [\n                                {\n                                    \"bbox\": [\n                                        54.0,\n                                        61.956024169921875,\n                                        296.2261657714844,\n                                        72.0000228881836\n                                    ],\n                                    \"content\": \"dependent on the service headway and the reliability of the departure \",\n                                    \"type\": \"text\",\n                                    \"score\": 1.0\n                                }\n                            ]\n                        }\n                    ]\n                }\n            ],\n            \"layout_bboxes\": [\n                {\n                    \"layout_bbox\": [\n                        52,\n                        61,\n                        294,\n                        731\n                    ],\n                    \"layout_label\": \"V\",\n                    \"sub_layout\": []\n                }\n            ],\n            \"page_idx\": 0,\n            \"page_size\": [\n                612.0,\n                792.0\n            ],\n            \"_layout_tree\": [],\n            \"images\": [],\n            \"tables\": [],\n            \"interline_equations\": [],\n            \"discarded_blocks\": [],\n            \"para_blocks\": [\n                {\n                    \"type\": \"text\",\n                    \"bbox\": [\n                        52,\n                        61.956024169921875,\n                        294,\n                        82.99800872802734\n                    ],\n                    \"lines\": [\n                        {\n                            \"bbox\": [\n                                52,\n                                61.956024169921875,\n                                294,\n                                72.0000228881836\n                            ],\n                            \"spans\": [\n                                {\n                                    \"bbox\": [\n                                        54.0,\n                                        61.956024169921875,\n                                        296.2261657714844,\n                                        72.0000228881836\n                                    ],\n                                    \"content\": \"dependent on the service headway and the reliability of the departure \",\n                                    \"type\": \"text\",\n                                    \"score\": 1.0\n                                }\n                            ]\n                        }\n                    ]\n                }\n            ]\n        }\n    ],\n    \"_backend\": \"pipeline\",\n    \"_version_name\": \"0.6.1\"\n}\n
"},{"location":"reference/output_files/#content-list-content_listjson","title":"Content List (content_list.json)","text":"

File naming format: {original_filename}_content_list.json

"},{"location":"reference/output_files/#functionality","title":"Functionality","text":"

This is a simplified version of middle.json that stores all readable content blocks in reading order as a flat structure, removing complex layout information for easier subsequent processing.

"},{"location":"reference/output_files/#content-types","title":"Content Types","text":"Type Description image Image table Table text Text/Title equation Interline formula"},{"location":"reference/output_files/#text-level-identification","title":"Text Level Identification","text":"

Text levels are distinguished through the text_level field:

  • No text_level or text_level: 0: Body text
  • text_level: 1: Level 1 heading
  • text_level: 2: Level 2 heading
  • And so on...
"},{"location":"reference/output_files/#common-fields","title":"Common Fields","text":"
  • All content blocks include a page_idx field indicating the page number (starting from 0).
  • All content blocks include a bbox field representing the bounding box coordinates of the content block [x0, y0, x1, y1], mapped to a range of 0-1000.
"},{"location":"reference/output_files/#sample-data_2","title":"Sample Data","text":"
[\n        {\n        \"type\": \"text\",\n        \"text\": \"The response of flow duration curves to afforestation \",\n        \"text_level\": 1, \n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],\n        \"page_idx\": 0\n    },\n    {\n        \"type\": \"image\",\n        \"img_path\": \"images/a8ecda1c69b27e4f79fce1589175a9d721cbdc1cf78b4cc06a015f3746f6b9d8.jpg\",\n        \"image_caption\": [\n            \"Fig. 1. Annual flow duration curves of daily flows from Pine Creek, Australia, 1989\u20132000. \"\n        ],\n        \"image_footnote\": [],\n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],\n        \"page_idx\": 1\n    },\n    {\n        \"type\": \"equation\",\n        \"img_path\": \"images/181ea56ef185060d04bf4e274685f3e072e922e7b839f093d482c29bf89b71e8.jpg\",\n        \"text\": \"$$\\nQ _ { \\\\% } = f ( P ) + g ( T )\\n$$\",\n        \"text_format\": \"latex\",\n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],\n        \"page_idx\": 2\n    },\n    {\n        \"type\": \"table\",\n        \"img_path\": \"images/e3cb413394a475e555807ffdad913435940ec637873d673ee1b039e3bc3496d0.jpg\",\n        \"table_caption\": [\n            \"Table 2 Significance of the rainfall and time terms \"\n        ],\n        \"table_footnote\": [\n            \"indicates that the rainfall term was significant at the $5 \\\\%$ level, $T$ indicates that the time term was significant at the $5 \\\\%$ level, \\\\* represents significance at the $10 \\\\%$ level, and na denotes too few data points for meaningful analysis. \"\n        ],\n        \"table_body\": \"<html><body><table><tr><td rowspan=\\\"2\\\">Site</td><td colspan=\\\"10\\\">Percentile</td></tr><tr><td>10</td><td>20</td><td>30</td><td>40</td><td>50</td><td>60</td><td>70</td><td>80</td><td>90</td><td>100</td></tr><tr><td>Traralgon Ck</td><td>P</td><td>P,*</td><td>P</td><td>P</td><td>P,</td><td>P,</td><td>P,</td><td>P,</td><td>P</td><td>P</td></tr><tr><td>Redhill</td><td>P,T</td><td>P,T</td><td>\uff0c*</td><td>**</td><td>P.T</td><td>P,*</td><td>P*</td><td>P*</td><td>*</td><td>\uff0c*</td></tr><tr><td>Pine Ck</td><td></td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td><td>T</td><td>na</td><td>na</td></tr><tr><td>Stewarts Ck 5</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P.T</td><td>P.T</td><td>P,T</td><td>na</td><td>na</td><td>na</td></tr><tr><td>Glendhu 2</td><td>P</td><td>P,T</td><td>P,*</td><td>P,T</td><td>P.T</td><td>P,ns</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td></tr><tr><td>Cathedral Peak 2</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Cathedral Peak 3</td><td>P.T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Lambrechtsbos A</td><td>P,T</td><td>P</td><td>P</td><td>P,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>T</td></tr><tr><td>Lambrechtsbos B</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td></tr><tr><td>Biesievlei</td><td>P,T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>*,T</td><td>T</td><td>T</td><td>P,T</td><td>P,T</td></tr></table></body></html>\",\n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],  \n        \"page_idx\": 5\n    }\n]\n
"},{"location":"reference/output_files/#vlm-backend-output-results","title":"VLM Backend Output Results","text":""},{"location":"reference/output_files/#model-inference-results-modeljson_1","title":"Model Inference Results (model.json)","text":"

File naming format: {original_filename}_model.json

"},{"location":"reference/output_files/#file-format-description","title":"File format description","text":"
  • Two-level nested list: outer list = pages; inner list = content blocks of that page
  • Each block is a dict with at least: type, bbox, angle, content (some types add extra fields like score, block_tags, content_tags, format)
  • Designed for direct, raw model inspection
"},{"location":"reference/output_files/#supported-content-types-type-field-values","title":"Supported content types (type field values)","text":"
{\n  \"text\": \"Plain text\",\n  \"title\": \"Title\",\n  \"equation\": \"Display (interline) formula\",\n  \"image\": \"Image\",\n  \"image_caption\": \"Image caption\",\n  \"image_footnote\": \"Image footnote\",\n  \"table\": \"Table\",\n  \"table_caption\": \"Table caption\",\n  \"table_footnote\": \"Table footnote\",\n  \"phonetic\": \"Phonetic annotation\",\n  \"code\": \"Code block\",\n  \"code_caption\": \"Code caption\",\n  \"ref_text\": \"Reference / citation entry\",\n  \"algorithm\": \"Algorithm block (treated as code subtype)\",\n  \"list\": \"List container\",\n  \"header\": \"Page header\",\n  \"footer\": \"Page footer\",\n  \"page_number\": \"Page number\",\n  \"aside_text\": \"Side / margin note\",\n  \"page_footnote\": \"Page footnote\"\n}\n
"},{"location":"reference/output_files/#coordinate-system","title":"Coordinate system","text":"
  • bbox = [x0, y0, x1, y1] (top-left, bottom-right)
  • Origin at top-left of the page
  • All coordinates are normalized percentages in [0,1]
"},{"location":"reference/output_files/#sample-data_3","title":"Sample data","text":"
[\n  [\n    {\n      \"type\": \"header\",\n      \"bbox\": [0.077, 0.095, 0.18, 0.181],\n      \"angle\": 0,\n      \"score\": null,\n      \"block_tags\": null,\n      \"content\": \"ELSEVIER\",\n      \"format\": null,\n      \"content_tags\": null\n    },\n    {\n      \"type\": \"title\",\n      \"bbox\": [0.157, 0.228, 0.833, 0.253],\n      \"angle\": 0,\n      \"score\": null,\n      \"block_tags\": null,\n      \"content\": \"The response of flow duration curves to afforestation\",\n      \"format\": null,\n      \"content_tags\": null\n    }\n  ]\n]\n
"},{"location":"reference/output_files/#intermediate-processing-results-middlejson_1","title":"Intermediate Processing Results (middle.json)","text":"

File naming format: {original_filename}_middle.json

Structure is broadly similar to the pipeline backend, but with these differences:

  • list becomes a second\u2011level block, a new field sub_type distinguishes list categories:
    • text: ordinary list
    • ref_text: reference / bibliography style list
  • New code block type with sub_type(a code block always has at least a code_body, it may optionally have a code_caption):
    • code
    • algorithm
  • discarded_blocks may contain additional types:
    • header
    • footer
    • page_number
    • aside_text
    • page_footnote
  • All blocks include an angle field indicating rotation (one of 0, 90, 180, 270).
"},{"location":"reference/output_files/#examples","title":"Examples","text":"
  • Example: list block

    {\n  \"bbox\": [174,155,818,333],\n  \"type\": \"list\",\n  \"angle\": 0,\n  \"index\": 11,\n  \"blocks\": [\n    {\n      \"bbox\": [174,157,311,175],\n      \"type\": \"text\",\n      \"angle\": 0,\n      \"lines\": [\n        {\n          \"bbox\": [174,157,311,175],\n            \"spans\": [\n              {\n                \"bbox\": [174,157,311,175],\n                \"type\": \"text\",\n                \"content\": \"H.1 Introduction\"\n              }\n            ]\n        }\n      ],\n      \"index\": 3\n    },\n    {\n      \"bbox\": [175,182,464,229],\n      \"type\": \"text\",\n      \"angle\": 0,\n      \"lines\": [\n        {\n          \"bbox\": [175,182,464,229],\n          \"spans\": [\n            {\n              \"bbox\": [175,182,464,229],\n              \"type\": \"text\",\n              \"content\": \"H.2 Example: Divide by Zero without Exception Handling\"\n            }\n          ]\n        }\n      ],\n      \"index\": 4\n    }\n  ],\n  \"sub_type\": \"text\"\n}\n
  • Example: code block with optional caption:

    {\n  \"type\": \"code\",\n  \"bbox\": [114,780,885,1231],\n  \"blocks\": [\n    {\n      \"bbox\": [114,780,885,1231],\n      \"lines\": [\n        {\n          \"bbox\": [114,780,885,1231],\n          \"spans\": [\n            {\n              \"bbox\": [114,780,885,1231],\n              \"type\": \"text\",\n              \"content\": \"1 // Fig. H.1: DivideByZeroNoExceptionHandling.java  \\n2 // Integer division without exception handling.  \\n3 import java.util.Scanner;  \\n4  \\n5 public class DivideByZeroNoExceptionHandling  \\n6 {  \\n7 // demonstrates throwing an exception when a divide-by-zero occurs  \\n8 public static int quotient( int numerator, int denominator )  \\n9 {  \\n10 return numerator / denominator; // possible division by zero  \\n11 } // end method quotient  \\n12  \\n13 public static void main(String[] args)  \\n14 {  \\n15 Scanner scanner = new Scanner(System.in); // scanner for input  \\n16  \\n17 System.out.print(\\\"Please enter an integer numerator: \\\");  \\n18 int numerator = scanner.nextInt();  \\n19 System.out.print(\\\"Please enter an integer denominator: \\\");  \\n20 int denominator = scanner.nextInt();  \\n21\"\n            }\n          ]\n        }\n      ],\n      \"index\": 17,\n      \"angle\": 0,\n      \"type\": \"code_body\"\n    },\n    {\n      \"bbox\": [867,160,1280,189],\n      \"lines\": [\n        {\n          \"bbox\": [867,160,1280,189],\n          \"spans\": [\n            {\n              \"bbox\": [867,160,1280,189],\n              \"type\": \"text\",\n              \"content\": \"Algorithm 1 Modules for MCTSteg\"\n            }\n          ]\n        }\n      ],\n      \"index\": 19,\n      \"angle\": 0,\n      \"type\": \"code_caption\"\n    }\n  ],\n  \"index\": 17,\n  \"sub_type\": \"code\"\n}\n
"},{"location":"reference/output_files/#content-list-content_listjson_1","title":"Content List (content_list.json)","text":"

File naming format: {original_filename}_content_list.json

Based on the pipeline format, with these VLM-specific extensions:

  • New code type with sub_type (code | algorithm):
    • Fields: code_body (string), optional code_caption (list of strings)
  • New list type with sub_type (text | ref_text):
    • Field: list_items (array of strings)
  • All discarded_blocks entries are also output (e.g., headers, footers, page numbers, margin notes, page footnotes).
  • Existing types (image, table, text, equation) remain unchanged.
  • bbox still uses the 0\u20131000 normalized coordinate mapping.
"},{"location":"reference/output_files/#examples_1","title":"Examples","text":"

Example: code (algorithm) entry

{\n  \"type\": \"code\",\n  \"sub_type\": \"algorithm\",\n  \"code_caption\": [\"Algorithm 1 Modules for MCTSteg\"],\n  \"code_body\": \"1: function GETCOORDINATE(d)  \\n2:  $x \\\\gets d / l$ ,  $y \\\\gets d$  mod  $l$   \\n3: return  $(x, y)$   \\n4: end function  \\n5: function BESTCHILD(v)  \\n6:  $C \\\\gets$  child set of  $v$   \\n7:  $v' \\\\gets \\\\arg \\\\max_{c \\\\in C} \\\\mathrm{UCTScore}(c)$   \\n8:  $v'.n \\\\gets v'.n + 1$   \\n9: return  $v'$   \\n10: end function  \\n11: function BACK PROPAGATE(v)  \\n12: Calculate  $R$  using Equation 11  \\n13: while  $v$  is not a root node do  \\n14:  $v.r \\\\gets v.r + R$ ,  $v \\\\gets v.p$   \\n15: end while  \\n16: end function  \\n17: function RANDOMSEARCH(v)  \\n18: while  $v$  is not a leaf node do  \\n19: Randomly select an untried action  $a \\\\in A(v)$   \\n20: Create a new node  $v'$   \\n21:  $(x, y) \\\\gets \\\\mathrm{GETCOORDINATE}(v'.d)$   \\n22:  $v'.p \\\\gets v$ ,  $v'.d \\\\gets v.d + 1$ ,  $v'.\\\\Gamma \\\\gets v.\\\\Gamma$   \\n23:  $v'.\\\\gamma_{x,y} \\\\gets a$   \\n24: if  $a = -1$  then  \\n25:  $v.lc \\\\gets v'$   \\n26: else if  $a = 0$  then  \\n27:  $v.mc \\\\gets v'$   \\n28: else  \\n29:  $v.rc \\\\gets v'$   \\n30: end if  \\n31:  $v \\\\gets v'$   \\n32: end while  \\n33: return  $v$   \\n34: end function  \\n35: function SEARCH(v)  \\n36: while  $v$  is fully expanded do  \\n37:  $v \\\\gets$  BESTCHILD(v)  \\n38: end while  \\n39: if  $v$  is not a leaf node then  \\n40:  $v \\\\gets$  RANDOMSEARCH(v)  \\n41: end if  \\n42: return  $v$   \\n43: end function\",\n  \"bbox\": [510,87,881,740],\n  \"page_idx\": 0\n}\n

Example: list (text) entry

{\n  \"type\": \"list\",\n  \"sub_type\": \"text\",\n  \"list_items\": [\n    \"H.1 Introduction\",\n    \"H.2 Example: Divide by Zero without Exception Handling\",\n    \"H.3 Example: Divide by Zero with Exception Handling\",\n    \"H.4 Summary\"\n  ],\n  \"bbox\": [174,155,818,333],\n  \"page_idx\": 0\n}\n

Example: discarded blocks output

[\n  {\n    \"type\": \"header\",\n    \"text\": \"Journal of Hydrology 310 (2005) 253-265\",\n    \"bbox\": [363,164,623,177],\n    \"page_idx\": 0\n  },\n  {\n    \"type\": \"page_footnote\",\n    \"text\": \"* Corresponding author. Address: Forest Science Centre, Department of Sustainability and Environment, P.O. Box 137, Heidelberg, Vic. 3084, Australia. Tel.: +61 3 9450 8719; fax: +61 3 9450 8644.\",\n    \"bbox\": [71,815,915,841],\n    \"page_idx\": 0\n  }\n]\n
"},{"location":"reference/output_files/#summary","title":"Summary","text":"

The above files constitute MinerU's complete output results. Users can choose appropriate files for subsequent processing based on their needs:

  • Model outputs (Use raw outputs):

    • model.json
  • Debugging and verification (Use visualization files):

    • layout.pdf
    • span.pdf
  • Content extraction: (Use simplified files):

    • *.md
    • content_list.json
  • Secondary development: (Use structured files):

    • middle.json
"},{"location":"usage/","title":"Usage","text":""},{"location":"usage/#usage-guide","title":"Usage Guide","text":"

This section provides comprehensive usage instructions for the project. We will help you progressively master the project's usage from basic to advanced through the following sections:

"},{"location":"usage/#table-of-contents","title":"Table of Contents","text":"
  • Quick Usage - Quick setup and basic usage
  • Model Source Configuration - Detailed configuration instructions for model sources
  • Command Line Tools - Detailed parameter descriptions for command line tools
  • Advanced Optimization Parameters - Advanced parameter descriptions for command line tool adaptation
"},{"location":"usage/#getting-started","title":"Getting Started","text":"

We recommend reading the documentation in the order listed above, which will help you better understand and use the project features.

If you encounter issues during usage, please check the FAQ

"},{"location":"usage/advanced_cli_parameters/","title":"Advanced CLI Parameters","text":""},{"location":"usage/advanced_cli_parameters/#advanced-command-line-parameters","title":"Advanced Command Line Parameters","text":""},{"location":"usage/advanced_cli_parameters/#pass-through-of-inference-engine-parameters","title":"Pass-through of inference engine parameters","text":""},{"location":"usage/advanced_cli_parameters/#vllm-acceleration-parameter-optimization","title":"vllm Acceleration Parameter Optimization","text":"

Tip

If you can already use vllm normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:

  • If you have multiple graphics cards, you can use vllm's multi-card parallel mode to increase throughput: --data-parallel-size 2
"},{"location":"usage/advanced_cli_parameters/#parameter-passing-instructions","title":"Parameter Passing Instructions","text":"

Tip

  • All officially supported vllm/lmdeploy parameters can be passed to MinerU through command line arguments, including the following commands: mineru, mineru-openai-server, mineru-gradio, mineru-api
  • If you want to learn more about vllm parameter usage, please refer to the vllm official documentation
  • If you want to learn more about lmdeploy parameter usage, please refer to the lmdeploy official documentation
"},{"location":"usage/advanced_cli_parameters/#gpu-device-selection-and-configuration","title":"GPU Device Selection and Configuration","text":""},{"location":"usage/advanced_cli_parameters/#cuda_visible_devices-basic-usage","title":"CUDA_VISIBLE_DEVICES Basic Usage","text":"

Tip

  • In any situation, you can specify visible GPU devices by adding the CUDA_VISIBLE_DEVICES environment variable at the beginning of the command line. For example:
    CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>\n
  • This specification method is effective for all command line calls, including mineru, mineru-openai-server, mineru-gradio, and mineru-api, and applies to both pipeline and vlm backends.
"},{"location":"usage/advanced_cli_parameters/#common-device-configuration-examples","title":"Common Device Configuration Examples","text":"

Tip

Here are some common CUDA_VISIBLE_DEVICES setting examples:

CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen\nCUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible\nCUDA_VISIBLE_DEVICES=\"0,1\"  # Same as above, quotation marks are optional\nCUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked\nCUDA_VISIBLE_DEVICES=\"\"  # No GPU will be visible\n
"},{"location":"usage/advanced_cli_parameters/#practical-application-scenarios","title":"Practical Application Scenarios","text":"

Tip

Here are some possible usage scenarios:

  • If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start openai-server, you can use the following command:

    CUDA_VISIBLE_DEVICES=0,1 mineru-openai-server --engine vllm --port 30000 --data-parallel-size 2\n
  • If you have multiple graphics cards and need to start two fastapi services on cards 0 and 1, listening on different ports respectively, you can use the following commands:

    # In terminal 1\nCUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000\n# In terminal 2\nCUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001\n
"},{"location":"usage/cli_tools/","title":"CLI Tools","text":""},{"location":"usage/cli_tools/#command-line-tools-usage-instructions","title":"Command Line Tools Usage Instructions","text":""},{"location":"usage/cli_tools/#view-help-information","title":"View Help Information","text":"

To view help information for MinerU command line tools, you can use the --help parameter. Here are help information examples for various command line tools:

mineru --help\nUsage: mineru [OPTIONS]\n\nOptions:\n  -v, --version                   Show version and exit\n  -p, --path PATH                 Input file path or directory (required)\n  -o, --output PATH               Output directory (required)\n  --api-url TEXT                  MinerU FastAPI base URL; if omitted, `mineru` starts a temporary local `mineru-api`\n  -m, --method [auto|txt|ocr]     Parsing method: auto (default), txt, ocr (pipeline and hybrid* backend only)\n  -b, --backend [pipeline|hybrid-auto-engine|hybrid-http-client|vlm-auto-engine|vlm-http-client]\n                                  Parsing backend (default: hybrid-auto-engine)\n  -l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|th|el|latin|arabic|east_slavic|cyrillic|devanagari]\n                                  Specify document language (improves OCR accuracy, pipeline and hybrid* backend only)\n  -u, --url TEXT                  OpenAI-compatible backend URL passed through to the server when using http-client\n  -s, --start INTEGER             Starting page number for parsing (0-based)\n  -e, --end INTEGER               Ending page number for parsing (0-based)\n  -f, --formula BOOLEAN           Enable formula parsing (default: enabled)\n  -t, --table BOOLEAN             Enable table parsing (default: enabled)\n  --help                          Show help information\n
mineru-api --help\nUsage: mineru-api [OPTIONS]\n\nOptions:\n  --host TEXT     Server host (default: 127.0.0.1)\n  --port INTEGER  Server port (default: 8000)\n  --reload        Enable auto-reload (development mode)\n  --help          Show this message and exit.\n
mineru-gradio --help\nUsage: mineru-gradio [OPTIONS]\n\nOptions:\n  --enable-example BOOLEAN        Enable example files for input. The example\n                                  files to be input need to be placed in the\n                                  `example` folder within the directory where\n                                  the command is currently executed.\n  --enable-http-client BOOLEAN    Enable http-client backend to link openai-\n                                  compatible servers.\n  --enable-api BOOLEAN            Enable gradio API for serving the\n                                  application.\n  --max-convert-pages INTEGER     Set the maximum number of pages to convert\n                                  from PDF to Markdown.\n  --server-name TEXT              Set the server name for the Gradio app.\n  --server-port INTEGER           Set the server port for the Gradio app.\n  --latex-delimiters-type [a|b|all]\n                                  Set the type of LaTeX delimiters to use in\n                                  Markdown rendering: 'a' for type '$', 'b' for\n                                  type '()[]', 'all' for both types.\n  --help                          Show this message and exit.\n
"},{"location":"usage/cli_tools/#environment-variables-description","title":"Environment Variables Description","text":"

Note

Starting from this version, mineru is an orchestration client built on top of mineru-api: - Without --api-url, the CLI launches a temporary local mineru-api - With --api-url, the CLI connects to that FastAPI service directly - --url is no longer the MinerU API address; it is the OpenAI-compatible backend URL used by server-side vlm/hybrid-http-client

Some parameters of MinerU command line tools have equivalent environment variable configurations. Generally, environment variable configurations have higher priority than command line parameters and take effect across all command line tools. Here are the environment variables and their descriptions:

  • MINERU_TOOLS_CONFIG_JSON:

    • Used to specify configuration file path
    • defaults to mineru.json in user directory, can specify other configuration file paths through environment variables.
  • MINERU_FORMULA_ENABLE:

    • Used to enable formula parsing
    • defaults to true, can be set to false through environment variables to disable formula parsing.
  • MINERU_FORMULA_CH_SUPPORT:

    • Used to enable Chinese formula parsing optimization (experimental feature)
    • Default is false, can be set to true via environment variable to enable Chinese formula parsing optimization.
    • Only effective for pipeline backend.
  • MINERU_TABLE_ENABLE:

    • Used to enable table parsing
    • Default is true, can be set to false via environment variable to disable table parsing.
  • MINERU_TABLE_MERGE_ENABLE:

    • Used to enable table merging functionality
    • Default is true, can be set to false via environment variable to disable table merging functionality.
  • MINERU_PDF_RENDER_TIMEOUT:

    • Used to set the timeout (in seconds) for rendering PDFs to images.
    • Default is 300 seconds; you can set a different value via an environment variable to adjust the rendering timeout.
    • Only effective on Linux and macOS systems.
  • MINERU_PDF_RENDER_THREADS:

    • Used to set the number of threads used when rendering PDFs to images.
    • Default is 4; you can set a different value via an environment variable to adjust the number of threads for image rendering.
    • Only effective on Linux and macOS systems.
  • MINERU_INTRA_OP_NUM_THREADS:

    • Used to set the intra_op thread count for ONNX models, affects the computation speed of individual operators
    • Default is -1 (auto-select), can be set to other values via environment variable to adjust the thread count.
  • MINERU_INTER_OP_NUM_THREADS:

    • Used to set the inter_op thread count for ONNX models, affects the parallel execution of multiple operators
    • Default is -1 (auto-select), can be set to other values via environment variable to adjust the thread count.
  • MINERU_HYBRID_BATCH_RATIO:

    • Used to set the batch ratio for small model processing in hybrid-* backends.
    • Commonly used in hybrid-http-client, it allows adjusting the VRAM usage of a single client by controlling the batch ratio of small models.
    • Single Client VRAM Size MINERU_HYBRID_BATCH_RATIO <= 6 GB 8 <= 4.5 GB 4 <= 3 GB 2 <= 2.5 GB 1
  • MINERU_HYBRID_FORCE_PIPELINE_ENABLE:

    • Used to force the text extraction part in hybrid-* backends to be processed using small models.
    • Defaults to false. Can be set to true via environment variable to enable this feature, thereby reducing hallucinations in certain extreme cases.
  • MINERU_VL_MODEL_NAME:

    • Used to specify the model name for the vlm/hybrid backend, allowing you to designate the model required for MinerU to run when multiple models exist on a remote openai-server.
  • MINERU_VL_API_KEY:

    • Used to specify the API Key for the vlm/hybrid backend, enabling authentication on the remote openai-server.
"},{"location":"usage/model_source/","title":"Model Source","text":""},{"location":"usage/model_source/#model-source-documentation","title":"Model Source Documentation","text":"

MinerU uses HuggingFace and ModelScope as model repositories. Users can switch model sources or use local models as needed.

  • HuggingFace is the default model source, providing excellent loading speed and high stability globally.
  • ModelScope is the best choice for users in mainland China, providing seamlessly compatible hf SDK modules, suitable for users who cannot access HuggingFace.
"},{"location":"usage/model_source/#methods-to-switch-model-sources","title":"Methods to Switch Model Sources","text":""},{"location":"usage/model_source/#switch-via-command-line-parameters","title":"Switch via Command Line Parameters","text":"

Currently, only the mineru command line tool supports switching model sources through command line parameters. Other command line tools such as mineru-api, mineru-gradio, etc., do not support this yet.

mineru -p <input_path> -o <output_path> --source modelscope\n
"},{"location":"usage/model_source/#switch-via-environment-variables","title":"Switch via Environment Variables","text":"

You can switch model sources by setting environment variables in any situation. This applies to all command line tools and API calls.

export MINERU_MODEL_SOURCE=modelscope\n
or
import os\nos.environ[\"MINERU_MODEL_SOURCE\"] = \"modelscope\"\n

Tip

Model sources set through environment variables will take effect in the current terminal session until the terminal is closed or the environment variable is modified. They have higher priority than command line parameters - if both command line parameters and environment variables are set, the command line parameters will be ignored.

"},{"location":"usage/model_source/#using-local-models","title":"Using Local Models","text":""},{"location":"usage/model_source/#1-download-models-to-local-storage","title":"1. Download Models to Local Storage","text":"
mineru-models-download --help\n
or use the interactive command line tool to select model downloads:
mineru-models-download\n

Note

  • After download completion, the model path will be output in the current terminal window and automatically written to mineru.json in the user directory.
  • You can also create it by copying the configuration template file to your user directory and renaming it to mineru.json.
  • After downloading models locally, you can freely move the model folder to other locations while updating the model path in mineru.json.
  • If you deploy the model folder to another server, please ensure you move the mineru.json file to the user directory of the new device and configure the model path correctly.
  • If you need to update model files, you can run the mineru-models-download command again. Model updates do not support custom paths currently - if you haven't moved the local model folder, model files will be incrementally updated; if you have moved the model folder, model files will be re-downloaded to the default location and mineru.json will be updated.
"},{"location":"usage/model_source/#2-use-local-models-for-parsing","title":"2. Use Local Models for Parsing","text":"
mineru -p <input_path> -o <output_path> --source local\n
or enable through environment variables:
export MINERU_MODEL_SOURCE=local\nmineru -p <input_path> -o <output_path>\n
"},{"location":"usage/quick_usage/","title":"Quick Usage","text":""},{"location":"usage/quick_usage/#using-mineru","title":"Using MinerU","text":""},{"location":"usage/quick_usage/#quick-model-source-configuration","title":"Quick Model Source Configuration","text":"

MinerU uses huggingface as the default model source. If users cannot access huggingface due to network restrictions, they can conveniently switch the model source to modelscope through environment variables:

export MINERU_MODEL_SOURCE=modelscope\n
For more information about model source configuration and custom local model paths, please refer to the Model Source Documentation in the documentation."},{"location":"usage/quick_usage/#quick-usage-via-command-line","title":"Quick Usage via Command Line","text":"

MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line:

mineru -p <input_path> -o <output_path>\n

Tip

  • <input_path>: Local PDF/image file or directory
  • <output_path>: Output directory
  • Without --api-url, the CLI launches a temporary local mineru-api
  • With --api-url, the CLI connects to an existing local or remote FastAPI service directly

For more information about output files, please refer to Output File Documentation.

Note

The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. Windows users who need cuda acceleration should visit the PyTorch official website to select the appropriate command for their cuda version to install acceleration-enabled torch and torchvision.

If you need to adjust parsing options through custom parameters, you can also check the more detailed Command Line Tools Usage Instructions in the documentation.

"},{"location":"usage/quick_usage/#advanced-usage-via-api-webui-http-clientserver","title":"Advanced Usage via API, WebUI, http-client/server","text":"
  • Direct Python API calls: Python Usage Example
  • FastAPI calls:
    mineru-api --host 0.0.0.0 --port 8000\n

    Tip

    Access http://127.0.0.1:8000/docs in your browser to view the API documentation.

    • Health endpoint: GET /health Returns protocol_version, processing_window_size, max_concurrent_requests, and task stats
    • Asynchronous task submission endpoint: POST /tasks
    • Synchronous parsing endpoint: POST /file_parse
    • Task query endpoints: GET /tasks/{task_id}, GET /tasks/{task_id}/result
    • API outputs are controlled by the server and written to ./output by default

    POST /tasks returns immediately with a task_id. POST /file_parse uses the same task manager internally, waits for the task to finish, and then returns the final result synchronously. Tasks are tracked only in-process for a single mineru-api instance. Task status is not preserved across service restarts, --reload, or multi-process deployments. Completed or failed tasks are retained for 24 hours by default, then their task state and output directory are cleaned automatically. After cleanup, task status and result endpoints return 404. Use MINERU_API_TASK_RETENTION_SECONDS and MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS to adjust retention and cleanup polling intervals.

    Asynchronous task submission example:

    curl -X POST http://127.0.0.1:8000/tasks \\\n  -F \"files=@demo/pdfs/demo1.pdf\" \\\n  -F \"return_md=true\"\n

    Synchronous parsing example:

    curl -X POST http://127.0.0.1:8000/file_parse \\\n  -F \"files=@demo/pdfs/demo1.pdf\" \\\n  -F \"return_md=true\" \\\n  -F \"response_format_zip=true\" \\\n  -F \"return_original_file=true\"\n

    Poll task status and fetch results:

    curl http://127.0.0.1:8000/tasks/<task_id>\ncurl http://127.0.0.1:8000/tasks/<task_id>/result\ncurl http://127.0.0.1:8000/health\n
  • Start Gradio WebUI visual frontend:

    mineru-gradio --server-name 0.0.0.0 --server-port 7860\n

    Tip

    • Access http://127.0.0.1:7860 in your browser to use the Gradio WebUI.
  • Using http-client/server method:

    # Start openai compatible server (requires vllm or lmdeploy environment)\nmineru-openai-server --port 30000\n

    Tip

    In another terminal, connect to openai server via http client

    mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000\n

Note

All officially supported vllm/lmdeploy parameters can be passed to MinerU through command line arguments, including the following commands: mineru, mineru-openai-server, mineru-gradio, mineru-api. We have compiled some commonly used parameters and usage methods for vllm/lmdeploy, which can be found in the documentation Advanced Command Line Parameters.

"},{"location":"usage/quick_usage/#extending-mineru-functionality-with-configuration-files","title":"Extending MinerU Functionality with Configuration Files","text":"

MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit mineru.json file in your user directory to add custom configurations.

Important

The mineru.json file will be automatically generated when you use the built-in model download command mineru-models-download, or you can create it by copying the configuration template file to your user directory and renaming it to mineru.json.

Here are some available configuration options:

  • latex-delimiter-config:

    • Used to configure LaTeX formula delimiters
    • Defaults to $ symbol, can be modified to other symbols or strings as needed.
  • llm-aided-config:

    • Used to configure parameters for LLM-assisted title hierarchy
    • Compatible with all LLM models supporting openai protocol, defaults to using Alibaba Cloud Bailian's qwen3-next-80b-a3b-instruct model.
    • You need to configure your own API key and set enable to true to enable this feature.
    • If your API provider does not support the enable_thinking parameter, please manually remove it.
      • For example, in your configuration file, the llm-aided-config section may look like:
        \"llm-aided-config\": {\n   \"api_key\": \"your_api_key\",\n   \"base_url\": \"https://dashscope.aliyuncs.com/compatible-mode/v1\",\n   \"model\": \"qwen3-next-80b-a3b-instruct\",\n   \"enable_thinking\": false,\n   \"enable\": false\n}\n
      • To remove the enable_thinking parameter, simply delete the line containing \"enable_thinking\": false, resulting in:
        \"llm-aided-config\": {\n   \"api_key\": \"your_api_key\",\n   \"base_url\": \"https://dashscope.aliyuncs.com/compatible-mode/v1\",\n   \"model\": \"qwen3-next-80b-a3b-instruct\",\n   \"enable\": false\n}\n
  • models-dir:

    • Used to specify local model storage directory
    • Please specify model directories for pipeline and vlm backends separately.
    • After specifying the directory, you can use local models by configuring the environment variable export MINERU_MODEL_SOURCE=local.
"},{"location":"zh/","title":"MinerU","text":"

\ud83d\ude80MinerU \u5b98\u7f51\u5165\u53e3\u2192\u2705 \u514d\u88c5\u5728\u7ebf\u7248 \u2705 \u5168\u529f\u80fd\u5ba2\u6237\u7aef \u2705 \u5f00\u53d1\u8005API\u5728\u7ebf\u8c03\u7528\uff0c\u7701\u53bb\u90e8\u7f72\u9ebb\u70e6\uff0c\u591a\u79cd\u4ea7\u54c1\u5f62\u6001\u4e00\u952eget\uff0c\u901f\u51b2\uff01

\ud83d\udc4b join us on Discord and WeChat

"},{"location":"zh/#_1","title":"\u9879\u76ee\u7b80\u4ecb","text":"

MinerU\u662f\u4e00\u6b3e\u5c06PDF\u8f6c\u5316\u4e3a\u673a\u5668\u53ef\u8bfb\u683c\u5f0f\u7684\u5de5\u5177\uff08\u5982markdown\u3001json\uff09\uff0c\u53ef\u4ee5\u5f88\u65b9\u4fbf\u5730\u62bd\u53d6\u4e3a\u4efb\u610f\u683c\u5f0f\u3002 MinerU\u8bde\u751f\u4e8e\u4e66\u751f-\u6d66\u8bed\u7684\u9884\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\uff0c\u6211\u4eec\u5c06\u4f1a\u96c6\u4e2d\u7cbe\u529b\u89e3\u51b3\u79d1\u6280\u6587\u732e\u4e2d\u7684\u7b26\u53f7\u8f6c\u5316\u95ee\u9898\uff0c\u5e0c\u671b\u5728\u5927\u6a21\u578b\u65f6\u4ee3\u4e3a\u79d1\u6280\u53d1\u5c55\u505a\u51fa\u8d21\u732e\u3002 \u76f8\u6bd4\u56fd\u5185\u5916\u77e5\u540d\u5546\u7528\u4ea7\u54c1MinerU\u8fd8\u5f88\u5e74\u8f7b\uff0c\u5982\u679c\u9047\u5230\u95ee\u9898\u6216\u8005\u7ed3\u679c\u4e0d\u53ca\u9884\u671f\u8bf7\u5230issue\u63d0\u4ea4\u95ee\u9898\uff0c\u540c\u65f6\u9644\u4e0a\u76f8\u5173PDF\u3002

"},{"location":"zh/#_2","title":"\u4e3b\u8981\u529f\u80fd","text":"
  • \u5220\u9664\u9875\u7709\u3001\u9875\u811a\u3001\u811a\u6ce8\u3001\u9875\u7801\u7b49\u5143\u7d20\uff0c\u786e\u4fdd\u8bed\u4e49\u8fde\u8d2f
  • \u8f93\u51fa\u7b26\u5408\u4eba\u7c7b\u9605\u8bfb\u987a\u5e8f\u7684\u6587\u672c\uff0c\u9002\u7528\u4e8e\u5355\u680f\u3001\u591a\u680f\u53ca\u590d\u6742\u6392\u7248
  • \u4fdd\u7559\u539f\u6587\u6863\u7684\u7ed3\u6784\uff0c\u5305\u62ec\u6807\u9898\u3001\u6bb5\u843d\u3001\u5217\u8868\u7b49
  • \u63d0\u53d6\u56fe\u50cf\u3001\u56fe\u7247\u63cf\u8ff0\u3001\u8868\u683c\u3001\u8868\u683c\u6807\u9898\u53ca\u811a\u6ce8
  • \u81ea\u52a8\u8bc6\u522b\u5e76\u8f6c\u6362\u6587\u6863\u4e2d\u7684\u516c\u5f0f\u4e3aLaTeX\u683c\u5f0f
  • \u81ea\u52a8\u8bc6\u522b\u5e76\u8f6c\u6362\u6587\u6863\u4e2d\u7684\u8868\u683c\u4e3aHTML\u683c\u5f0f
  • \u81ea\u52a8\u68c0\u6d4b\u626b\u63cf\u7248PDF\u548c\u4e71\u7801PDF\uff0c\u5e76\u542f\u7528OCR\u529f\u80fd
  • OCR\u652f\u6301109\u79cd\u8bed\u8a00\u7684\u68c0\u6d4b\u4e0e\u8bc6\u522b
  • \u652f\u6301\u591a\u79cd\u8f93\u51fa\u683c\u5f0f\uff0c\u5982\u591a\u6a21\u6001\u4e0eNLP\u7684Markdown\u3001\u6309\u9605\u8bfb\u987a\u5e8f\u6392\u5e8f\u7684JSON\u3001\u542b\u6709\u4e30\u5bcc\u4fe1\u606f\u7684\u4e2d\u95f4\u683c\u5f0f\u7b49
  • \u652f\u6301\u591a\u79cd\u53ef\u89c6\u5316\u7ed3\u679c\uff0c\u5305\u62eclayout\u53ef\u89c6\u5316\u3001span\u53ef\u89c6\u5316\u7b49\uff0c\u4fbf\u4e8e\u9ad8\u6548\u786e\u8ba4\u8f93\u51fa\u6548\u679c\u4e0e\u8d28\u68c0
  • \u652f\u6301\u7eafCPU\u73af\u5883\u8fd0\u884c\uff0c\u5e76\u652f\u6301 GPU(CUDA)/NPU(CANN)/MPS \u52a0\u901f
  • \u517c\u5bb9Windows\u3001Linux\u548cMac\u5e73\u53f0
"},{"location":"zh/#_3","title":"\u4f7f\u7528\u6307\u5357","text":"
  • \u5feb\u901f\u4e0a\u624b\u6307\u5357
  • \u8be6\u7ec6\u4f7f\u7528\u8bf4\u660e
"},{"location":"zh/demo/","title":"\u5728\u7ebf\u6f14\u793a","text":""},{"location":"zh/faq/","title":"\u5e38\u89c1\u95ee\u9898\u89e3\u7b54","text":""},{"location":"zh/faq/#_1","title":"\u5e38\u89c1\u95ee\u9898\u89e3\u7b54","text":"

\u5982\u679c\u672a\u80fd\u5217\u51fa\u60a8\u7684\u95ee\u9898\uff0c\u60a8\u4e5f\u53ef\u4ee5\u4f7f\u7528DeepWiki\u4e0eAI\u52a9\u624b\u4ea4\u6d41\uff0c\u8fd9\u53ef\u4ee5\u89e3\u51b3\u5927\u90e8\u5206\u5e38\u89c1\u95ee\u9898\u3002

\u5982\u679c\u60a8\u4ecd\u7136\u65e0\u6cd5\u89e3\u51b3\u95ee\u9898\uff0c\u60a8\u53ef\u901a\u8fc7Discord\u6216WeChat\u52a0\u5165\u793e\u533a\uff0c\u4e0e\u5176\u4ed6\u7528\u6237\u548c\u5f00\u53d1\u8005\u4ea4\u6d41\u3002

\u5728WSL2\u7684Ubuntu22.04\u4e2d\u9047\u5230\u62a5\u9519ImportError: libGL.so.1: cannot open shared object file: No such file or directory

WSL2\u7684Ubuntu22.04\u4e2d\u7f3a\u5c11libgl\u5e93\uff0c\u53ef\u901a\u8fc7\u4ee5\u4e0b\u547d\u4ee4\u5b89\u88c5libgl\u5e93\u89e3\u51b3\uff1a

sudo apt-get install libgl1-mesa-glx\n

\u53c2\u8003\uff1a#388

\u5728 Linux \u7cfb\u7edf\u5b89\u88c5\u5e76\u4f7f\u7528\u65f6\uff0c\u89e3\u6790\u7ed3\u679c\u7f3a\u5931\u90e8\u4efd\u6587\u5b57\u4fe1\u606f\u3002

MinerU\u5728>=2.0\u7684\u7248\u672c\u4e2d\u4f7f\u7528pypdfium2\u4ee3\u66ffpymupdf\u4f5c\u4e3aPDF\u9875\u9762\u7684\u6e32\u67d3\u5f15\u64ce\uff0c\u4ee5\u89e3\u51b3AGPLv3\u7684\u8bb8\u53ef\u8bc1\u95ee\u9898\uff0c\u5728\u67d0\u4e9bLinux\u53d1\u884c\u7248\uff0c\u7531\u4e8e\u7f3a\u5c11CJK\u5b57\u4f53\uff0c\u53ef\u80fd\u4f1a\u5728\u5c06PDF\u6e32\u67d3\u6210\u56fe\u7247\u7684\u8fc7\u7a0b\u4e2d\u4e22\u5931\u90e8\u4efd\u6587\u5b57\u3002 \u4e3a\u4e86\u89e3\u51b3\u8fd9\u4e2a\u95ee\u9898\uff0c\u60a8\u53ef\u4ee5\u901a\u8fc7\u4ee5\u4e0b\u547d\u4ee4\u5b89\u88c5noto\u5b57\u4f53\u5305\uff0c\u8fd9\u5728Ubuntu/debian\u7cfb\u7edf\u4e2d\u6709\u6548\uff1a

sudo apt update\nsudo apt install fonts-noto-core\nsudo apt install fonts-noto-cjk\nfc-cache -fv\n
\u4e5f\u53ef\u4ee5\u76f4\u63a5\u4f7f\u7528\u6211\u4eec\u7684Docker\u90e8\u7f72\u65b9\u5f0f\u6784\u5efa\u955c\u50cf\uff0c\u955c\u50cf\u4e2d\u9ed8\u8ba4\u5305\u542b\u4ee5\u4e0a\u5b57\u4f53\u5305\u3002

\u53c2\u8003\uff1a#2915

"},{"location":"zh/quick_start/","title":"\u5feb\u901f\u5165\u95e8","text":""},{"location":"zh/quick_start/#_1","title":"\u5feb\u901f\u5165\u95e8","text":"

\u5982\u679c\u9047\u5230\u4efb\u4f55\u5b89\u88c5\u95ee\u9898\uff0c\u8bf7\u5148\u67e5\u8be2 FAQ

"},{"location":"zh/quick_start/#_2","title":"\u5728\u7ebf\u4f53\u9a8c","text":""},{"location":"zh/quick_start/#_3","title":"\u5b98\u7f51\u5728\u7ebf\u5e94\u7528","text":"

\u5b98\u7f51\u5728\u7ebf\u7248\u529f\u80fd\u4e0e\u5ba2\u6237\u7aef\u4e00\u81f4\uff0c\u754c\u9762\u7f8e\u89c2\uff0c\u529f\u80fd\u4e30\u5bcc\uff0c\u9700\u8981\u767b\u5f55\u4f7f\u7528

"},{"location":"zh/quick_start/#gradiodemo","title":"\u57fa\u4e8eGradio\u7684\u5728\u7ebfdemo","text":"

\u57fa\u4e8egradio\u5f00\u53d1\u7684webui\uff0c\u754c\u9762\u7b80\u6d01\uff0c\u4ec5\u5305\u542b\u6838\u5fc3\u89e3\u6790\u529f\u80fd\uff0c\u514d\u767b\u5f55

"},{"location":"zh/quick_start/#_4","title":"\u672c\u5730\u90e8\u7f72","text":"

Warning

\u5b89\u88c5\u524d\u5fc5\u770b\u2014\u2014\u8f6f\u786c\u4ef6\u73af\u5883\u652f\u6301\u8bf4\u660e

\u4e3a\u4e86\u786e\u4fdd\u9879\u76ee\u7684\u7a33\u5b9a\u6027\u548c\u53ef\u9760\u6027\uff0c\u6211\u4eec\u5728\u5f00\u53d1\u8fc7\u7a0b\u4e2d\u4ec5\u5bf9\u7279\u5b9a\u7684\u8f6f\u786c\u4ef6\u73af\u5883\u8fdb\u884c\u4f18\u5316\u548c\u6d4b\u8bd5\u3002\u8fd9\u6837\u5f53\u7528\u6237\u5728\u63a8\u8350\u7684\u7cfb\u7edf\u914d\u7f6e\u4e0a\u90e8\u7f72\u548c\u8fd0\u884c\u9879\u76ee\u65f6\uff0c\u80fd\u591f\u83b7\u5f97\u6700\u4f73\u7684\u6027\u80fd\u8868\u73b0\u548c\u6700\u5c11\u7684\u517c\u5bb9\u6027\u95ee\u9898\u3002

\u901a\u8fc7\u96c6\u4e2d\u8d44\u6e90\u548c\u7cbe\u529b\u4e8e\u4e3b\u7ebf\u73af\u5883\uff0c\u6211\u4eec\u56e2\u961f\u80fd\u591f\u66f4\u9ad8\u6548\u5730\u89e3\u51b3\u6f5c\u5728\u7684BUG\uff0c\u53ca\u65f6\u5f00\u53d1\u65b0\u529f\u80fd\u3002

\u5728\u975e\u4e3b\u7ebf\u73af\u5883\u4e2d\uff0c\u7531\u4e8e\u786c\u4ef6\u3001\u8f6f\u4ef6\u914d\u7f6e\u7684\u591a\u6837\u6027\uff0c\u4ee5\u53ca\u7b2c\u4e09\u65b9\u4f9d\u8d56\u9879\u7684\u517c\u5bb9\u6027\u95ee\u9898\uff0c\u6211\u4eec\u65e0\u6cd5100%\u4fdd\u8bc1\u9879\u76ee\u7684\u5b8c\u5168\u53ef\u7528\u6027\u3002\u56e0\u6b64\uff0c\u5bf9\u4e8e\u5e0c\u671b\u5728\u975e\u63a8\u8350\u73af\u5883\u4e2d\u4f7f\u7528\u672c\u9879\u76ee\u7684\u7528\u6237\uff0c\u6211\u4eec\u5efa\u8bae\u5148\u4ed4\u7ec6\u9605\u8bfb\u6587\u6863\u4ee5\u53caFAQ\uff0c\u5927\u591a\u6570\u95ee\u9898\u5df2\u7ecf\u5728FAQ\u4e2d\u6709\u5bf9\u5e94\u7684\u89e3\u51b3\u65b9\u6848\uff0c\u9664\u6b64\u4e4b\u5916\u6211\u4eec\u9f13\u52b1\u793e\u533a\u53cd\u9988\u95ee\u9898\uff0c\u4ee5\u4fbf\u6211\u4eec\u80fd\u591f\u9010\u6b65\u6269\u5927\u652f\u6301\u8303\u56f4\u3002

\u89e3\u6790\u540e\u7aef pipeline *-auto-engine *-http-client hybrid vlm hybrid vlm \u540e\u7aef\u7279\u6027 \u517c\u5bb9\u6027\u597d \u786c\u4ef6\u914d\u7f6e\u8981\u6c42\u8f83\u9ad8 \u9002\u7528\u4e8eOpenAI\u517c\u5bb9\u670d\u52a1\u56682 \u7cbe\u5ea6\u6307\u68071 82+ 90+ \u64cd\u4f5c\u7cfb\u7edf Linux3 / Windows4 / macOS5 \u7eafCPU\u5e73\u53f0\u652f\u6301 \u2705 \u274c \u2705 GPU\u52a0\u901f\u652f\u6301 Volta\u53ca\u4ee5\u540e\u67b6\u6784GPU\u6216Apple Silicon \u4e0d\u9700\u8981 \u663e\u5b58\u6700\u4f4e\u8981\u6c42 6GB 10GB 8GB 3GB \u5185\u5b58\u8981\u6c42 \u6700\u4f4e16GB\u4ee5\u4e0a,\u63a8\u835032GB\u4ee5\u4e0a 8GB \u78c1\u76d8\u7a7a\u95f4\u8981\u6c42 20GB\u4ee5\u4e0a,\u63a8\u8350\u4f7f\u7528SSD 2GB python\u7248\u672c 3.10-3.13

1 \u7cbe\u5ea6\u6307\u6807\u4e3aOmniDocBench (v1.5)\u7684End-to-End Evaluation Overall\u5206\u6570\uff0c\u57fa\u4e8eMinerU\u6700\u65b0\u7248\u672c\u6d4b\u8bd5 2 \u517c\u5bb9OpenAI API\u7684\u670d\u52a1\u5668\uff0c\u5982\u901a\u8fc7vLLM/SGLang/LMDeploy\u7b49\u63a8\u7406\u6846\u67b6\u90e8\u7f72\u7684\u672c\u5730\u6a21\u578b\u670d\u52a1\u5668\u6216\u8fdc\u7a0b\u6a21\u578b\u670d\u52a1 3 Linux\u4ec5\u652f\u63012019\u5e74\u53ca\u4ee5\u540e\u53d1\u884c\u7248 4 \u7531\u4e8e\u5173\u952e\u4f9d\u8d56ray\u672a\u80fd\u5728windows\u5e73\u53f0\u652f\u6301Python 3.13\uff0c\u6545\u4ec5\u652f\u6301\u81f33.10~3.12\u7248\u672c 5 macOS \u9700\u4f7f\u752814.0\u4ee5\u4e0a\u7248\u672c

Tip

\u9664\u4ee5\u4e0a\u4e3b\u6d41\u73af\u5883\u4e0e\u5e73\u53f0\u5916\uff0c\u6211\u4eec\u4e5f\u6536\u5f55\u4e86\u4e00\u4e9b\u793e\u533a\u7528\u6237\u53cd\u9988\u7684\u5176\u4ed6\u5e73\u53f0\u652f\u6301\u60c5\u51b5\uff0c\u8be6\u60c5\u8bf7\u53c2\u8003\u5176\u4ed6\u52a0\u901f\u5361\u9002\u914d\u3002 \u5982\u679c\u60a8\u6709\u610f\u5c06\u81ea\u5df1\u7684\u73af\u5883\u9002\u914d\u7ecf\u9a8c\u5206\u4eab\u7ed9\u793e\u533a\uff0c\u6b22\u8fce\u901a\u8fc7show-and-tell\u63d0\u4ea4\u6216\u63d0\u4ea4PR\u81f3\u5176\u4ed6\u52a0\u901f\u5361\u9002\u914d\u6587\u6863\u3002

"},{"location":"zh/quick_start/#mineru","title":"\u5b89\u88c5 MinerU","text":""},{"location":"zh/quick_start/#pipuvmineru","title":"\u4f7f\u7528pip\u6216uv\u5b89\u88c5MinerU","text":"
pip install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple\npip install uv -i https://mirrors.aliyun.com/pypi/simple\nuv pip install -U \"mineru[all]\" -i https://mirrors.aliyun.com/pypi/simple \n
"},{"location":"zh/quick_start/#mineru_1","title":"\u901a\u8fc7\u6e90\u7801\u5b89\u88c5MinerU","text":"
git clone https://github.com/opendatalab/MinerU.git\ncd MinerU\nuv pip install -e .[all] -i https://mirrors.aliyun.com/pypi/simple\n

Tip

mineru[all]\u5305\u542b\u6240\u6709\u6838\u5fc3\u529f\u80fd\uff0c\u517c\u5bb9Windows / Linux / macOS\u7cfb\u7edf\uff0c\u9002\u5408\u7edd\u5927\u591a\u6570\u7528\u6237\u3002 \u5982\u679c\u60a8\u9700\u8981\u6307\u5b9avlm\u6a21\u578b\u7684\u63a8\u7406\u6846\u67b6\uff0c\u6216\u662f\u4ec5\u51c6\u5907\u5728\u8fb9\u7f18\u8bbe\u5907\u5b89\u88c5\u8f7b\u91cf\u7248client\u7aef\uff0c\u53ef\u4ee5\u53c2\u8003\u6587\u6863\u6269\u5c55\u6a21\u5757\u5b89\u88c5\u6307\u5357\u3002

"},{"location":"zh/quick_start/#dockermineru","title":"\u4f7f\u7528docker\u90e8\u7f72Mineru","text":"

MinerU\u63d0\u4f9b\u4e86\u4fbf\u6377\u7684docker\u90e8\u7f72\u65b9\u5f0f\uff0c\u8fd9\u6709\u52a9\u4e8e\u5feb\u901f\u642d\u5efa\u73af\u5883\u5e76\u89e3\u51b3\u4e00\u4e9b\u68d8\u624b\u7684\u73af\u5883\u517c\u5bb9\u95ee\u9898\u3002 \u60a8\u53ef\u4ee5\u5728\u6587\u6863\u4e2d\u83b7\u53d6Docker\u90e8\u7f72\u8bf4\u660e\u3002

"},{"location":"zh/quick_start/#mineru_2","title":"\u4f7f\u7528 MinerU","text":"

Tip

\u9ed8\u8ba4\u4f7f\u7528\u6258\u7ba1\u5728huggingface\u7684\u6a21\u578b\u8fdb\u884c\u89e3\u6790\uff0c\u9996\u6b21\u4f7f\u7528\u65f6\u4f1a\u81ea\u52a8\u4e0b\u8f7d\u6240\u9700\u6a21\u578b\u6587\u4ef6\uff0c\u540e\u7eed\u4f7f\u7528\u5c06\u76f4\u63a5\u52a0\u8f7d\u672c\u5730\u7f13\u5b58\u7684\u6a21\u578b\u3002\u5982\u679c\u60a8\u65e0\u6cd5\u8bbf\u95eehuggingface\uff0c\u53ef\u4ee5\u901a\u8fc7\u4ee5\u4e0b\u547d\u4ee4\u5207\u6362\u81f3\u56fd\u5185\u955c\u50cf\u6e90:

export MINERU_MODEL_SOURCE=modelscope\n

\u5982\u679c\u60a8\u7684\u8bbe\u5907\u6ee1\u8db3\u4e0a\u8868\u4e2dGPU\u52a0\u901f\u7684\u6761\u4ef6\uff0c\u53ef\u4ee5\u4f7f\u7528\u7b80\u5355\u7684\u547d\u4ee4\u884c\u8fdb\u884c\u6587\u6863\u89e3\u6790:

mineru -p <input_path> -o <output_path>\n
\u5982\u679c\u60a8\u7684\u8bbe\u5907\u4e0d\u6ee1\u8db3GPU\u52a0\u901f\u6761\u4ef6\uff0c\u53ef\u4ee5\u6307\u5b9a\u540e\u7aef\u4e3apipeline\uff0c\u4ee5\u5728\u7eafCPU\u73af\u5883\u4e0b\u8fd0\u884c:
mineru -p <input_path> -o <output_path> -b pipeline\n

\u60a8\u53ef\u4ee5\u901a\u8fc7\u547d\u4ee4\u884c\u3001API\u3001WebUI\u7b49\u591a\u79cd\u65b9\u5f0f\u4f7f\u7528MinerU\u8fdb\u884cPDF\u89e3\u6790\uff0c\u5177\u4f53\u4f7f\u7528\u65b9\u6cd5\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5357\u3002

"},{"location":"zh/quick_start/docker_deployment/","title":"Docker\u90e8\u7f72","text":""},{"location":"zh/quick_start/docker_deployment/#dockermineru","title":"\u4f7f\u7528docker\u90e8\u7f72Mineru","text":"

MinerU\u63d0\u4f9b\u4e86\u4fbf\u6377\u7684docker\u90e8\u7f72\u65b9\u5f0f\uff0c\u8fd9\u6709\u52a9\u4e8e\u5feb\u901f\u642d\u5efa\u73af\u5883\u5e76\u89e3\u51b3\u4e00\u4e9b\u68d8\u624b\u7684\u73af\u5883\u517c\u5bb9\u95ee\u9898\u3002

"},{"location":"zh/quick_start/docker_deployment/#dockerfile","title":"\u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/Dockerfile\ndocker build -t mineru:latest -f Dockerfile .\n
"},{"location":"zh/quick_start/docker_deployment/#docker","title":"Docker\u8bf4\u660e","text":"

Mineru\u7684docker\u4f7f\u7528\u4e86vllm/vllm-openai\u4f5c\u4e3a\u57fa\u7840\u955c\u50cf\uff0c\u56e0\u6b64\u5728docker\u4e2d\u9ed8\u8ba4\u96c6\u6210\u4e86vllm\u63a8\u7406\u52a0\u901f\u6846\u67b6\u548c\u5fc5\u9700\u7684\u4f9d\u8d56\u73af\u5883\u3002\u56e0\u6b64\u5728\u6ee1\u8db3\u6761\u4ef6\u7684\u8bbe\u5907\u4e0a\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u4f7f\u7528vllm\u52a0\u901fVLM\u6a21\u578b\u63a8\u7406\u3002

Note

\u4f7f\u7528vllm\u52a0\u901fVLM\u6a21\u578b\u63a8\u7406\u9700\u8981\u6ee1\u8db3\u7684\u6761\u4ef6\u662f\uff1a

  • \u8bbe\u5907\u5305\u542bVolta\u53ca\u4ee5\u540e\u67b6\u6784\u7684\u663e\u5361\uff0c\u4e14\u53ef\u7528\u663e\u5b58\u5927\u4e8e\u7b49\u4e8e8G\u3002
  • \u7269\u7406\u673a\u7684\u663e\u5361\u9a71\u52a8\u5e94\u652f\u6301CUDA 12.9.1\u6216\u66f4\u9ad8\u7248\u672c\uff0c\u53ef\u901a\u8fc7nvidia-smi\u547d\u4ee4\u68c0\u67e5\u9a71\u52a8\u7248\u672c\u3002
  • docker\u4e2d\u80fd\u591f\u8bbf\u95ee\u7269\u7406\u673a\u7684\u663e\u5361\u8bbe\u5907\u3002
"},{"location":"zh/quick_start/docker_deployment/#docker_1","title":"\u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run --gpus all \\\n  --shm-size 32g \\\n  -p 30000:30000 -p 7860:7860 -p 8000:8000 \\\n  --ipc=host \\\n  -it mineru:latest \\\n  /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u5e76\u6620\u5c04\u4e86\u4e00\u4e9b\u7aef\u53e3\u7528\u4e8e\u53ef\u80fd\u4f1a\u4f7f\u7528\u7684\u670d\u52a1\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/quick_start/docker_deployment/#docker-compose","title":"\u901a\u8fc7 Docker Compose \u76f4\u63a5\u542f\u52a8\u670d\u52a1","text":"

\u6211\u4eec\u63d0\u4f9b\u4e86compose.yml\u6587\u4ef6\uff0c\u60a8\u53ef\u4ee5\u901a\u8fc7\u5b83\u6765\u5feb\u901f\u542f\u52a8MinerU\u670d\u52a1\u3002

# \u4e0b\u8f7d compose.yaml \u6587\u4ef6\nwget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml\n

Note

  • compose.yaml\u6587\u4ef6\u4e2d\u5305\u542b\u4e86MinerU\u7684\u591a\u4e2a\u670d\u52a1\u914d\u7f6e\uff0c\u60a8\u53ef\u4ee5\u6839\u636e\u9700\u8981\u9009\u62e9\u542f\u52a8\u7279\u5b9a\u7684\u670d\u52a1\u3002
  • \u4e0d\u540c\u7684\u670d\u52a1\u53ef\u80fd\u4f1a\u6709\u989d\u5916\u7684\u53c2\u6570\u914d\u7f6e\uff0c\u60a8\u53ef\u4ee5\u5728compose.yaml\u6587\u4ef6\u4e2d\u67e5\u770b\u5e76\u7f16\u8f91\u3002
  • \u7531\u4e8evllm\u63a8\u7406\u52a0\u901f\u6846\u67b6\u9884\u5206\u914d\u663e\u5b58\u7684\u7279\u6027\uff0c\u60a8\u53ef\u80fd\u65e0\u6cd5\u5728\u540c\u4e00\u53f0\u673a\u5668\u4e0a\u540c\u65f6\u8fd0\u884c\u591a\u4e2avllm\u670d\u52a1\uff0c\u56e0\u6b64\u8bf7\u786e\u4fdd\u5728\u542f\u52a8vlm-openai-server\u670d\u52a1\u6216\u4f7f\u7528vlm-vllm-engine\u540e\u7aef\u65f6\uff0c\u5176\u4ed6\u53ef\u80fd\u4f7f\u7528\u663e\u5b58\u7684\u670d\u52a1\u5df2\u505c\u6b62\u3002
"},{"location":"zh/quick_start/docker_deployment/#openai","title":"\u542f\u52a8 openai\u517c\u5bb9\u63a5\u53e3 \u670d\u52a1","text":"

\u5e76\u901a\u8fc7vlm-http-client\u540e\u7aef\u8fde\u63a5openai-server

docker compose -f compose.yaml --profile openai-server up -d\n

Tip

\u5728\u53e6\u4e00\u4e2a\u7ec8\u7aef\u4e2d\u901a\u8fc7http client\u8fde\u63a5openai server\uff08\u53ea\u9700cpu\u4e0e\u7f51\u7edc\uff0c\u4e0d\u9700\u8981vllm\u73af\u5883\uff09

mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://<server_ip>:30000\n
"},{"location":"zh/quick_start/docker_deployment/#web-api","title":"\u542f\u52a8 Web API \u670d\u52a1","text":"
docker compose -f compose.yaml --profile api up -d\n

Tip

\u5728\u6d4f\u89c8\u5668\u4e2d\u8bbf\u95ee http://<server_ip>:8000/docs \u67e5\u770bAPI\u6587\u6863\u3002

"},{"location":"zh/quick_start/docker_deployment/#gradio-webui","title":"\u542f\u52a8 Gradio WebUI \u670d\u52a1","text":"
docker compose -f compose.yaml --profile gradio up -d\n

Tip

  • \u5728\u6d4f\u89c8\u5668\u4e2d\u8bbf\u95ee http://<server_ip>:7860 \u4f7f\u7528 Gradio WebUI\u3002
"},{"location":"zh/quick_start/extension_modules/","title":"\u6269\u5c55\u6a21\u5757\u5b89\u88c5","text":""},{"location":"zh/quick_start/extension_modules/#mineru","title":"MinerU \u6269\u5c55\u6a21\u5757\u5b89\u88c5\u6307\u5357","text":"

MinerU \u652f\u6301\u6839\u636e\u4e0d\u540c\u9700\u6c42\uff0c\u6309\u9700\u5b89\u88c5\u6269\u5c55\u6a21\u5757\uff0c\u4ee5\u589e\u5f3a\u529f\u80fd\u6216\u652f\u6301\u7279\u5b9a\u7684\u6a21\u578b\u540e\u7aef\u3002

"},{"location":"zh/quick_start/extension_modules/#_1","title":"\u5e38\u89c1\u573a\u666f","text":""},{"location":"zh/quick_start/extension_modules/#_2","title":"\u6838\u5fc3\u529f\u80fd\u5b89\u88c5","text":"

core \u6a21\u5757\u662f MinerU \u7684\u6838\u5fc3\u4f9d\u8d56\uff0c\u5305\u542b\u4e86\u9664vllm/lmdeploy\u5916\u7684\u6240\u6709\u529f\u80fd\u6a21\u5757\u3002\u5b89\u88c5\u6b64\u6a21\u5757\u53ef\u4ee5\u786e\u4fdd MinerU \u7684\u57fa\u672c\u529f\u80fd\u6b63\u5e38\u8fd0\u884c\u3002

uv pip install \"mineru[core]\"\n
"},{"location":"zh/quick_start/extension_modules/#vllm-vlm","title":"\u4f7f\u7528vllm\u52a0\u901f VLM \u6a21\u578b\u63a8\u7406","text":"

Note

vllm\u548clmdeploy\u5bf9vlm\u7684\u63a8\u7406\u52a0\u901f\u6548\u679c\u548c\u4f7f\u7528\u65b9\u5f0f\u51e0\u4e4e\u76f8\u540c\uff0c\u60a8\u53ef\u4ee5\u6839\u636e\u5b9e\u9645\u60c5\u51b5\u9009\u62e9\u5176\u4e2d\u4e4b\u4e00\u8fdb\u884c\u5b89\u88c5\u548c\u4f7f\u7528\uff0c\u4f46\u4e0d\u5efa\u8bae\u540c\u65f6\u5b89\u88c5\u8fd9\u4e24\u4e2a\u6a21\u5757\uff0c\u4ee5\u907f\u514d\u6f5c\u5728\u7684\u4f9d\u8d56\u51b2\u7a81\u3002

vllm \u6a21\u5757\u63d0\u4f9b\u4e86\u5bf9 VLM \u6a21\u578b\u63a8\u7406\u7684\u52a0\u901f\u652f\u6301\uff0c\u9002\u7528\u4e8e\u5177\u6709 Volta \u53ca\u4ee5\u540e\u67b6\u6784\u7684\u663e\u5361\uff088G \u663e\u5b58\u53ca\u4ee5\u4e0a\uff09\u3002\u5b89\u88c5\u6b64\u6a21\u5757\u53ef\u4ee5\u663e\u8457\u63d0\u5347\u6a21\u578b\u63a8\u7406\u901f\u5ea6\u3002

uv pip install \"mineru[core,vllm]\"\n

Tip

\u5982\u5728\u5b89\u88c5\u5305\u542bvllm\u7684\u6269\u5c55\u5305\u8fc7\u7a0b\u4e2d\u53d1\u751f\u5f02\u5e38\uff0c\u8bf7\u53c2\u8003 vllm \u5b98\u65b9\u6587\u6863 \u5c1d\u8bd5\u89e3\u51b3\uff0c\u6216\u76f4\u63a5\u4f7f\u7528 Docker \u65b9\u5f0f\u90e8\u7f72\u955c\u50cf\u3002

"},{"location":"zh/quick_start/extension_modules/#lmdeploy-vlm","title":"\u4f7f\u7528lmdeploy\u52a0\u901f VLM \u6a21\u578b\u63a8\u7406","text":"

Note

vllm\u548clmdeploy\u5bf9vlm\u7684\u63a8\u7406\u52a0\u901f\u6548\u679c\u548c\u4f7f\u7528\u65b9\u5f0f\u51e0\u4e4e\u76f8\u540c\uff0c\u60a8\u53ef\u4ee5\u6839\u636e\u5b9e\u9645\u60c5\u51b5\u9009\u62e9\u5176\u4e2d\u4e4b\u4e00\u8fdb\u884c\u5b89\u88c5\u548c\u4f7f\u7528\uff0c\u4f46\u4e0d\u5efa\u8bae\u540c\u65f6\u5b89\u88c5\u8fd9\u4e24\u4e2a\u6a21\u5757\uff0c\u4ee5\u907f\u514d\u6f5c\u5728\u7684\u4f9d\u8d56\u51b2\u7a81\u3002

lmdeploy \u6a21\u5757\u63d0\u4f9b\u4e86\u5bf9 VLM \u6a21\u578b\u63a8\u7406\u7684\u52a0\u901f\u652f\u6301\uff0c\u9002\u7528\u4e8e\u5177\u6709 Volta \u53ca\u4ee5\u540e\u67b6\u6784\u7684\u663e\u5361\uff088G \u663e\u5b58\u53ca\u4ee5\u4e0a\uff09\u3002\u5b89\u88c5\u6b64\u6a21\u5757\u53ef\u4ee5\u663e\u8457\u63d0\u5347\u6a21\u578b\u63a8\u7406\u901f\u5ea6\u3002

uv pip install \"mineru[core,lmdeploy]\"\n

Tip

\u5982\u5728\u5b89\u88c5\u5305\u542blmdeploy\u7684\u6269\u5c55\u5305\u8fc7\u7a0b\u4e2d\u53d1\u751f\u5f02\u5e38\uff0c\u8bf7\u53c2\u8003 lmdeploy \u5b98\u65b9\u6587\u6863 \u5c1d\u8bd5\u89e3\u51b3\u3002

"},{"location":"zh/quick_start/extension_modules/#clientopenai-vlm-http-client","title":"\u5b89\u88c5\u8f7b\u91cf\u7248client\u8fde\u63a5\u517c\u5bb9openai\u670d\u52a1\u5668\u4f7f\u7528 (\u9002\u7528vlm-http-client\u6a21\u5f0f)","text":"

\u5982\u679c\u60a8\u9700\u8981\u5728\u8fb9\u7f18\u8bbe\u5907\u4e0a\u5b89\u88c5\u8f7b\u91cf\u7248\u7684 client \u7aef\u4ee5\u8fde\u63a5\u517c\u5bb9 openai \u63a5\u53e3\u7684\u670d\u52a1\u7aef\u6765\u4f7f\u7528vlm\u6a21\u5f0f\uff0c\u53ef\u4ee5\u5b89\u88c5mineru\u7684\u57fa\u7840\u5305\uff0c\u975e\u5e38\u8f7b\u91cf\uff0c\u9002\u5408\u5728\u53ea\u6709cpu\u548c\u7f51\u7edc\u8fde\u63a5\u7684\u8bbe\u5907\u4e0a\u4f7f\u7528\u3002

uv pip install mineru\nmineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:30000\n
"},{"location":"zh/quick_start/extension_modules/#clientopenai-hybrid-http-client","title":"\u5b89\u88c5\u8f7b\u91cf\u7248client\u8fde\u63a5\u517c\u5bb9openai\u670d\u52a1\u5668\u4f7f\u7528 (\u9002\u7528hybrid-http-client\u6a21\u5f0f)","text":"

\u5982\u679c\u60a8\u9700\u8981\u5728\u8fb9\u7f18\u8bbe\u5907\u4e0a\u5b89\u88c5\u8f7b\u91cf\u7248\u7684 client \u7aef\u4ee5\u8fde\u63a5\u517c\u5bb9 openai \u63a5\u53e3\u7684\u670d\u52a1\u7aef\u6765\u4f7f\u7528hybrid\u6a21\u5f0f\uff0c\u53ef\u4ee5\u5b89\u88c5mineru\u7684pipeline\u6269\u5c55\u5305\uff0c\u76f8\u5bf9\u8f83\u8f7b\u91cf\uff0c\u53ef\u4ee5\u5728\u53ea\u6709cpu\u548c\u7f51\u7edc\u8fde\u63a5\u7684\u8bbe\u5907\u4e0a\u4f7f\u7528\uff0c\u540c\u65f6\u5728\u652f\u6301gpu\u52a0\u901f\u7684\u8bbe\u5907\u4e0a\u53ef\u4ee5\u66f4\u5feb\u8fd0\u884c\u3002

uv pip install \"mineru[pipeline]\"\nmineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000\n
"},{"location":"zh/reference/","title":"\u53c2\u8003\u8d44\u6599","text":""},{"location":"zh/reference/#_1","title":"\u53c2\u8003\u6587\u6863","text":"

\u672c\u7ae0\u8282\u63d0\u4f9b\u4e86 MinerU \u9879\u76ee\u7684\u8be6\u7ec6\u53c2\u8003\u8d44\u6599\u3002\u5728\u8fd9\u91cc\u60a8\u53ef\u4ee5\u627e\u5230\u6280\u672f\u89c4\u8303\u3001API \u6587\u6863\u3001\u8f93\u51fa\u6587\u4ef6\u683c\u5f0f\u8bf4\u660e\u4ee5\u53ca\u7248\u672c\u5386\u53f2\u8bb0\u5f55\u3002

"},{"location":"zh/reference/#_2","title":"\u76ee\u5f55","text":"
  • \u8f93\u51fa\u6587\u4ef6\u8bf4\u660e - \u8be6\u7ec6\u4ecb\u7ecd\u6240\u6709\u8f93\u51fa\u6587\u4ef6\u53ca\u5176\u683c\u5f0f
  • \u66f4\u65b0\u65e5\u5fd7 - \u7248\u672c\u66f4\u65b0\u5386\u53f2\u548c\u53d1\u5e03\u8bf4\u660e
"},{"location":"zh/reference/#_3","title":"\u6587\u6863\u6982\u89c8","text":""},{"location":"zh/reference/#_4","title":"\u8f93\u51fa\u6587\u4ef6\u8bf4\u660e","text":"

\u7406\u89e3 MinerU \u751f\u6210\u7684\u8f93\u51fa\u6587\u4ef6\u5bf9\u4e8e\u6709\u6548\u4f7f\u7528\u5de5\u5177\u81f3\u5173\u91cd\u8981\u3002\u8f93\u51fa\u6587\u4ef6\u6587\u6863\u63d0\u4f9b\u4e86\uff1a

  • \u53ef\u89c6\u5316\u8c03\u8bd5\u6587\u4ef6\uff1a\u5e2e\u52a9\u60a8\u7406\u89e3\u6587\u6863\u89e3\u6790\u8fc7\u7a0b
  • \u7ed3\u6784\u5316\u6570\u636e\u6587\u4ef6\uff1a\u5305\u542b\u8be6\u7ec6\u7684\u89e3\u6790\u7ed3\u679c\uff0c\u53ef\u7528\u4e8e\u8fdb\u4e00\u6b65\u5904\u7406
  • \u6587\u4ef6\u683c\u5f0f\u89c4\u8303\uff1a\u6bcf\u79cd\u8f93\u51fa\u6587\u4ef6\u7c7b\u578b\u7684\u8be6\u7ec6\u8bf4\u660e
"},{"location":"zh/reference/#_5","title":"\u66f4\u65b0\u65e5\u5fd7","text":"

\u66f4\u65b0\u65e5\u5fd7\u8bb0\u5f55\u4e86 MinerU \u7684\u6f14\u8fdb\u5386\u7a0b\uff0c\u5305\u62ec\uff1a

  • \u7248\u672c\u66f4\u65b0\uff1a\u6bcf\u4e2a\u7248\u672c\u7684\u65b0\u529f\u80fd\u548c\u6539\u8fdb
  • \u9519\u8bef\u4fee\u590d\uff1a\u6bcf\u4e2a\u7248\u672c\u4e2d\u89e3\u51b3\u7684\u95ee\u9898
  • \u91cd\u5927\u53d8\u66f4\uff1a\u53ef\u80fd\u5f71\u54cd\u60a8\u4f7f\u7528\u7684\u91cd\u8981\u53d8\u66f4
  • \u529f\u80fd\u5f03\u7528\uff1a\u6b63\u5728\u9010\u6b65\u6dd8\u6c70\u7684\u529f\u80fd
"},{"location":"zh/reference/changelog/","title":"\u66f4\u65b0\u65e5\u5fd7","text":""},{"location":"zh/reference/changelog/#_1","title":"\u66f4\u65b0\u65e5\u5fd7","text":"

\u672c\u6587\u6863\u8bb0\u5f55\u4e86MinerU\u9879\u76ee2.6.7\u53ca\u66f4\u65e9\u7248\u672c\u7684\u66f4\u65b0\u5386\u53f2\u3002\u6700\u65b0\u7248\u672c\u7684\u66f4\u65b0\u8bf7\u67e5\u770b\u9879\u76eeREADME\u3002

"},{"location":"zh/reference/changelog/#26","title":"2.6 \u7cfb\u5217\u7248\u672c","text":""},{"location":"zh/reference/changelog/#267-20251212","title":"2.6.7 (2025/12/12)","text":"
  • bug\u4fee\u590d\uff1a #4168
"},{"location":"zh/reference/changelog/#266-20251202","title":"2.6.6 (2025/12/02)","text":"

Ascend\u9002\u914d\u4f18\u5316

  • \u4f18\u5316\u547d\u4ee4\u884c\u5de5\u5177\u521d\u59cb\u5316\u6d41\u7a0b\uff0c\u4f7fAscend\u9002\u914d\u65b9\u6848\u4e2dvlm-vllm-engine\u540e\u7aef\u5728\u547d\u4ee4\u884c\u5de5\u5177\u4e2d\u53ef\u7528\u3002
  • \u4e3aAtlas 300I Duo(310p)\u8bbe\u5907\u66f4\u65b0\u9002\u914d\u6587\u6863\u3002

mineru-api\u5de5\u5177\u4f18\u5316

  • \u4e3amineru-api\u63a5\u53e3\u53c2\u6570\u589e\u52a0\u63cf\u8ff0\u6027\u6587\u672c\uff0c\u4f18\u5316\u63a5\u53e3\u6587\u6863\u53ef\u8bfb\u6027\u3002
  • \u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cfMINERU_API_ENABLE_FASTAPI_DOCS\u63a7\u5236\u662f\u5426\u542f\u7528\u81ea\u52a8\u751f\u6210\u7684\u63a5\u53e3\u6587\u6863\u9875\u9762\uff0c\u9ed8\u8ba4\u4e3a\u542f\u7528\u3002
  • \u4e3avlm-vllm-async-engine\u3001vlm-lmdeploy-engine\u3001vlm-http-client\u540e\u7aef\u589e\u52a0\u5e76\u53d1\u6570\u914d\u7f6e\u9009\u9879\uff0c\u7528\u6237\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cfMINERU_API_MAX_CONCURRENT_REQUESTS\u63a7\u5236api\u63a5\u53e3\u7684\u6700\u5927\u5e76\u53d1\u8bf7\u6c42\u6570\uff0c\u9ed8\u8ba4\u4e3a\u4e0d\u9650\u5236\u6570\u91cf\u3002
"},{"location":"zh/reference/changelog/#265-20251126","title":"2.6.5 (2025/11/26)","text":"
  • \u589e\u52a0\u65b0\u540e\u7aefvlm-lmdeploy-engine\u652f\u6301\uff0c\u4f7f\u7528\u65b9\u5f0f\u4e0evlm-vllm-(async)engine\u7c7b\u4f3c\uff0c\u4f46\u4f7f\u7528lmdeploy\u4f5c\u4e3a\u63a8\u7406\u5f15\u64ce\uff0c\u4e0evllm\u76f8\u6bd4\u989d\u5916\u652f\u6301Windows\u5e73\u53f0\u539f\u751f\u63a8\u7406\u52a0\u901f\u3002
  • \u65b0\u589e\u56fd\u4ea7\u7b97\u529b\u5e73\u53f0\u6607\u817e/npu\u3001\u5e73\u5934\u54e5/ppu\u3001\u6c90\u66e6/maca\u7684\u9002\u914d\u652f\u6301\uff0c\u7528\u6237\u53ef\u5728\u5bf9\u5e94\u5e73\u53f0\u4e0a\u4f7f\u7528pipeline\u4e0evlm\u6a21\u578b\uff0c\u5e76\u4f7f\u7528vllm/lmdeploy\u5f15\u64ce\u52a0\u901fvlm\u6a21\u578b\u63a8\u7406\uff0c\u5177\u4f53\u4f7f\u7528\u65b9\u5f0f\u8bf7\u53c2\u8003\u5176\u4ed6\u52a0\u901f\u5361\u9002\u914d\u3002
  • \u56fd\u4ea7\u5e73\u53f0\u9002\u914d\u4e0d\u6613\uff0c\u6211\u4eec\u5df2\u5c3d\u91cf\u786e\u4fdd\u9002\u914d\u7684\u5b8c\u6574\u6027\u548c\u7a33\u5b9a\u6027\uff0c\u4f46\u4ecd\u53ef\u80fd\u5b58\u5728\u4e00\u4e9b\u7a33\u5b9a\u6027/\u517c\u5bb9\u95ee\u9898\u4e0e\u7cbe\u5ea6\u5bf9\u9f50\u95ee\u9898\uff0c\u8bf7\u5927\u5bb6\u6839\u636e\u9002\u914d\u6587\u6863\u9875\u9762\u5185\u7ea2\u7eff\u706f\u60c5\u51b5\u81ea\u884c\u9009\u62e9\u5408\u9002\u7684\u73af\u5883\u4e0e\u573a\u666f\u8fdb\u884c\u4f7f\u7528\u3002
  • \u5982\u5728\u4f7f\u7528\u56fd\u4ea7\u5316\u5e73\u53f0\u9002\u914d\u65b9\u6848\u7684\u8fc7\u7a0b\u4e2d\u9047\u5230\u4efb\u4f55\u6587\u6863\u672a\u63d0\u53ca\u7684\u95ee\u9898\uff0c\u4e3a\u4fbf\u4e8e\u5176\u4ed6\u7528\u6237\u67e5\u627e\u89e3\u51b3\u65b9\u6848\uff0c\u8bf7\u5728discussions\u7684\u6307\u5b9a\u5e16\u5b50\u4e2d\u8fdb\u884c\u53cd\u9988\u3002
"},{"location":"zh/reference/changelog/#264-20251104","title":"2.6.4 (2025/11/04)","text":"
  • \u4e3apdf\u6e32\u67d3\u56fe\u7247\u589e\u52a0\u8d85\u65f6\u914d\u7f6e\uff0c\u9ed8\u8ba4\u4e3a300\u79d2\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cfMINERU_PDF_RENDER_TIMEOUT\u8fdb\u884c\u914d\u7f6e\uff0c\u9632\u6b62\u90e8\u5206\u5f02\u5e38pdf\u6587\u4ef6\u5bfc\u81f4\u6e32\u67d3\u8fc7\u7a0b\u957f\u65f6\u95f4\u963b\u585e\u3002
  • \u4e3aonnx\u6a21\u578b\u589e\u52a0cpu\u7ebf\u7a0b\u6570\u914d\u7f6e\u9009\u9879\uff0c\u9ed8\u8ba4\u4e3a\u7cfb\u7edfcpu\u6838\u5fc3\u6570\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cfMINERU_INTRA_OP_NUM_THREADS\u548cMINERU_INTER_OP_NUM_THREADS\u8fdb\u884c\u914d\u7f6e\uff0c\u4ee5\u51cf\u5c11\u9ad8\u5e76\u53d1\u573a\u666f\u4e0b\u7684\u5bf9cpu\u8d44\u6e90\u7684\u62a2\u5360\u51b2\u7a81\u3002
"},{"location":"zh/reference/changelog/#263-20251031","title":"2.6.3 (2025/10/31)","text":"
  • \u589e\u52a0\u65b0\u540e\u7aefvlm-mlx-engine\u652f\u6301\uff0c\u5728Apple Silicon\u8bbe\u5907\u4e0a\u652f\u6301\u4f7f\u7528MLX\u52a0\u901fMinerU2.5\u6a21\u578b\u63a8\u7406\uff0c\u76f8\u6bd4vlm-transformers\u540e\u7aef\uff0cvlm-mlx-engine\u540e\u7aef\u901f\u5ea6\u63d0\u5347100%~200%\u3002
  • bug\u4fee\u590d: #3849 #3859
"},{"location":"zh/reference/changelog/#262-20251024","title":"2.6.2 (2025/10/24)","text":"

pipline\u540e\u7aef\u4f18\u5316

  • \u589e\u52a0\u5bf9\u4e2d\u6587\u516c\u5f0f\u7684\u5b9e\u9a8c\u6027\u652f\u6301\uff0c\u53ef\u901a\u8fc7\u914d\u7f6e\u73af\u5883\u53d8\u91cfexport MINERU_FORMULA_CH_SUPPORT=1\u5f00\u542f\u3002\u8be5\u529f\u80fd\u53ef\u80fd\u4f1a\u5bfc\u81f4MFR\u901f\u7387\u7565\u5fae\u4e0b\u964d\u3001\u90e8\u5206\u957f\u516c\u5f0f\u8bc6\u522b\u5931\u8d25\u7b49\u95ee\u9898\uff0c\u5efa\u8bae\u4ec5\u5728\u9700\u8981\u89e3\u6790\u4e2d\u6587\u516c\u5f0f\u7684\u573a\u666f\u4e0b\u5f00\u542f\u3002\u5982\u9700\u5173\u95ed\u8be5\u529f\u80fd\uff0c\u53ef\u5c06\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3a0\u3002
  • OCR\u901f\u5ea6\u5927\u5e45\u63d0\u5347200%~300%\uff0c\u611f\u8c22 @cjsdurj \u63d0\u4f9b\u7684\u4f18\u5316\u65b9\u6848
  • OCR\u6a21\u578b\u4f18\u5316\u62c9\u4e01\u6587\u8bc6\u522b\u7684\u51c6\u5ea6\u548c\u5e7f\u5ea6\uff0c\u5e76\u66f4\u65b0\u897f\u91cc\u5c14\u6587(cyrillic)\u3001\u963f\u62c9\u4f2f\u6587(arabic)\u3001\u5929\u57ce\u6587(devanagari)\u3001\u6cf0\u5362\u56fa\u8bed(te)\u3001\u6cf0\u7c73\u5c14\u8bed(ta)\u8bed\u7cfb\u81f3ppocr-v5\u7248\u672c\uff0c\u7cbe\u5ea6\u76f8\u6bd4\u4e0a\u4ee3\u6a21\u578b\u63d0\u534740%\u4ee5\u4e0a

vlm\u540e\u7aef\u4f18\u5316

  • table_caption\u3001table_footnote\u5339\u914d\u903b\u8f91\u4f18\u5316\uff0c\u63d0\u5347\u9875\u5185\u591a\u5f20\u8fde\u7eed\u8868\u573a\u666f\u4e0b\u7684\u8868\u683c\u6807\u9898\u548c\u811a\u6ce8\u7684\u5339\u914d\u51c6\u786e\u7387\u548c\u9605\u8bfb\u987a\u5e8f\u5408\u7406\u6027
  • \u4f18\u5316\u4f7f\u7528vllm\u540e\u7aef\u65f6\u9ad8\u5e76\u53d1\u65f6\u7684cpu\u8d44\u6e90\u5360\u7528\uff0c\u964d\u4f4e\u670d\u52a1\u7aef\u538b\u529b
  • \u9002\u914dvllm0.11.0\u7248\u672c

\u901a\u7528\u4f18\u5316

  • \u8de8\u9875\u8868\u683c\u5408\u5e76\u6548\u679c\u4f18\u5316\uff0c\u65b0\u589e\u8de8\u9875\u7eed\u8868\u5408\u5e76\u652f\u6301\uff0c\u63d0\u5347\u5728\u591a\u5217\u5408\u5e76\u573a\u666f\u4e0b\u7684\u8868\u683c\u5408\u5e76\u6548\u679c
  • \u4e3a\u8868\u683c\u5408\u5e76\u529f\u80fd\u589e\u52a0\u73af\u5883\u53d8\u91cf\u914d\u7f6e\u9009\u9879MINERU_TABLE_MERGE_ENABLE\uff0c\u8868\u683c\u5408\u5e76\u529f\u80fd\u9ed8\u8ba4\u5f00\u542f\uff0c\u53ef\u901a\u8fc7\u8bbe\u7f6e\u8be5\u53d8\u91cf\u4e3a0\u6765\u5173\u95ed\u8868\u683c\u5408\u5e76\u529f\u80fd
"},{"location":"zh/reference/changelog/#25","title":"2.5 \u7cfb\u5217\u7248\u672c","text":""},{"location":"zh/reference/changelog/#254-20250926","title":"2.5.4 (2025/09/26)","text":"
  • \ud83c\udf89\ud83c\udf89 MinerU2.5\u6280\u672f\u62a5\u544a\u73b0\u5df2\u53d1\u5e03\uff0c\u6b22\u8fce\u9605\u8bfb\u5168\u9762\u4e86\u89e3\u5176\u6a21\u578b\u67b6\u6784\u3001\u8bad\u7ec3\u7b56\u7565\u3001\u6570\u636e\u5de5\u7a0b\u548c\u8bc4\u6d4b\u7ed3\u679c\u3002
  • \u4fee\u590d\u90e8\u5206pdf\u6587\u4ef6\u88ab\u8bc6\u522b\u6210ai\u6587\u4ef6\u5bfc\u81f4\u65e0\u6cd5\u89e3\u6790\u7684\u95ee\u9898
"},{"location":"zh/reference/changelog/#253-20250920","title":"2.5.3 (2025/09/20)","text":"
  • \u4f9d\u8d56\u7248\u672c\u8303\u56f4\u8c03\u6574\uff0c\u4f7f\u5f97Turing\u53ca\u66f4\u65e9\u67b6\u6784\u663e\u5361\u53ef\u4ee5\u4f7f\u7528vLLM\u52a0\u901f\u63a8\u7406MinerU2.5\u6a21\u578b\u3002
  • pipeline\u540e\u7aef\u5bf9torch 2.8.0\u7684\u4e00\u4e9b\u517c\u5bb9\u6027\u4fee\u590d\u3002
  • \u964d\u4f4evLLM\u5f02\u6b65\u540e\u7aef\u9ed8\u8ba4\u7684\u5e76\u53d1\u6570\uff0c\u964d\u4f4e\u670d\u52a1\u7aef\u538b\u529b\u4ee5\u907f\u514d\u9ad8\u538b\u5bfc\u81f4\u7684\u94fe\u63a5\u5173\u95ed\u95ee\u9898\u3002
  • \u66f4\u591a\u517c\u5bb9\u6027\u76f8\u5173\u5185\u5bb9\u8be6\u89c1\u516c\u544a
"},{"location":"zh/reference/changelog/#252-20250919","title":"2.5.2 (2025/09/19)","text":"

\u6211\u4eec\u6b63\u5f0f\u53d1\u5e03 MinerU2.5\uff0c\u5f53\u524d\u6700\u5f3a\u6587\u6863\u89e3\u6790\u591a\u6a21\u6001\u5927\u6a21\u578b\u3002\u4ec5\u51ed 1.2B \u53c2\u6570\uff0cMinerU2.5 \u5728 OmniDocBench \u6587\u6863\u89e3\u6790\u8bc4\u6d4b\u4e2d\uff0c\u7cbe\u5ea6\u5df2\u5168\u9762\u8d85\u8d8a Gemini2.5-Pro\u3001GPT-4o\u3001Qwen2.5-VL-72B\u7b49\u9876\u7ea7\u591a\u6a21\u6001\u5927\u6a21\u578b\uff0c\u5e76\u663e\u8457\u9886\u5148\u4e8e\u4e3b\u6d41\u6587\u6863\u89e3\u6790\u4e13\u7528\u6a21\u578b\uff08\u5982 dots.ocr, MonkeyOCR, PP-StructureV3 \u7b49\uff09\u3002

\u6a21\u578b\u5df2\u53d1\u5e03\u81f3HuggingFace\u548cModelScope\u5e73\u53f0\uff0c\u6b22\u8fce\u5927\u5bb6\u4e0b\u8f7d\u4f7f\u7528\uff01

\u6838\u5fc3\u4eae\u70b9

  • \u6781\u81f4\u80fd\u6548\uff0c\u6027\u80fdSOTA: \u4ee5 1.2B \u7684\u8f7b\u91cf\u5316\u89c4\u6a21\uff0c\u5b9e\u73b0\u4e86\u8d85\u8d8a\u767e\u4ebf\u4e43\u81f3\u5343\u4ebf\u7ea7\u6a21\u578b\u7684SOTA\u6027\u80fd\uff0c\u91cd\u65b0\u5b9a\u4e49\u4e86\u6587\u6863\u89e3\u6790\u7684\u80fd\u6548\u6bd4\u3002
  • \u5148\u8fdb\u67b6\u6784\uff0c\u5168\u9762\u9886\u5148: \u901a\u8fc7 \"\u4e24\u9636\u6bb5\u63a8\u7406\" (\u89e3\u8026\u5e03\u5c40\u5206\u6790\u4e0e\u5185\u5bb9\u8bc6\u522b) \u4e0e \u539f\u751f\u9ad8\u5206\u8fa8\u7387\u67b6\u6784 \u7684\u7ed3\u5408\uff0c\u5728\u5e03\u5c40\u5206\u6790\u3001\u6587\u672c\u8bc6\u522b\u3001\u516c\u5f0f\u8bc6\u522b\u3001\u8868\u683c\u8bc6\u522b\u53ca\u9605\u8bfb\u987a\u5e8f\u4e94\u5927\u65b9\u9762\u5747\u8fbe\u5230 SOTA \u6c34\u5e73\u3002

\u5173\u952e\u80fd\u529b\u63d0\u5347

  • \u5e03\u5c40\u68c0\u6d4b: \u7ed3\u679c\u66f4\u5b8c\u6574\uff0c\u7cbe\u51c6\u8986\u76d6\u9875\u7709\u3001\u9875\u811a\u3001\u9875\u7801\u7b49\u975e\u6b63\u6587\u5185\u5bb9\uff1b\u540c\u65f6\u63d0\u4f9b\u66f4\u7cbe\u51c6\u7684\u5143\u7d20\u5b9a\u4f4d\u4e0e\u66f4\u81ea\u7136\u7684\u683c\u5f0f\u8fd8\u539f\uff08\u5982\u5217\u8868\u3001\u53c2\u8003\u6587\u732e\uff09\u3002
  • \u8868\u683c\u89e3\u6790: \u5927\u5e45\u4f18\u5316\u4e86\u5bf9\u65cb\u8f6c\u8868\u683c\u3001\u65e0\u7ebf/\u5c11\u7ebf\u8868\u3001\u4ee5\u53ca\u957f\u96be\u8868\u683c\u7684\u89e3\u6790\u80fd\u529b\u3002
  • \u516c\u5f0f\u8bc6\u522b: \u663e\u8457\u63d0\u5347\u4e2d\u82f1\u6df7\u5408\u516c\u5f0f\u53ca\u590d\u6742\u957f\u516c\u5f0f\u7684\u8bc6\u522b\u51c6\u786e\u7387\uff0c\u5927\u5e45\u6539\u5584\u6570\u5b66\u7c7b\u6587\u6863\u89e3\u6790\u80fd\u529b\u3002

\u4ed3\u5e93\u8c03\u6574

\u6b64\u5916\uff0c\u4f34\u968fvlm 2.5\u7684\u53d1\u5e03\uff0c\u6211\u4eec\u5bf9\u4ed3\u5e93\u505a\u51fa\u4e00\u4e9b\u8c03\u6574\uff1a

  • vlm\u540e\u7aef\u5347\u7ea7\u81f32.5\u7248\u672c\uff0c\u652f\u6301MinerU2.5\u6a21\u578b\uff0c\u4e0d\u518d\u517c\u5bb9MinerU2.0-2505-0.9B\u6a21\u578b\uff0c\u6700\u540e\u4e00\u4e2a\u652f\u63012.0\u6a21\u578b\u7684\u7248\u672c\u4e3amineru-2.2.2\u3002
  • vlm\u63a8\u7406\u76f8\u5173\u4ee3\u7801\u5df2\u79fb\u81f3mineru_vl_utils,\u964d\u4f4e\u4e0emineru\u4e3b\u4ed3\u5e93\u7684\u8026\u5408\u5ea6\uff0c\u4fbf\u4e8e\u540e\u7eed\u72ec\u7acb\u8fed\u4ee3\u3002
  • vlm\u52a0\u901f\u63a8\u7406\u6846\u67b6\u4ecesglang\u5207\u6362\u81f3vllm,\u5e76\u5b9e\u73b0\u5bf9vllm\u751f\u6001\u7684\u5b8c\u5168\u517c\u5bb9\uff0c\u4f7f\u5f97\u7528\u6237\u53ef\u4ee5\u5728\u4efb\u4f55\u652f\u6301vllm\u6846\u67b6\u7684\u5e73\u53f0\u4e0a\u4f7f\u7528MinerU2.5\u6a21\u578b\u5e76\u52a0\u901f\u63a8\u7406\u3002
  • \u7531\u4e8evlm\u6a21\u578b\u7684\u91cd\u5927\u5347\u7ea7\uff0c\u652f\u6301\u66f4\u591alayout type\uff0c\u56e0\u6b64\u6211\u4eec\u5bf9\u89e3\u6790\u7684\u4e2d\u95f4\u6587\u4ef6middle.json\u548c\u7ed3\u679c\u6587\u4ef6content_list.json\u7684\u7ed3\u6784\u505a\u51fa\u4e00\u4e9b\u8c03\u6574\uff0c\u8bf7\u53c2\u8003\u6587\u6863\u4e86\u89e3\u8be6\u60c5\u3002

\u5176\u4ed6\u4ed3\u5e93\u4f18\u5316

  • \u79fb\u9664\u5bf9\u8f93\u5165\u6587\u4ef6\u7684\u540e\u7f00\u540d\u767d\u540d\u5355\u6821\u9a8c\uff0c\u5f53\u8f93\u5165\u6587\u4ef6\u4e3aPDF\u6587\u6863\u6216\u56fe\u7247\u65f6\uff0c\u5bf9\u6587\u4ef6\u7684\u540e\u7f00\u540d\u4e0d\u518d\u6709\u8981\u6c42\uff0c\u63d0\u5347\u6613\u7528\u6027\u3002
"},{"location":"zh/reference/changelog/#22-24","title":"2.2 - 2.4 \u7cfb\u5217\u7248\u672c","text":""},{"location":"zh/reference/changelog/#222-20250910","title":"2.2.2 (2025/09/10)","text":"
  • \u4fee\u590d\u65b0\u7684\u8868\u683c\u8bc6\u522b\u6a21\u578b\u5728\u90e8\u5206\u8868\u683c\u89e3\u6790\u5931\u8d25\u65f6\u5f71\u54cd\u6574\u4f53\u89e3\u6790\u4efb\u52a1\u7684\u95ee\u9898
"},{"location":"zh/reference/changelog/#221-20250908","title":"2.2.1 (2025/09/08)","text":"
  • \u4fee\u590d\u4f7f\u7528\u6a21\u578b\u4e0b\u8f7d\u547d\u4ee4\u65f6\uff0c\u90e8\u5206\u65b0\u589e\u6a21\u578b\u672a\u4e0b\u8f7d\u7684\u95ee\u9898
"},{"location":"zh/reference/changelog/#220-20250905","title":"2.2.0 (2025/09/05)","text":"

\u4e3b\u8981\u66f4\u65b0

  • \u5728\u8fd9\u4e2a\u7248\u672c\u6211\u4eec\u91cd\u70b9\u63d0\u5347\u4e86\u8868\u683c\u7684\u89e3\u6790\u7cbe\u5ea6\uff0c\u901a\u8fc7\u5f15\u5165\u65b0\u7684\u6709\u7ebf\u8868\u8bc6\u522b\u6a21\u578b\u548c\u5168\u65b0\u7684\u6df7\u5408\u8868\u683c\u7ed3\u6784\u89e3\u6790\u7b97\u6cd5\uff0c\u663e\u8457\u63d0\u5347\u4e86pipeline\u540e\u7aef\u7684\u8868\u683c\u8bc6\u522b\u80fd\u529b\u3002
  • \u53e6\u5916\u6211\u4eec\u589e\u52a0\u4e86\u5bf9\u8de8\u9875\u8868\u683c\u5408\u5e76\u7684\u652f\u6301\uff0c\u8fd9\u4e00\u529f\u80fd\u540c\u65f6\u652f\u6301pipeline\u548cvlm\u540e\u7aef\uff0c\u8fdb\u4e00\u6b65\u63d0\u5347\u4e86\u8868\u683c\u89e3\u6790\u7684\u5b8c\u6574\u6027\u548c\u51c6\u786e\u6027\u3002

\u5176\u4ed6\u66f4\u65b0

  • pipeline\u540e\u7aef\u589e\u52a0270\u5ea6\u65cb\u8f6c\u7684\u8868\u683c\u89e3\u6790\u80fd\u529b\uff0c\u73b0\u5df2\u652f\u63010/90/270\u5ea6\u4e09\u4e2a\u65b9\u5411\u7684\u8868\u683c\u89e3\u6790
  • pipeline\u589e\u52a0\u5bf9\u6cf0\u6587\u3001\u5e0c\u814a\u6587\u7684ocr\u80fd\u529b\u652f\u6301\uff0c\u5e76\u66f4\u65b0\u4e86\u82f1\u6587ocr\u6a21\u578b\u81f3\u6700\u65b0\uff0c\u82f1\u6587\u8bc6\u522b\u7cbe\u5ea6\u63d0\u534711%\uff0c\u6cf0\u6587\u8bc6\u522b\u6a21\u578b\u7cbe\u5ea6 82.68%\uff0c\u5e0c\u814a\u6587\u8bc6\u522b\u6a21\u578b\u7cbe\u5ea6 89.28%\uff08by PPOCRv5\uff09
  • \u5728\u8f93\u51fa\u7684content_list.json\u4e2d\u589e\u52a0\u4e86bbox\u5b57\u6bb5(\u6620\u5c04\u81f30-1000\u8303\u56f4\u5185)\uff0c\u65b9\u4fbf\u7528\u6237\u76f4\u63a5\u83b7\u53d6\u6bcf\u4e2a\u5185\u5bb9\u5757\u7684\u4f4d\u7f6e\u4fe1\u606f
  • \u79fb\u9664pipeline_old_linux\u5b89\u88c5\u53ef\u9009\u9879\uff0c\u4e0d\u518d\u652f\u6301\u8001\u7248\u672c\u7684Linux\u7cfb\u7edf\u5982Centos 7\u7b49\uff0c\u4ee5\u4fbf\u5bf9uv\u7684sync/run\u7b49\u547d\u4ee4\u8fdb\u884c\u66f4\u597d\u7684\u652f\u6301
"},{"location":"zh/reference/changelog/#21","title":"2.1 \u7cfb\u5217\u7248\u672c","text":""},{"location":"zh/reference/changelog/#2110-20250801","title":"2.1.10 (2025/08/01)","text":"
  • \u4fee\u590dpipeline\u540e\u7aef\u56e0block\u8986\u76d6\u5bfc\u81f4\u7684\u89e3\u6790\u7ed3\u679c\u4e0e\u9884\u671f\u4e0d\u7b26 #3232
"},{"location":"zh/reference/changelog/#219-20250730","title":"2.1.9 (2025/07/30)","text":"
  • transformers 4.54.1 \u7248\u672c\u9002\u914d
"},{"location":"zh/reference/changelog/#218-20250728","title":"2.1.8 (2025/07/28)","text":"
  • sglang 0.4.9.post5 \u7248\u672c\u9002\u914d
"},{"location":"zh/reference/changelog/#217-20250727","title":"2.1.7 (2025/07/27)","text":"
  • transformers 4.54.0 \u7248\u672c\u9002\u914d
"},{"location":"zh/reference/changelog/#216-20250726","title":"2.1.6 (2025/07/26)","text":"
  • \u4fee\u590dvlm\u540e\u7aef\u89e3\u6790\u90e8\u5206\u624b\u5199\u6587\u6863\u65f6\u7684\u8868\u683c\u5f02\u5e38\u95ee\u9898
  • \u4fee\u590d\u6587\u6863\u65cb\u8f6c\u65f6\u53ef\u89c6\u5316\u6846\u4f4d\u7f6e\u6f02\u79fb\u95ee\u9898 #3175
"},{"location":"zh/reference/changelog/#215-20250724","title":"2.1.5 (2025/07/24)","text":"
  • sglang 0.4.9 \u7248\u672c\u9002\u914d\uff0c\u540c\u6b65\u5347\u7ea7dockerfile\u57fa\u7840\u955c\u50cf\u4e3asglang 0.4.9.post3
"},{"location":"zh/reference/changelog/#214-20250723","title":"2.1.4 (2025/07/23)","text":"

bug\u4fee\u590d

  • \u4fee\u590dpipeline\u540e\u7aef\u4e2dMFR\u6b65\u9aa4\u5728\u67d0\u4e9b\u60c5\u51b5\u4e0b\u663e\u5b58\u6d88\u8017\u8fc7\u5927\u7684\u95ee\u9898 #2771
  • \u4fee\u590d\u67d0\u4e9b\u60c5\u51b5\u4e0bimage/table\u4e0ecaption/footnote\u5339\u914d\u4e0d\u51c6\u786e\u7684\u95ee\u9898 #3129
"},{"location":"zh/reference/changelog/#211-20250716","title":"2.1.1 (2025/07/16)","text":"

bug\u4fee\u590d

  • \u4fee\u590dpipeline\u5728\u67d0\u4e9b\u60c5\u51b5\u53ef\u80fd\u53d1\u751f\u7684\u6587\u672c\u5757\u5185\u5bb9\u4e22\u5931\u95ee\u9898 #3005
  • \u4fee\u590dsglang-client\u9700\u8981\u5b89\u88c5torch\u7b49\u4e0d\u5fc5\u8981\u7684\u5305\u7684\u95ee\u9898 #2968
  • \u66f4\u65b0dockerfile\u4ee5\u4fee\u590dlinux\u5b57\u4f53\u7f3a\u5931\u5bfc\u81f4\u7684\u89e3\u6790\u6587\u672c\u5185\u5bb9\u4e0d\u5b8c\u6574\u95ee\u9898 #2915

\u6613\u7528\u6027\u66f4\u65b0

  • \u66f4\u65b0compose.yaml\uff0c\u4fbf\u4e8e\u7528\u6237\u76f4\u63a5\u542f\u52a8sglang-server\u3001mineru-api\u3001mineru-gradio\u670d\u52a1
  • \u542f\u7528\u5168\u65b0\u7684\u5728\u7ebf\u6587\u6863\u7ad9\u70b9\uff0c\u7b80\u5316readme\uff0c\u63d0\u4f9b\u66f4\u597d\u7684\u6587\u6863\u4f53\u9a8c
"},{"location":"zh/reference/changelog/#210-20250705","title":"2.1.0 (2025/07/05)","text":"

\u8fd9\u662f MinerU 2 \u7684\u7b2c\u4e00\u4e2a\u5927\u7248\u672c\u66f4\u65b0\uff0c\u5305\u542b\u4e86\u5927\u91cf\u65b0\u529f\u80fd\u548c\u6539\u8fdb\uff0c\u5305\u542b\u4f17\u591a\u6027\u80fd\u4f18\u5316\u3001\u4f53\u9a8c\u4f18\u5316\u548cbug\u4fee\u590d\uff0c\u5177\u4f53\u66f4\u65b0\u5185\u5bb9\u5982\u4e0b\uff1a

\u6027\u80fd\u4f18\u5316

  • \u5927\u5e45\u63d0\u5347\u67d0\u4e9b\u7279\u5b9a\u5206\u8fa8\u7387\uff08\u957f\u8fb92000\u50cf\u7d20\u5de6\u53f3\uff09\u6587\u6863\u7684\u9884\u5904\u7406\u901f\u5ea6
  • \u5927\u5e45\u63d0\u5347pipeline\u540e\u7aef\u6279\u91cf\u5904\u7406\u5927\u91cf\u9875\u6570\u8f83\u5c11\uff08<10\uff09\u6587\u6863\u65f6\u7684\u540e\u5904\u7406\u901f\u5ea6
  • pipeline\u540e\u7aef\u7684layout\u5206\u6790\u901f\u5ea6\u63d0\u5347\u7ea620%

\u4f53\u9a8c\u4f18\u5316

  • \u5185\u7f6e\u5f00\u7bb1\u5373\u7528\u7684fastapi\u670d\u52a1\u548cgradio webui\uff0c\u8be6\u7ec6\u4f7f\u7528\u65b9\u6cd5\u8bf7\u53c2\u8003\u6587\u6863
  • sglang\u9002\u914d0.4.8\u7248\u672c\uff0c\u5927\u5e45\u964d\u4f4evlm-sglang\u540e\u7aef\u7684\u663e\u5b58\u8981\u6c42\uff0c\u6700\u4f4e\u53ef\u57288G\u663e\u5b58(Turing\u53ca\u4ee5\u540e\u67b6\u6784)\u7684\u663e\u5361\u4e0a\u8fd0\u884c
  • \u5bf9\u6240\u6709\u547d\u4ee4\u589e\u52a0sglang\u7684\u53c2\u6570\u900f\u4f20\uff0c\u4f7f\u5f97sglang-engine\u540e\u7aef\u53ef\u4ee5\u4e0esglang-server\u4e00\u81f4\uff0c\u63a5\u6536sglang\u7684\u6240\u6709\u53c2\u6570
  • \u652f\u6301\u57fa\u4e8e\u914d\u7f6e\u6587\u4ef6\u7684\u529f\u80fd\u6269\u5c55\uff0c\u5305\u542b\u81ea\u5b9a\u4e49\u516c\u5f0f\u6807\u8bc6\u7b26\u3001\u5f00\u542f\u6807\u9898\u5206\u7ea7\u529f\u80fd\u3001\u81ea\u5b9a\u4e49\u672c\u5730\u6a21\u578b\u76ee\u5f55\uff0c\u8be6\u7ec6\u4f7f\u7528\u65b9\u6cd5\u8bf7\u53c2\u8003\u6587\u6863

\u65b0\u7279\u6027

  • pipeline\u540e\u7aef\u66f4\u65b0 PP-OCRv5 \u591a\u8bed\u79cd\u6587\u672c\u8bc6\u522b\u6a21\u578b\uff0c\u652f\u6301\u6cd5\u8bed\u3001\u897f\u73ed\u7259\u8bed\u3001\u8461\u8404\u7259\u8bed\u3001\u4fc4\u8bed\u3001\u97e9\u8bed\u7b49 37 \u79cd\u8bed\u8a00\u7684\u6587\u5b57\u8bc6\u522b\uff0c\u5e73\u5747\u7cbe\u5ea6\u6da8\u5e45\u8d8530%\u3002\u8be6\u60c5
  • pipeline\u540e\u7aef\u589e\u52a0\u5bf9\u7ad6\u6392\u6587\u672c\u7684\u6709\u9650\u652f\u6301
"},{"location":"zh/reference/changelog/#20","title":"2.0 \u7cfb\u5217\u7248\u672c","text":""},{"location":"zh/reference/changelog/#206-20250620","title":"2.0.6 (2025/06/20)","text":"
  • \u4fee\u590dvlm\u6a21\u5f0f\u4e0b\uff0c\u67d0\u4e9b\u5076\u53d1\u7684\u65e0\u6548\u5757\u5185\u5bb9\u5bfc\u81f4\u89e3\u6790\u4e2d\u65ad\u95ee\u9898
  • \u4fee\u590dvlm\u6a21\u5f0f\u4e0b\uff0c\u67d0\u4e9b\u4e0d\u5b8c\u6574\u7684\u8868\u7ed3\u6784\u5bfc\u81f4\u7684\u89e3\u6790\u4e2d\u65ad\u95ee\u9898
"},{"location":"zh/reference/changelog/#205-20250617","title":"2.0.5 (2025/06/17)","text":"
  • \u4fee\u590d\u4e86sglang-client\u6a21\u5f0f\u4e0b\u4f9d\u7136\u9700\u8981\u4e0b\u8f7d\u6a21\u578b\u7684\u95ee\u9898
  • \u4fee\u590d\u4e86sglang-client\u6a21\u5f0f\u9700\u8981\u4f9d\u8d56torch\u7b49\u5b9e\u9645\u8fd0\u884c\u4e0d\u9700\u8981\u7684\u5305\u7684\u95ee\u9898
  • \u4fee\u590d\u4e86\u540c\u4e00\u8fdb\u7a0b\u5185\u5c1d\u8bd5\u901a\u8fc7\u591a\u4e2aurl\u542f\u52a8\u591a\u4e2asglang-client\u5b9e\u4f8b\u65f6\uff0c\u53ea\u6709\u7b2c\u4e00\u4e2a\u751f\u6548\u7684\u95ee\u9898
"},{"location":"zh/reference/changelog/#203-20250615","title":"2.0.3 (2025/06/15)","text":"
  • \u4fee\u590d\u4e86\u5f53\u4e0b\u8f7d\u6a21\u578b\u7c7b\u578b\u8bbe\u7f6e\u4e3aall\u65f6\uff0c\u914d\u7f6e\u6587\u4ef6\u51fa\u73b0\u952e\u503c\u66f4\u65b0\u9519\u8bef\u7684\u95ee\u9898
  • \u4fee\u590d\u4e86\u547d\u4ee4\u884c\u6a21\u5f0f\u4e0b\u516c\u5f0f\u548c\u8868\u683c\u529f\u80fd\u5f00\u5173\u4e0d\u751f\u6548\u5bfc\u81f4\u529f\u80fd\u65e0\u6cd5\u5173\u95ed\u7684\u95ee\u9898
  • \u4fee\u590d\u4e86sglang-engine\u6a21\u5f0f\u4e0b\uff0c0.4.7\u7248\u672csglang\u7684\u517c\u5bb9\u6027\u95ee\u9898
  • \u66f4\u65b0\u4e86sglang\u73af\u5883\u4e0b\u90e8\u7f72\u5b8c\u6574\u7248MinerU\u7684Dockerfile\u548c\u76f8\u5173\u5b89\u88c5\u6587\u6863
"},{"location":"zh/reference/changelog/#200-20250613","title":"2.0.0 (2025/06/13)","text":"

\u5168\u65b0\u67b6\u6784

MinerU 2.0 \u5728\u4ee3\u7801\u7ed3\u6784\u548c\u4ea4\u4e92\u65b9\u5f0f\u4e0a\u8fdb\u884c\u4e86\u6df1\u5ea6\u91cd\u6784\uff0c\u663e\u8457\u63d0\u5347\u4e86\u7cfb\u7edf\u7684\u6613\u7528\u6027\u3001\u53ef\u7ef4\u62a4\u6027\u4e0e\u6269\u5c55\u80fd\u529b\u3002

  • \u53bb\u9664\u7b2c\u4e09\u65b9\u4f9d\u8d56\u9650\u5236\uff1a\u5f7b\u5e95\u79fb\u9664\u5bf9 pymupdf \u7684\u4f9d\u8d56\uff0c\u63a8\u52a8\u9879\u76ee\u5411\u66f4\u5f00\u653e\u3001\u5408\u89c4\u7684\u5f00\u6e90\u65b9\u5411\u8fc8\u8fdb\u3002
  • \u5f00\u7bb1\u5373\u7528\uff0c\u914d\u7f6e\u4fbf\u6377\uff1a\u65e0\u9700\u624b\u52a8\u7f16\u8f91 JSON \u914d\u7f6e\u6587\u4ef6\uff0c\u7edd\u5927\u591a\u6570\u53c2\u6570\u5df2\u652f\u6301\u547d\u4ee4\u884c\u6216 API \u76f4\u63a5\u8bbe\u7f6e\u3002
  • \u6a21\u578b\u81ea\u52a8\u7ba1\u7406\uff1a\u65b0\u589e\u6a21\u578b\u81ea\u52a8\u4e0b\u8f7d\u4e0e\u66f4\u65b0\u673a\u5236\uff0c\u7528\u6237\u65e0\u9700\u624b\u52a8\u5e72\u9884\u5373\u53ef\u5b8c\u6210\u6a21\u578b\u90e8\u7f72\u3002
  • \u79bb\u7ebf\u90e8\u7f72\u53cb\u597d\uff1a\u63d0\u4f9b\u5185\u7f6e\u6a21\u578b\u4e0b\u8f7d\u547d\u4ee4\uff0c\u652f\u6301\u5b8c\u5168\u65ad\u7f51\u73af\u5883\u4e0b\u7684\u90e8\u7f72\u9700\u6c42\u3002
  • \u4ee3\u7801\u7ed3\u6784\u7cbe\u7b80\uff1a\u79fb\u9664\u6570\u5343\u884c\u5197\u4f59\u4ee3\u7801\uff0c\u7b80\u5316\u7c7b\u7ee7\u627f\u903b\u8f91\uff0c\u663e\u8457\u63d0\u5347\u4ee3\u7801\u53ef\u8bfb\u6027\u4e0e\u5f00\u53d1\u6548\u7387\u3002
  • \u7edf\u4e00\u4e2d\u95f4\u683c\u5f0f\u8f93\u51fa\uff1a\u91c7\u7528\u6807\u51c6\u5316\u7684 middle_json \u683c\u5f0f\uff0c\u517c\u5bb9\u591a\u6570\u57fa\u4e8e\u8be5\u683c\u5f0f\u7684\u4e8c\u6b21\u5f00\u53d1\u573a\u666f\uff0c\u786e\u4fdd\u751f\u6001\u4e1a\u52a1\u65e0\u7f1d\u8fc1\u79fb\u3002

\u5168\u65b0\u6a21\u578b

MinerU 2.0 \u96c6\u6210\u4e86\u6211\u4eec\u6700\u65b0\u7814\u53d1\u7684\u5c0f\u53c2\u6570\u91cf\u3001\u9ad8\u6027\u80fd\u591a\u6a21\u6001\u6587\u6863\u89e3\u6790\u6a21\u578b\uff0c\u5b9e\u73b0\u7aef\u5230\u7aef\u7684\u9ad8\u901f\u3001\u9ad8\u7cbe\u5ea6\u6587\u6863\u7406\u89e3\u3002

  • \u5c0f\u6a21\u578b\uff0c\u5927\u80fd\u529b\uff1a\u6a21\u578b\u53c2\u6570\u4e0d\u8db3 1B\uff0c\u5374\u5728\u89e3\u6790\u7cbe\u5ea6\u4e0a\u8d85\u8d8a\u4f20\u7edf 72B \u7ea7\u522b\u7684\u89c6\u89c9\u8bed\u8a00\u6a21\u578b\uff08VLM\uff09\u3002
  • \u591a\u529f\u80fd\u5408\u4e00\uff1a\u5355\u6a21\u578b\u8986\u76d6\u591a\u8bed\u8a00\u8bc6\u522b\u3001\u624b\u5199\u8bc6\u522b\u3001\u7248\u9762\u5206\u6790\u3001\u8868\u683c\u89e3\u6790\u3001\u516c\u5f0f\u8bc6\u522b\u3001\u9605\u8bfb\u987a\u5e8f\u6392\u5e8f\u7b49\u6838\u5fc3\u4efb\u52a1\u3002
  • \u6781\u81f4\u63a8\u7406\u901f\u5ea6\uff1a\u5728\u5355\u5361 NVIDIA 4090 \u4e0a\u901a\u8fc7 sglang \u52a0\u901f\uff0c\u8fbe\u5230\u5cf0\u503c\u541e\u5410\u91cf\u8d85\u8fc7 10,000 token/s\uff0c\u8f7b\u677e\u5e94\u5bf9\u5927\u89c4\u6a21\u6587\u6863\u5904\u7406\u9700\u6c42\u3002
  • \u5728\u7ebf\u4f53\u9a8c\uff1a\u60a8\u53ef\u4ee5\u5728MinerU.net\u3001Hugging Face, \u4ee5\u53caModelScope\u4f53\u9a8c\u6211\u4eec\u7684\u5168\u65b0VLM\u6a21\u578b

\u4e0d\u517c\u5bb9\u53d8\u66f4\u8bf4\u660e

\u4e3a\u63d0\u5347\u6574\u4f53\u67b6\u6784\u5408\u7406\u6027\u4e0e\u957f\u671f\u53ef\u7ef4\u62a4\u6027\uff0c\u672c\u7248\u672c\u5305\u542b\u90e8\u5206\u4e0d\u517c\u5bb9\u7684\u53d8\u66f4\uff1a

  • Python \u5305\u540d\u4ece magic-pdf \u66f4\u6539\u4e3a mineru\uff0c\u547d\u4ee4\u884c\u5de5\u5177\u4e5f\u7531 magic-pdf \u6539\u4e3a mineru\uff0c\u8bf7\u540c\u6b65\u66f4\u65b0\u811a\u672c\u4e0e\u8c03\u7528\u547d\u4ee4\u3002
  • \u51fa\u4e8e\u5bf9\u7cfb\u7edf\u6a21\u5757\u5316\u8bbe\u8ba1\u4e0e\u751f\u6001\u4e00\u81f4\u6027\u7684\u8003\u8651\uff0cMinerU 2.0 \u5df2\u4e0d\u518d\u5185\u7f6e LibreOffice \u6587\u6863\u8f6c\u6362\u6a21\u5757\u3002\u5982\u9700\u5904\u7406 Office \u6587\u6863\uff0c\u5efa\u8bae\u901a\u8fc7\u72ec\u7acb\u90e8\u7f72\u7684 LibreOffice \u670d\u52a1\u5148\u884c\u8f6c\u6362\u4e3a PDF \u683c\u5f0f\uff0c\u518d\u8fdb\u884c\u540e\u7eed\u89e3\u6790\u64cd\u4f5c\u3002
"},{"location":"zh/reference/changelog/#1x","title":"1.x \u7cfb\u5217\u5386\u53f2\u7248\u672c","text":""},{"location":"zh/reference/changelog/#1312-20250524","title":"1.3.12 (2025/05/24)","text":"

\u589e\u52a0ppocrv5\u6a21\u578b\u7684\u652f\u6301\uff0c\u5c06ch_server\u6a21\u578b\u66f4\u65b0\u4e3aPP-OCRv5_rec_server\uff0cch_lite\u6a21\u578b\u66f4\u65b0\u4e3aPP-OCRv5_rec_mobile\uff08\u9700\u66f4\u65b0\u6a21\u578b\uff09

  • \u5728\u6d4b\u8bd5\u4e2d\uff0c\u53d1\u73b0ppocrv5(server)\u5bf9\u624b\u5199\u6587\u6863\u6548\u679c\u6709\u4e00\u5b9a\u63d0\u5347\uff0c\u4f46\u5728\u5176\u4f59\u7c7b\u522b\u6587\u6863\u7684\u7cbe\u5ea6\u7565\u5dee\u4e8ev4_server_doc\uff0c\u56e0\u6b64\u9ed8\u8ba4\u7684ch\u6a21\u578b\u4fdd\u6301\u4e0d\u53d8\uff0c\u4ecd\u4e3aPP-OCRv4_server_rec_doc\u3002
  • \u7531\u4e8eppocrv5\u5f3a\u5316\u4e86\u624b\u5199\u573a\u666f\u548c\u7279\u6b8a\u5b57\u7b26\u7684\u8bc6\u522b\u80fd\u529b\uff0c\u56e0\u6b64\u60a8\u53ef\u4ee5\u5728\u65e5\u7e41\u6df7\u5408\u573a\u666f\u4ee5\u53ca\u624b\u5199\u6587\u6863\u573a\u666f\u4e0b\u624b\u52a8\u9009\u62e9\u4f7f\u7528ppocrv5\u6a21\u578b
  • \u60a8\u53ef\u901a\u8fc7lang\u53c2\u6570lang='ch_server'(python api)\u6216--lang ch_server(\u547d\u4ee4\u884c)\u81ea\u884c\u9009\u62e9\u76f8\u5e94\u7684\u6a21\u578b\uff1a
  • ch \uff1aPP-OCRv4_rec_server_doc\uff08\u9ed8\u8ba4\uff09\uff08\u4e2d\u82f1\u65e5\u7e41\u6df7\u5408/1.5w\u5b57\u5178\uff09
  • ch_server \uff1aPP-OCRv5_rec_server\uff08\u4e2d\u82f1\u65e5\u7e41\u6df7\u5408+\u624b\u5199\u573a\u666f/1.8w\u5b57\u5178\uff09
  • ch_lite \uff1aPP-OCRv5_rec_mobile\uff08\u4e2d\u82f1\u65e5\u7e41\u6df7\u5408+\u624b\u5199\u573a\u666f/1.8w\u5b57\u5178\uff09
  • ch_server_v4 \uff1aPP-OCRv4_rec_server\uff08\u4e2d\u82f1\u6df7\u5408/6k\u5b57\u5178\uff09
  • ch_lite_v4 \uff1aPP-OCRv4_rec_mobile\uff08\u4e2d\u82f1\u6df7\u5408/6k\u5b57\u5178\uff09

\u589e\u52a0\u624b\u5199\u6587\u6863\u7684\u652f\u6301\uff0c\u901a\u8fc7\u4f18\u5316layout\u5bf9\u624b\u5199\u6587\u672c\u533a\u57df\u7684\u8bc6\u522b\uff0c\u73b0\u5df2\u652f\u6301\u624b\u5199\u6587\u6863\u7684\u89e3\u6790

  • \u9ed8\u8ba4\u652f\u6301\u6b64\u529f\u80fd\uff0c\u65e0\u9700\u989d\u5916\u914d\u7f6e
  • \u53ef\u4ee5\u53c2\u8003\u4e0a\u8ff0\u8bf4\u660e\uff0c\u624b\u52a8\u9009\u62e9ppocrv5\u6a21\u578b\u4ee5\u83b7\u5f97\u66f4\u597d\u7684\u624b\u5199\u6587\u6863\u89e3\u6790\u6548\u679c

huggingface\u548cmodelscope\u7684demo\u5df2\u66f4\u65b0\u4e3a\u652f\u6301\u624b\u5199\u8bc6\u522b\u548cppocrv5\u6a21\u578b\u7684\u7248\u672c\uff0c\u53ef\u81ea\u884c\u5728\u7ebf\u4f53\u9a8c

"},{"location":"zh/reference/changelog/#1310-20250429","title":"1.3.10 (2025/04/29)","text":"
  • \u652f\u6301\u4f7f\u7528\u81ea\u5b9a\u4e49\u516c\u5f0f\u6807\u8bc6\u7b26\uff0c\u53ef\u901a\u8fc7\u4fee\u6539\u7528\u6237\u76ee\u5f55\u4e0b\u7684magic-pdf.json\u6587\u4ef6\u4e2d\u7684latex-delimiter-config\u9879\u5b9e\u73b0\u3002
"},{"location":"zh/reference/changelog/#139-20250427","title":"1.3.9 (2025/04/27)","text":"
  • \u4f18\u5316\u516c\u5f0f\u89e3\u6790\u529f\u80fd\uff0c\u63d0\u5347\u516c\u5f0f\u6e32\u67d3\u7684\u6210\u529f\u7387
"},{"location":"zh/reference/changelog/#138-20250423","title":"1.3.8 (2025/04/23)","text":"

ocr\u9ed8\u8ba4\u6a21\u578b(ch)\u66f4\u65b0\u4e3aPP-OCRv4_server_rec_doc\uff08\u9700\u66f4\u65b0\u6a21\u578b\uff09

  • PP-OCRv4_server_rec_doc\u662f\u5728PP-OCRv4_server_rec\u7684\u57fa\u7840\u4e0a\uff0c\u5728\u66f4\u591a\u4e2d\u6587\u6587\u6863\u6570\u636e\u548cPP-OCR\u8bad\u7ec3\u6570\u636e\u7684\u6df7\u5408\u6570\u636e\u8bad\u7ec3\u800c\u6210\uff0c\u589e\u52a0\u4e86\u90e8\u5206\u7e41\u4f53\u5b57\u3001\u65e5\u6587\u3001\u7279\u6b8a\u5b57\u7b26\u7684\u8bc6\u522b\u80fd\u529b\uff0c\u53ef\u652f\u6301\u8bc6\u522b\u7684\u5b57\u7b26\u4e3a1.5\u4e07+\uff0c\u9664\u6587\u6863\u76f8\u5173\u7684\u6587\u5b57\u8bc6\u522b\u80fd\u529b\u63d0\u5347\u5916\uff0c\u4e5f\u540c\u65f6\u63d0\u5347\u4e86\u901a\u7528\u6587\u5b57\u7684\u8bc6\u522b\u80fd\u529b\u3002
  • PP-OCRv4_server_rec_doc/PP-OCRv4_server_rec/PP-OCRv4_mobile_rec \u6027\u80fd\u5bf9\u6bd4
  • \u7ecf\u9a8c\u8bc1\uff0cPP-OCRv4_server_rec_doc\u6a21\u578b\u5728\u4e2d\u82f1\u65e5\u7e41\u5355\u79cd\u8bed\u8a00\u6216\u591a\u79cd\u8bed\u8a00\u6df7\u5408\u573a\u666f\u5747\u6709\u660e\u663e\u7cbe\u5ea6\u63d0\u5347\uff0c\u4e14\u901f\u5ea6\u4e0ePP-OCRv4_server_rec\u76f8\u5f53\uff0c\u9002\u5408\u7edd\u5927\u90e8\u5206\u573a\u666f\u4f7f\u7528\u3002
  • PP-OCRv4_server_rec_doc\u5728\u5c0f\u90e8\u5206\u7eaf\u82f1\u6587\u573a\u666f\u53ef\u80fd\u4f1a\u53d1\u751f\u5355\u8bcd\u7c98\u8fde\u95ee\u9898\uff0cPP-OCRv4_server_rec\u5219\u5728\u6b64\u573a\u666f\u4e0b\u8868\u73b0\u66f4\u597d\uff0c\u56e0\u6b64\u6211\u4eec\u4fdd\u7559\u4e86PP-OCRv4_server_rec\u6a21\u578b\uff0c\u7528\u6237\u53ef\u901a\u8fc7\u589e\u52a0\u53c2\u6570lang='ch_server'(python api)\u6216--lang ch_server(\u547d\u4ee4\u884c)\u8c03\u7528\u3002
"},{"location":"zh/reference/changelog/#137-20250422","title":"1.3.7 (2025/04/22)","text":"
  • \u4fee\u590d\u8868\u683c\u89e3\u6790\u6a21\u578b\u521d\u59cb\u5316\u65f6lang\u53c2\u6570\u5931\u6548\u7684\u95ee\u9898
  • \u4fee\u590d\u5728cpu\u6a21\u5f0f\u4e0bocr\u548c\u8868\u683c\u89e3\u6790\u901f\u5ea6\u5927\u5e45\u4e0b\u964d\u7684\u95ee\u9898
"},{"location":"zh/reference/changelog/#134-20250416","title":"1.3.4 (2025/04/16)","text":"
  • \u901a\u8fc7\u79fb\u9664\u4e00\u4e9b\u65e0\u7528\u7684\u5757\uff0c\u5c0f\u5e45\u63d0\u5347\u4e86ocr-det\u7684\u901f\u5ea6
  • \u4fee\u590d\u90e8\u5206\u60c5\u51b5\u4e0b\u7531footnote\u5bfc\u81f4\u7684\u9875\u9762\u5185\u6392\u5e8f\u9519\u8bef
"},{"location":"zh/reference/changelog/#132-20250412","title":"1.3.2 (2025/04/12)","text":"
  • \u4fee\u590d\u4e86windows\u7cfb\u7edf\u4e0b\uff0c\u5728python3.13\u73af\u5883\u5b89\u88c5\u65f6\u4e00\u4e9b\u4f9d\u8d56\u5305\u7248\u672c\u4e0d\u517c\u5bb9\u7684\u95ee\u9898
  • \u4f18\u5316\u6279\u91cf\u63a8\u7406\u65f6\u7684\u5185\u5b58\u5360\u7528
  • \u4f18\u5316\u65cb\u8f6c90\u5ea6\u8868\u683c\u7684\u89e3\u6790\u6548\u679c
  • \u4f18\u5316\u8d22\u62a5\u6837\u672c\u4e2d\u8d85\u5927\u8868\u683c\u7684\u89e3\u6790\u6548\u679c
  • \u4fee\u590d\u4e86\u5728\u672a\u6307\u5b9aOCR\u8bed\u8a00\u65f6\uff0c\u82f1\u6587\u6587\u672c\u533a\u57df\u5076\u5c14\u51fa\u73b0\u7684\u5355\u8bcd\u9ecf\u8fde\u95ee\u9898\uff08\u9700\u8981\u66f4\u65b0\u6a21\u578b\uff09
"},{"location":"zh/reference/changelog/#131-20250408","title":"1.3.1 (2025/04/08)","text":"

\u4fee\u590d\u4e86\u4e00\u4e9b\u517c\u5bb9\u95ee\u9898

  • \u652f\u6301python 3.13
  • \u4e3a\u90e8\u5206\u8fc7\u65f6\u7684linux\u7cfb\u7edf\uff08\u5982centos7\uff09\u505a\u51fa\u6700\u540e\u9002\u914d\uff0c\u5e76\u4e0d\u518d\u4fdd\u8bc1\u540e\u7eed\u7248\u672c\u7684\u7ee7\u7eed\u652f\u6301\uff0c\u5b89\u88c5\u8bf4\u660e
"},{"location":"zh/reference/changelog/#130-20250403","title":"1.3.0 (2025/04/03)","text":"

\u5b89\u88c5\u4e0e\u517c\u5bb9\u6027\u4f18\u5316

  • \u901a\u8fc7\u79fb\u9664layout\u4e2dlayoutlmv3\u7684\u4f7f\u7528\uff0c\u89e3\u51b3\u4e86\u7531detectron2\u5bfc\u81f4\u7684\u517c\u5bb9\u95ee\u9898
  • torch\u7248\u672c\u517c\u5bb9\u6269\u5c55\u52302.2~2.6(2.5\u9664\u5916)
  • cuda\u517c\u5bb9\u652f\u630111.8/12.4/12.6/12.8\uff08cuda\u7248\u672c\u7531torch\u51b3\u5b9a\uff09\uff0c\u89e3\u51b3\u90e8\u5206\u7528\u623750\u7cfb\u663e\u5361\u4e0eH\u7cfb\u663e\u5361\u7684\u517c\u5bb9\u95ee\u9898
  • python\u517c\u5bb9\u7248\u672c\u6269\u5c55\u52303.10~3.12\uff0c\u89e3\u51b3\u4e86\u5728\u975e3.10\u73af\u5883\u4e0b\u5b89\u88c5\u65f6\u81ea\u52a8\u964d\u7ea7\u52300.6.1\u7684\u95ee\u9898
  • \u4f18\u5316\u79bb\u7ebf\u90e8\u7f72\u6d41\u7a0b\uff0c\u90e8\u7f72\u6210\u529f\u540e\u4e0d\u9700\u8981\u8054\u7f51\u4e0b\u8f7d\u4efb\u4f55\u6a21\u578b\u6587\u4ef6

\u6027\u80fd\u4f18\u5316

  • \u901a\u8fc7\u652f\u6301\u591a\u4e2apdf\u6587\u4ef6\u7684batch\u5904\u7406\uff08\u811a\u672c\u6837\u4f8b\uff09\uff0c\u63d0\u5347\u4e86\u6279\u91cf\u5c0f\u6587\u4ef6\u7684\u89e3\u6790\u901f\u5ea6 (\u4e0e1.0.1\u7248\u672c\u76f8\u6bd4\uff0c\u516c\u5f0f\u89e3\u6790\u901f\u5ea6\u6700\u9ad8\u63d0\u5347\u8d85\u8fc71400%\uff0c\u6574\u4f53\u89e3\u6790\u901f\u5ea6\u6700\u9ad8\u63d0\u5347\u8d85\u8fc7500%)
  • \u901a\u8fc7\u4f18\u5316mfr\u6a21\u578b\u7684\u52a0\u8f7d\u548c\u4f7f\u7528\uff0c\u964d\u4f4e\u4e86\u663e\u5b58\u5360\u7528\u5e76\u63d0\u5347\u4e86\u89e3\u6790\u901f\u5ea6(\u9700\u91cd\u65b0\u6267\u884c\u6a21\u578b\u4e0b\u8f7d\u6d41\u7a0b\u4ee5\u83b7\u5f97\u6a21\u578b\u6587\u4ef6\u7684\u589e\u91cf\u66f4\u65b0)
  • \u4f18\u5316\u663e\u5b58\u5360\u7528\uff0c\u6700\u4f4e\u4ec5\u97006GB\u5373\u53ef\u8fd0\u884c\u672c\u9879\u76ee
  • \u4f18\u5316\u4e86\u5728mps\u8bbe\u5907\u4e0a\u7684\u8fd0\u884c\u901f\u5ea6

\u89e3\u6790\u6548\u679c\u4f18\u5316

  • mfr\u6a21\u578b\u66f4\u65b0\u5230unimernet(2503)\uff0c\u89e3\u51b3\u591a\u884c\u516c\u5f0f\u4e2d\u6362\u884c\u4e22\u5931\u7684\u95ee\u9898

\u6613\u7528\u6027\u4f18\u5316

  • \u901a\u8fc7\u4f7f\u7528paddleocr2torch\uff0c\u5b8c\u5168\u66ff\u4ee3paddle\u6846\u67b6\u4ee5\u53capaddleocr\u5728\u9879\u76ee\u4e2d\u7684\u4f7f\u7528\uff0c\u89e3\u51b3\u4e86paddle\u548ctorch\u7684\u51b2\u7a81\u95ee\u9898\uff0c\u548c\u7531\u4e8epaddle\u6846\u67b6\u5bfc\u81f4\u7684\u7ebf\u7a0b\u4e0d\u5b89\u5168\u95ee\u9898
  • \u89e3\u6790\u8fc7\u7a0b\u589e\u52a0\u5b9e\u65f6\u8fdb\u5ea6\u6761\u663e\u793a\uff0c\u7cbe\u51c6\u628a\u63e1\u89e3\u6790\u8fdb\u5ea6\uff0c\u8ba9\u7b49\u5f85\u4e0d\u518d\u75db\u82e6
"},{"location":"zh/reference/changelog/#121-20250303","title":"1.2.1 (2025/03/03)","text":"

\u4fee\u590d\u4e86\u4e00\u4e9b\u95ee\u9898

  • \u4fee\u590d\u5728\u5b57\u6bcd\u4e0e\u6570\u5b57\u7684\u5168\u89d2\u8f6c\u534a\u89d2\u64cd\u4f5c\u65f6\u5bf9\u6807\u70b9\u7b26\u53f7\u7684\u5f71\u54cd
  • \u4fee\u590d\u5728\u67d0\u4e9b\u60c5\u51b5\u4e0bcaption\u7684\u5339\u914d\u4e0d\u51c6\u786e\u95ee\u9898
  • \u4fee\u590d\u5728\u67d0\u4e9b\u60c5\u51b5\u4e0b\u7684\u516c\u5f0fspan\u4e22\u5931\u95ee\u9898
"},{"location":"zh/reference/changelog/#120-20250224","title":"1.2.0 (2025/02/24)","text":"

\u8fd9\u4e2a\u7248\u672c\u6211\u4eec\u4fee\u590d\u4e86\u4e00\u4e9b\u95ee\u9898\uff0c\u63d0\u5347\u4e86\u89e3\u6790\u7684\u6548\u7387\u4e0e\u7cbe\u5ea6\uff1a

\u6027\u80fd\u4f18\u5316

  • auto\u6a21\u5f0f\u4e0bpdf\u6587\u6863\u7684\u5206\u7c7b\u901f\u5ea6\u63d0\u5347

\u89e3\u6790\u4f18\u5316

  • \u4f18\u5316\u5bf9\u5305\u542b\u6c34\u5370\u6587\u6863\u7684\u89e3\u6790\u903b\u8f91\uff0c\u663e\u8457\u63d0\u5347\u5305\u542b\u6c34\u5370\u6587\u6863\u7684\u89e3\u6790\u6548\u679c
  • \u6539\u8fdb\u4e86\u5355\u9875\u5185\u591a\u4e2a\u56fe\u50cf/\u8868\u683c\u4e0ecaption\u7684\u5339\u914d\u903b\u8f91\uff0c\u63d0\u5347\u4e86\u590d\u6742\u5e03\u5c40\u4e0b\u56fe\u6587\u5339\u914d\u7684\u51c6\u786e\u6027

\u95ee\u9898\u4fee\u590d

  • \u4fee\u590d\u5728\u67d0\u4e9b\u60c5\u51b5\u4e0b\u56fe\u7247/\u8868\u683cspan\u88ab\u586b\u5145\u8fdbtextblock\u5bfc\u81f4\u7684\u5f02\u5e38
  • \u4fee\u590d\u5728\u67d0\u4e9b\u60c5\u51b5\u4e0b\u6807\u9898block\u4e3a\u7a7a\u7684\u95ee\u9898
"},{"location":"zh/reference/changelog/#110-20250122","title":"1.1.0 (2025/01/22)","text":"

\u5728\u8fd9\u4e2a\u7248\u672c\u6211\u4eec\u91cd\u70b9\u63d0\u5347\u4e86\u89e3\u6790\u7684\u7cbe\u5ea6\u4e0e\u6548\u7387\uff1a

\u6a21\u578b\u80fd\u529b\u5347\u7ea7\uff08\u9700\u91cd\u65b0\u6267\u884c \u6a21\u578b\u4e0b\u8f7d\u6d41\u7a0b \u4ee5\u83b7\u5f97\u6a21\u578b\u6587\u4ef6\u7684\u589e\u91cf\u66f4\u65b0\uff09

  • \u5e03\u5c40\u8bc6\u522b\u6a21\u578b\u5347\u7ea7\u5230\u6700\u65b0\u7684 doclayout_yolo(2501) \u6a21\u578b\uff0c\u63d0\u5347\u4e86layout\u8bc6\u522b\u7cbe\u5ea6
  • \u516c\u5f0f\u89e3\u6790\u6a21\u578b\u5347\u7ea7\u5230\u6700\u65b0\u7684 unimernet(2501) \u6a21\u578b\uff0c\u63d0\u5347\u4e86\u516c\u5f0f\u8bc6\u522b\u7cbe\u5ea6

\u6027\u80fd\u4f18\u5316

  • \u5728\u914d\u7f6e\u6ee1\u8db3\u4e00\u5b9a\u6761\u4ef6\uff08\u663e\u5b5816GB+\uff09\u7684\u8bbe\u5907\u4e0a\uff0c\u901a\u8fc7\u4f18\u5316\u8d44\u6e90\u5360\u7528\u548c\u91cd\u6784\u5904\u7406\u6d41\u6c34\u7ebf\uff0c\u6574\u4f53\u89e3\u6790\u901f\u5ea6\u63d0\u534750%\u4ee5\u4e0a

\u89e3\u6790\u6548\u679c\u4f18\u5316

  • \u5728\u7ebfdemo\uff08mineru.net / huggingface / modelscope\uff09\u4e0a\u65b0\u589e\u6807\u9898\u5206\u7ea7\u529f\u80fd\uff08\u6d4b\u8bd5\u7248\u672c\uff0c\u9ed8\u8ba4\u5f00\u542f\uff09\uff0c\u652f\u6301\u5bf9\u6807\u9898\u8fdb\u884c\u5206\u7ea7\uff0c\u63d0\u5347\u6587\u6863\u7ed3\u6784\u5316\u7a0b\u5ea6
"},{"location":"zh/reference/changelog/#101-20250110","title":"1.0.1 (2025/01/10)","text":"

\u8fd9\u662f\u6211\u4eec\u7684\u7b2c\u4e00\u4e2a\u6b63\u5f0f\u7248\u672c\uff0c\u5728\u8fd9\u4e2a\u7248\u672c\u4e2d\uff0c\u6211\u4eec\u901a\u8fc7\u5927\u91cf\u91cd\u6784\u5e26\u6765\u4e86\u5168\u65b0\u7684API\u63a5\u53e3\u548c\u66f4\u5e7f\u6cdb\u7684\u517c\u5bb9\u6027\uff0c\u4ee5\u53ca\u5168\u65b0\u7684\u81ea\u52a8\u8bed\u8a00\u8bc6\u522b\u529f\u80fd\uff1a

\u5168\u65b0API\u63a5\u53e3

  • \u5bf9\u4e8e\u6570\u636e\u4fa7API\uff0c\u6211\u4eec\u5f15\u5165\u4e86Dataset\u7c7b\uff0c\u65e8\u5728\u63d0\u4f9b\u4e00\u4e2a\u5f3a\u5927\u800c\u7075\u6d3b\u7684\u6570\u636e\u5904\u7406\u6846\u67b6\u3002\u8be5\u6846\u67b6\u5f53\u524d\u652f\u6301\u5305\u62ec\u56fe\u50cf\uff08.jpg\u53ca.png\uff09\u3001PDF\u3001Word\uff08.doc\u53ca.docx\uff09\u3001\u4ee5\u53caPowerPoint\uff08.ppt\u53ca.pptx\uff09\u5728\u5185\u7684\u591a\u79cd\u6587\u6863\u683c\u5f0f\uff0c\u786e\u4fdd\u4e86\u4ece\u7b80\u5355\u5230\u590d\u6742\u7684\u6570\u636e\u5904\u7406\u4efb\u52a1\u90fd\u80fd\u5f97\u5230\u6709\u6548\u7684\u652f\u6301\u3002
  • \u9488\u5bf9\u7528\u6237\u4fa7API\uff0c\u6211\u4eec\u5c06MinerU\u7684\u5904\u7406\u6d41\u7a0b\u7cbe\u5fc3\u8bbe\u8ba1\u4e3a\u4e00\u7cfb\u5217\u53ef\u7ec4\u5408\u7684Stage\u9636\u6bb5\u3002\u6bcf\u4e2aStage\u4ee3\u8868\u4e86\u4e00\u4e2a\u7279\u5b9a\u7684\u5904\u7406\u6b65\u9aa4\uff0c\u7528\u6237\u53ef\u4ee5\u6839\u636e\u81ea\u8eab\u9700\u6c42\u81ea\u7531\u5730\u5b9a\u4e49\u65b0\u7684Stage\uff0c\u5e76\u901a\u8fc7\u521b\u9020\u6027\u5730\u7ec4\u5408\u8fd9\u4e9b\u9636\u6bb5\u6765\u5b9a\u5236\u4e13\u5c5e\u7684\u6570\u636e\u5904\u7406\u6d41\u7a0b\u3002

\u66f4\u5e7f\u6cdb\u7684\u517c\u5bb9\u6027\u9002\u914d

  • \u901a\u8fc7\u4f18\u5316\u4f9d\u8d56\u73af\u5883\u548c\u914d\u7f6e\u9879\uff0c\u786e\u4fdd\u5728ARM\u67b6\u6784\u7684Linux\u7cfb\u7edf\u4e0a\u80fd\u591f\u7a33\u5b9a\u9ad8\u6548\u8fd0\u884c\u3002
  • \u6df1\u5ea6\u9002\u914d\u534e\u4e3a\u6607\u817eNPU\u52a0\u901f\uff0c\u79ef\u6781\u54cd\u5e94\u4fe1\u521b\u8981\u6c42\uff0c\u63d0\u4f9b\u81ea\u4e3b\u53ef\u63a7\u7684\u9ad8\u6027\u80fd\u8ba1\u7b97\u80fd\u529b\uff0c\u52a9\u529b\u4eba\u5de5\u667a\u80fd\u5e94\u7528\u5e73\u53f0\u7684\u56fd\u4ea7\u5316\u5e94\u7528\u4e0e\u53d1\u5c55\u3002 NPU\u52a0\u901f\u6559\u7a0b

\u81ea\u52a8\u8bed\u8a00\u8bc6\u522b

  • \u901a\u8fc7\u5f15\u5165\u5168\u65b0\u7684\u8bed\u8a00\u8bc6\u522b\u6a21\u578b\uff0c \u5728\u6587\u6863\u89e3\u6790\u4e2d\u5c06 lang \u914d\u7f6e\u4e3a auto\uff0c\u5373\u53ef\u81ea\u52a8\u9009\u62e9\u5408\u9002\u7684OCR\u8bed\u8a00\u6a21\u578b\uff0c\u63d0\u5347\u626b\u63cf\u7c7b\u6587\u6863\u89e3\u6790\u7684\u51c6\u786e\u6027\u3002
"},{"location":"zh/reference/changelog/#0x","title":"0.x \u7cfb\u5217\u5386\u53f2\u7248\u672c","text":""},{"location":"zh/reference/changelog/#0100-20241122","title":"0.10.0 (2024/11/22)","text":"

\u901a\u8fc7\u5f15\u5165\u6df7\u5408OCR\u6587\u672c\u63d0\u53d6\u80fd\u529b\uff1a

  • \u5728\u516c\u5f0f\u5bc6\u96c6\u3001span\u533a\u57df\u4e0d\u89c4\u8303\u3001\u90e8\u5206\u6587\u672c\u4f7f\u7528\u56fe\u50cf\u8868\u73b0\u7b49\u590d\u6742\u6587\u672c\u5206\u5e03\u573a\u666f\u4e0b\u83b7\u5f97\u89e3\u6790\u6548\u679c\u7684\u663e\u8457\u63d0\u5347
  • \u540c\u65f6\u5177\u5907\u6587\u672c\u6a21\u5f0f\u5185\u5bb9\u63d0\u53d6\u51c6\u786e\u3001\u901f\u5ea6\u66f4\u5feb\u4e0eOCR\u6a21\u5f0fspan/line\u533a\u57df\u8bc6\u522b\u66f4\u51c6\u7684\u53cc\u91cd\u4f18\u52bf
"},{"location":"zh/reference/changelog/#093-20241115","title":"0.9.3 (2024/11/15)","text":"

\u4e3a\u8868\u683c\u8bc6\u522b\u529f\u80fd\u63a5\u5165\u4e86RapidTable,\u5355\u8868\u89e3\u6790\u901f\u5ea6\u63d0\u534710\u500d\u4ee5\u4e0a\uff0c\u51c6\u786e\u7387\u66f4\u9ad8\uff0c\u663e\u5b58\u5360\u7528\u66f4\u4f4e

"},{"location":"zh/reference/changelog/#092-20241106","title":"0.9.2 (2024/11/06)","text":"

\u4e3a\u8868\u683c\u8bc6\u522b\u529f\u80fd\u63a5\u5165\u4e86StructTable-InternVL2-1B\u6a21\u578b

"},{"location":"zh/reference/changelog/#090-20241031","title":"0.9.0 (2024/10/31)","text":"

\u8fd9\u662f\u6211\u4eec\u8fdb\u884c\u4e86\u5927\u91cf\u4ee3\u7801\u91cd\u6784\u7684\u5168\u65b0\u7248\u672c\uff0c\u89e3\u51b3\u4e86\u4f17\u591a\u95ee\u9898\uff0c\u63d0\u5347\u4e86\u6027\u80fd\uff0c\u964d\u4f4e\u4e86\u786c\u4ef6\u9700\u6c42\uff0c\u5e76\u63d0\u4f9b\u4e86\u66f4\u4e30\u5bcc\u7684\u6613\u7528\u6027\uff1a

  • \u91cd\u6784\u6392\u5e8f\u6a21\u5757\u4ee3\u7801\uff0c\u4f7f\u7528 layoutreader \u8fdb\u884c\u9605\u8bfb\u987a\u5e8f\u6392\u5e8f\uff0c\u786e\u4fdd\u5728\u5404\u79cd\u6392\u7248\u4e0b\u90fd\u80fd\u5b9e\u73b0\u6781\u9ad8\u51c6\u786e\u7387
  • \u91cd\u6784\u6bb5\u843d\u62fc\u63a5\u6a21\u5757\uff0c\u5728\u8de8\u680f\u3001\u8de8\u9875\u3001\u8de8\u56fe\u3001\u8de8\u8868\u60c5\u51b5\u4e0b\u5747\u80fd\u5b9e\u73b0\u826f\u597d\u7684\u6bb5\u843d\u62fc\u63a5\u6548\u679c
  • \u91cd\u6784\u5217\u8868\u548c\u76ee\u5f55\u8bc6\u522b\u529f\u80fd\uff0c\u6781\u5927\u63d0\u5347\u5217\u8868\u5757\u548c\u76ee\u5f55\u5757\u8bc6\u522b\u7684\u51c6\u786e\u7387\u53ca\u5bf9\u5e94\u6587\u672c\u6bb5\u843d\u7684\u89e3\u6790\u6548\u679c
  • \u91cd\u6784\u56fe\u3001\u8868\u4e0e\u63cf\u8ff0\u6027\u6587\u672c\u7684\u5339\u914d\u903b\u8f91\uff0c\u5927\u5e45\u63d0\u5347 caption \u548c footnote \u4e0e\u56fe\u8868\u7684\u5339\u914d\u51c6\u786e\u7387\uff0c\u5e76\u5c06\u63cf\u8ff0\u6027\u6587\u672c\u7684\u4e22\u5931\u7387\u964d\u81f3\u63a5\u8fd10
  • \u589e\u52a0 OCR \u7684\u591a\u8bed\u8a00\u652f\u6301\uff0c\u652f\u6301 84 \u79cd\u8bed\u8a00\u7684\u68c0\u6d4b\u4e0e\u8bc6\u522b\uff0c\u8bed\u8a00\u652f\u6301\u5217\u8868\u8be6\u89c1 OCR \u8bed\u8a00\u652f\u6301\u5217\u8868
  • \u589e\u52a0\u663e\u5b58\u56de\u6536\u903b\u8f91\u53ca\u5176\u4ed6\u663e\u5b58\u4f18\u5316\u63aa\u65bd\uff0c\u5927\u5e45\u964d\u4f4e\u663e\u5b58\u4f7f\u7528\u9700\u6c42\u3002\u5f00\u542f\u9664\u8868\u683c\u52a0\u901f\u5916\u7684\u5168\u90e8\u52a0\u901f\u529f\u80fd(layout/\u516c\u5f0f/OCR)\u7684\u663e\u5b58\u9700\u6c42\u4ece16GB\u964d\u81f38GB\uff0c\u5f00\u542f\u5168\u90e8\u52a0\u901f\u529f\u80fd\u7684\u663e\u5b58\u9700\u6c42\u4ece24GB\u964d\u81f310GB
  • \u4f18\u5316\u914d\u7f6e\u6587\u4ef6\u7684\u529f\u80fd\u5f00\u5173\uff0c\u589e\u52a0\u72ec\u7acb\u7684\u516c\u5f0f\u68c0\u6d4b\u5f00\u5173\uff0c\u65e0\u9700\u516c\u5f0f\u68c0\u6d4b\u65f6\u53ef\u5927\u5e45\u63d0\u5347\u901f\u5ea6\u548c\u89e3\u6790\u6548\u679c
  • \u96c6\u6210 PDF-Extract-Kit 1.0
  • \u52a0\u5165\u81ea\u7814\u7684 doclayout_yolo \u6a21\u578b\uff0c\u5728\u76f8\u8fd1\u89e3\u6790\u6548\u679c\u60c5\u51b5\u4e0b\u6bd4\u539f\u65b9\u6848\u63d0\u901f10\u500d\u4ee5\u4e0a\uff0c\u53ef\u901a\u8fc7\u914d\u7f6e\u6587\u4ef6\u4e0e layoutlmv3 \u81ea\u7531\u5207\u6362
  • \u516c\u5f0f\u89e3\u6790\u5347\u7ea7\u81f3 unimernet 0.2.1\uff0c\u5728\u63d0\u5347\u516c\u5f0f\u89e3\u6790\u51c6\u786e\u7387\u7684\u540c\u65f6\uff0c\u5927\u5e45\u964d\u4f4e\u663e\u5b58\u9700\u6c42
  • \u56e0 PDF-Extract-Kit 1.0 \u66f4\u6362\u4ed3\u5e93\uff0c\u9700\u8981\u91cd\u65b0\u4e0b\u8f7d\u6a21\u578b\uff0c\u6b65\u9aa4\u8be6\u89c1 \u5982\u4f55\u4e0b\u8f7d\u6a21\u578b
"},{"location":"zh/reference/changelog/#081-20240927","title":"0.8.1 (2024/09/27)","text":"

\u4fee\u590d\u4e86\u4e00\u4e9bbug\uff0c\u540c\u65f6\u63d0\u4f9b\u4e86\u5728\u7ebfdemo\u7684\u672c\u5730\u5316\u90e8\u7f72\u7248\u672c\u548c\u524d\u7aef\u754c\u9762

"},{"location":"zh/reference/changelog/#080-20240909","title":"0.8.0 (2024/09/09)","text":"

\u652f\u6301Dockerfile\u5feb\u901f\u90e8\u7f72\uff0c\u540c\u65f6\u4e0a\u7ebf\u4e86huggingface\u3001modelscope demo

"},{"location":"zh/reference/changelog/#071-20240830","title":"0.7.1 (2024/08/30)","text":"

\u96c6\u6210\u4e86paddle tablemaster\u8868\u683c\u8bc6\u522b\u529f\u80fd

"},{"location":"zh/reference/changelog/#070b1-20240809","title":"0.7.0b1 (2024/08/09)","text":"

\u7b80\u5316\u5b89\u88c5\u6b65\u9aa4\u63d0\u5347\u6613\u7528\u6027\uff0c\u52a0\u5165\u8868\u683c\u8bc6\u522b\u529f\u80fd

"},{"location":"zh/reference/changelog/#062b1-20240801","title":"0.6.2b1 (2024/08/01)","text":"

\u4f18\u5316\u4e86\u4f9d\u8d56\u51b2\u7a81\u95ee\u9898\u548c\u5b89\u88c5\u6587\u6863

"},{"location":"zh/reference/changelog/#20240705","title":"\u9996\u6b21\u5f00\u6e90 (2024/07/05)","text":"

MinerU\u9879\u76ee\u9996\u6b21\u5f00\u6e90\u53d1\u5e03

"},{"location":"zh/reference/output_files/","title":"\u8f93\u51fa\u6587\u4ef6\u683c\u5f0f","text":""},{"location":"zh/reference/output_files/#mineru","title":"MinerU \u8f93\u51fa\u6587\u4ef6\u8bf4\u660e","text":""},{"location":"zh/reference/output_files/#_1","title":"\u6982\u89c8","text":"

mineru \u547d\u4ee4\u6267\u884c\u540e\uff0c\u9664\u4e86\u8f93\u51fa\u4e3b\u8981\u7684 markdown \u6587\u4ef6\u5916\uff0c\u8fd8\u4f1a\u751f\u6210\u591a\u4e2a\u8f85\u52a9\u6587\u4ef6\u7528\u4e8e\u8c03\u8bd5\u3001\u8d28\u68c0\u548c\u8fdb\u4e00\u6b65\u5904\u7406\u3002\u8fd9\u4e9b\u6587\u4ef6\u5305\u62ec\uff1a

  • \u53ef\u89c6\u5316\u8c03\u8bd5\u6587\u4ef6\uff1a\u5e2e\u52a9\u7528\u6237\u76f4\u89c2\u4e86\u89e3\u6587\u6863\u89e3\u6790\u8fc7\u7a0b\u548c\u7ed3\u679c
  • \u7ed3\u6784\u5316\u6570\u636e\u6587\u4ef6\uff1a\u5305\u542b\u8be6\u7ec6\u7684\u89e3\u6790\u6570\u636e\uff0c\u53ef\u7528\u4e8e\u4e8c\u6b21\u5f00\u53d1

\u4e0b\u9762\u5c06\u8be6\u7ec6\u4ecb\u7ecd\u6bcf\u4e2a\u6587\u4ef6\u7684\u4f5c\u7528\u548c\u683c\u5f0f\u3002

"},{"location":"zh/reference/output_files/#_2","title":"\u53ef\u89c6\u5316\u8c03\u8bd5\u6587\u4ef6","text":""},{"location":"zh/reference/output_files/#layoutpdf","title":"\u5e03\u5c40\u5206\u6790\u6587\u4ef6 (layout.pdf)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_layout.pdf

\u529f\u80fd\u8bf4\u660e\uff1a

  • \u53ef\u89c6\u5316\u5c55\u793a\u6bcf\u4e00\u9875\u7684\u5e03\u5c40\u5206\u6790\u7ed3\u679c
  • \u6bcf\u4e2a\u68c0\u6d4b\u6846\u53f3\u4e0a\u89d2\u7684\u6570\u5b57\u8868\u793a\u9605\u8bfb\u987a\u5e8f
  • \u4f7f\u7528\u4e0d\u540c\u80cc\u666f\u8272\u5757\u533a\u5206\u4e0d\u540c\u7c7b\u578b\u7684\u5185\u5bb9\u5757

\u4f7f\u7528\u573a\u666f\uff1a

  • \u68c0\u67e5\u5e03\u5c40\u5206\u6790\u662f\u5426\u6b63\u786e
  • \u786e\u8ba4\u9605\u8bfb\u987a\u5e8f\u662f\u5426\u5408\u7406
  • \u8c03\u8bd5\u5e03\u5c40\u76f8\u5173\u95ee\u9898

"},{"location":"zh/reference/output_files/#spanpdf","title":"\u6587\u672c\u7247\u6bb5\u6587\u4ef6 (span.pdf)","text":"

Note

\u4ec5\u9002\u7528\u4e8e pipeline \u540e\u7aef

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_span.pdf

\u529f\u80fd\u8bf4\u660e\uff1a

  • \u6839\u636e span \u7c7b\u578b\u4f7f\u7528\u4e0d\u540c\u989c\u8272\u7ebf\u6846\u6807\u6ce8\u9875\u9762\u5185\u5bb9
  • \u7528\u4e8e\u8d28\u91cf\u68c0\u67e5\u548c\u95ee\u9898\u6392\u67e5

\u4f7f\u7528\u573a\u666f\uff1a

  • \u5feb\u901f\u6392\u67e5\u6587\u672c\u4e22\u5931\u95ee\u9898
  • \u68c0\u67e5\u884c\u5185\u516c\u5f0f\u8bc6\u522b\u60c5\u51b5
  • \u9a8c\u8bc1\u6587\u672c\u5206\u5272\u51c6\u786e\u6027

"},{"location":"zh/reference/output_files/#_3","title":"\u7ed3\u6784\u5316\u6570\u636e\u6587\u4ef6","text":"

Important

2.5\u7248\u672cvlm\u540e\u7aef\u7684\u8f93\u51fa\u5b58\u5728\u8f83\u5927\u53d8\u5316\uff0c\u4e0epipeline\u7248\u672c\u5b58\u5728\u4e0d\u517c\u5bb9\u60c5\u51b5\uff0c\u5982\u9700\u57fa\u4e8e\u7ed3\u6784\u5316\u8f93\u51fa\u8fdb\u884c\u4e8c\u6b21\u5f00\u53d1\uff0c\u8bf7\u4ed4\u7ec6\u9605\u8bfb\u672c\u6587\u6863\u5185\u5bb9\u3002

"},{"location":"zh/reference/output_files/#pipeline","title":"pipeline \u540e\u7aef \u8f93\u51fa\u7ed3\u679c","text":""},{"location":"zh/reference/output_files/#modeljson","title":"\u6a21\u578b\u63a8\u7406\u7ed3\u679c (model.json)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_model.json

"},{"location":"zh/reference/output_files/#_4","title":"\u6570\u636e\u7ed3\u6784\u5b9a\u4e49","text":"
from pydantic import BaseModel, Field\nfrom enum import IntEnum\n\nclass CategoryType(IntEnum):\n    \"\"\"\u5185\u5bb9\u7c7b\u522b\u679a\u4e3e\"\"\"\n    title = 0               # \u6807\u9898\n    plain_text = 1          # \u6587\u672c\n    abandon = 2             # \u5305\u62ec\u9875\u7709\u9875\u811a\u9875\u7801\u548c\u9875\u9762\u6ce8\u91ca\n    figure = 3              # \u56fe\u7247\n    figure_caption = 4      # \u56fe\u7247\u63cf\u8ff0\n    table = 5               # \u8868\u683c\n    table_caption = 6       # \u8868\u683c\u63cf\u8ff0\n    table_footnote = 7      # \u8868\u683c\u6ce8\u91ca\n    isolate_formula = 8     # \u884c\u95f4\u516c\u5f0f\n    formula_caption = 9     # \u884c\u95f4\u516c\u5f0f\u7684\u6807\u53f7\n    embedding = 13          # \u884c\u5185\u516c\u5f0f\n    isolated = 14           # \u884c\u95f4\u516c\u5f0f\n    text = 15               # OCR \u8bc6\u522b\u7ed3\u679c\n\nclass PageInfo(BaseModel):\n    \"\"\"\u9875\u9762\u4fe1\u606f\"\"\"\n    page_no: int = Field(description=\"\u9875\u7801\u5e8f\u53f7\uff0c\u7b2c\u4e00\u9875\u7684\u5e8f\u53f7\u662f 0\", ge=0)\n    height: int = Field(description=\"\u9875\u9762\u9ad8\u5ea6\", gt=0)\n    width: int = Field(description=\"\u9875\u9762\u5bbd\u5ea6\", ge=0)\n\nclass ObjectInferenceResult(BaseModel):\n    \"\"\"\u5bf9\u8c61\u8bc6\u522b\u7ed3\u679c\"\"\"\n    category_id: CategoryType = Field(description=\"\u7c7b\u522b\", ge=0)\n    poly: list[float] = Field(description=\"\u56db\u8fb9\u5f62\u5750\u6807\uff0c\u683c\u5f0f\u4e3a [x0,y0,x1,y1,x2,y2,x3,y3]\")\n    score: float = Field(description=\"\u63a8\u7406\u7ed3\u679c\u7684\u7f6e\u4fe1\u5ea6\")\n    latex: str | None = Field(description=\"LaTeX \u89e3\u6790\u7ed3\u679c\", default=None)\n    html: str | None = Field(description=\"HTML \u89e3\u6790\u7ed3\u679c\", default=None)\n\nclass PageInferenceResults(BaseModel):\n    \"\"\"\u9875\u9762\u63a8\u7406\u7ed3\u679c\"\"\"\n    layout_dets: list[ObjectInferenceResult] = Field(description=\"\u9875\u9762\u8bc6\u522b\u7ed3\u679c\")\n    page_info: PageInfo = Field(description=\"\u9875\u9762\u5143\u4fe1\u606f\")\n\n# \u5b8c\u6574\u7684\u63a8\u7406\u7ed3\u679c\ninference_result: list[PageInferenceResults] = []\n
"},{"location":"zh/reference/output_files/#_5","title":"\u5750\u6807\u7cfb\u7edf\u8bf4\u660e","text":"

poly \u5750\u6807\u683c\u5f0f\uff1a[x0, y0, x1, y1, x2, y2, x3, y3]

  • \u5206\u522b\u8868\u793a\u5de6\u4e0a\u3001\u53f3\u4e0a\u3001\u53f3\u4e0b\u3001\u5de6\u4e0b\u56db\u70b9\u7684\u5750\u6807
  • \u5750\u6807\u539f\u70b9\u5728\u9875\u9762\u5de6\u4e0a\u89d2

"},{"location":"zh/reference/output_files/#_6","title":"\u793a\u4f8b\u6570\u636e","text":"
[\n    {\n        \"layout_dets\": [\n            {\n                \"category_id\": 2,\n                \"poly\": [\n                    99.1906967163086,\n                    100.3119125366211,\n                    730.3707885742188,\n                    100.3119125366211,\n                    730.3707885742188,\n                    245.81326293945312,\n                    99.1906967163086,\n                    245.81326293945312\n                ],\n                \"score\": 0.9999997615814209\n            }\n        ],\n        \"page_info\": {\n            \"page_no\": 0,\n            \"height\": 2339,\n            \"width\": 1654\n        }\n    },\n    {\n        \"layout_dets\": [\n            {\n                \"category_id\": 5,\n                \"poly\": [\n                    99.13092803955078,\n                    2210.680419921875,\n                    497.3183898925781,\n                    2210.680419921875,\n                    497.3183898925781,\n                    2264.78076171875,\n                    99.13092803955078,\n                    2264.78076171875\n                ],\n                \"score\": 0.9999997019767761\n            }\n        ],\n        \"page_info\": {\n            \"page_no\": 1,\n            \"height\": 2339,\n            \"width\": 1654\n        }\n    }\n]\n
"},{"location":"zh/reference/output_files/#middlejson","title":"\u4e2d\u95f4\u5904\u7406\u7ed3\u679c (middle.json)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_middle.json

"},{"location":"zh/reference/output_files/#_7","title":"\u9876\u5c42\u7ed3\u6784","text":"\u5b57\u6bb5\u540d \u7c7b\u578b \u8bf4\u660e pdf_info list[dict] \u6bcf\u4e00\u9875\u7684\u89e3\u6790\u7ed3\u679c\u6570\u7ec4 _backend string \u89e3\u6790\u6a21\u5f0f\uff1apipeline \u6216 vlm _version_name string MinerU \u7248\u672c\u53f7"},{"location":"zh/reference/output_files/#pdf_info","title":"\u9875\u9762\u4fe1\u606f\u7ed3\u6784 (pdf_info)","text":"\u5b57\u6bb5\u540d \u8bf4\u660e preproc_blocks PDF \u9884\u5904\u7406\u540e\u7684\u672a\u5206\u6bb5\u4e2d\u95f4\u7ed3\u679c page_idx \u9875\u7801\uff0c\u4ece 0 \u5f00\u59cb page_size \u9875\u9762\u7684\u5bbd\u5ea6\u548c\u9ad8\u5ea6 [width, height] images \u56fe\u7247\u5757\u4fe1\u606f\u5217\u8868 tables \u8868\u683c\u5757\u4fe1\u606f\u5217\u8868 interline_equations \u884c\u95f4\u516c\u5f0f\u5757\u4fe1\u606f\u5217\u8868 discarded_blocks \u9700\u8981\u4e22\u5f03\u7684\u5757\u4fe1\u606f para_blocks \u5206\u6bb5\u540e\u7684\u5185\u5bb9\u5757\u7ed3\u679c"},{"location":"zh/reference/output_files/#_8","title":"\u5757\u7ed3\u6784\u5c42\u6b21","text":"
\u4e00\u7ea7\u5757 (table | image)\n\u2514\u2500\u2500 \u4e8c\u7ea7\u5757\n    \u2514\u2500\u2500 \u884c (line)\n        \u2514\u2500\u2500 \u7247\u6bb5 (span)\n
"},{"location":"zh/reference/output_files/#_9","title":"\u4e00\u7ea7\u5757\u5b57\u6bb5","text":"\u5b57\u6bb5\u540d \u8bf4\u660e type \u5757\u7c7b\u578b\uff1atable \u6216 image bbox \u5757\u7684\u77e9\u5f62\u6846\u5750\u6807 [x0, y0, x1, y1] blocks \u5305\u542b\u7684\u4e8c\u7ea7\u5757\u5217\u8868"},{"location":"zh/reference/output_files/#_10","title":"\u4e8c\u7ea7\u5757\u5b57\u6bb5","text":"\u5b57\u6bb5\u540d \u8bf4\u660e type \u5757\u7c7b\u578b\uff08\u8be6\u89c1\u4e0b\u8868\uff09 bbox \u5757\u7684\u77e9\u5f62\u6846\u5750\u6807 lines \u5305\u542b\u7684\u884c\u4fe1\u606f\u5217\u8868"},{"location":"zh/reference/output_files/#_11","title":"\u4e8c\u7ea7\u5757\u7c7b\u578b","text":"\u7c7b\u578b \u8bf4\u660e image_body \u56fe\u50cf\u672c\u4f53 image_caption \u56fe\u50cf\u63cf\u8ff0\u6587\u672c image_footnote \u56fe\u50cf\u811a\u6ce8 table_body \u8868\u683c\u672c\u4f53 table_caption \u8868\u683c\u63cf\u8ff0\u6587\u672c table_footnote \u8868\u683c\u811a\u6ce8 text \u6587\u672c\u5757 title \u6807\u9898\u5757 index \u76ee\u5f55\u5757 list \u5217\u8868\u5757 interline_equation \u884c\u95f4\u516c\u5f0f\u5757"},{"location":"zh/reference/output_files/#_12","title":"\u884c\u548c\u7247\u6bb5\u7ed3\u6784","text":"

\u884c (line) \u5b57\u6bb5\uff1a - bbox\uff1a\u884c\u7684\u77e9\u5f62\u6846\u5750\u6807 - spans\uff1a\u5305\u542b\u7684\u7247\u6bb5\u5217\u8868

\u7247\u6bb5 (span) \u5b57\u6bb5\uff1a - bbox\uff1a\u7247\u6bb5\u7684\u77e9\u5f62\u6846\u5750\u6807 - type\uff1a\u7247\u6bb5\u7c7b\u578b\uff08image\u3001table\u3001text\u3001inline_equation\u3001interline_equation\uff09 - content | img_path\uff1a\u6587\u672c\u5185\u5bb9\u6216\u56fe\u7247\u8def\u5f84

"},{"location":"zh/reference/output_files/#_13","title":"\u793a\u4f8b\u6570\u636e","text":"
{\n    \"pdf_info\": [\n        {\n            \"preproc_blocks\": [\n                {\n                    \"type\": \"text\",\n                    \"bbox\": [\n                        52,\n                        61.956024169921875,\n                        294,\n                        82.99800872802734\n                    ],\n                    \"lines\": [\n                        {\n                            \"bbox\": [\n                                52,\n                                61.956024169921875,\n                                294,\n                                72.0000228881836\n                            ],\n                            \"spans\": [\n                                {\n                                    \"bbox\": [\n                                        54.0,\n                                        61.956024169921875,\n                                        296.2261657714844,\n                                        72.0000228881836\n                                    ],\n                                    \"content\": \"dependent on the service headway and the reliability of the departure \",\n                                    \"type\": \"text\",\n                                    \"score\": 1.0\n                                }\n                            ]\n                        }\n                    ]\n                }\n            ],\n            \"layout_bboxes\": [\n                {\n                    \"layout_bbox\": [\n                        52,\n                        61,\n                        294,\n                        731\n                    ],\n                    \"layout_label\": \"V\",\n                    \"sub_layout\": []\n                }\n            ],\n            \"page_idx\": 0,\n            \"page_size\": [\n                612.0,\n                792.0\n            ],\n            \"_layout_tree\": [],\n            \"images\": [],\n            \"tables\": [],\n            \"interline_equations\": [],\n            \"discarded_blocks\": [],\n            \"para_blocks\": [\n                {\n                    \"type\": \"text\",\n                    \"bbox\": [\n                        52,\n                        61.956024169921875,\n                        294,\n                        82.99800872802734\n                    ],\n                    \"lines\": [\n                        {\n                            \"bbox\": [\n                                52,\n                                61.956024169921875,\n                                294,\n                                72.0000228881836\n                            ],\n                            \"spans\": [\n                                {\n                                    \"bbox\": [\n                                        54.0,\n                                        61.956024169921875,\n                                        296.2261657714844,\n                                        72.0000228881836\n                                    ],\n                                    \"content\": \"dependent on the service headway and the reliability of the departure \",\n                                    \"type\": \"text\",\n                                    \"score\": 1.0\n                                }\n                            ]\n                        }\n                    ]\n                }\n            ]\n        }\n    ],\n    \"_backend\": \"pipeline\",\n    \"_version_name\": \"0.6.1\"\n}\n
"},{"location":"zh/reference/output_files/#content_listjson","title":"\u5185\u5bb9\u5217\u8868 (content_list.json)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_content_list.json

"},{"location":"zh/reference/output_files/#_14","title":"\u529f\u80fd\u8bf4\u660e","text":"

\u8fd9\u662f\u4e00\u4e2a\u7b80\u5316\u7248\u7684 middle.json\uff0c\u6309\u9605\u8bfb\u987a\u5e8f\u5e73\u94fa\u5b58\u50a8\u6240\u6709\u53ef\u8bfb\u5185\u5bb9\u5757\uff0c\u53bb\u9664\u4e86\u590d\u6742\u7684\u5e03\u5c40\u4fe1\u606f\uff0c\u4fbf\u4e8e\u540e\u7eed\u5904\u7406\u3002

"},{"location":"zh/reference/output_files/#_15","title":"\u5185\u5bb9\u7c7b\u578b","text":"\u7c7b\u578b \u8bf4\u660e image \u56fe\u7247 table \u8868\u683c text \u6587\u672c/\u6807\u9898 equation \u884c\u95f4\u516c\u5f0f"},{"location":"zh/reference/output_files/#_16","title":"\u6587\u672c\u5c42\u7ea7\u6807\u8bc6","text":"

\u901a\u8fc7 text_level \u5b57\u6bb5\u533a\u5206\u6587\u672c\u5c42\u7ea7\uff1a

  • \u65e0 text_level \u6216 text_level: 0\uff1a\u6b63\u6587\u6587\u672c
  • text_level: 1\uff1a\u4e00\u7ea7\u6807\u9898
  • text_level: 2\uff1a\u4e8c\u7ea7\u6807\u9898
  • \u4ee5\u6b64\u7c7b\u63a8...
"},{"location":"zh/reference/output_files/#_17","title":"\u901a\u7528\u5b57\u6bb5","text":"
  • \u6240\u6709\u5185\u5bb9\u5757\u90fd\u5305\u542b page_idx \u5b57\u6bb5\uff0c\u8868\u793a\u6240\u5728\u9875\u7801\uff08\u4ece 0 \u5f00\u59cb\uff09\u3002
  • \u6240\u6709\u5185\u5bb9\u5757\u90fd\u5305\u542b bbox \u5b57\u6bb5\uff0c\u8868\u793a\u5185\u5bb9\u5757\u7684\u8fb9\u754c\u6846\u5750\u6807 [x0, y0, x1, y1] \u6620\u5c04\u57280-1000\u8303\u56f4\u5185\u7684\u7ed3\u679c\u3002
"},{"location":"zh/reference/output_files/#_18","title":"\u793a\u4f8b\u6570\u636e","text":"
[\n        {\n        \"type\": \"text\",\n        \"text\": \"The response of flow duration curves to afforestation \",\n        \"text_level\": 1, \n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],\n        \"page_idx\": 0\n    },\n    {\n        \"type\": \"image\",\n        \"img_path\": \"images/a8ecda1c69b27e4f79fce1589175a9d721cbdc1cf78b4cc06a015f3746f6b9d8.jpg\",\n        \"image_caption\": [\n            \"Fig. 1. Annual flow duration curves of daily flows from Pine Creek, Australia, 1989\u20132000. \"\n        ],\n        \"image_footnote\": [],\n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],\n        \"page_idx\": 1\n    },\n    {\n        \"type\": \"equation\",\n        \"img_path\": \"images/181ea56ef185060d04bf4e274685f3e072e922e7b839f093d482c29bf89b71e8.jpg\",\n        \"text\": \"$$\\nQ _ { \\\\% } = f ( P ) + g ( T )\\n$$\",\n        \"text_format\": \"latex\",\n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],\n        \"page_idx\": 2\n    },\n    {\n        \"type\": \"table\",\n        \"img_path\": \"images/e3cb413394a475e555807ffdad913435940ec637873d673ee1b039e3bc3496d0.jpg\",\n        \"table_caption\": [\n            \"Table 2 Significance of the rainfall and time terms \"\n        ],\n        \"table_footnote\": [\n            \"indicates that the rainfall term was significant at the $5 \\\\%$ level, $T$ indicates that the time term was significant at the $5 \\\\%$ level, \\\\* represents significance at the $10 \\\\%$ level, and na denotes too few data points for meaningful analysis. \"\n        ],\n        \"table_body\": \"<html><body><table><tr><td rowspan=\\\"2\\\">Site</td><td colspan=\\\"10\\\">Percentile</td></tr><tr><td>10</td><td>20</td><td>30</td><td>40</td><td>50</td><td>60</td><td>70</td><td>80</td><td>90</td><td>100</td></tr><tr><td>Traralgon Ck</td><td>P</td><td>P,*</td><td>P</td><td>P</td><td>P,</td><td>P,</td><td>P,</td><td>P,</td><td>P</td><td>P</td></tr><tr><td>Redhill</td><td>P,T</td><td>P,T</td><td>\uff0c*</td><td>**</td><td>P.T</td><td>P,*</td><td>P*</td><td>P*</td><td>*</td><td>\uff0c*</td></tr><tr><td>Pine Ck</td><td></td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td><td>T</td><td>na</td><td>na</td></tr><tr><td>Stewarts Ck 5</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P.T</td><td>P.T</td><td>P,T</td><td>na</td><td>na</td><td>na</td></tr><tr><td>Glendhu 2</td><td>P</td><td>P,T</td><td>P,*</td><td>P,T</td><td>P.T</td><td>P,ns</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td></tr><tr><td>Cathedral Peak 2</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Cathedral Peak 3</td><td>P.T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Lambrechtsbos A</td><td>P,T</td><td>P</td><td>P</td><td>P,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>T</td></tr><tr><td>Lambrechtsbos B</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td></tr><tr><td>Biesievlei</td><td>P,T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>*,T</td><td>T</td><td>T</td><td>P,T</td><td>P,T</td></tr></table></body></html>\",\n        \"bbox\": [\n            62,\n            480,\n            946,\n            904\n        ],  \n        \"page_idx\": 5\n    }\n]\n
"},{"location":"zh/reference/output_files/#vlm","title":"VLM \u540e\u7aef \u8f93\u51fa\u7ed3\u679c","text":""},{"location":"zh/reference/output_files/#modeljson_1","title":"\u6a21\u578b\u63a8\u7406\u7ed3\u679c (model.json)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_model.json

"},{"location":"zh/reference/output_files/#_19","title":"\u6587\u4ef6\u683c\u5f0f\u8bf4\u660e","text":"
  • \u8be5\u6587\u4ef6\u4e3a VLM \u6a21\u578b\u7684\u539f\u59cb\u8f93\u51fa\u7ed3\u679c\uff0c\u5305\u542b\u4e24\u5c42\u5d4c\u5957list\uff0c\u5916\u5c42\u8868\u793a\u9875\u9762\uff0c\u5185\u5c42\u8868\u793a\u8be5\u9875\u7684\u5185\u5bb9\u5757
  • \u6bcf\u4e2a\u5185\u5bb9\u5757\u90fd\u662f\u4e00\u4e2adict\uff0c\u5305\u542b type\u3001bbox\u3001angle\u3001content \u5b57\u6bb5
"},{"location":"zh/reference/output_files/#_20","title":"\u652f\u6301\u7684\u5185\u5bb9\u7c7b\u578b","text":"
{\n    \"text\": \"\u6587\u672c\",\n    \"title\": \"\u6807\u9898\", \n    \"equation\": \"\u884c\u95f4\u516c\u5f0f\",\n    \"image\": \"\u56fe\u7247\",\n    \"image_caption\": \"\u56fe\u7247\u63cf\u8ff0\",\n    \"image_footnote\": \"\u56fe\u7247\u811a\u6ce8\",\n    \"table\": \"\u8868\u683c\",\n    \"table_caption\": \"\u8868\u683c\u63cf\u8ff0\",\n    \"table_footnote\": \"\u8868\u683c\u811a\u6ce8\",\n    \"phonetic\": \"\u62fc\u97f3\",\n    \"code\": \"\u4ee3\u7801\u5757\",\n    \"code_caption\": \"\u4ee3\u7801\u63cf\u8ff0\",\n    \"ref_text\": \"\u53c2\u8003\u6587\u732e\",\n    \"algorithm\": \"\u7b97\u6cd5\u5757\",\n    \"list\": \"\u5217\u8868\",\n    \"header\": \"\u9875\u7709\",\n    \"footer\": \"\u9875\u811a\",\n    \"page_number\": \"\u9875\u7801\",\n    \"aside_text\": \"\u88c5\u8ba2\u7ebf\u65c1\u6ce8\", \n    \"page_footnote\": \"\u9875\u9762\u811a\u6ce8\"\n}\n
"},{"location":"zh/reference/output_files/#_21","title":"\u5750\u6807\u7cfb\u7edf\u8bf4\u660e","text":"

bbox \u5750\u6807\u683c\u5f0f\uff1a[x0, y0, x1, y1]

  • \u5206\u522b\u8868\u793a\u5de6\u4e0a\u3001\u53f3\u4e0b\u4e24\u70b9\u7684\u5750\u6807
  • \u5750\u6807\u539f\u70b9\u5728\u9875\u9762\u5de6\u4e0a\u89d2
  • \u5750\u6807\u4e3a\u76f8\u5bf9\u4e8e\u539f\u59cb\u9875\u9762\u5c3a\u5bf8\u7684\u767e\u5206\u6bd4\uff0c\u8303\u56f4\u57280-1\u4e4b\u95f4
"},{"location":"zh/reference/output_files/#_22","title":"\u793a\u4f8b\u6570\u636e","text":"
[\n    [\n        {\n            \"type\": \"header\",\n            \"bbox\": [\n                0.077,\n                0.095,\n                0.18,\n                0.181\n            ],\n            \"angle\": 0,\n            \"score\": null,\n            \"block_tags\": null,\n            \"content\": \"ELSEVIER\",\n            \"format\": null,\n            \"content_tags\": null\n        },\n        {\n            \"type\": \"title\",\n            \"bbox\": [\n                0.157,\n                0.228,\n                0.833,\n                0.253\n            ],\n            \"angle\": 0,\n            \"score\": null,\n            \"block_tags\": null,\n            \"content\": \"The response of flow duration curves to afforestation\",\n            \"format\": null,\n            \"content_tags\": null\n        }\n    ]\n]\n
"},{"location":"zh/reference/output_files/#middlejson_1","title":"\u4e2d\u95f4\u5904\u7406\u7ed3\u679c (middle.json)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_middle.json

"},{"location":"zh/reference/output_files/#_23","title":"\u6587\u4ef6\u683c\u5f0f\u8bf4\u660e","text":"

vlm \u540e\u7aef\u7684 middle.json \u6587\u4ef6\u7ed3\u6784\u4e0e pipeline \u540e\u7aef\u7c7b\u4f3c\uff0c\u4f46\u5b58\u5728\u4ee5\u4e0b\u5dee\u5f02\uff1a

  • list\u53d8\u6210\u4e8c\u7ea7block\uff0c\u589e\u52a0sub_type\u5b57\u6bb5\u533a\u5206list\u7c7b\u578b:

    • text\uff08\u6587\u672c\u7c7b\u578b\uff09
    • ref_text\uff08\u5f15\u7528\u7c7b\u578b\uff09
  • \u589e\u52a0code\u7c7b\u578bblock\uff0ccode\u7c7b\u578b\u5305\u542b\u4e24\u79cd\"sub_type\":

    • \u5206\u522b\u662fcode\u548calgorithm
    • \u81f3\u5c11\u6709code_body, \u53ef\u9009code_caption
  • discarded_blocks\u5185\u5143\u7d20type\u589e\u52a0\u4ee5\u4e0b\u7c7b\u578b:

    • header\uff08\u9875\u7709\uff09
    • footer\uff08\u9875\u811a\uff09
    • page_number\uff08\u9875\u7801\uff09
    • aside_text\uff08\u88c5\u8ba2\u7ebf\u6587\u672c\uff09
    • page_footnote\uff08\u811a\u6ce8\uff09
  • \u6240\u6709block\u589e\u52a0angle\u5b57\u6bb5\uff0c\u7528\u6765\u8868\u793a\u65cb\u8f6c\u89d2\u5ea6\uff0c0\uff0c90\uff0c180\uff0c270
"},{"location":"zh/reference/output_files/#_24","title":"\u793a\u4f8b\u6570\u636e","text":"
  • list block \u793a\u4f8b
    {\n    \"bbox\": [\n        174,\n        155,\n        818,\n        333\n    ],\n    \"type\": \"list\",\n    \"angle\": 0,\n    \"index\": 11,\n    \"blocks\": [\n        {\n            \"bbox\": [\n                174,\n                157,\n                311,\n                175\n            ],\n            \"type\": \"text\",\n            \"angle\": 0,\n            \"lines\": [\n                {\n                    \"bbox\": [\n                        174,\n                        157,\n                        311,\n                        175\n                    ],\n                    \"spans\": [\n                        {\n                            \"bbox\": [\n                                174,\n                                157,\n                                311,\n                                175\n                            ],\n                            \"type\": \"text\",\n                            \"content\": \"H.1 Introduction\"\n                        }\n                    ]\n                }\n            ],\n            \"index\": 3\n        },\n        {\n            \"bbox\": [\n                175,\n                182,\n                464,\n                229\n            ],\n            \"type\": \"text\",\n            \"angle\": 0,\n            \"lines\": [\n                {\n                    \"bbox\": [\n                        175,\n                        182,\n                        464,\n                        229\n                    ],\n                    \"spans\": [\n                        {\n                            \"bbox\": [\n                                175,\n                                182,\n                                464,\n                                229\n                            ],\n                            \"type\": \"text\",\n                            \"content\": \"H.2 Example: Divide by Zero without Exception Handling\"\n                        }\n                    ]\n                }\n            ],\n            \"index\": 4\n        }\n    ],\n    \"sub_type\": \"text\"\n}\n
  • code block \u793a\u4f8b
    {\n    \"type\": \"code\",\n    \"bbox\": [\n        114,\n        780,\n        885,\n        1231\n    ],\n    \"blocks\": [\n        {\n            \"bbox\": [\n                114,\n                780,\n                885,\n                1231\n            ],\n            \"lines\": [\n                {\n                    \"bbox\": [\n                        114,\n                        780,\n                        885,\n                        1231\n                    ],\n                    \"spans\": [\n                        {\n                            \"bbox\": [\n                                114,\n                                780,\n                                885,\n                                1231\n                            ],\n                            \"type\": \"text\",\n                            \"content\": \"1 // Fig. H.1: DivideByZeroNoExceptionHandling.java  \\n2 // Integer division without exception handling.  \\n3 import java.util.Scanner;  \\n4  \\n5 public class DivideByZeroNoExceptionHandling  \\n6 {  \\n7 // demonstrates throwing an exception when a divide-by-zero occurs  \\n8 public static int quotient( int numerator, int denominator )  \\n9 {  \\n10 return numerator / denominator; // possible division by zero  \\n11 } // end method quotient  \\n12  \\n13 public static void main(String[] args)  \\n14 {  \\n15 Scanner scanner = new Scanner(System.in); // scanner for input  \\n16  \\n17 System.out.print(\\\"Please enter an integer numerator: \\\");  \\n18 int numerator = scanner.nextInt();  \\n19 System.out.print(\\\"Please enter an integer denominator: \\\");  \\n20 int denominator = scanner.nextInt();  \\n21\"\n                        }\n                    ]\n                }\n            ],\n            \"index\": 17,\n            \"angle\": 0,\n            \"type\": \"code_body\"\n        },\n        {\n            \"bbox\": [\n                867,\n                160,\n                1280,\n                189\n            ],\n            \"lines\": [\n                {\n                    \"bbox\": [\n                        867,\n                        160,\n                        1280,\n                        189\n                    ],\n                    \"spans\": [\n                        {\n                            \"bbox\": [\n                                867,\n                                160,\n                                1280,\n                                189\n                            ],\n                            \"type\": \"text\",\n                            \"content\": \"Algorithm 1 Modules for MCTSteg\"\n                        }\n                    ]\n                }\n            ],\n            \"index\": 19,\n            \"angle\": 0,\n            \"type\": \"code_caption\"\n        }\n    ],\n    \"index\": 17,\n    \"sub_type\": \"code\"\n}\n
"},{"location":"zh/reference/output_files/#content_listjson_1","title":"\u5185\u5bb9\u5217\u8868 (content_list.json)","text":"

\u6587\u4ef6\u547d\u540d\u683c\u5f0f\uff1a{\u539f\u6587\u4ef6\u540d}_content_list.json

"},{"location":"zh/reference/output_files/#_25","title":"\u6587\u4ef6\u683c\u5f0f\u8bf4\u660e","text":"

vlm \u540e\u7aef\u7684 content_list.json \u6587\u4ef6\u7ed3\u6784\u4e0e pipeline \u540e\u7aef\u7c7b\u4f3c\uff0c\u4f34\u968f\u672c\u6b21middle.json\u7684\u53d8\u5316\uff0c\u505a\u4e86\u4ee5\u4e0b\u8c03\u6574\uff1a

  • \u65b0\u589ecode\u7c7b\u578b\uff0ccode\u7c7b\u578b\u5305\u542b\u4e24\u79cd\"sub_type\":

    • \u5206\u522b\u662fcode\u548calgorithm
    • \u81f3\u5c11\u6709code_body, \u53ef\u9009code_caption
  • \u65b0\u589elist\u7c7b\u578b\uff0clist\u7c7b\u578b\u5305\u542b\u4e24\u79cd\"sub_type\":

    • text
    • ref_text
  • \u589e\u52a0\u6240\u6709\u6240\u6709discarded_blocks\u7684\u8f93\u51fa\u5185\u5bb9

    • header
    • footer
    • page_number
    • aside_text
    • page_footnote
"},{"location":"zh/reference/output_files/#_26","title":"\u793a\u4f8b\u6570\u636e","text":"
  • code \u7c7b\u578b content
    {\n    \"type\": \"code\",\n    \"sub_type\": \"algorithm\",\n    \"code_caption\": [\n        \"Algorithm 1 Modules for MCTSteg\"\n    ],\n    \"code_body\": \"1: function GETCOORDINATE(d)  \\n2:  $x \\\\gets d / l$ ,  $y \\\\gets d$  mod  $l$   \\n3: return  $(x, y)$   \\n4: end function  \\n5: function BESTCHILD(v)  \\n6:  $C \\\\gets$  child set of  $v$   \\n7:  $v' \\\\gets \\\\arg \\\\max_{c \\\\in C} \\\\mathrm{UCTScore}(c)$   \\n8:  $v'.n \\\\gets v'.n + 1$   \\n9: return  $v'$   \\n10: end function  \\n11: function BACK PROPAGATE(v)  \\n12: Calculate  $R$  using Equation 11  \\n13: while  $v$  is not a root node do  \\n14:  $v.r \\\\gets v.r + R$ ,  $v \\\\gets v.p$   \\n15: end while  \\n16: end function  \\n17: function RANDOMSEARCH(v)  \\n18: while  $v$  is not a leaf node do  \\n19: Randomly select an untried action  $a \\\\in A(v)$   \\n20: Create a new node  $v'$   \\n21:  $(x, y) \\\\gets \\\\mathrm{GETCOORDINATE}(v'.d)$   \\n22:  $v'.p \\\\gets v$ ,  $v'.d \\\\gets v.d + 1$ ,  $v'.\\\\Gamma \\\\gets v.\\\\Gamma$   \\n23:  $v'.\\\\gamma_{x,y} \\\\gets a$   \\n24: if  $a = -1$  then  \\n25:  $v.lc \\\\gets v'$   \\n26: else if  $a = 0$  then  \\n27:  $v.mc \\\\gets v'$   \\n28: else  \\n29:  $v.rc \\\\gets v'$   \\n30: end if  \\n31:  $v \\\\gets v'$   \\n32: end while  \\n33: return  $v$   \\n34: end function  \\n35: function SEARCH(v)  \\n36: while  $v$  is fully expanded do  \\n37:  $v \\\\gets$  BESTCHILD(v)  \\n38: end while  \\n39: if  $v$  is not a leaf node then  \\n40:  $v \\\\gets$  RANDOMSEARCH(v)  \\n41: end if  \\n42: return  $v$   \\n43: end function\",\n    \"bbox\": [\n        510,\n        87,\n        881,\n        740\n    ],\n    \"page_idx\": 0\n}\n
  • list \u7c7b\u578b content
    {\n    \"type\": \"list\",\n    \"sub_type\": \"text\",\n    \"list_items\": [\n        \"H.1 Introduction\",\n        \"H.2 Example: Divide by Zero without Exception Handling\",\n        \"H.3 Example: Divide by Zero with Exception Handling\",\n        \"H.4 Summary\"\n    ],\n    \"bbox\": [\n        174,\n        155,\n        818,\n        333\n    ],\n    \"page_idx\": 0\n}\n
  • discarded \u7c7b\u578b content
    [{\n    \"type\": \"header\",\n    \"text\": \"Journal of Hydrology 310 (2005) 253-265\",\n    \"bbox\": [\n        363,\n        164,\n        623,\n        177\n    ],\n    \"page_idx\": 0\n},\n{\n    \"type\": \"page_footnote\",\n    \"text\": \"* Corresponding author. Address: Forest Science Centre, Department of Sustainability and Environment, P.O. Box 137, Heidelberg, Vic. 3084, Australia. Tel.: +61 3 9450 8719; fax: +61 3 9450 8644.\",\n    \"bbox\": [\n        71,\n        815,\n        915,\n        841\n    ],\n    \"page_idx\": 0\n}]\n
"},{"location":"zh/reference/output_files/#_27","title":"\u603b\u7ed3","text":"

\u4ee5\u4e0a\u6587\u4ef6\u4e3a MinerU \u7684\u5b8c\u6574\u8f93\u51fa\u7ed3\u679c\uff0c\u7528\u6237\u53ef\u6839\u636e\u9700\u8981\u9009\u62e9\u5408\u9002\u7684\u6587\u4ef6\u8fdb\u884c\u540e\u7eed\u5904\u7406\uff1a

  • \u6a21\u578b\u8f93\u51fa(\u4f7f\u7528\u539f\u59cb\u8f93\u51fa):

    • model.json
  • \u8c03\u8bd5\u548c\u9a8c\u8bc1(\u4f7f\u7528\u53ef\u89c6\u5316\u6587\u4ef6):

    • layout.pdf
    • span.pdf
  • \u5185\u5bb9\u63d0\u53d6(\u4f7f\u7528\u7b80\u5316\u6587\u4ef6):

    • *.md
    • content_list.json
  • \u4e8c\u6b21\u5f00\u53d1(\u4f7f\u7528\u7ed3\u6784\u5316\u6587\u4ef6):

    • middle.json
"},{"location":"zh/usage/","title":"\u4f7f\u7528\u6307\u5357","text":""},{"location":"zh/usage/#_1","title":"\u4f7f\u7528\u6307\u5357","text":"

\u672c\u7ae0\u8282\u63d0\u4f9b\u4e86\u9879\u76ee\u7684\u5b8c\u6574\u4f7f\u7528\u8bf4\u660e\u3002\u6211\u4eec\u5c06\u901a\u8fc7\u4ee5\u4e0b\u51e0\u4e2a\u90e8\u5206\uff0c\u5e2e\u52a9\u60a8\u4ece\u57fa\u7840\u5230\u8fdb\u9636\u9010\u6b65\u638c\u63e1\u9879\u76ee\u7684\u4f7f\u7528\u65b9\u6cd5\uff1a

"},{"location":"zh/usage/#_2","title":"\u76ee\u5f55","text":"
  • \u672c\u5730\u90e8\u7f72
    • \u57fa\u7840\u4f7f\u7528 - \u5feb\u901f\u4e0a\u624b\u548c\u57fa\u672c\u4f7f\u7528
    • \u6a21\u578b\u6e90\u914d\u7f6e - \u6a21\u578b\u6e90\u7684\u8be6\u7ec6\u914d\u7f6e\u8bf4\u660e
    • \u547d\u4ee4\u884c\u5de5\u5177 - \u547d\u4ee4\u884c\u5de5\u5177\u7684\u8be6\u7ec6\u53c2\u6570\u8bf4\u660e
    • \u547d\u4ee4\u884c\u8fdb\u9636\u53c2\u6570 - \u4e00\u4e9b\u9002\u914d\u547d\u4ee4\u884c\u5de5\u5177\u7684\u8fdb\u9636\u53c2\u6570\u8bf4\u660e
  • \u5176\u4ed6\u52a0\u901f\u5361\u9002\u914d\uff08\ud83d\ude80\u5b98\u65b9\u652f\u6301/\u2764\ufe0f\u793e\u533a\u8d21\u732e\uff09
    • \u6607\u817e Ascend \ud83d\ude80
    • \u5e73\u5934\u54e5 T-Head \ud83d\ude80
    • \u6c90\u66e6 METAX \ud83d\ude80
    • \u6d77\u5149 Hygon \ud83d\ude80
    • \u71e7\u539f Enflame \ud83d\ude80
    • \u6469\u5c14\u7ebf\u7a0b MooreThreads \ud83d\ude80
    • \u5929\u6570\u667a\u82af IluvatarCorex \ud83d\ude80
    • \u5bd2\u6b66\u7eaa Cambricon \ud83d\ude80
    • \u6606\u4ed1\u82af Kunlunxin \ud83d\ude80
    • \u592a\u521d\u5143\u7881 Tecorigin \u2764\ufe0f
    • \u58c1\u4ede Biren \u2764\ufe0f
    • AMD #3662 \u2764\ufe0f
    • \u701a\u535a VastAI #4237 \u2764\ufe0f
  • \u63d2\u4ef6\u4e0e\u751f\u6001
    • Cherry Studio
    • Sider
    • Dify
    • n8n
    • Coze
    • FastGPT
    • ModelWhale
    • DingTalk
    • DataFlow
    • BISHENG
    • RagFlow
"},{"location":"zh/usage/#_3","title":"\u5f00\u59cb\u4f7f\u7528","text":"

\u5efa\u8bae\u6309\u7167\u4e0a\u8ff0\u987a\u5e8f\u9605\u8bfb\u6587\u6863\uff0c\u8fd9\u6837\u53ef\u4ee5\u5e2e\u52a9\u60a8\u66f4\u597d\u5730\u7406\u89e3\u548c\u4f7f\u7528\u9879\u76ee\u529f\u80fd\u3002

\u5982\u679c\u60a8\u5728\u4f7f\u7528\u8fc7\u7a0b\u4e2d\u9047\u5230\u95ee\u9898\uff0c\u8bf7\u67e5\u770b FAQ

"},{"location":"zh/usage/advanced_cli_parameters/","title":"\u547d\u4ee4\u884c\u8fdb\u9636\u53c2\u6570","text":""},{"location":"zh/usage/advanced_cli_parameters/#_1","title":"\u547d\u4ee4\u884c\u53c2\u6570\u8fdb\u9636","text":""},{"location":"zh/usage/advanced_cli_parameters/#_2","title":"\u63a8\u7406\u5f15\u64ce\u53c2\u6570\u900f\u4f20","text":""},{"location":"zh/usage/advanced_cli_parameters/#vllm","title":"vllm \u52a0\u901f\u53c2\u6570\u4f18\u5316","text":"

Tip

\u5982\u679c\u60a8\u5df2\u7ecf\u53ef\u4ee5\u6b63\u5e38\u4f7f\u7528vllm\u5bf9vlm\u6a21\u578b\u8fdb\u884c\u52a0\u901f\u63a8\u7406\uff0c\u4f46\u4ecd\u7136\u5e0c\u671b\u8fdb\u4e00\u6b65\u63d0\u5347\u63a8\u7406\u901f\u5ea6\uff0c\u53ef\u4ee5\u5c1d\u8bd5\u4ee5\u4e0b\u53c2\u6570\uff1a

  • \u5982\u679c\u60a8\u6709\u8d85\u8fc7\u591a\u5f20\u663e\u5361\uff0c\u53ef\u4ee5\u4f7f\u7528vllm\u7684\u591a\u5361\u5e76\u884c\u6a21\u5f0f\u6765\u589e\u52a0\u541e\u5410\u91cf\uff1a--data-parallel-size 2
"},{"location":"zh/usage/advanced_cli_parameters/#_3","title":"\u53c2\u6570\u4f20\u9012\u8bf4\u660e","text":"

Tip

  • \u6240\u6709vllm/lmdeploy\u5b98\u65b9\u652f\u6301\u7684\u53c2\u6570\u90fd\u53ef\u7528\u901a\u8fc7\u547d\u4ee4\u884c\u53c2\u6570\u4f20\u9012\u7ed9 MinerU\uff0c\u5305\u62ec\u4ee5\u4e0b\u547d\u4ee4:mineru\u3001mineru-openai-server\u3001mineru-gradio\u3001mineru-api
  • \u5982\u679c\u60a8\u60f3\u4e86\u89e3\u66f4\u591a\u6709\u5173vllm\u7684\u53c2\u6570\u4f7f\u7528\u65b9\u6cd5\uff0c\u8bf7\u53c2\u8003 vllm\u5b98\u65b9\u6587\u6863
  • \u5982\u679c\u60a8\u60f3\u4e86\u89e3\u66f4\u591a\u6709\u5173lmdeploy\u7684\u53c2\u6570\u4f7f\u7528\u65b9\u6cd5\uff0c\u8bf7\u53c2\u8003 lmdeploy\u5b98\u65b9\u6587\u6863
"},{"location":"zh/usage/advanced_cli_parameters/#gpu","title":"GPU \u8bbe\u5907\u9009\u62e9\u4e0e\u914d\u7f6e","text":""},{"location":"zh/usage/advanced_cli_parameters/#cuda_visible_devices","title":"CUDA_VISIBLE_DEVICES \u57fa\u672c\u7528\u6cd5","text":"

Tip

  • \u4efb\u4f55\u60c5\u51b5\u4e0b\uff0c\u60a8\u90fd\u53ef\u4ee5\u901a\u8fc7\u5728\u547d\u4ee4\u884c\u7684\u5f00\u5934\u6dfb\u52a0CUDA_VISIBLE_DEVICES \u73af\u5883\u53d8\u91cf\u6765\u6307\u5b9a\u53ef\u89c1\u7684 GPU \u8bbe\u5907\uff1a
    CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>\n
  • \u8fd9\u79cd\u6307\u5b9a\u65b9\u5f0f\u5bf9\u6240\u6709\u7684\u547d\u4ee4\u884c\u8c03\u7528\u90fd\u6709\u6548\uff0c\u5305\u62ec mineru\u3001mineru-openai-server\u3001mineru-gradio \u548c mineru-api\uff0c\u4e14\u5bf9pipeline\u3001vlm\u540e\u7aef\u5747\u9002\u7528\u3002
"},{"location":"zh/usage/advanced_cli_parameters/#_4","title":"\u5e38\u89c1\u8bbe\u5907\u914d\u7f6e\u793a\u4f8b","text":"

Tip

\u4ee5\u4e0b\u662f\u4e00\u4e9b\u5e38\u89c1\u7684 CUDA_VISIBLE_DEVICES \u8bbe\u7f6e\u793a\u4f8b\uff1a

CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen\nCUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible\nCUDA_VISIBLE_DEVICES=\"0,1\"  # Same as above, quotation marks are optional\nCUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked\nCUDA_VISIBLE_DEVICES=\"\"  # No GPU will be visible\n
"},{"location":"zh/usage/advanced_cli_parameters/#_5","title":"\u5b9e\u9645\u5e94\u7528\u573a\u666f","text":"

Tip

\u4ee5\u4e0b\u662f\u4e00\u4e9b\u53ef\u80fd\u7684\u4f7f\u7528\u573a\u666f\uff1a

  • \u5982\u679c\u60a8\u6709\u591a\u5f20\u663e\u5361\uff0c\u9700\u8981\u6307\u5b9a\u53610\u548c\u53611\uff0c\u5e76\u4f7f\u7528\u591a\u5361\u5e76\u884c\u6765\u542f\u52a8openai-server\uff0c\u53ef\u4ee5\u4f7f\u7528\u4ee5\u4e0b\u547d\u4ee4\uff1a

    CUDA_VISIBLE_DEVICES=0,1 mineru-openai-server --engine vllm --port 30000 --data-parallel-size 2\n
  • \u5982\u679c\u60a8\u6709\u591a\u5f20\u663e\u5361\uff0c\u9700\u8981\u5728\u53610\u548c\u53611\u4e0a\u542f\u52a8\u4e24\u4e2afastapi\u670d\u52a1\uff0c\u5e76\u5206\u522b\u76d1\u542c\u4e0d\u540c\u7684\u7aef\u53e3\uff0c\u53ef\u4ee5\u4f7f\u7528\u4ee5\u4e0b\u547d\u4ee4\uff1a

    # \u5728\u7ec8\u7aef1\u4e2d\nCUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000\n# \u5728\u7ec8\u7aef2\u4e2d\nCUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001\n
"},{"location":"zh/usage/cli_tools/","title":"\u547d\u4ee4\u884c\u5de5\u5177","text":""},{"location":"zh/usage/cli_tools/#_1","title":"\u547d\u4ee4\u884c\u5de5\u5177\u4f7f\u7528\u8bf4\u660e","text":""},{"location":"zh/usage/cli_tools/#_2","title":"\u67e5\u770b\u5e2e\u52a9\u4fe1\u606f","text":"

\u8981\u67e5\u770b MinerU \u547d\u4ee4\u884c\u5de5\u5177\u7684\u5e2e\u52a9\u4fe1\u606f\uff0c\u53ef\u4ee5\u4f7f\u7528 --help \u53c2\u6570\u3002\u4ee5\u4e0b\u662f\u5404\u4e2a\u547d\u4ee4\u884c\u5de5\u5177\u7684\u5e2e\u52a9\u4fe1\u606f\u793a\u4f8b\uff1a

mineru --help\nUsage: mineru [OPTIONS]\n\nOptions:\n  -v, --version                   \u663e\u793a\u7248\u672c\u5e76\u9000\u51fa\n  -p, --path PATH                 \u8f93\u5165\u6587\u4ef6\u8def\u5f84\u6216\u76ee\u5f55\uff08\u5fc5\u586b\uff09\n  -o, --output PATH               \u8f93\u51fa\u76ee\u5f55\uff08\u5fc5\u586b\uff09\n  --api-url TEXT                  MinerU FastAPI \u670d\u52a1\u5730\u5740\uff1b\u4e0d\u4f20\u65f6\u81ea\u52a8\u62c9\u8d77\u672c\u5730\u4e34\u65f6 mineru-api\n  -m, --method [auto|txt|ocr]     \u89e3\u6790\u65b9\u6cd5\uff1aauto\uff08\u9ed8\u8ba4\uff09\u3001txt\u3001ocr\uff08\u4ec5\u7528\u4e8e pipeline \u4e0e hybrid* \u540e\u7aef\uff09\n  -b, --backend [pipeline|hybrid-auto-engine|hybrid-http-client|vlm-auto-engine|vlm-http-client]\n                                  \u89e3\u6790\u540e\u7aef\uff08\u9ed8\u8ba4\u4e3a hybrid-auto-engine\uff09\n  -l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|th|el|latin|arabic|east_slavic|cyrillic|devanagari]\n                                  \u6307\u5b9a\u6587\u6863\u8bed\u8a00\uff08\u53ef\u63d0\u5347 OCR \u51c6\u786e\u7387\uff0c\u4ec5\u7528\u4e8e pipeline \u4e0e hybrid* \u540e\u7aef\uff09\n  -u, --url TEXT                  \u5f53\u4f7f\u7528 http-client \u65f6\uff0c\u4f20\u7ed9\u670d\u52a1\u7aef\u540e\u7aef\u7684 OpenAI \u517c\u5bb9\u5730\u5740\n  -s, --start INTEGER             \u5f00\u59cb\u89e3\u6790\u7684\u9875\u7801\uff08\u4ece 0 \u5f00\u59cb\uff09\n  -e, --end INTEGER               \u7ed3\u675f\u89e3\u6790\u7684\u9875\u7801\uff08\u4ece 0 \u5f00\u59cb\uff09\n  -f, --formula BOOLEAN           \u662f\u5426\u542f\u7528\u516c\u5f0f\u89e3\u6790\uff08\u9ed8\u8ba4\u5f00\u542f\uff09\n  -t, --table BOOLEAN             \u662f\u5426\u542f\u7528\u8868\u683c\u89e3\u6790\uff08\u9ed8\u8ba4\u5f00\u542f\uff09\n  --help                          \u663e\u793a\u5e2e\u52a9\u4fe1\u606f\n
mineru-api --help\nUsage: mineru-api [OPTIONS]\n\nOptions:\n  --host TEXT     \u670d\u52a1\u5668\u4e3b\u673a\u5730\u5740\uff08\u9ed8\u8ba4\uff1a127.0.0.1\uff09\n  --port INTEGER  \u670d\u52a1\u5668\u7aef\u53e3\uff08\u9ed8\u8ba4\uff1a8000\uff09\n  --reload        \u542f\u7528\u81ea\u52a8\u91cd\u8f7d\uff08\u5f00\u53d1\u6a21\u5f0f\uff09\n  --help          \u663e\u793a\u6b64\u5e2e\u52a9\u4fe1\u606f\u5e76\u9000\u51fa\n
mineru-gradio --help\nUsage: mineru-gradio [OPTIONS]\n\nOptions:\n  --enable-example BOOLEAN        \u542f\u7528\u793a\u4f8b\u6587\u4ef6\u8f93\u5165(\u9700\u8981\u5c06\u793a\u4f8b\u6587\u4ef6\u653e\u7f6e\u5728\u5f53\u524d\n                                  \u6267\u884c\u547d\u4ee4\u76ee\u5f55\u4e0b\u7684 `example` \u6587\u4ef6\u5939\u4e2d)\n  --enable-http-client BOOLEAN    \u5728\u540e\u7aef\u9009\u9879\u4e2d\u542f\u7528 HTTP \u5ba2\u6237\u7aef\u9009\u9879\n  --enable-api BOOLEAN            \u542f\u7528 Gradio API \u4ee5\u63d0\u4f9b\u5e94\u7528\u7a0b\u5e8f\u670d\u52a1\n  --max-convert-pages INTEGER     \u8bbe\u7f6e\u4ece PDF \u8f6c\u6362\u4e3a Markdown \u7684\u6700\u5927\u9875\u6570\n  --server-name TEXT              \u8bbe\u7f6e Gradio \u5e94\u7528\u7a0b\u5e8f\u7684\u670d\u52a1\u5668\u4e3b\u673a\u540d\n  --server-port INTEGER           \u8bbe\u7f6e Gradio \u5e94\u7528\u7a0b\u5e8f\u7684\u670d\u52a1\u5668\u7aef\u53e3\n  --latex-delimiters-type [a|b|all]\n                                  \u8bbe\u7f6e\u5728 Markdown \u6e32\u67d3\u4e2d\u4f7f\u7528\u7684 LaTeX \u5206\u9694\u7b26\u7c7b\u578b\n                                  ('a' \u8868\u793a '$' \u7c7b\u578b\uff0c'b' \u8868\u793a '()[]' \u7c7b\u578b\uff0c\n                                  'all' \u8868\u793a\u4e24\u79cd\u7c7b\u578b\u90fd\u4f7f\u7528)\n  --help                          \u663e\u793a\u6b64\u5e2e\u52a9\u4fe1\u606f\u5e76\u9000\u51fa\n
"},{"location":"zh/usage/cli_tools/#_3","title":"\u73af\u5883\u53d8\u91cf\u8bf4\u660e","text":"

Note

\u4ece\u5f53\u524d\u7248\u672c\u5f00\u59cb\uff0cmineru \u662f\u57fa\u4e8e mineru-api \u7684\u7f16\u6392\u5ba2\u6237\u7aef\uff1a - \u672a\u4f20 --api-url \u65f6\uff0cCLI \u4f1a\u81ea\u52a8\u62c9\u8d77\u672c\u5730\u4e34\u65f6 mineru-api - \u4f20\u5165 --api-url \u65f6\uff0cCLI \u4f1a\u76f4\u8fde\u8be5 FastAPI \u670d\u52a1 - --url \u4e0d\u518d\u8868\u793a MinerU API \u5730\u5740\uff0c\u800c\u662f\u670d\u52a1\u7aef vlm/hybrid-http-client \u6240\u9700\u7684 OpenAI \u517c\u5bb9\u5730\u5740

MinerU\u547d\u4ee4\u884c\u5de5\u5177\u7684\u67d0\u4e9b\u53c2\u6570\u5b58\u5728\u76f8\u540c\u529f\u80fd\u7684\u73af\u5883\u53d8\u91cf\u914d\u7f6e\uff0c\u901a\u5e38\u73af\u5883\u53d8\u91cf\u914d\u7f6e\u7684\u4f18\u5148\u7ea7\u9ad8\u4e8e\u547d\u4ee4\u884c\u53c2\u6570\uff0c\u4e14\u5728\u6240\u6709\u547d\u4ee4\u884c\u5de5\u5177\u4e2d\u90fd\u751f\u6548\u3002 \u4ee5\u4e0b\u662f\u5e38\u7528\u7684\u73af\u5883\u53d8\u91cf\u53ca\u5176\u8bf4\u660e\uff1a

  • MINERU_TOOLS_CONFIG_JSON\uff1a

    • \u7528\u4e8e\u6307\u5b9a\u914d\u7f6e\u6587\u4ef6\u8def\u5f84
    • \u9ed8\u8ba4\u4e3a\u7528\u6237\u76ee\u5f55\u4e0b\u7684mineru.json\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u6307\u5b9a\u5176\u4ed6\u914d\u7f6e\u6587\u4ef6\u8def\u5f84\u3002
  • MINERU_FORMULA_ENABLE\uff1a

    • \u7528\u4e8e\u542f\u7528\u516c\u5f0f\u89e3\u6790
    • \u9ed8\u8ba4\u4e3atrue\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3afalse\u6765\u7981\u7528\u516c\u5f0f\u89e3\u6790\u3002
  • MINERU_FORMULA_CH_SUPPORT\uff1a

    • \u7528\u4e8e\u542f\u7528\u4e2d\u6587\u516c\u5f0f\u89e3\u6790\u4f18\u5316\uff08\u5b9e\u9a8c\u6027\u529f\u80fd\uff09
    • \u9ed8\u8ba4\u4e3afalse\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3atrue\u6765\u542f\u7528\u4e2d\u6587\u516c\u5f0f\u89e3\u6790\u4f18\u5316\u3002
    • \u4ec5\u5bf9pipeline\u540e\u7aef\u751f\u6548\u3002
  • MINERU_TABLE_ENABLE\uff1a

    • \u7528\u4e8e\u542f\u7528\u8868\u683c\u89e3\u6790
    • \u9ed8\u8ba4\u4e3atrue\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3afalse\u6765\u7981\u7528\u8868\u683c\u89e3\u6790\u3002
  • MINERU_TABLE_MERGE_ENABLE\uff1a

    • \u7528\u4e8e\u542f\u7528\u8868\u683c\u5408\u5e76\u529f\u80fd
    • \u9ed8\u8ba4\u4e3atrue\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3afalse\u6765\u7981\u7528\u8868\u683c\u5408\u5e76\u529f\u80fd\u3002
  • MINERU_PDF_RENDER_TIMEOUT\uff1a

    • \u7528\u4e8e\u8bbe\u7f6e\u5c06PDF\u6e32\u67d3\u4e3a\u56fe\u7247\u7684\u8d85\u65f6\u65f6\u95f4\uff08\u79d2\uff09
    • \u9ed8\u8ba4\u4e3a300\u79d2\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3a\u5176\u4ed6\u503c\u4ee5\u8c03\u6574\u6e32\u67d3\u56fe\u7247\u7684\u8d85\u65f6\u65f6\u95f4\u3002
    • \u4ec5\u5728linux\u548cmacOS\u7cfb\u7edf\u4e2d\u751f\u6548\u3002
  • MINERU_PDF_RENDER_THREADS\uff1a

    • \u7528\u4e8e\u8bbe\u7f6e\u5c06PDF\u6e32\u67d3\u4e3a\u56fe\u7247\u65f6\u4f7f\u7528\u7684\u7ebf\u7a0b\u6570
    • \u9ed8\u8ba4\u4e3a4\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3a\u5176\u4ed6\u503c\u4ee5\u8c03\u6574\u6e32\u67d3\u56fe\u7247\u65f6\u7684\u7ebf\u7a0b\u6570\u3002
    • \u4ec5\u5728linux\u548cmacOS\u7cfb\u7edf\u4e2d\u751f\u6548\u3002
  • MINERU_INTRA_OP_NUM_THREADS\uff1a

    • \u7528\u4e8e\u8bbe\u7f6eonnx\u6a21\u578b\u7684intra_op\u7ebf\u7a0b\u6570\uff0c\u5f71\u54cd\u5355\u4e2a\u7b97\u5b50\u7684\u8ba1\u7b97\u901f\u5ea6
    • \u9ed8\u8ba4\u4e3a-1\uff08\u81ea\u52a8\u9009\u62e9\uff09\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3a\u5176\u4ed6\u503c\u4ee5\u8c03\u6574\u7ebf\u7a0b\u6570\u3002
  • MINERU_INTER_OP_NUM_THREADS\uff1a

    • \u7528\u4e8e\u8bbe\u7f6eonnx\u6a21\u578b\u7684inter_op\u7ebf\u7a0b\u6570\uff0c\u5f71\u54cd\u591a\u4e2a\u7b97\u5b50\u7684\u5e76\u884c\u6267\u884c
    • \u9ed8\u8ba4\u4e3a-1\uff08\u81ea\u52a8\u9009\u62e9\uff09\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3a\u5176\u4ed6\u503c\u4ee5\u8c03\u6574\u7ebf\u7a0b\u6570\u3002
  • MINERU_HYBRID_BATCH_RATIO\uff1a

    • \u7528\u4e8e\u8bbe\u7f6e hybrid-* \u540e\u7aef\u4e2d \u5c0f\u6a21\u578b\u5904\u7406\u7684batch\u500d\u7387
    • \u5728hybrid-http-client\u4e2d\u8f83\u4e3a\u5e38\u7528\uff0c\u53ef\u4ee5\u901a\u8fc7\u63a7\u5236\u5c0f\u6a21\u578b\u7684batch\u500d\u7387\u6765\u8c03\u6574\u5355\u4e2a\u5ba2\u6237\u7aef\u7684\u663e\u5b58\u5360\u7528\u91cf
    • \u5355\u4e2aclient\u7aef\u663e\u5b58\u5927\u5c0f MINERU_HYBRID_BATCH_RATIO <= 6 GB 8 <= 4.5 GB 4 <= 3 GB 2 <= 2.5 GB 1
  • MINERU_HYBRID_FORCE_PIPELINE_ENABLE\uff1a

    • \u7528\u4e8e\u5f3a\u5236\u5c06 hybrid-* \u540e\u7aef\u4e2d\u7684 \u6587\u672c\u63d0\u53d6\u90e8\u5206\u4f7f\u7528 \u5c0f\u6a21\u578b \u8fdb\u884c\u5904\u7406
    • \u9ed8\u8ba4\u4e3afalse\uff0c\u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u4e3atrue\u6765\u542f\u7528\u8be5\u529f\u80fd\uff0c\u4ece\u800c\u5728\u67d0\u4e9b\u6781\u7aef\u60c5\u51b5\u4e0b\u51cf\u5c11\u5e7b\u89c9\u7684\u53d1\u751f\u3002
  • MINERU_VL_MODEL_NAME\uff1a

    • \u7528\u4e8e\u6307\u5b9a vlm/hybrid \u540e\u7aef\u4f7f\u7528\u7684\u6a21\u578b\u540d\u79f0\uff0c\u8fd9\u5c06\u5141\u8bb8\u60a8\u5728\u540c\u65f6\u5b58\u5728\u591a\u4e2a\u6a21\u578b\u7684\u8fdc\u7a0bopenai-server\u4e2d\u6307\u5b9a MinerU \u8fd0\u884c\u6240\u9700\u7684\u6a21\u578b\u3002
  • MINERU_VL_API_KEY:

    • \u7528\u4e8e\u6307\u5b9a vlm/hybrid \u540e\u7aef\u4f7f\u7528\u7684API Key\uff0c\u8fd9\u5c06\u5141\u8bb8\u60a8\u5728\u8fdc\u7a0bopenai-server\u4e2d\u8fdb\u884c\u8eab\u4efd\u9a8c\u8bc1\u3002
"},{"location":"zh/usage/model_source/","title":"\u6a21\u578b\u6e90\u914d\u7f6e","text":""},{"location":"zh/usage/model_source/#_1","title":"\u6a21\u578b\u6e90\u8bf4\u660e","text":"

MinerU\u4f7f\u7528 HuggingFace \u548c ModelScope \u4f5c\u4e3a\u6a21\u578b\u4ed3\u5e93\uff0c\u7528\u6237\u53ef\u4ee5\u6839\u636e\u9700\u8981\u5207\u6362\u6a21\u578b\u6e90\u6216\u4f7f\u7528\u672c\u5730\u6a21\u578b\u3002

  • HuggingFace \u662f\u9ed8\u8ba4\u7684\u6a21\u578b\u6e90\uff0c\u5728\u5168\u7403\u8303\u56f4\u5185\u63d0\u4f9b\u4e86\u4f18\u5f02\u7684\u52a0\u8f7d\u901f\u5ea6\u548c\u6781\u9ad8\u7a33\u5b9a\u6027\u3002
  • ModelScope \u662f\u4e2d\u56fd\u5927\u9646\u5730\u533a\u7528\u6237\u7684\u6700\u4f73\u9009\u62e9\uff0c\u63d0\u4f9b\u4e86\u65e0\u7f1d\u517c\u5bb9\u7684SDK\u6a21\u5757\uff0c\u9002\u7528\u4e8e\u65e0\u6cd5\u8bbf\u95eeHuggingFace\u7684\u7528\u6237\u3002
"},{"location":"zh/usage/model_source/#_2","title":"\u6a21\u578b\u6e90\u7684\u5207\u6362\u65b9\u6cd5","text":""},{"location":"zh/usage/model_source/#_3","title":"\u901a\u8fc7\u547d\u4ee4\u884c\u53c2\u6570\u5207\u6362","text":"

\u76ee\u524d\u4ec5mineru\u547d\u4ee4\u884c\u5de5\u5177\u652f\u6301\u901a\u8fc7\u547d\u4ee4\u884c\u53c2\u6570\u5207\u6362\u6a21\u578b\u6e90\uff0c\u5176\u4ed6\u547d\u4ee4\u884c\u5de5\u5177\u5982mineru-api\u3001mineru-gradio\u7b49\u6682\u4e0d\u652f\u6301\u3002

mineru -p <input_path> -o <output_path> --source modelscope\n
"},{"location":"zh/usage/model_source/#_4","title":"\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u5207\u6362","text":"

\u5728\u4efb\u4f55\u60c5\u51b5\u4e0b\u53ef\u4ee5\u901a\u8fc7\u8bbe\u7f6e\u73af\u5883\u53d8\u91cf\u6765\u5207\u6362\u6a21\u578b\u6e90\uff0c\u8fd9\u9002\u7528\u4e8e\u6240\u6709\u547d\u4ee4\u884c\u5de5\u5177\u548cAPI\u8c03\u7528\u3002

export MINERU_MODEL_SOURCE=modelscope\n
\u6216
import os\nos.environ[\"MINERU_MODEL_SOURCE\"] = \"modelscope\"\n

Tip

\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u8bbe\u7f6e\u7684\u6a21\u578b\u6e90\u4f1a\u5728\u5f53\u524d\u7ec8\u7aef\u4f1a\u8bdd\u4e2d\u751f\u6548\uff0c\u76f4\u5230\u7ec8\u7aef\u5173\u95ed\u6216\u73af\u5883\u53d8\u91cf\u88ab\u4fee\u6539\u3002\u4e14\u4f18\u5148\u7ea7\u9ad8\u4e8e\u547d\u4ee4\u884c\u53c2\u6570\uff0c\u5982\u540c\u65f6\u8bbe\u7f6e\u4e86\u547d\u4ee4\u884c\u53c2\u6570\u548c\u73af\u5883\u53d8\u91cf\uff0c\u547d\u4ee4\u884c\u53c2\u6570\u5c06\u88ab\u5ffd\u7565\u3002

"},{"location":"zh/usage/model_source/#_5","title":"\u4f7f\u7528\u672c\u5730\u6a21\u578b","text":""},{"location":"zh/usage/model_source/#1","title":"1. \u4e0b\u8f7d\u6a21\u578b\u5230\u672c\u5730","text":"
mineru-models-download --help\n
\u6216\u4f7f\u7528\u4ea4\u4e92\u5f0f\u547d\u4ee4\u884c\u5de5\u5177\u9009\u62e9\u6a21\u578b\u4e0b\u8f7d\uff1a
mineru-models-download\n

Note

  • \u4e0b\u8f7d\u5b8c\u6210\u540e\uff0c\u6a21\u578b\u8def\u5f84\u4f1a\u5728\u5f53\u524d\u7ec8\u7aef\u7a97\u53e3\u8f93\u51fa\uff0c\u5e76\u81ea\u52a8\u5199\u5165\u7528\u6237\u76ee\u5f55\u4e0b\u7684 mineru.json\u3002
  • \u60a8\u4e5f\u53ef\u4ee5\u901a\u8fc7\u5c06\u914d\u7f6e\u6a21\u677f\u6587\u4ef6\u590d\u5236\u5230\u7528\u6237\u76ee\u5f55\u4e0b\u5e76\u91cd\u547d\u540d\u4e3a mineru.json \u6765\u521b\u5efa\u914d\u7f6e\u6587\u4ef6\u3002
  • \u6a21\u578b\u4e0b\u8f7d\u5230\u672c\u5730\u540e\uff0c\u60a8\u53ef\u4ee5\u81ea\u7531\u79fb\u52a8\u6a21\u578b\u6587\u4ef6\u5939\u5230\u5176\u4ed6\u4f4d\u7f6e\uff0c\u540c\u65f6\u9700\u8981\u5728 mineru.json \u4e2d\u66f4\u65b0\u6a21\u578b\u8def\u5f84\u3002
  • \u5982\u60a8\u5c06\u6a21\u578b\u6587\u4ef6\u5939\u90e8\u7f72\u5230\u5176\u4ed6\u670d\u52a1\u5668\u4e0a\uff0c\u8bf7\u786e\u4fdd\u5c06 mineru.json\u6587\u4ef6\u4e00\u540c\u79fb\u52a8\u5230\u65b0\u8bbe\u5907\u7684\u7528\u6237\u76ee\u5f55\u4e2d\u5e76\u6b63\u786e\u914d\u7f6e\u6a21\u578b\u8def\u5f84\u3002
  • \u5982\u60a8\u9700\u8981\u66f4\u65b0\u6a21\u578b\u6587\u4ef6\uff0c\u53ef\u4ee5\u518d\u6b21\u8fd0\u884c mineru-models-download \u547d\u4ee4\uff0c\u6a21\u578b\u66f4\u65b0\u6682\u4e0d\u652f\u6301\u81ea\u5b9a\u4e49\u8def\u5f84\uff0c\u5982\u60a8\u6ca1\u6709\u79fb\u52a8\u672c\u5730\u6a21\u578b\u6587\u4ef6\u5939\uff0c\u6a21\u578b\u6587\u4ef6\u4f1a\u589e\u91cf\u66f4\u65b0\uff1b\u5982\u60a8\u79fb\u52a8\u4e86\u6a21\u578b\u6587\u4ef6\u5939\uff0c\u6a21\u578b\u6587\u4ef6\u4f1a\u91cd\u65b0\u4e0b\u8f7d\u5230\u9ed8\u8ba4\u4f4d\u7f6e\u5e76\u66f4\u65b0mineru.json\u3002
"},{"location":"zh/usage/model_source/#2","title":"2. \u4f7f\u7528\u672c\u5730\u6a21\u578b\u8fdb\u884c\u89e3\u6790","text":"
mineru -p <input_path> -o <output_path> --source local\n
\u6216\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u542f\u7528\uff1a
export MINERU_MODEL_SOURCE=local\nmineru -p <input_path> -o <output_path>\n
"},{"location":"zh/usage/quick_usage/","title":"\u57fa\u7840\u4f7f\u7528","text":""},{"location":"zh/usage/quick_usage/#mineru","title":"\u4f7f\u7528 MinerU","text":""},{"location":"zh/usage/quick_usage/#_1","title":"\u5feb\u901f\u914d\u7f6e\u6a21\u578b\u6e90","text":"

MinerU\u9ed8\u8ba4\u4f7f\u7528huggingface\u4f5c\u4e3a\u6a21\u578b\u6e90\uff0c\u82e5\u7528\u6237\u7f51\u7edc\u65e0\u6cd5\u8bbf\u95eehuggingface\uff0c\u53ef\u4ee5\u901a\u8fc7\u73af\u5883\u53d8\u91cf\u4fbf\u6377\u5730\u5207\u6362\u6a21\u578b\u6e90\u4e3amodelscope\uff1a

export MINERU_MODEL_SOURCE=modelscope\n
\u6709\u5173\u6a21\u578b\u6e90\u914d\u7f6e\u548c\u81ea\u5b9a\u4e49\u672c\u5730\u6a21\u578b\u8def\u5f84\u7684\u66f4\u591a\u4fe1\u606f\uff0c\u8bf7\u53c2\u8003\u6587\u6863\u4e2d\u7684\u6a21\u578b\u6e90\u8bf4\u660e\u3002"},{"location":"zh/usage/quick_usage/#_2","title":"\u901a\u8fc7\u547d\u4ee4\u884c\u5feb\u901f\u4f7f\u7528","text":"

MinerU\u5185\u7f6e\u4e86\u547d\u4ee4\u884c\u5de5\u5177\uff0c\u7528\u6237\u53ef\u4ee5\u901a\u8fc7\u547d\u4ee4\u884c\u5feb\u901f\u4f7f\u7528MinerU\u8fdb\u884cPDF\u89e3\u6790\uff1a

mineru -p <input_path> -o <output_path>\n

Tip

  • <input_path>\uff1a\u672c\u5730 PDF/\u56fe\u7247 \u6587\u4ef6\u6216\u76ee\u5f55
  • <output_path>\uff1a\u8f93\u51fa\u76ee\u5f55
  • \u672a\u4f20 --api-url \u65f6\uff0cCLI \u4f1a\u81ea\u52a8\u62c9\u8d77\u672c\u5730\u4e34\u65f6 mineru-api
  • \u4f20\u5165 --api-url \u65f6\uff0cCLI \u4f1a\u76f4\u8fde\u8fdc\u7aef\u6216\u5df2\u6709\u672c\u5730 FastAPI \u670d\u52a1

\u66f4\u591a\u5173\u4e8e\u8f93\u51fa\u6587\u4ef6\u7684\u4fe1\u606f\uff0c\u8bf7\u53c2\u8003\u8f93\u51fa\u6587\u4ef6\u8bf4\u660e\u3002

Note

\u547d\u4ee4\u884c\u5de5\u5177\u4f1a\u5728Linux\u548cmacOS\u7cfb\u7edf\u81ea\u52a8\u5c1d\u8bd5cuda/mps\u52a0\u901f\u3002Windows\u7528\u6237\u5982\u9700\u4f7f\u7528cuda\u52a0\u901f\uff0c \u8bf7\u524d\u5f80 Pytorch\u5b98\u7f51 \u9009\u62e9\u9002\u5408\u81ea\u5df1cuda\u7248\u672c\u7684\u547d\u4ee4\u5b89\u88c5\u652f\u6301\u52a0\u901f\u7684torch\u548ctorchvision\u3002

\u5982\u679c\u9700\u8981\u901a\u8fc7\u81ea\u5b9a\u4e49\u53c2\u6570\u8c03\u6574\u89e3\u6790\u9009\u9879\uff0c\u60a8\u4e5f\u53ef\u4ee5\u5728\u6587\u6863\u4e2d\u67e5\u770b\u66f4\u8be6\u7ec6\u7684\u547d\u4ee4\u884c\u5de5\u5177\u4f7f\u7528\u8bf4\u660e\u3002

"},{"location":"zh/usage/quick_usage/#apiwebuihttp-clientserver","title":"\u901a\u8fc7api\u3001webui\u3001http-client/server\u8fdb\u9636\u4f7f\u7528","text":"
  • \u901a\u8fc7python api\u76f4\u63a5\u8c03\u7528\uff1aPython \u8c03\u7528\u793a\u4f8b
  • \u901a\u8fc7fast api\u65b9\u5f0f\u8c03\u7528\uff1a
    mineru-api --host 0.0.0.0 --port 8000\n

    Tip

    \u5728\u6d4f\u89c8\u5668\u4e2d\u8bbf\u95ee http://127.0.0.1:8000/docs \u67e5\u770bAPI\u6587\u6863\u3002

    • \u5065\u5eb7\u68c0\u67e5\u63a5\u53e3\uff1aGET /health \u8fd4\u56de protocol_version\u3001processing_window_size\u3001max_concurrent_requests \u7b49\u670d\u52a1\u4fe1\u606f
    • \u5f02\u6b65\u4efb\u52a1\u63d0\u4ea4\u63a5\u53e3\uff1aPOST /tasks
    • \u540c\u6b65\u89e3\u6790\u63a5\u53e3\uff1aPOST /file_parse
    • \u4efb\u52a1\u67e5\u8be2\u63a5\u53e3\uff1aGET /tasks/{task_id}\u3001GET /tasks/{task_id}/result
    • API \u8f93\u51fa\u76ee\u5f55\u7531\u670d\u52a1\u7aef\u56fa\u5b9a\u63a7\u5236\uff0c\u9ed8\u8ba4\u5199\u5165 ./output

    POST /tasks \u4f1a\u7acb\u5373\u8fd4\u56de task_id\uff1bPOST /file_parse \u4f1a\u5728\u5185\u90e8\u63d0\u4ea4\u5230\u540c\u4e00\u4e2a\u4efb\u52a1\u7ba1\u7406\u5668\uff0c\u7b49\u5f85\u4efb\u52a1\u5b8c\u6210\u540e\u540c\u6b65\u8fd4\u56de\u6700\u7ec8\u7ed3\u679c\u3002 \u4efb\u52a1\u4e3a\u5355\u8fdb\u7a0b\u3001\u8fdb\u7a0b\u5185\u72b6\u6001\u5b9e\u73b0\uff0c\u670d\u52a1\u91cd\u542f\u3001--reload \u70ed\u91cd\u8f7d\u6216\u591a\u8fdb\u7a0b\u90e8\u7f72\u540e\u4e0d\u4fdd\u8bc1\u4ecd\u53ef\u67e5\u8be2\u5386\u53f2\u4efb\u52a1\u72b6\u6001\u3002 \u9ed8\u8ba4\u4efb\u52a1\u5b8c\u6210\u6216\u5931\u8d25\u540e\u4fdd\u7559 24 \u5c0f\u65f6\uff0c\u968f\u540e\u81ea\u52a8\u6e05\u7406\u4efb\u52a1\u72b6\u6001\u548c\u8f93\u51fa\u76ee\u5f55\uff1b\u6e05\u7406\u540e\u8bbf\u95ee\u4efb\u52a1\u72b6\u6001\u6216\u7ed3\u679c\u4f1a\u8fd4\u56de 404\u3002 \u53ef\u901a\u8fc7\u73af\u5883\u53d8\u91cf MINERU_API_TASK_RETENTION_SECONDS \u548c MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS \u8c03\u6574\u4fdd\u7559\u65f6\u957f\u4e0e\u6e05\u7406\u8f6e\u8be2\u95f4\u9694\u3002

    \u5f02\u6b65\u4efb\u52a1\u63d0\u4ea4\u793a\u4f8b\uff1a

    curl -X POST http://127.0.0.1:8000/tasks \\\n  -F \"files=@demo/pdfs/demo1.pdf\" \\\n  -F \"return_md=true\"\n

    \u540c\u6b65\u89e3\u6790\u793a\u4f8b\uff1a

    curl -X POST http://127.0.0.1:8000/file_parse \\\n  -F \"files=@demo/pdfs/demo1.pdf\" \\\n  -F \"return_md=true\" \\\n  -F \"response_format_zip=true\" \\\n  -F \"return_original_file=true\"\n

    \u8f6e\u8be2\u4efb\u52a1\u72b6\u6001\u4e0e\u7ed3\u679c\uff1a

    curl http://127.0.0.1:8000/tasks/<task_id>\ncurl http://127.0.0.1:8000/tasks/<task_id>/result\ncurl http://127.0.0.1:8000/health\n
  • \u542f\u52a8gradio webui \u53ef\u89c6\u5316\u524d\u7aef\uff1a

    mineru-gradio --server-name 0.0.0.0 --server-port 7860\n

    Tip

    • \u5728\u6d4f\u89c8\u5668\u4e2d\u8bbf\u95ee http://127.0.0.1:7860 \u4f7f\u7528 Gradio WebUI\u3002
  • \u4f7f\u7528http-client/server\u65b9\u5f0f\u8c03\u7528\uff1a

    # \u542f\u52a8openai\u517c\u5bb9\u670d\u52a1\u5668(\u9700\u8981\u5b89\u88c5vllm\u6216lmdeploy\u73af\u5883)\nmineru-openai-server --port 30000\n

    Tip

    \u5728\u53e6\u4e00\u4e2a\u7ec8\u7aef\u4e2d\u901a\u8fc7http client\u8fde\u63a5openai server

    mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000\n

Note

\u6240\u6709vllm/lmdeploy\u5b98\u65b9\u652f\u6301\u7684\u53c2\u6570\u90fd\u53ef\u7528\u901a\u8fc7\u547d\u4ee4\u884c\u53c2\u6570\u4f20\u9012\u7ed9 MinerU\uff0c\u5305\u62ec\u4ee5\u4e0b\u547d\u4ee4:mineru\u3001mineru-openai-server\u3001mineru-gradio\u3001mineru-api\uff0c \u6211\u4eec\u6574\u7406\u4e86\u4e00\u4e9bvllm/lmdeploy\u4f7f\u7528\u4e2d\u7684\u5e38\u7528\u53c2\u6570\u548c\u4f7f\u7528\u65b9\u6cd5\uff0c\u53ef\u4ee5\u5728\u6587\u6863\u547d\u4ee4\u884c\u8fdb\u9636\u53c2\u6570\u4e2d\u83b7\u53d6\u3002

"},{"location":"zh/usage/quick_usage/#mineru_1","title":"\u57fa\u4e8e\u914d\u7f6e\u6587\u4ef6\u6269\u5c55 MinerU \u529f\u80fd","text":"

MinerU \u73b0\u5df2\u5b9e\u73b0\u5f00\u7bb1\u5373\u7528\uff0c\u4f46\u4e5f\u652f\u6301\u901a\u8fc7\u914d\u7f6e\u6587\u4ef6\u6269\u5c55\u529f\u80fd\u3002\u60a8\u53ef\u901a\u8fc7\u7f16\u8f91\u7528\u6237\u76ee\u5f55\u4e0b\u7684 mineru.json \u6587\u4ef6\uff0c\u6dfb\u52a0\u81ea\u5b9a\u4e49\u914d\u7f6e\u3002

Important

mineru.json \u6587\u4ef6\u4f1a\u5728\u60a8\u4f7f\u7528\u5185\u7f6e\u6a21\u578b\u4e0b\u8f7d\u547d\u4ee4 mineru-models-download \u65f6\u81ea\u52a8\u751f\u6210\uff0c\u4e5f\u53ef\u4ee5\u901a\u8fc7\u5c06\u914d\u7f6e\u6a21\u677f\u6587\u4ef6\u590d\u5236\u5230\u7528\u6237\u76ee\u5f55\u4e0b\u5e76\u91cd\u547d\u540d\u4e3a mineru.json \u6765\u521b\u5efa\u3002

\u4ee5\u4e0b\u662f\u4e00\u4e9b\u53ef\u7528\u7684\u914d\u7f6e\u9009\u9879\uff1a

  • latex-delimiter-config\uff1a

    • \u7528\u4e8e\u914d\u7f6e LaTeX \u516c\u5f0f\u7684\u5206\u9694\u7b26
    • \u9ed8\u8ba4\u4e3a$\u7b26\u53f7\uff0c\u53ef\u6839\u636e\u9700\u8981\u4fee\u6539\u4e3a\u5176\u4ed6\u7b26\u53f7\u6216\u5b57\u7b26\u4e32\u3002
  • llm-aided-config\uff1a

    • \u7528\u4e8e\u914d\u7f6e LLM \u8f85\u52a9\u6807\u9898\u5206\u7ea7\u7684\u76f8\u5173\u53c2\u6570\uff0c\u517c\u5bb9\u6240\u6709\u652f\u6301openai\u534f\u8bae\u7684 LLM \u6a21\u578b
    • \u9ed8\u8ba4\u4f7f\u7528\u963f\u91cc\u4e91\u767e\u70bc\u7684qwen3-next-80b-a3b-instruct\u6a21\u578b
    • \u60a8\u9700\u8981\u81ea\u884c\u914d\u7f6e API \u5bc6\u94a5\u5e76\u5c06enable\u8bbe\u7f6e\u4e3atrue\u6765\u542f\u7528\u6b64\u529f\u80fd
    • \u5982\u679c\u60a8\u7684api\u4f9b\u5e94\u5546\u4e0d\u652f\u6301enable_thinking\u53c2\u6570\uff0c\u8bf7\u624b\u52a8\u5c06\u8be5\u53c2\u6570\u5220\u9664
      • \u4f8b\u5982\uff0c\u5728\u60a8\u7684\u914d\u7f6e\u6587\u4ef6\u4e2d\uff0cllm-aided-config \u90e8\u5206\u53ef\u80fd\u5982\u4e0b\u6240\u793a\uff1a
        \"llm-aided-config\": {\n   \"api_key\": \"your_api_key\",\n   \"base_url\": \"https://dashscope.aliyuncs.com/compatible-mode/v1\",\n   \"model\": \"qwen3-next-80b-a3b-instruct\",\n   \"enable_thinking\": false,\n   \"enable\": false\n}\n
      • \u8981\u79fb\u9664enable_thinking\u53c2\u6570\uff0c\u53ea\u9700\u5220\u9664\u5305\u542b\"enable_thinking\": false\u7684\u90a3\u4e00\u884c\uff0c\u7ed3\u679c\u5982\u4e0b:
        \"llm-aided-config\": {\n   \"api_key\": \"your_api_key\",\n   \"base_url\": \"https://dashscope.aliyuncs.com/compatible-mode/v1\",\n   \"model\": \"qwen3-next-80b-a3b-instruct\",\n   \"enable\": false\n}\n
  • models-dir\uff1a

    • \u7528\u4e8e\u6307\u5b9a\u672c\u5730\u6a21\u578b\u5b58\u50a8\u76ee\u5f55\uff0c\u8bf7\u4e3apipeline\u548cvlm\u540e\u7aef\u5206\u522b\u6307\u5b9a\u6a21\u578b\u76ee\u5f55\uff0c
    • \u6307\u5b9a\u76ee\u5f55\u540e\u60a8\u53ef\u901a\u8fc7\u914d\u7f6e\u73af\u5883\u53d8\u91cfexport MINERU_MODEL_SOURCE=local\u6765\u4f7f\u7528\u672c\u5730\u6a21\u578b\u3002
"},{"location":"zh/usage/acceleration_cards/AMD/","title":"AMD","text":""},{"location":"zh/usage/acceleration_cards/AMD/#tritonrocm-vllmpipelinelayoutdoclayout-yolo","title":"\u57fa\u4e8eTriton\u7684ROCm \u4e0d\u540c\u540e\u7aef\u5b9e\u73b0\u4f18\u5316\uff0c\u57fa\u672c\u5b9e\u73b0vllm\u540e\u7aef\u6b63\u5e38\u63a8\u7406\uff0c\u4ee5\u53capipeline\u540e\u7aef\u4e2d\u7b2c\u4e00\u6b65layout\u7528\u7684DocLayout-YOLO","text":"

\u5df2\u6709\u5b8c\u6574python vllm\u548cmineru\u73af\u5883\u76f4\u63a5\u8df3\u8f6c\u7b2c\u4e94\u6b65\uff01\uff01\uff01 \u5176\u4ed6GPU\u6267\u884c\u95ee\u9898\u53ef\u4ee5\u53c2\u8003\uff0c\u5148prof\u67e5\u770b\u5b9a\u4f4d\u627e\u5230\u54ea\u4e2a\u7b97\u5b50\u95ee\u9898\uff0c\u7136\u540etriton\u540e\u7aef\u5b9e\u73b0\u5373\u53ef \u6d4b\u8bd5\u4e86\u4e00\u4e0b\uff0c\u57fa\u672c\u548cMinerU\u5b98\u7f51\u6548\u679c\u5dee\u4e0d\u591a\uff0c\u7528AMD\u7684\u4eba\u4e5f\u4e0d\u662f\u5f88\u591a\uff0c\u5c31\u5728\u8bc4\u8bba\u533a\u5206\u4eab\u7ed9\u5927\u5bb6\u4e86

"},{"location":"zh/usage/acceleration_cards/AMD/#1","title":"1.\u7ed3\u679c\u4ecb\u7ecd","text":"

\u8865\u5145\u4e00\u4e2a200\u9875\u7684PDF python\u7f16\u7a0b\u4e66\u6d4b\u8bd5\u4e00\u4e0b\u901f\u5ea6\uff0c\u53ef\u4ee5\u52301.99it/s\uff1a Two Step Extraction: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 200/200 [01:40<00:00, 1.99it/s]

\u4e0b\u9762\u4e3a\u4e4b\u524d14\u5b66\u672f\u8bba\u6587\u6d4b\u8bd5\u7ed3\u679c\uff1a 7900xtx mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-vllm-engine true \u901f\u5ea6\u5927\u6982\u4e3a1.6-1.8s/it\uff0c\u6ca1\u6709\u4ed4\u7ec6\u6d4b\u8bd5\uff0c\u7b80\u5355\u8bd5\u4e86\u4e24\u4e2a\u6587\u6863\u3002\u7b2c\u4e8c\u79cd\u77e9\u9635\u4e58\u6cd5\u4ee3\u66ff\u539f\u6765\u7684dots\u70b9\u4e58\u53ef\u4ee5\u8fdb\u4e00\u6b65\u63d0\u901f\u52301.3s/it\uff0c\u4f18\u5316\u540e\u7684\u4e3b\u8981\u7b97\u5b50\u8017\u65f6\u5728hipblast(\u8fd9\u4e2a\u6ca1\u6cd5\u63d0\u5347\u4e86)\u548cvllm triton\u540e\u7aef\uff0c\u5404\u536025%\u8017\u65f6\u5427\uff0cvllm tirion\u540e\u7aef\u8fd9\u4e2a\u8fd9\u4e2a\u53ea\u80fd\u7b49\u5b98\u65b9\u4f18\u5316\u4e86\u3002\u3002\u3002\u3002 doclayout-yolo\u7684layout\u901f\u5ea6\u4ece\u539f\u6765\u76841.6it/s\u63d0\u9ad8\u523015it/s\uff0c\u6ce8\u610f\u9700\u8981\u7f13\u5b58\u4e00\u4e0b\u8f93\u5165\u7684pdf\u5c3a\u5bf8\u540e\uff0ctriton\u5fc5\u987b\u8981\u7f13\u5b58\u5c3a\u5bf8\u6ca1\u529e\u6cd5\u3002\u4e3b\u8981\u662f\u4e3a\u4e86\u4fdd\u7559\u6a21\u578b\u8f93\u5165\u8f93\u51fa\u63a5\u53e3\uff0c\u6700\u5c0f\u4ee3\u7801\u6539\u52a8\u3002 \u91c7\u7528-b vlm-vllm-engine\u6a21\u5f0f\u4e3e\u4e2a\u4f8b\u5b50

\u6d4b\u8bd5\u7ed3\u679c\u4e3a\u4f18\u5316\u4e3a5d\u77e9\u9635\u4e58\u4ee3\u66ff\u539f\u6765\u7684\u70b9\u79ef\u7ed3\u679c\uff1a 2025-10-05 15:45:12.985 | INFO | mineru.backend.vlm.vlm_analyze:get_model:128 - get vllm-engine predictor cost: 18.45s Adding requests: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 14/14 [00:01<00:00, 12.20it/s] Processed prompts: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 14/14 [00:08<00:00, 1.56it/s, est. speed input: 2174.18 toks/s, output: 791.87 toks/s] Adding requests: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 278/278 [00:00<00:00, 323.03it/s] Processed prompts: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 278/278 [00:07<00:00, 37.63it/s, est. speed input: 5264.66 toks/s, output: 2733.31 toks/s]

mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-vllm-engine true\u6d4b\u8bd5\uff1a 2025-10-05 15:46:55.953 | WARNING | mineru.cli.common:convert_pdf_bytes_to_bytes_by_pypdfium2:54 - end_page_id is out of range, use pdf_docs length Two Step Extraction: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 14/14 [00:18<00:00, 1.30s/it]

"},{"location":"zh/usage/acceleration_cards/AMD/#2","title":"2.\u539f\u56e0\u4ecb\u7ecd","text":"

AMD RDNA\u4f7f\u7528vllm\u540e\u7aef\u6709\u4e25\u91cd\u7684\u6027\u80fd\u95ee\u9898\uff0c\u539f\u56e0\u662f\u56e0\u4e3avllm\u7684qwen2_vl.py\u4e2d\u6709\u4e00\u4e2a\u7b97\u5b50\u5728rocm kernel\u4e0a\u6ca1\u6709\u5bf9\u5e94\u7684\u5b9e\u73b0\uff0c\u5bfc\u81f4\u6027\u80fd\u51fa\u73b0\u4e25\u91cd\u7684\u5377\u79ef\u8ba1\u7b97\u56de\u9000\uff0c\u4e00\u6b21\u6267\u884c\u82b1\u4e8612s\uff0c\u3002\u3002\u3002\u3002\u3002\u3002\u3002\u3002\u4e00\u8a00\u96be\u5c3d\u3002\u5373MIOpen \u5e93\u4e2d\u7f3a\u5c11\u6a21\u578b\u4e2d\u7279\u5b9a Conv3d(bfloat16) \u7684\u4f18\u5316\u5185\u6838\u3002 DocLayout-YOLO\u7684g2l_crm.py\u7a7a\u6d1e\u5377\u79ef\u4e5f\u662f\u8fd9\u4e2a\u95ee\u9898\uff0c\u4e13\u4e1a\u7684CDNA MI210\u4e5f\u6ca1\u89e3\u51b3\u8fd9\u4e2a\u95ee\u9898 \u6b63\u597d\u4e00\u8d77\u5904\u7406\u4e86\u3002

"},{"location":"zh/usage/acceleration_cards/AMD/#3","title":"3.\u73af\u5883\u4ecb\u7ecd","text":"

System: Ubuntu 24.04.3 Kernel: Linux 6.14.0-33-generic ROCm version: 7.0.1 python\u73af\u5883\uff1a python 3.12 pytorch-triton-rocm 3.5.0+gitbbb06c03 torch 2.10.0.dev20251001+rocm7.0 torchvision 0.25.0.dev20251003+rocm7.0 vllm 0.11.0rc2.dev198+g736fbf4c8.rocm701 \u4e0d\u540c\u7248\u672c\u65e0\u6240\u8c13\uff0c\u5904\u7406\u65b9\u6cd5\u662f\u4e00\u6837\u7684\u3002

"},{"location":"zh/usage/acceleration_cards/AMD/#4","title":"4.\u524d\u7f6e\u73af\u5883\u5b89\u88c5","text":"
uv venv --python python3.12\nsource .venv/bin/activate\nuv pip install --pre torch torchvision   -i https://pypi.tuna.tsinghua.edu.cn/simple/   --extra-index-url https://download.pytorch.org/whl/nightly/rocm7.0\nuv pip install pip\n# \u907f\u514d\u8986\u76d6\u6211\u4eec\u672c\u5730\u7684pytorch\uff0c\u6539\u7528pip\u800c\u6ca1\u6709\u7ee7\u7eed\u4f7f\u7528uv pip\npip install -U \"mineru[core]\" -i https://pypi.mirrors.ustc.edu.cn/simple/\n
vllm \u5b89\u88c5\u53c2\u8003\u5b98\u65b9\u624b\u518cVllm
#\u624b\u52a8\u5b89\u88c5aiter\uff0cvllm\uff0camd-smi\u7b49\uff0c\u81ea\u884c\u627e\u4e00\u4e2a\u4f4d\u7f6eclone\uff0c\u7136\u540e\u8fdb\u5165\u8be5\u76ee\u5f55\u5427\ngit clone --recursive https://github.com/ROCm/aiter.git\ncd aiter\ngit submodule sync; git submodule update --init --recursive\npython setup.py develop\ncd ..\ngit clone https://github.com/vllm-project/vllm.git\ncd vllm/\ncp -r /opt/rocm/share/amd_smi ~/Pytorch/vllm/\npip install amd_smi/\npip install --upgrade numba \\\n    scipy \\\n    huggingface-hub[cli,hf_transfer] \\\n    setuptools_scm\npip install -r requirements/rocm.txt\nexport PYTORCH_ROCM_ARCH=\"gfx1100\"   #\u6839\u636e\u81ea\u5df1\u7684GPU\u67b6\u6784 rocminfo | grep gfx\npython setup.py develop\n
"},{"location":"zh/usage/acceleration_cards/AMD/#5vllmtriton","title":"5.vllm\u4e2d\u5173\u952etriton\u7b97\u5b50\u6dfb\u52a0","text":""},{"location":"zh/usage/acceleration_cards/AMD/#1518sit7900xtx13sitamd-gpu","title":"\u8fd9\u91cc\u6211\u7ed9\u51fa\u4e24\u79cd\u89e3\u51b3\u65b9\u6cd5\uff0c\u7b2c\u4e00\u79cd\u89e3\u51b3\u65b9\u6cd5\u5c31\u662f\u524d\u9762\u63d0\u5230\u7684\u4f18\u5316\u52301.5\u52301.8s/it\uff0c\u7b2c\u4e8c\u79cd\u65b9\u6cd5\u6709\u624b\u52a8\u4f18\u5316\u7b97\u5b50\u5230\u77e9\u9635\u4e58\u6cd5\uff0c7900xtx\u80af\u5b9a\u9002\u7528\uff0c\u5927\u69821.3s/it\uff0c\u5176\u4ed6AMD GPU\u76f8\u5bf9\u65b9\u6848\u4e00\u4e5f\u6709\u63d0\u901f\uff0c\u4f46\u662f\u4e0d\u4e00\u5b9a\u662f\u6700\u4f73\u901f\u5ea6\u5b9e\u73b0\uff0c\u91cc\u9762\u7684\u624b\u52a8\u90e8\u5206\u53ef\u80fd\u9700\u8981\u5fae\u8c03\u3002","text":"

\u6ce8\u610fpip\u628atriton \u540e\u7aef\u7684flash_attn\u5378\u8f7d\u4e86\uff0c\u641e\u4e86\u534a\u5929\u5404\u79cd\u5c1d\u8bd5\u8fd8\u662f\u62a5\u9519\uff0c\u95ee\u9898\u6bd4\u8f83\u5927\uff0c\u76f4\u63a5\u4e0d\u7528\u5c31\u884c\u4e86

#\u5b9a\u4f4d\u81ea\u5df1vllm\u4f4d\u7f6eXXX\npip show vllm\n
\u5173\u952e\u66f4\u6539 XXX/vllm/model_executor/models/qwen2_vl.py\u6587\u4ef6\uff1a 1.qwen2_vl.py\u6587\u4ef633\u884c\u4e0b\u589e\u52a0from .qwen2_vl_vision_kernels import triton_conv3d_patchify
from collections.abc import Iterable, Mapping, Sequence\nfrom functools import partial\nfrom typing import Annotated, Any, Callable, Literal, Optional, Union\n\nimport torch\nimport torch.nn as nn\nimport torch.nn.functional as F\nfrom .qwen2_vl_vision_kernels import triton_conv3d_patchify\n
\u63a5\u4e0b\u6765\u5206\u4e3a\u65b9\u6848\u4e00(2.1\u548c3.1)\u548c\u65b9\u6848\u4e8c(2.2\u548c3.2)\uff0c\u9009\u53d6\u4e00\u79cd\u5b9e\u73b0\u5373\u53ef

\u65b9\u68481 2.1qwen2_vl.py\u6587\u4ef6498\u884cclass Qwen2VisionPatchEmbed(nn.Module),PS.\u5c31\u662f\u8fd9\u73a9\u610fAMD\u6ca1\u6709\u73b0\u6210\u7684\u5185\u6838\u7b97\u5b50\u5bfc\u81f4\u56de\u9000

class Qwen2VisionPatchEmbed(nn.Module):\n\n    def __init__(\n        self,\n        patch_size: int = 14,\n        temporal_patch_size: int = 2,\n        in_channels: int = 3,\n        embed_dim: int = 1152,\n    ) -> None:\n        super().__init__()\n        self.patch_size = patch_size\n        self.temporal_patch_size = temporal_patch_size\n        self.embed_dim = embed_dim\n\n        kernel_size = (temporal_patch_size, patch_size, patch_size)\n        self.proj = nn.Conv3d(in_channels,\n                              embed_dim,\n                              kernel_size=kernel_size,\n                              stride=kernel_size,\n                              bias=False)\n    def forward(self, x: torch.Tensor) -> torch.Tensor:\n        L, C = x.shape\n        x_reshaped = x.view(L, -1, self.temporal_patch_size, self.patch_size,\n                            self.patch_size)\n\n        # Call your custom Triton kernel instead of self.proj\n        x_out = triton_conv3d_patchify(x_reshaped, self.proj.weight)\n\n        # The output of our kernel is already the correct shape [L, embed_dim]\n        return x_out\n
3.1XXX/vllm/model_executor/models/\u76ee\u5f55\u4e0b\u521b\u5efaqwen2_vl_vision_kernels.py\u6587\u4ef6\uff0c\u7528triton\u5b9e\u73b0
import torch\nfrom vllm.triton_utils import tl, triton\n\n@triton.jit\ndef _conv3d_patchify_kernel(\n    # Pointers to tensors\n    X, W, Y,\n    # Tensor dimensions\n    N, C_in, D_in, H_in, W_in,\n    C_out, KD, KH, KW,\n    # Stride and padding for memory access\n    stride_xn, stride_xc, stride_xd, stride_xh, stride_xw,\n    stride_wn, stride_wc, stride_wd, stride_wh, stride_ww,\n    stride_yn, stride_yc,\n    # Triton-specific metaparameters\n    BLOCK_SIZE: tl.constexpr,\n):\n    \"\"\"\n    Triton kernel for a non-overlapping 3D patching convolution.\n    Each kernel instance computes one output value for one patch.\n    \"\"\"\n    # Get the program IDs for the N (patch) and C_out (output channel) dimensions\n    pid_n = tl.program_id(0)  # The index of the patch we are processing\n    pid_cout = tl.program_id(1) # The index of the output channel we are computing\n\n    # --- Calculate memory pointers ---\n    # Pointer to the start of the current input patch\n    x_ptr = X + (pid_n * stride_xn)\n    # Pointer to the start of the current filter (weight)\n    w_ptr = W + (pid_cout * stride_wn)\n    # Pointer to where the output will be stored\n    y_ptr = Y + (pid_n * stride_yn + pid_cout * stride_yc)\n\n    # --- Perform the convolution (element-wise product and sum) ---\n    # This is a dot product between the flattened patch and the flattened filter.\n    accumulator = tl.zeros((BLOCK_SIZE,), dtype=tl.float32)\n\n    # Iterate over the elements of the patch/filter\n    for c_offset in range(0, C_in):\n        for d_offset in range(0, KD):\n            for h_offset in range(0, KH):\n                # Unrolled loop for the innermost dimension (width) for performance\n                for w_offset in range(0, KW, BLOCK_SIZE):\n                    # Create masks to handle cases where KW is not a multiple of BLOCK_SIZE\n                    w_range = w_offset + tl.arange(0, BLOCK_SIZE)\n                    w_mask = w_range < KW\n\n                    # Calculate offsets to load data\n                    patch_offset = (c_offset * stride_xc + d_offset * stride_xd +\n                                    h_offset * stride_xh + w_range * stride_xw)\n                    filter_offset = (c_offset * stride_wc + d_offset * stride_wd +\n                                     h_offset * stride_wh + w_range * stride_ww)\n\n                    # Load patch and filter data, applying masks\n                    patch_vals = tl.load(x_ptr + patch_offset, mask=w_mask, other=0.0)\n                    filter_vals = tl.load(w_ptr + filter_offset, mask=w_mask, other=0.0)\n\n                    # Multiply and accumulate\n                    accumulator += patch_vals.to(tl.float32) * filter_vals.to(tl.float32)\n\n    # Sum the accumulator block and store the single output value\n    output_val = tl.sum(accumulator, axis=0)\n    tl.store(y_ptr, output_val)\n\n\ndef triton_conv3d_patchify(x: torch.Tensor, weight: torch.Tensor) -> torch.Tensor:\n    \"\"\"\n    Python wrapper for the 3D patching convolution Triton kernel.\n    \"\"\"\n    # Get tensor dimensions\n    N, C_in, D_in, H_in, W_in = x.shape\n    C_out, _, KD, KH, KW = weight.shape\n\n    # Create the output tensor\n    # The output of this specific conv is (N, C_out, 1, 1, 1), which we squeeze\n    Y = torch.empty((N, C_out), dtype=x.dtype, device=x.device)\n\n    # Define the grid for launching the Triton kernel\n    # Each kernel instance handles one patch (N) for one output channel (C_out)\n    grid = (N, C_out)\n\n    # Launch the kernel\n    # We pass all strides to make the kernel flexible\n    _conv3d_patchify_kernel[grid](\n        x, weight, Y,\n        N, C_in, D_in, H_in, W_in,\n        C_out, KD, KH, KW,\n        x.stride(0), x.stride(1), x.stride(2), x.stride(3), x.stride(4),\n        weight.stride(0), weight.stride(1), weight.stride(2), weight.stride(3), weight.stride(4),\n        Y.stride(0), Y.stride(1),\n        BLOCK_SIZE=16, # A reasonable default, can be tuned\n    )\n\n    return Y\n

\u65b9\u68482 2.2qwen2_vl.py\u6587\u4ef6498\u884cclass Qwen2VisionPatchEmbed(nn.Module)\u51fd\u6570,PS.\u5c31\u662f\u8fd9\u73a9\u610fAMD\u6ca1\u6709\u73b0\u6210\u7684\u5185\u6838\u7b97\u5b50\u5bfc\u81f4\u56de\u9000\uff0c\u8fd9\u91cc\u6211\u4eec\u76f4\u63a55D\u5f20\u91cf\u4e00\u6b65\u5230\u4f4d\uff0c\u6539\u4e3a\u77e9\u9635\u4e58\u6cd5

class Qwen2VisionPatchEmbed(nn.Module):\n\n    def __init__(\n        self,\n        patch_size: int = 14,\n        temporal_patch_size: int = 2,\n        in_channels: int = 3,\n        embed_dim: int = 1152,\n    ) -> None:\n        super().__init__()\n        self.patch_size = patch_size\n        self.temporal_patch_size = temporal_patch_size\n        self.embed_dim = embed_dim\n\n        kernel_size = (temporal_patch_size, patch_size, patch_size)\n\n        self.proj = nn.Conv3d(in_channels,\n                              embed_dim,\n                              kernel_size=kernel_size,\n                              stride=kernel_size,\n                              bias=False)\n\n    def forward(self, x: torch.Tensor) -> torch.Tensor:\n        L, C = x.shape\n        x_reshaped_5d = x.view(L, -1, self.temporal_patch_size, self.patch_size,\n                               self.patch_size)\n\n        return triton_conv3d_patchify(x_reshaped_5d, self.proj.weight)\n
3.2XXX/vllm/model_executor/models/\u76ee\u5f55\u4e0b\u521b\u5efaqwen2_vl_vision_kernels.py\u6587\u4ef6\uff0c\u7528triton\u5b9e\u73b0
import torch\nfrom vllm.triton_utils import tl, triton\n\n@triton.jit\ndef _conv_gemm_kernel(\n    A, B, C, M, N, K,\n    stride_am, stride_ak,\n    stride_bk, stride_bn,\n    stride_cm, stride_cn,\n    BLOCK_M: tl.constexpr, BLOCK_N: tl.constexpr, BLOCK_K: tl.constexpr,\n):\n    pid_m = tl.program_id(0)\n    pid_n = tl.program_id(1)\n    offs_m = pid_m * BLOCK_M + tl.arange(0, BLOCK_M)\n    offs_n = pid_n * BLOCK_N + tl.arange(0, BLOCK_N)\n    offs_k = tl.arange(0, BLOCK_K)\n    a_ptrs = A + (offs_m[:, None] * stride_am + offs_k[None, :] * stride_ak)\n    b_ptrs = B + (offs_k[:, None] * stride_bk + offs_n[None, :] * stride_bn)\n    accumulator = tl.zeros((BLOCK_M, BLOCK_N), dtype=tl.float32)\n    for k in range(0, K, BLOCK_K):\n        a = tl.load(a_ptrs, mask=(offs_m[:, None] < M) & (offs_k[None, :] < K), other=0.0)\n        b = tl.load(b_ptrs, mask=(offs_k[:, None] < K) & (offs_n[None, :] < N), other=0.0)\n        accumulator += tl.dot(a, b)\n        a_ptrs += BLOCK_K * stride_ak\n        b_ptrs += BLOCK_K * stride_bk\n        offs_k += BLOCK_K\n    c = accumulator.to(C.dtype.element_ty)\n    offs_cm = pid_m * BLOCK_M + tl.arange(0, BLOCK_M)\n    offs_cn = pid_n * BLOCK_N + tl.arange(0, BLOCK_N)\n    c_ptrs = C + stride_cm * offs_cm[:, None] + stride_cn * offs_cn[None, :]\n    c_mask = (offs_cm[:, None] < M) & (offs_cn[None, :] < N)\n    tl.store(c_ptrs, c, mask=c_mask)\n\ndef triton_conv3d_patchify(x_5d: torch.Tensor, weight_5d: torch.Tensor) -> torch.Tensor:\n    N_patches, _, _, _, _ = x_5d.shape\n    C_out, _, _, _, _ = weight_5d.shape\n    A = x_5d.view(N_patches, -1)\n    B = weight_5d.view(C_out, -1).transpose(0, 1).contiguous()\n    M, K = A.shape\n    _K, N = B.shape\n    assert K == _K\n    C = torch.empty((M, N), device=A.device, dtype=A.dtype)\n\n    # --- \u9488\u5bf97900xtx\u7684\u624b\u52a8\u8c03\u4f18\u914d\u7f6e\uff0c\u5176\u4ed6GPU\u7684\u6700\u4f18\u7ec4\u5408\u53ef\u80fd\u9700\u8981\u81ea\u884c\u5bfb\u627e\uff0cAMD\u7684autotune\u6548\u679c\u5c31\u662f\u6ca1\u6709\u6548\u679c ---\n    best_config = {\n        'BLOCK_M': 128,\n        'BLOCK_N': 128,\n        'BLOCK_K': 32,\n    }\n    num_stages = 4\n    num_warps = 8\n\n    grid = (triton.cdiv(M, best_config['BLOCK_M']),\n            triton.cdiv(N, best_config['BLOCK_N']))\n\n    _conv_gemm_kernel[grid](\n        A, B, C,\n        M, N, K,\n        A.stride(0), A.stride(1),\n        B.stride(0), B.stride(1),\n        C.stride(0), C.stride(1),\n        **best_config,\n        num_stages=num_stages,\n        num_warps=num_warps\n    )\n\n    return C\n

4.\u5173\u95ed\u7ec8\u7aef\u540e\u518d\u6b21\u4f7f\u7528mineru-gradio\u4f1a\u62a5\u4e00\u4e2aLora\u9519\u8bef\uff0c\u4fee\u6539\u4ee3\u7801\u8df3\u8fc7\u5b83

pip show mineru_vl_utils\n

\u6253\u5f00\u8be5\u6587\u4ef6XXX/mineru_vl_utils/vlm_client/vllm_async_engine_client.py\u4fee\u6539\u7b2c58\u884cself.tokenizer = vllm_async_llm.tokenizer.get_lora_tokenizer()\u4e3a\uff1a

        try:\n            self.tokenizer = vllm_async_llm.tokenizer.get_lora_tokenizer()\n        except AttributeError:\n            # \u5982\u679c\u6ca1\u6709 get_lora_tokenizer \u65b9\u6cd5\uff0c\u76f4\u63a5\u4f7f\u7528\u539f\u59cb tokenizer\n            self.tokenizer = vllm_async_llm.tokenizer\n

\u6700\u540e\u6574\u4e24\u4e2a\u73af\u5883\u53d8\u91cf\u540e\u6109\u5feb\u73a9\u800d\u5373\u53ef

export MINERU_MODEL_SOURCE=modelscope\nexport TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1\n
"},{"location":"zh/usage/acceleration_cards/AMD/#6vllmpipeline-layoutdoclayout-yolo","title":"6.vllm\u540e\u7aef\u5df2\u7ecf\u6ca1\u6709\u95ee\u9898\uff0c\u4e0b\u9762\u662fpipeline \u4e2dlayout\u7528\u7684doclayout-yolo\u6a21\u578b\u7a7a\u6d1e\u5377\u79ef\u95ee\u9898","text":""},{"location":"zh/usage/acceleration_cards/AMD/#doclayout-yolo-pipeline","title":"\u6211\u5728 DocLayout-YOLO \u4e0b\u505a\u4e86\u4e00\u4e2a\u56de\u7b54\uff0c\u56e0\u6b64 pipeline \u7684\u7a7a\u6d1e\u5377\u79ef\u95ee\u9898\u4e0d\u5728\u8fd9\u91cc\u8d58\u8ff0\uff0c\u76f4\u63a5\u70b9\u51fb\u94fe\u63a5\u67e5\u770b\u5373\u53ef\u3002","text":"

\u67e5\u770b\u81ea\u5df1doclayout-yolo\u5b89\u88c5\u4f4d\u7f6e\u5982\u4e0b\uff0c\u7136\u540e\u8fdb\u5165\u4fee\u6539\u94fe\u63a5\u4e2d\u56de\u590d\u4ecb\u7ecd\u7684\u6587\u4ef6\u5373\u53ef

pip show doclayout-yolo\n
"},{"location":"zh/usage/acceleration_cards/Ascend/","title":"Ascend","text":""},{"location":"zh/usage/acceleration_cards/Ascend/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: CTyunOS 22.06  \ncpu: Kunpeng-920 (aarch64)  \nnpu: Ascend 910B2  \ndriver: 23.0.3 \ndocker: 20.10.12\n
"},{"location":"zh/usage/acceleration_cards/Ascend/#2","title":"2. \u73af\u5883\u51c6\u5907","text":"

Note

Ascend\u52a0\u901f\u5361\u652f\u6301\u4f7f\u7528vllm\u6216lmdeploy\u8fdb\u884cVLM\u6a21\u578b\u63a8\u7406\u52a0\u901f\u3002\u8bf7\u6839\u636e\u5b9e\u9645\u9700\u6c42\u9009\u62e9\u5b89\u88c5\u548c\u4f7f\u7528\u5176\u4e2d\u4e4b\u4e00:

"},{"location":"zh/usage/acceleration_cards/Ascend/#21-dockerfile-vllm","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08vllm\uff09","text":"

Tip

ascend-vllm\u652f\u6301\u8bbe\u5907\u5982\u4e0b:

  • Atlas A2 training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
  • Atlas 800I A2 inference series (Atlas 800I A2)
  • Atlas A3 training series (Atlas 800T A3, Atlas 900 A3 SuperPoD, Atlas 9000 A3 SuperPoD)
  • Atlas 800I A3 inference series (Atlas 800I A3)
  • [Experimental] Atlas 300I inference series (Atlas 300I Duo)

Dockerfile\u6587\u4ef6\u7b2c\u4e09\u884c\u4e3aascend-vllm\u57fa\u7840\u955c\u50cf\u4fe1\u606f,\u9ed8\u8ba4tag\u4e3aA2\u9002\u914d\u7684\u7248\u672c,\u4f8b\u5982 v0.11.0

  • \u5982\u9700\u4f7f\u7528A3\u9002\u914d\u7684\u7248\u672c,\u8bf7\u5c06\u7b2c\u4e09\u884c\u7684tag\u4fee\u6539\u4e3a v0.11.0-a3,\u7136\u540e\u518d\u6267\u884cbuild\u64cd\u4f5c\u3002
  • \u5982\u9700\u4f7f\u7528Atlas 300I Duo\u9002\u914d\u7684\u7248\u672c,\u8bf7\u5c06\u7b2c\u4e09\u884c\u7684tag\u4fee\u6539\u4e3a v0.10.0rc1-310p,\u7136\u540e\u518d\u6267\u884cbuild\u64cd\u4f5c\u3002
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/npu.Dockerfile\ndocker build --network=host -t mineru:npu-vllm-latest -f npu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Ascend/#22-dockerfile-lmdeploy","title":"2.2 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08lmdeploy\uff09","text":"

Tip

ascend-lmdeploy\u652f\u6301\u8bbe\u5907\u5982\u4e0b:

  • Atlas A2 training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
  • Atlas 800I A2 inference series (Atlas 800I A2)

\u5982\u679c\u60a8\u7684\u8bbe\u5907\u4e3aAtlas A3\u7cfb\u5217\u6216Atlas 300I Duo\u7cfb\u5217\uff0c\u8bf7\u4f7f\u7528vllm\u7248\u672c\u7684\u955c\u50cf\u3002

wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/npu.Dockerfile\n# \u5c06\u57fa\u7840\u955c\u50cf\u4ece vllm \u5207\u6362\u4e3a lmdeploy\nsed -i '3s/^/# /' npu.Dockerfile && sed -i '5s/^# //' npu.Dockerfile\ndocker build --network=host -t mineru:npu-lmdeploy-latest -f npu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Ascend/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -u root --name mineru_docker --privileged=true \\\n    --ipc=host \\\n    --network=host \\\n    --device=/dev/davinci0 \\\n    --device=/dev/davinci1 \\\n    --device=/dev/davinci2 \\\n    --device=/dev/davinci3 \\\n    --device=/dev/davinci4 \\\n    --device=/dev/davinci5 \\\n    --device=/dev/davinci6 \\\n    --device=/dev/davinci7 \\\n    --device=/dev/davinci_manager \\\n    --device=/dev/devmm_svm \\\n    --device=/dev/hisi_hdc \\\n    -v /var/log/npu/:/usr/slog \\\n    -v /usr/local/dcmi:/usr/local/dcmi \\\n    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \\\n    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \\\n    -e VLLM_WORKER_MULTIPROC_METHOD=spawn \\\n    -e MINERU_MODEL_SOURCE=local \\\n    -e MINERU_LMDEPLOY_DEVICE=ascend \\\n    -it mineru:npu-vllm-latest \\\n    /bin/bash\n

Tip

\u8bf7\u6839\u636e\u5b9e\u9645\u60c5\u51b5\u9009\u62e9\u4f7f\u7528vllm\u6216lmdeploy\u7248\u672c\u7684\u955c\u50cf\uff0c\u5982\u9700\u4f7f\u7528lmdeploy\uff0c\u66ff\u6362\u4e0a\u8ff0\u547d\u4ee4\u4e2d\u7684mineru:npu-vllm-latest\u4e3amineru:npu-lmdeploy-latest\u5373\u53ef\u3002

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

Note

\u7531\u4e8e310p\u52a0\u901f\u5361\u4e0d\u652f\u6301\u56fe\u6a21\u5f0f\u4e0ebf16\u7cbe\u5ea6\uff0c\u56e0\u6b64\u5728\u4f7f\u7528\u8be5\u52a0\u901f\u5361\u65f6\uff0c\u6267\u884c\u4efb\u610f\u4e0evllm\u76f8\u5173\u547d\u4ee4\u9700\u8ffd\u52a0--enforce-eager --dtype float16\u53c2\u6570\u3002 \u4f8b\u5982:

mineru-openai-server --port 30000 --enforce-eager --dtype float16\n
"},{"location":"zh/usage/acceleration_cards/Ascend/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Ascend\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm lmdeploy \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size/--dp) \ud83d\udfe2 \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • NPU\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003ASCEND_RT_VISIBLE_DEVICES
  • \u5728Ascend\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7npu-smi info\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/Biren/","title":"Biren","text":""},{"location":"zh/usage/acceleration_cards/Biren/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.4 LTS\ncpu: Intel x86-64\ngpu: Biren 106C\ndriver: 1.10.0\ndocker: 28.0.4\n
"},{"location":"zh/usage/acceleration_cards/Biren/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/Biren/#21-vllm","title":"2.1 \u4e0b\u8f7d\u5e76\u52a0\u8f7d\u955c\u50cf \uff08vllm\uff09","text":"
wget http://birentech.com/xxx/MinerU/mineru-vllm.tar \u94fe\u63a5\u83b7\u53d6\u8bf7\u8054\u7cfb\u58c1\u4ede\u5185\u90e8\u4eba\u5458\uff08\u90ae\u7bb1\uff1aMonaLiu@birentech.com\uff09\ndocker load -i mineru-vllm.tar\n
"},{"location":"zh/usage/acceleration_cards/Biren/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -it --name mineru_docker \\\n    --privileged \\\n    --network=host \\\n    --shm-size=100G \\\n    -e MINERU_MODEL_SOURCE=local \\\n    -e MINERU_DEVICE_MODEL=supa \\\n    -e SHAPE_TRANSFORM_GRANK=true \\\n    mineru:biren-vllm-latest \\\n    /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/Biren/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Biren\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • Biren\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e, \u5c06\u73af\u5883\u53d8\u91cfCUDA_VISIBLE_DEVICES\u66ff\u6362\u4e3aSUPA_VISIBLE_DEVICES\u5373\u53ef\u3002
  • \u5728\u58c1\u4ede\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7brsmi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/Cambricon/","title":"Cambricon","text":""},{"location":"zh/usage/acceleration_cards/Cambricon/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.5 LTS  \ncpu: Hygon Hygon C86 7490\nmlu: MLU590-M9D\ndriver: v6.2.11\ndocker: 28.3.0\n
"},{"location":"zh/usage/acceleration_cards/Cambricon/#2","title":"2. \u73af\u5883\u51c6\u5907","text":"

Note

Cambricon\u52a0\u901f\u5361\u652f\u6301\u4f7f\u7528lmdeploy\u6216vllm\u8fdb\u884cVLM\u6a21\u578b\u63a8\u7406\u52a0\u901f\u3002\u8bf7\u6839\u636e\u5b9e\u9645\u9700\u6c42\u9009\u62e9\u5b89\u88c5\u548c\u4f7f\u7528\u5176\u4e2d\u4e4b\u4e00:

"},{"location":"zh/usage/acceleration_cards/Cambricon/#21-dockerfile-lmdeploy","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08lmdeploy\uff09","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/mlu.Dockerfile\ndocker build --network=host -t mineru:mlu-lmdeploy-latest -f mlu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Cambricon/#22-dockerfile-vllm","title":"2.2 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08vllm\uff09","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/mlu.Dockerfile\n# \u5c06\u57fa\u7840\u955c\u50cf\u4ece lmdeploy \u5207\u6362\u4e3a vllm\nsed -i -e '3,4s/^/# /' -e '6,7s/^# //' mlu.Dockerfile\ndocker build --network=host -t mineru:mlu-vllm-latest -f mlu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Cambricon/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run --name mineru_docker \\\n   --privileged \\\n   --ipc=host \\\n   --network=host \\\n   --shm-size=400g \\\n   --ulimit memlock=-1 \\\n   -v /dev:/dev \\\n   -v /lib/modules:/lib/modules:ro \\\n   -v /usr/bin/cnmon:/usr/bin/cnmon \\\n   -e MINERU_MODEL_SOURCE=local \\\n   -e MINERU_LMDEPLOY_DEVICE=camb \\\n   -it mineru:mlu-lmdeploy-latest \\\n   /bin/bash\n

Tip

\u8bf7\u6839\u636e\u5b9e\u9645\u60c5\u51b5\u9009\u62e9\u4f7f\u7528vllm\u6216lmdeploy\u7248\u672c\u7684\u955c\u50cf\uff0c\u5982\u9700\u4f7f\u7528vllm,\u8bf7\u6267\u884c\u4ee5\u4e0b\u64cd\u4f5c\uff1a

  • \u66ff\u6362\u4e0a\u8ff0\u547d\u4ee4\u4e2d\u7684mineru:mlu-lmdeploy-latest\u4e3amineru:mlu-vllm-latest

  • \u8fdb\u5165\u5bb9\u5668\u540e\uff0c\u901a\u8fc7\u4ee5\u4e0b\u547d\u4ee4\u5207\u6362venv\u73af\u5883\uff1a

    source /torch/venv3/pytorch_infer/bin/activate\n
  • \u5207\u6362\u6210\u529f\u540e\uff0c\u60a8\u53ef\u4ee5\u5728\u547d\u4ee4\u884c\u524d\u770b\u5230(pytorch_infer)\u7684\u6807\u8bc6\uff0c\u8fd9\u8868\u793a\u60a8\u5df2\u6210\u529f\u8fdb\u5165vllm\u7684\u865a\u62df\u73af\u5883\u3002

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/Cambricon/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

Note

\u517c\u5bb9\u6027\u8bf4\u660e\uff1a\u7531\u4e8e\u5bd2\u6b66\u7eaa\uff08Cambricon\uff09\u76ee\u524d\u5bf9 vLLM v1 \u5f15\u64ce\u7684\u652f\u6301\u5c1a\u5f85\u5b8c\u5584\uff0cMinerU \u73b0\u9636\u6bb5\u91c7\u7528 v0 \u5f15\u64ce\u4f5c\u4e3a\u9002\u914d\u65b9\u6848\u3002 \u53d7\u6b64\u9650\u5236\uff0cvLLM \u7684\u5f02\u6b65\u5f15\u64ce\uff08Async Engine\uff09\u529f\u80fd\u5b58\u5728\u517c\u5bb9\u6027\u95ee\u9898\uff0c\u53ef\u80fd\u5bfc\u81f4\u90e8\u5206\u4f7f\u7528\u573a\u666f\u65e0\u6cd5\u6b63\u5e38\u8fd0\u884c\u3002 \u6211\u4eec\u5c06\u6301\u7eed\u8ddf\u8fdb\u5bd2\u6b66\u7eaa\u5bf9 vLLM v1 \u5f15\u64ce\u7684\u652f\u6301\u8fdb\u5c55\uff0c\u5e76\u53ca\u65f6\u5728 MinerU \u4e2d\u8fdb\u884c\u76f8\u5e94\u7684\u9002\u914d\u4e0e\u4f18\u5316\u3002

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Cambricon\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

Tip

  • lmdeploy\u9ec4\u706f\u95ee\u9898\u4e3a\u4e0d\u80fd\u8f93\u5165\u6587\u4ef6\u5939\u4f7f\u7528\u6279\u91cf\u89e3\u6790\u529f\u80fd\uff0c\u8f93\u5165\u5355\u4e2a\u6587\u4ef6\u65f6\u8868\u73b0\u6b63\u5e38\u3002
  • vllm\u9ec4\u706f\u95ee\u9898\u4e3a\u5728\u7cbe\u5ea6\u672a\u5bf9\u9f50\uff0c\u5728\u90e8\u5206\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u9884\u671f\u5916\u7ed3\u679c\u3002
\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm lmdeploy \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe1 \ud83d\udfe1 <vlm/hybrid>-http-client \ud83d\udfe1 \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udd34 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe1 \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udd34 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe1 \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe1 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size/--dp) \ud83d\udd34 \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • Cambricon\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e, \u5c06\u73af\u5883\u53d8\u91cfCUDA_VISIBLE_DEVICES\u66ff\u6362\u4e3aMLU_VISIBLE_DEVICES\u5373\u53ef\u3002
  • \u5728Cambricon\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7cnmon\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/Enflame/","title":"Enflame","text":""},{"location":"zh/usage/acceleration_cards/Enflame/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.4 LTS  \ncpu: Intel x86-64\ngcu: Enflame S60 \ndriver: 1.7.0.9\ndocker: 28.0.1\n
"},{"location":"zh/usage/acceleration_cards/Enflame/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/Enflame/#21-dockerfile","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/gcu.Dockerfile\ndocker build --network=host -t mineru:gcu-vllm-latest -f gcu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Enflame/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -u root --name mineru_docker \\\n    --network=host \\\n    --ipc=host \\\n    --privileged \\\n    -e MINERU_MODEL_SOURCE=local \\\n    -it mineru:gcu-vllm-latest \\\n    /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/Enflame/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Enflame\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • GCU\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e, \u5c06\u73af\u5883\u53d8\u91cfCUDA_VISIBLE_DEVICES\u66ff\u6362\u4e3aTOPS_VISIBLE_DEVICES\u5373\u53ef\u3002
  • \u5728Enflame\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7efsmi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/Hygon/","title":"Hygon","text":""},{"location":"zh/usage/acceleration_cards/Hygon/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.3 LTS  \ncpu: Hygon C86-4G(x86-64)\ndcu: BW200\ndriver: 6.3.13-V1.12.0a\ndocker: 20.10.24\n
"},{"location":"zh/usage/acceleration_cards/Hygon/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/Hygon/#21-dockerfile","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/dcu.Dockerfile\ndocker build --network=host -t mineru:dcu-vllm-latest -f dcu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Hygon/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -u root --name mineru_docker \\\n    --network=host \\\n    --ipc=host \\\n    --shm-size=16G \\\n    --device=/dev/kfd \\\n    --device=/dev/mkfd \\\n    --device=/dev/dri \\\n    -v /opt/hyhal:/opt/hyhal \\\n    --group-add video \\\n    --cap-add=SYS_PTRACE \\\n    --security-opt seccomp=unconfined \\\n    -e MINERU_MODEL_SOURCE=local \\\n    -it mineru:dcu-vllm-latest \\\n    /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/Hygon/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Hygon\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udfe2

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • DCU\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eAMD GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003GPU isolation techniques
  • \u5728Hygon\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7hy-smi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/IluvatarCorex/","title":"IluvatarCorex","text":""},{"location":"zh/usage/acceleration_cards/IluvatarCorex/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.5 LTS  \ncpu: Intel x86-64\ngpu: Iluvatar BI-V150\ndriver: 4.4.0\ndocker: 28.1.1\n
"},{"location":"zh/usage/acceleration_cards/IluvatarCorex/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/IluvatarCorex/#21-dockerfile","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/corex.Dockerfile\ndocker build --network=host -t mineru:corex-vllm-latest -f corex.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/IluvatarCorex/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run --name mineru_docker \\\n   -v /usr/src:/usr/src \\\n   -v /lib/modules:/lib/modules \\\n   -v /dev:/dev \\\n   --privileged \\\n   --cap-add=ALL \\\n   --pid=host \\\n   --group-add video \\\n   --network=host \\\n   --shm-size '400gb' \\\n   --ulimit memlock=-1 \\\n   --security-opt seccomp=unconfined \\\n   --security-opt apparmor=unconfined \\\n   -e VLLM_ENFORCE_CUDA_GRAPH=1 \\\n   -e MINERU_MODEL_SOURCE=local \\\n   -e MINERU_VLLM_DEVICE=corex \\\n   -it mineru:corex-vllm-latest \\\n   /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/IluvatarCorex/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

Tip

\u76ee\u524dIluvatar\u65b9\u6848\u4f7f\u7528vllm\u4f5c\u4e3a\u63a8\u7406\u5f15\u64ce\u65f6\uff0c\u53ef\u80fd\u51fa\u73b0\u670d\u52a1\u505c\u6b62\u540e\u663e\u5b58\u65e0\u6cd5\u6b63\u5e38\u91ca\u653e\u7684\u95ee\u9898\uff0c\u5982\u679c\u9047\u5230\u8be5\u95ee\u9898\uff0c\u8bf7\u91cd\u542fDocker\u5bb9\u5668\u4ee5\u91ca\u653e\u663e\u5b58\u3002

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Iluvatar\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udfe2

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • Iluvatar\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e
  • \u5728Iluvatar\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7ixsmi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/Kunlunxin/","title":"Kunlunxin","text":""},{"location":"zh/usage/acceleration_cards/Kunlunxin/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.5 LTS  \ncpu: Intel x86-64\nxpu: P800\ndriver: 515.58\ndocker: 20.10.5\n
"},{"location":"zh/usage/acceleration_cards/Kunlunxin/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/Kunlunxin/#21-dockerfile-vllm","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08vllm\uff09","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/kxpu.Dockerfile\ndocker build --network=host -t mineru:kxpu-vllm-latest -f kxpu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/Kunlunxin/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -u root --name mineru_docker \\\n    --device=/dev/xpu0:/dev/xpu0 \\\n    --device=/dev/xpu1:/dev/xpu1 \\\n    --device=/dev/xpu2:/dev/xpu2 \\\n    --device=/dev/xpu3:/dev/xpu3 \\\n    --device=/dev/xpu4:/dev/xpu4 \\\n    --device=/dev/xpu5:/dev/xpu5 \\\n    --device=/dev/xpu6:/dev/xpu6 \\\n    --device=/dev/xpu7:/dev/xpu7 \\\n    --device=/dev/xpuctrl:/dev/xpuctrl \\\n    --net=host \\\n    --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \\\n    --tmpfs /dev/shm:rw,nosuid,nodev,exec,size=32g \\\n    --cap-add=SYS_PTRACE \\\n    -v /home/users/vllm-kunlun:/home/vllm-kunlun \\\n    -v /usr/local/bin/xpu-smi:/usr/local/bin/xpu-smi \\\n    -w /workspace \\\n    -e MINERU_MODEL_SOURCE=local \\\n    -e MINERU_FORMULA_CH_SUPPORT=true \\\n    -e MINERU_VLLM_DEVICE=kxpu \\\n    -it mineru:kxpu-vllm-latest \\\n    /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/Kunlunxin/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Kunlunxin\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • Kunlunxin\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e, \u5c06\u73af\u5883\u53d8\u91cfCUDA_VISIBLE_DEVICES\u66ff\u6362\u4e3aXPU_VISIBLE_DEVICES\u5373\u53ef\u3002
  • \u5728Kunlunxin\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7xpu-smi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/METAX/","title":"METAX","text":""},{"location":"zh/usage/acceleration_cards/METAX/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04   \ncpu: INTEL x86_64\ngpu: C500  \ndriver: 2.12.13\ndocker: 28.1.1\n
"},{"location":"zh/usage/acceleration_cards/METAX/#2","title":"2. \u73af\u5883\u51c6\u5907","text":"

Note

maca\u52a0\u901f\u5361\u652f\u6301\u4f7f\u7528vllm\u6216lmdeploy\u8fdb\u884cVLM\u6a21\u578b\u63a8\u7406\u52a0\u901f\u3002\u8bf7\u6839\u636e\u5b9e\u9645\u9700\u6c42\u9009\u62e9\u5b89\u88c5\u548c\u4f7f\u7528\u5176\u4e2d\u4e4b\u4e00:

"},{"location":"zh/usage/acceleration_cards/METAX/#21-metaxvllm","title":"2.1 \u4f7f\u7528metax\u5b98\u65b9\u955c\u50cf\u4f5c\u4e3a\u57fa\u7840\u955c\u50cf\u6784\u5efavllm\u73af\u5883\u955c\u50cf","text":"
  1. \u4ecemetax\u5b98\u65b9\u4ed3\u5e93\u62c9\u53d6\u57fa\u7840\u955c\u50cf
    • 1.1 \u955c\u50cf\u83b7\u53d6\u5730\u5740\uff1ahttps://developer.metax-tech.com/softnova/docker
    • 1.2 \u5728\u955c\u50cf\u7f51\u7ad9\u9009\u62e9AI\u5206\u7c7b\uff0c\u8f6f\u4ef6\u5305\u7c7b\u578b\u9009\u62e9vllm\uff0c\u64cd\u4f5c\u7cfb\u7edf\u9009\u62e9ubuntu
    • 1.3 \u627e\u5230vllm:maca.ai3.1.0.7-torch2.6-py310-ubuntu22.04-amd64\u955c\u50cf\uff0c\u590d\u5236\u62c9\u53d6\u547d\u4ee4\u5e76\u5728\u672c\u5730\u7ec8\u7aef\u6267\u884c
  2. \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08vllm\uff09
    wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/maca.Dockerfile\ndocker build --network=host -t mineru:maca-vllm-latest -f maca.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/METAX/#22-dockerfile-lmdeploy","title":"2.2 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08lmdeploy\uff09","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/maca.Dockerfile\n# \u5c06\u57fa\u7840\u955c\u50cf\u4ece vllm \u5207\u6362\u4e3a lmdeploy\nsed -i '3s/^/# /' maca.Dockerfile && sed -i '5s/^# //' maca.Dockerfile\ndocker build --network=host -t mineru:maca-lmdeploy-latest -f maca.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/METAX/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run --ipc host \\\n   --cap-add SYS_PTRACE \\\n   --privileged=true \\\n   --device=/dev/mem \\\n   --device=/dev/dri \\\n   --device=/dev/mxcd \\\n   --device=/dev/infiniband \\\n   --group-add video \\\n   --network=host \\\n   --shm-size '100gb' \\\n   --ulimit memlock=-1 \\\n   --security-opt seccomp=unconfined \\\n   --security-opt apparmor=unconfined \\\n   --name mineru_docker \\\n   -v /datapool:/datapool \\\n   -e MINERU_MODEL_SOURCE=local \\\n   -e MINERU_LMDEPLOY_DEVICE=maca \\\n   -it mineru:maca-vllm-latest \\\n   /bin/bash\n

Tip

\u8bf7\u6839\u636e\u5b9e\u9645\u60c5\u51b5\u9009\u62e9\u4f7f\u7528vllm\u6216lmdeploy\u7248\u672c\u7684\u955c\u50cf\uff0c\u5982\u9700\u4f7f\u7528lmdeploy\uff0c\u66ff\u6362\u4e0a\u8ff0\u547d\u4ee4\u4e2d\u7684mineru:maca-vllm-latest\u4e3amineru:maca-lmdeploy-latest\u5373\u53ef\u3002

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/METAX/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9maca\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm lmdeploy \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size/--dp) \ud83d\udd34 \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • MACA\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e\u3002
  • \u5728METAX\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7mx-smi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/MooreThreads/","title":"MooreThreads","text":""},{"location":"zh/usage/acceleration_cards/MooreThreads/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.4 LTS  \ncpu: Intel x86-64\ndcu: MTT S4000\ndriver: 3.0.0-rc-KuaE2.0\ndocker: 24.0.7\n
"},{"location":"zh/usage/acceleration_cards/MooreThreads/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/MooreThreads/#21-dockerfile","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/musa.Dockerfile\ndocker build --network=host -t mineru:musa-vllm-latest -f musa.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/MooreThreads/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -u root --name mineru_docker \\\n    --network=host \\\n    --ipc=host \\\n    --shm-size=80g \\\n    --privileged \\\n    -e MTHREADS_VISIBLE_DEVICES=all \\\n    -e MINERU_VLLM_DEVICE=musa \\\n    -e MINERU_MODEL_SOURCE=local \\\n    -it mineru:musa-vllm-latest \\\n    /bin/bash\n

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/MooreThreads/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9MooreThreads\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

Note

\u517c\u5bb9\u6027\u8bf4\u660e\uff1a\u7531\u4e8e\u6469\u5c14\u7ebf\u7a0b\uff08MooreThreads\uff09\u76ee\u524d\u5bf9 vLLM v1 \u5f15\u64ce\u7684\u652f\u6301\u5c1a\u5f85\u5b8c\u5584\uff0cMinerU \u73b0\u9636\u6bb5\u91c7\u7528 v0 \u5f15\u64ce\u4f5c\u4e3a\u9002\u914d\u65b9\u6848\u3002 \u53d7\u6b64\u9650\u5236\uff0cvLLM \u7684\u5f02\u6b65\u5f15\u64ce\uff08Async Engine\uff09\u529f\u80fd\u5b58\u5728\u517c\u5bb9\u6027\u95ee\u9898\uff0c\u53ef\u80fd\u5bfc\u81f4\u90e8\u5206\u4f7f\u7528\u573a\u666f\u65e0\u6cd5\u6b63\u5e38\u8fd0\u884c\u3002 \u6211\u4eec\u5c06\u6301\u7eed\u8ddf\u8fdb\u6469\u5c14\u7ebf\u7a0b\u5bf9 vLLM v1 \u5f15\u64ce\u7684\u652f\u6301\u8fdb\u5c55\uff0c\u5e76\u53ca\u65f6\u5728 MinerU \u4e2d\u8fdb\u884c\u76f8\u5e94\u7684\u9002\u914d\u4e0e\u4f18\u5316\u3002

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udd34 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udd34 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • MooreThreads\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003GPU \u679a\u4e3e
  • \u5728MooreThreads\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7mthreads-gmi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/THead/","title":"THead","text":""},{"location":"zh/usage/acceleration_cards/THead/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04   \ncpu: INTEL x86_64\nppu: ZW810E  \ndriver: 1.4.0\ndocker: 26.1.4\n
"},{"location":"zh/usage/acceleration_cards/THead/#2","title":"2. \u73af\u5883\u51c6\u5907","text":"

Note

ppu\u52a0\u901f\u5361\u652f\u6301\u4f7f\u7528vllm\u6216lmdeploy\u8fdb\u884cVLM\u6a21\u578b\u63a8\u7406\u52a0\u901f\u3002\u8bf7\u6839\u636e\u5b9e\u9645\u9700\u6c42\u9009\u62e9\u5b89\u88c5\u548c\u4f7f\u7528\u5176\u4e2d\u4e4b\u4e00:

"},{"location":"zh/usage/acceleration_cards/THead/#21-dockerfile-vllm","title":"2.1 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08vllm\uff09","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/ppu.Dockerfile\ndocker build --network=host -t mineru:ppu-vllm-latest -f ppu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/THead/#22-dockerfile-lmdeploy","title":"2.2 \u4f7f\u7528 Dockerfile \u6784\u5efa\u955c\u50cf \uff08lmdeploy\uff09","text":"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/ppu.Dockerfile\n# \u5c06\u57fa\u7840\u955c\u50cf\u4ece vllm \u5207\u6362\u4e3a lmdeploy\nsed -i '3s/^/# /' ppu.Dockerfile && sed -i '5s/^# //' ppu.Dockerfile\ndocker build --network=host -t mineru:ppu-lmdeploy-latest -f ppu.Dockerfile .\n
"},{"location":"zh/usage/acceleration_cards/THead/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run --privileged=true \\\n  --name mineru_docker \\\n  --device=/dev/alixpu \\\n  --device=/dev/alixpu_ctl \\\n  --ipc=host \\\n  --network=host \\\n  --ulimit memlock=-1 \\\n  --ulimit stack=67108864 \\\n  --shm-size=500g \\\n  -v /mnt:/mnt \\\n  -v /datapool:/datapool \\\n  -v /var/run/docker.sock:/var/run/docker.sock \\\n  -e MINERU_MODEL_SOURCE=local \\\n  -it mineru:ppu-vllm-latest \\\n  /bin/bash\n

Tip

\u8bf7\u6839\u636e\u5b9e\u9645\u60c5\u51b5\u9009\u62e9\u4f7f\u7528vllm\u6216lmdeploy\u7248\u672c\u7684\u955c\u50cf\uff0c\u5982\u9700\u4f7f\u7528lmdeploy\uff0c\u66ff\u6362\u4e0a\u8ff0\u547d\u4ee4\u4e2d\u7684mineru:ppu-vllm-latest\u4e3amineru:ppu-lmdeploy-latest\u5373\u53ef\u3002

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/THead/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9ppu\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm lmdeploy \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size/--dp) \ud83d\udd34 \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • PPU\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e\u3002
  • \u5728T-Head\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7ppu-smi\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/Tecorigin/","title":"Tecorigin","text":""},{"location":"zh/usage/acceleration_cards/Tecorigin/#1","title":"1. \u6d4b\u8bd5\u5e73\u53f0","text":"

\u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003\uff1a

os: Ubuntu 22.04.5 LTS  \ncpu: AMD EPYC (amd64)\ngpu: T100\ndriver: 3.0.0\ndocker: 28.0.4\n
"},{"location":"zh/usage/acceleration_cards/Tecorigin/#2","title":"2. \u73af\u5883\u51c6\u5907","text":""},{"location":"zh/usage/acceleration_cards/Tecorigin/#21-vllm","title":"2.1 \u4e0b\u8f7d\u5e76\u52a0\u8f7d\u955c\u50cf \uff08vllm\uff09","text":"
wget http://wb.tecorigin.com:8082/repository/teco-customer-repo/Course/MinerU/mineru-vllm.tar\n\ndocker load -i mineru-vllm.tar\n
"},{"location":"zh/usage/acceleration_cards/Tecorigin/#3-docker","title":"3. \u542f\u52a8 Docker \u5bb9\u5668","text":"
docker run -dit --name mineru_docker \\\n    --privileged \\\n    --cap-add SYS_PTRACE \\\n    --cap-add SYS_ADMIN \\\n    --network=host \\\n    --shm-size=500G \\\n    mineru:sdaa-vllm-latest \\\n    /bin/bash\n

Tip

\u5982\u9700\u4f7f\u7528vllm\u73af\u5883,\u8bf7\u6267\u884c\u4ee5\u4e0b\u64cd\u4f5c\uff1a - \u8fdb\u5165\u5bb9\u5668\u540e\uff0c\u901a\u8fc7\u4ee5\u4e0b\u547d\u4ee4\u5207\u6362\u5230conda\u73af\u5883\uff1a

conda activate vllm_env_py310\n
  • \u5207\u6362\u6210\u529f\u540e\uff0c\u60a8\u53ef\u4ee5\u5728\u547d\u4ee4\u884c\u524d\u770b\u5230(vllm_env_py310)\u7684\u6807\u8bc6\uff0c\u8fd9\u8868\u793a\u60a8\u5df2\u6210\u529f\u8fdb\u5165vllm\u7684\u865a\u62df\u73af\u5883\u3002

\u6267\u884c\u8be5\u547d\u4ee4\u540e\uff0c\u60a8\u5c06\u8fdb\u5165\u5230Docker\u5bb9\u5668\u7684\u4ea4\u4e92\u5f0f\u7ec8\u7aef\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u5728\u5bb9\u5668\u5185\u8fd0\u884cMinerU\u76f8\u5173\u547d\u4ee4\u6765\u4f7f\u7528MinerU\u7684\u529f\u80fd\u3002 \u60a8\u4e5f\u53ef\u4ee5\u76f4\u63a5\u901a\u8fc7\u66ff\u6362/bin/bash\u4e3a\u670d\u52a1\u542f\u52a8\u547d\u4ee4\u6765\u542f\u52a8MinerU\u670d\u52a1\uff0c\u8be6\u7ec6\u8bf4\u660e\u8bf7\u53c2\u8003\u901a\u8fc7\u547d\u4ee4\u542f\u52a8\u670d\u52a1\u3002

"},{"location":"zh/usage/acceleration_cards/Tecorigin/#4","title":"4. \u6ce8\u610f\u4e8b\u9879","text":"

\u4e0d\u540c\u73af\u5883\u4e0b\uff0cMinerU\u5bf9Tecorigin\u52a0\u901f\u5361\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u5bb9\u5668\u73af\u5883 vllm \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udfe2 <vlm/hybrid>-auto-engine \ud83d\udfe2 <vlm/hybrid>-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udd34

\u6ce8\uff1a \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNvidia GPU\u57fa\u672c\u4e00\u81f4 \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02 \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02

Tip

  • Tecorigin\u52a0\u901f\u5361\u6307\u5b9a\u53ef\u7528\u52a0\u901f\u5361\u7684\u65b9\u5f0f\u4e0eNVIDIA GPU\u7c7b\u4f3c\uff0c\u8bf7\u53c2\u8003\u4f7f\u7528\u6307\u5b9aGPU\u8bbe\u5907\u7ae0\u8282\u8bf4\u660e, \u5c06\u73af\u5883\u53d8\u91cfCUDA_VISIBLE_DEVICES\u66ff\u6362\u4e3aSDAA_VISIBLE_DEVICES\u5373\u53ef\u3002
  • \u5728\u592a\u521d\u5e73\u53f0\u53ef\u4ee5\u901a\u8fc7teco-smi -c\u547d\u4ee4\u67e5\u770b\u52a0\u901f\u5361\u7684\u4f7f\u7528\u60c5\u51b5\uff0c\u5e76\u6839\u636e\u9700\u8981\u6307\u5b9a\u7a7a\u95f2\u7684\u52a0\u901f\u5361ID\u4ee5\u907f\u514d\u8d44\u6e90\u51b2\u7a81\u3002
"},{"location":"zh/usage/acceleration_cards/VastAI/","title":"VastAI","text":""},{"location":"zh/usage/acceleration_cards/VastAI/#1","title":"1. \u701a\u535a\u534a\u5bfc\u4f53","text":"
  • \u5b98\u65b9\u7f51\u5740\uff1ahttps://www.vastaitech.com
  • \u6a21\u578b\u4e2d\u5fc3\uff1ahttps://github.com/Vastai/VastModelZOO
"},{"location":"zh/usage/acceleration_cards/VastAI/#2","title":"2. \u6d4b\u8bd5\u5e73\u53f0","text":"
  • \u4ee5\u4e0b\u4e3a\u672c\u6307\u5357\u6d4b\u8bd5\u4f7f\u7528\u7684\u5e73\u53f0\u4fe1\u606f\uff0c\u4f9b\u53c2\u8003
    os: Ubuntu-22.04.3-LTS-x86_64\ncpu: Hygon C86-4G\ngpu: VA16 / VA1L / VA10L\ntorch: 2.8.0+cpu\ntorch-vacc: 1.3.3.777\nvllm: 0.11.1.dev0+gb8b302cde.d20251030.cpu\nvllm-vacc: 0.11.0.777\ndriver: 00.25.12.30 d3_3_v2_9_a3_1 a76bf37 20251230\ndocker: 28.1.1\n
"},{"location":"zh/usage/acceleration_cards/VastAI/#3","title":"3. \u73af\u5883\u51c6\u5907","text":"
  • \u83b7\u53d6vllm_vacc\u57fa\u7840\u955c\u50cf

    sudo docker pull harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP2\n
  • \u542f\u52a8\u5bb9\u5668

    sudo docker run -it \\\n    --privileged=true \\\n    --shm-size=256g \\\n    --name vllm_service \\\n    --ipc=host \\\n    --network=host \\\n    harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP2 bash\n
  • \u5b89\u88c5MinerU

  • \u53c2\u8003\u5b98\u65b9\u6587\u6863\u5b89\u88c5\uff1aREADME_zh-CN.md#\u5b89\u88c5-mineru

    ```bash\n# \u542f\u52a8\u5bb9\u5668\n# sudo docker exec -it vllm_service bash\n\n# \u53ef\u9009pypi\u6e90\n# https://mirrors.163.com/pypi/simple/\n# https://mirrors.aliyun.com/pypi/simple/\n# https://pypi.mirrors.ustc.edu.cn/simple/\n# https://pypi.tuna.tsinghua.edu.cn/simple/\n# https://mirror.baidu.com/pypi/simple\n\n# \u901a\u8fc7\u6e90\u7801\u5b89\u88c5MinerU\ngit clone https://github.com/opendatalab/MinerU.git\ngit checkout 8c4b3ef3a20b11ddac9903f25124d24ea82639b5\npip install -e .[core] -i https://mirrors.aliyun.com/pypi/simple\n\n# \u6216\u4f7f\u7528pip\u5b89\u88c5MinerU\npip install -U \"mineru[core]==2.7.0\" -i https://mirrors.aliyun.com/pypi/simple\n```\n

Note

  • vllm_vacc\u57fa\u7840\u955c\u50cf\u5185\u5df2\u5305\u542btorch/vllm\u7b49\u76f8\u5173\u4f9d\u8d56
  • \u622a\u81f32025/12/31\uff0cVastAI\u5df2\u652f\u6301MinerU\u81f3\u6700\u65b0\u7248\u672c2.7.0\uff0cmaster\u5206\u652f8c4b3ef3
  • \u548cNVIDIA\u786c\u4ef6\u4e0bCUDA_VISIBLE_DEVICES\u7c7b\u4f3c\uff1b\u5728VastAI\u786c\u4ef6\u4e2d\u53ef\u4ee5\u4f7f\u7528VACC_VISIBLE_DEVICES\u6307\u5b9a\u53ef\u89c1\u8ba1\u7b97\u5361ID\uff0c\u5982-e VACC_VISIBLE_DEVICES=0,1,2,3
  • \u9700\u6307\u5b9a\u9002\u5f53\u7684--shm-size\u865a\u62df\u5185\u5b58
"},{"location":"zh/usage/acceleration_cards/VastAI/#4-mineru","title":"4. MinerU\u529f\u80fd","text":"

Note

  • VastAI\u52a0\u901f\u5361\u4ec5\u652f\u6301\u4f7f\u7528vlm-auto-engine\u548cvlm-http-client\u5f62\u5f0f\u8fdb\u884cVLM\u6a21\u578b\u63a8\u7406\u52a0\u901f
  • \u8fdb\u5165\u5bb9\u5668

    sudo docker exec -it vllm_service bash\n
  • \u4f7f\u7528MinerU

    • \u6a21\u578b\u51c6\u5907\uff0c\u53c2\u8003\u5b98\u65b9\u4ecb\u7ecd\uff1amodel_source.md

    • \u65b9\u5f0f\u4e00\uff1avlm-auto-engine

      export MINERU_MODEL_SOURCE=modelscope\n\n# step1, \u4ee5`vlm-auto-engine`\u65b9\u5f0f\u542f\u52a8MinerU\u89e3\u6790\u4efb\u52a1\nmineru -p image.png \\\n-o ./output \\\n-b vlm-auto-engine \\\n--http-timeout 1200 \\\n--tensor-parallel-size 2 \\\n--enforce_eager \\\n--trust-remote-code \\\n--max-model-len 16384\n
    • \u65b9\u5f0f\u4e8c\uff1avlm-http-client

      # step1, \u542f\u52a8vLLM API server\nvllm serve /root/.cache/modelscope/hub/models/OpenDataLab/MinerU2.5-2509-1.2B \\\n--tensor-parallel-size 2 \\\n--trust-remote-code \\\n--enforce_eager \\\n--port 8090 \\\n--max-model-len 16384 \\\n--served-model-name MinerU2.5-2509-1.2B\n\n# step2\uff0c\u4ee5`vlm-http-client`\u65b9\u5f0f\u542f\u52a8MinerU\u89e3\u6790\u4efb\u52a1\nmineru -p demo/pdfs/demo1.pdf \\\n-o ./output \\\n-b vlm-http-client \\\n-u http://127.0.0.1:8090 \\\n--http-timeout 1200\n

Note

  • \u6ce8\u610f\u5728\u6267\u884c\u4efb\u610f\u4e0evllm\u76f8\u5173\u547d\u4ee4\u9700\u8ffd\u52a0--enforce_eager\u53c2\u6570
"},{"location":"zh/usage/acceleration_cards/VastAI/#5","title":"5. \u6ce8\u610f\u4e8b\u9879","text":"

VastAI\u52a0\u901f\u5361\u5bf9MinerU\u7684\u652f\u6301\u60c5\u51b5\u5982\u4e0b\u8868\u6240\u793a\uff1a

\u4f7f\u7528\u573a\u666f \u652f\u6301\u60c5\u51b5 \u547d\u4ee4\u884c\u5de5\u5177(mineru) pipeline \ud83d\udd34 hybrid-http-client \ud83d\udd34 hybrid-auto-engine \ud83d\udd34 vlm-auto-engine \ud83d\udfe2 vlm-http-client \ud83d\udfe2 fastapi\u670d\u52a1(mineru-api) pipeline \ud83d\udd34 hybrid-http-client \ud83d\udd34 hybrid-auto-engine \ud83d\udd34 vlm-auto-engine \ud83d\udfe2 vlm-http-client \ud83d\udfe2 gradio\u754c\u9762(mineru-gradio) pipeline \ud83d\udd34 hybrid-http-client \ud83d\udd34 hybrid-auto-engine \ud83d\udd34 vlm-auto-engine \ud83d\udfe2 vlm-http-client \ud83d\udfe2 openai-server\u670d\u52a1\uff08mineru-openai-server\uff09 \ud83d\udfe2 Tensor\u5e76\u884c (--tensor-parallel-size) \ud83d\udfe2 \u6570\u636e\u5e76\u884c (--data-parallel-size) \ud83d\udd34

Note

  • \ud83d\udfe2: \u652f\u6301\uff0c\u8fd0\u884c\u8f83\u7a33\u5b9a\uff0c\u7cbe\u5ea6\u4e0eNVIDIA GPU\u57fa\u672c\u4e00\u81f4
  • \ud83d\udfe1: \u652f\u6301\u4f46\u8f83\u4e0d\u7a33\u5b9a\uff0c\u5728\u67d0\u4e9b\u573a\u666f\u4e0b\u53ef\u80fd\u51fa\u73b0\u5f02\u5e38\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u4e00\u5b9a\u5dee\u5f02
  • \ud83d\udd34: \u4e0d\u652f\u6301\uff0c\u65e0\u6cd5\u8fd0\u884c\uff0c\u6216\u7cbe\u5ea6\u5b58\u5728\u8f83\u5927\u5dee\u5f02
  • vlm-auto-engine\uff1aVastAI\u4ec5\u652f\u6301vLLM\u540e\u7aef
"},{"location":"zh/usage/plugin/BISHENG/","title":"BISHENG \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/BISHENG/#bisheng","title":"BISHENG \u7b80\u4ecb","text":"

BISHENG\u6bd5\u6607 \u662f\u4e00\u6b3e\u5f00\u6e90 LLM\u5e94\u7528\u5f00\u53d1\u5e73\u53f0\uff0c\u4e3b\u653b\u4f01\u4e1a\u573a\u666f\uff0c \u5df2\u6709\u5927\u91cf\u884c\u4e1a\u5934\u90e8\u7ec4\u7ec7\u53ca\u4e16\u754c500\u5f3a\u4f01\u4e1a\u5728\u4f7f\u7528\u3002\u201c\u6bd5\u6607\u201d\u662f\u6d3b\u5b57\u5370\u5237\u672f\u7684\u53d1\u660e\u4eba\uff0c\u6d3b\u5b57\u5370\u5237\u672f\u4e3a\u4eba\u7c7b\u77e5\u8bc6\u7684\u4f20\u9012\u8d77\u5230\u4e86\u5de8\u5927\u7684\u63a8\u52a8\u4f5c\u7528\u3002BISHENG\u6bd5\u6607\u56e2\u961f\u5e0c\u671b\u201cBISHENG\u6bd5\u6607\u201d\u540c\u6837\u80fd\u591f\u4e3a\u667a\u80fd\u5e94\u7528\u7684\u5e7f\u6cdb\u843d\u5730\u63d0\u4f9b\u6709\u529b\u652f\u6491\u3002

  • \u5b98\u7f51\u5730\u5740\uff1ahttps://bisheng.dataelem.com/
  • Miner \u5728BISHENG\u6bd5\u6607 \u9879\u76ee\u4e2d\u7684\u63d2\u4ef6\u9879\u76ee\uff1ahttps://github.com/dataelement/bisheng/pulls

\u7279\u522b\u9e23\u8c22 @pzc163

"},{"location":"zh/usage/plugin/Cherry_Studio/","title":"Cherry Studio \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/Cherry_Studio/#cherry-studio","title":"Cherry Studio \u7b80\u4ecb","text":"

Cherry Studio \u662f\u4e00\u6b3e\u529f\u80fd\u5f3a\u5927\u7684\u591a\u6a21\u578b AI \u5ba2\u6237\u7aef\u8f6f\u4ef6\uff0c\u652f\u6301 Windows\u3001macOS \u548c Linux \u7b49\u591a\u5e73\u53f0\u8fd0\u884c\uff0c\u96c6\u6210\u4e86 OpenAI\u3001DeepSeek\u3001Gemini\u3001Anthropic \u7b49\u4e3b\u6d41 AI \u4e91\u670d\u52a1\uff0c\u540c\u65f6\u652f\u6301\u672c\u5730\u6a21\u578b\u8fd0\u884c\uff0c\u7528\u6237\u53ef\u4ee5\u7075\u6d3b\u5207\u6362\u4e0d\u540c\u7684AI\u6a21\u578b\u3002

\u76ee\u524d\uff0cMinerU \u5f3a\u5927\u7684\u6587\u6863\u89e3\u6790\u80fd\u529b\u5df2\u6df1\u5ea6\u96c6\u6210\u5230 Cherry Studio \u7684\u77e5\u8bc6\u5e93\u4e0e\u5bf9\u8bdd\u4ea4\u4e92\u4e2d\uff0c\u4e3a\u7528\u6237\u5e26\u6765\u66f4\u4fbf\u6377\u7684\u6587\u6863\u5904\u7406\u4e0e\u4fe1\u606f\u83b7\u53d6\u4f53\u9a8c\u3002

  • Cherry Studio \u5b98\u7f51\u5730\u5740\uff1ahttps://www.cherry-ai.com/
"},{"location":"zh/usage/plugin/Cherry_Studio/#mineru-cherry-studio","title":"MinerU \u5728 Cherry Studio \u4e2d\u7684\u4f7f\u7528\u65b9\u6cd5","text":""},{"location":"zh/usage/plugin/Cherry_Studio/#cherry-studio_1","title":"\u8fdb\u5165 Cherry Studio \u8bbe\u7f6e","text":"

a. \u6253\u5f00 Cherry Studio \u5e94\u7528\u7a0b\u5e8f

b. \u70b9\u51fb\u5de6\u4e0b\u89d2\u7684\"\u8bbe\u7f6e\"\u6309\u94ae\uff0c\u8fdb\u5165\u8bbe\u7f6e\u9875\u9762

c. \u5728\u5de6\u4fa7\u83dc\u5355\u4e2d\uff0c\u9009\u62e9\"MCP \u670d\u52a1\u5668\"

\u5728\u53f3\u4fa7\u7684 MCP \u670d\u52a1\u5668\u914d\u7f6e\u754c\u9762\u4e2d\uff0c\u60a8\u53ef\u4ee5\u770b\u5230\u5df2\u6709\u7684 MCP \u670d\u52a1\u5668\u5217\u8868\u3002\u70b9\u51fb\u53f3\u4e0a\u89d2\u7684\"\u6dfb\u52a0\u670d\u52a1\u5668\"\u6309\u94ae\u6765\u521b\u5efa\u65b0\u7684 MCP \u670d\u52a1\uff0c\u6216\u8005\u70b9\u51fb\u73b0\u6709\u670d\u52a1\u6765\u7f16\u8f91\u914d\u7f6e\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#mineru-mcp","title":"\u6dfb\u52a0 MinerU-MCP \u914d\u7f6e","text":"

\u70b9\u51fb\"\u6dfb\u52a0\u670d\u52a1\u5668\"\u540e\uff0c\u60a8\u5c06\u770b\u5230\u4e00\u4e2a\u914d\u7f6e\u8868\u5355\u3002\u8bf7\u6309\u4ee5\u4e0b\u6b65\u9aa4\u586b\u5199\uff1a

a. \u540d\u79f0\uff1a\u8f93\u5165\"MinerU-MCP\"\u6216\u60a8\u559c\u6b22\u7684\u5176\u4ed6\u540d\u79f0

b. \u63cf\u8ff0\uff1a\u53ef\u9009\uff0c\u5982\"\u6587\u6863\u8f6c\u6362\u4e3aMarkdown\u5de5\u5177\"

c. \u7c7b\u578b\uff1a\u9009\u62e9\"\u6807\u51c6\u8f93\u5165/\u8f93\u51fa\uff08stdio\uff09\"

d. \u547d\u4ee4\uff1a\u8f93\u5165 uvx

e. \u53c2\u6570\uff1a\u8f93\u5165 mineru-mcp

f. \u73af\u5883\u53d8\u91cf\uff1a\u6dfb\u52a0\u4ee5\u4e0b\u73af\u5883\u53d8\u91cf

MINERU_API_BASE=https://mineru.net\nMINERU_API_KEY=\u60a8\u7684API\u5bc6\u94a5\nOUTPUT_DIR=./downloads\nUSE_LOCAL_API=false\nLOCAL_MINERU_API_BASE=http://localhost:8888\n

\u4f7f\u7528 uvx \u547d\u4ee4\u53ef\u4ee5\u81ea\u52a8\u5904\u7406 mineru-mcp \u7684\u5b89\u88c5\u548c\u8fd0\u884c\uff0c\u65e0\u9700\u9884\u5148\u624b\u52a8\u5b89\u88c5 mineru-mcp \u5305\u3002\u8fd9\u662f\u6700\u7b80\u5355\u7684\u914d\u7f6e\u65b9\u5f0f\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#_1","title":"\u4fdd\u5b58\u914d\u7f6e","text":"

\u786e\u8ba4\u65e0\u8bef\u540e\uff0c\u70b9\u51fb\u754c\u9762\u53f3\u4e0a\u89d2\u7684\"\u4fdd\u5b58\"\u6309\u94ae\u5b8c\u6210\u914d\u7f6e\u3002\u4fdd\u5b58\u540e\uff0cMCP \u670d\u52a1\u5668\u5217\u8868\u4e2d\u4f1a\u663e\u793a\u60a8\u521a\u521a\u6dfb\u52a0\u7684 MinerU-MCP \u670d\u52a1\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#cherry-studio-mineru-mcp","title":"\u4f7f\u7528 Cherry Studio \u4e2d\u7684 MinerU MCP","text":"

\u4e00\u65e6\u914d\u7f6e\u5b8c\u6210\uff0c\u60a8\u53ef\u4ee5\u5728 Cherry Studio \u4e2d\u7684\u5bf9\u8bdd\u4e2d\u4f7f\u7528 MinerU MCP \u5de5\u5177\u3002\u5728 Cherry Studio \u4e2d\uff0c\u60a8\u53ef\u4ee5\u4f7f\u7528\u5982\u4e0b\u63d0\u793a\u8ba9\u6a21\u578b\u8c03\u7528 MinerU MCP \u5de5\u5177\u3002\u6a21\u578b\u4f1a\u81ea\u52a8\u8bc6\u522b\u4efb\u52a1\u5e76\u8c03\u7528\u76f8\u5e94\u7684\u5de5\u5177\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#1-url","title":"\u793a\u4f8b 1: \u4f7f\u7528 URL \u8f6c\u6362\u6587\u6863","text":"

\u7528\u6237\u8f93\u5165:

\u8bf7\u4f7f\u7528 MinerU MCP \u5c06\u4ee5\u4e0b URL \u7684 PDF \u6587\u6863\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\uff1ahttps://example.com/sample.pdf\n

\u6a21\u578b\u5c06\u6267\u884c\u7684\u6b65\u9aa4\uff1a

\u6a21\u578b\u8bc6\u522b\u8fd9\u662f\u6587\u6863\u8f6c\u6362\u4efb\u52a1\uff0c\u5e76\u8c03\u7528 parse_documents \u5de5\u5177\uff0c\u53c2\u6570\u4e3a:

{\"file_sources\": \"https://example.com/sample.pdf\"}\n

\u5de5\u5177\u5904\u7406\u5b8c\u6210\u540e\uff0c\u6a21\u578b\u4f1a\u544a\u77e5\u60a8\u8f6c\u6362\u7ed3\u679c\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#2","title":"\u793a\u4f8b 2: \u8f6c\u6362\u672c\u5730\u6587\u6863","text":"

\u7528\u6237\u8f93\u5165:

\u8bf7\u4f7f\u7528 MinerU-MCP \u5c06\u672c\u5730\u7684 D://sample.pdf \u6587\u4ef6\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\n

\u6a21\u578b\u5c06\u6267\u884c\u7684\u6b65\u9aa4\uff1a

\u6a21\u578b\u8bc6\u522b\u8fd9\u662f\u672c\u5730\u6587\u6863\u8f6c\u6362\u4efb\u52a1\uff0c\u8c03\u7528 parse_documents \u5de5\u5177\uff0c\u53c2\u6570\u4e3a:

{\"file_sources\": \"D://sample.pdf\"}\n

"},{"location":"zh/usage/plugin/Cherry_Studio/#3-ocr","title":"\u793a\u4f8b 3: \u542f\u7528 OCR \u5904\u7406\u626b\u63cf\u6587\u6863","text":"

\u7528\u6237\u8f93\u5165:

\u8bf7\u4f7f\u7528 MinerU-MCP \u5c06\u4ee5\u4e0b URL \u7684\u626b\u63cf\u7248 PDF \u6587\u6863\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\uff0c\u5e76\u542f\u7528 OCR \u529f\u80fd\uff1a\nhttps://example.com/scanned.pdf\n

\u6a21\u578b\u5c06\u6267\u884c\u7684\u6b65\u9aa4\uff1a

\u6a21\u578b\u8bc6\u522b\u8fd9\u662f\u9700\u8981 OCR \u5904\u7406\u7684\u6587\u6863\u8f6c\u6362\u4efb\u52a1\uff0c\u8c03\u7528 parse_documents \u5de5\u5177\uff0c\u5e76\u542f\u7528 OCR \u53c2\u6570:

{\"file_sources\": \"https://example.com/scanned.pdf\",\"enable_ocr\": true}\n

"},{"location":"zh/usage/plugin/Cherry_Studio/#4","title":"\u793a\u4f8b 4: \u5b8c\u6574\u5bf9\u8bdd\u6d41\u7a0b","text":"

\u4ee5\u4e0b\u662f\u4e00\u4e2a\u5b8c\u6574\u7684\u5bf9\u8bdd\u6d41\u7a0b\u793a\u4f8b\uff1a

\u7528\u6237:

\u6211\u6709\u4e00\u4efd\u5b66\u672f\u8bba\u6587\u7684 PDF\uff0c\u7f51\u5740\u662f https://arxiv.org/pdf/2303.08774.pdf\uff0c\u80fd\u5e2e\u6211\u8f6c\u6362\u6210 Markdown \u683c\u5f0f\u5417\uff1f\n

\u6a21\u578b:

\u6211\u53ef\u4ee5\u5e2e\u60a8\u5c06\u8fd9\u4efd\u5b66\u672f\u8bba\u6587\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\u3002\u6211\u5c06\u4f7f\u7528 MinerU-MCP \u5de5\u5177\u6765\u5904\u7406\u8fd9\u4e2a\u4efb\u52a1\u3002\n[\u6a21\u578b\u8c03\u7528 parse_documents \u5de5\u5177\uff0c\u5904\u7406\u8fc7\u7a0b\u4e2d...]\n\u8bba\u6587\u5df2\u6210\u529f\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\u3002\u8f6c\u6362\u7ed3\u679c\u5982\u4e0b\uff1a\n# The Capacity of Diffusion Models to Memorize and Generate Training Data\n## Abstract\nRecent diffusion models can generate high-quality images that are nearly indistinguishable from real ones...\n[\u663e\u793a\u8bba\u6587\u5185\u5bb9...]\n

"},{"location":"zh/usage/plugin/Cherry_Studio/#_2","title":"\u5de5\u5177\u53c2\u6570\u8be6\u89e3","text":"

\u5728\u4f7f\u7528\u8fc7\u7a0b\u4e2d\uff0c\u6a21\u578b\u4f1a\u6839\u636e\u60a8\u7684\u6307\u4ee4\u81ea\u52a8\u9009\u62e9\u5408\u9002\u7684\u5de5\u5177\u548c\u53c2\u6570\u3002\u4ee5\u4e0b\u662f\u4e3b\u8981\u5de5\u5177\u7684\u53c2\u6570\u8bf4\u660e\uff1a

"},{"location":"zh/usage/plugin/Cherry_Studio/#parse_documents","title":"\u25cf parse_documents \u5de5\u5177\u53c2\u6570","text":""},{"location":"zh/usage/plugin/Cherry_Studio/#get_ocr_languages","title":"\u25cf get_ocr_languages \u5de5\u5177\u53c2\u6570","text":"

\u65e0\u9700\u53c2\u6570\uff0c\u7528\u4e8e\u83b7\u53d6OCR\u652f\u6301\u7684\u8bed\u8a00\u5217\u8868\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#_3","title":"\u9ad8\u7ea7\u7528\u6cd5","text":""},{"location":"zh/usage/plugin/Cherry_Studio/#_4","title":"\u6307\u5b9a\u8bed\u8a00\u548c\u9875\u7801\u8303\u56f4","text":"

\u7528\u6237\u8f93\u5165:

\u8bf7\u4f7f\u7528 MinerU MCP \u5c06\u4ee5\u4e0b URL \u7684\u6587\u6863\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\uff0c\u53ea\u5904\u7406\u7b2c 5-10 \u9875\uff0c\u5e76\u6307\u5b9a\u8bed\u8a00\u4e3a\u4e2d\u6587\uff1ahttps://example.com/document.pdf\n

\u6a21\u578b\u4f1a\u4f7f\u7528 parse_documents \u5de5\u5177\uff0c\u5e76\u8bbe\u7f6e language \u53c2\u6570\u4e3a \"ch\"\uff0cpage_ranges \u53c2\u6570\u4e3a \"5-10\"\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#_5","title":"\u6279\u91cf\u5904\u7406\u591a\u4e2a\u6587\u6863","text":"

\u7528\u6237\u8f93\u5165:

\u8bf7\u4f7f\u7528 MinerU-MCP \u5c06\u4ee5\u4e0b\u591a\u4e2a URL \u7684\u6587\u6863\u8f6c\u6362\u4e3a Markdown \u683c\u5f0f\uff1a\nhttps://example.com/doc1.pdf\nhttps://example.com/doc2.pdf\nhttps://example.com/doc3.pdf\n

\u6a21\u578b\u4f1a\u8c03\u7528 parse_documents \u5de5\u5177\uff0c\u5e76\u5c06\u591a\u4e2a URL \u4ee5\u9017\u53f7\u5206\u9694\u4f20\u5165 file_sources \u53c2\u6570\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#_6","title":"\u6ce8\u610f\u4e8b\u9879","text":"

\u25cf \u5f53\u8bbe\u7f6e USE_LOCAL_API=true \u65f6\uff0c\u4f7f\u7528\u672c\u5730\u914d\u7f6e\u7684API\u8fdb\u884c\u89e3\u6790

\u25cf \u5f53\u8bbe\u7f6e USE_LOCAL_API=false \u65f6\uff0c\u4f1a\u4f7f\u7528 MinerU \u5b98\u7f51\u7684API\u8fdb\u884c\u89e3\u6790

\u25cf \u5904\u7406\u5927\u578b\u6587\u6863\u53ef\u80fd\u9700\u8981\u8f83\u957f\u65f6\u95f4\uff0c\u8bf7\u8010\u5fc3\u7b49\u5f85

\u25cf \u5982\u679c\u9047\u5230\u8d85\u65f6\u95ee\u9898\uff0c\u8bf7\u8003\u8651\u5206\u6279\u5904\u7406\u6587\u6863\u6216\u4f7f\u7528\u672c\u5730API\u6a21\u5f0f

"},{"location":"zh/usage/plugin/Cherry_Studio/#_7","title":"\u5e38\u89c1\u95ee\u9898\u4e0e\u89e3\u51b3\u65b9\u6848","text":""},{"location":"zh/usage/plugin/Cherry_Studio/#mcp","title":"\u65e0\u6cd5\u542f\u52a8 MCP \u670d\u52a1","text":"

\u95ee\u9898\uff1a\u8fd0\u884c uv run -m mineru.cli\u65f6\u62a5\u9519\u3002

\u89e3\u51b3\u65b9\u6848\uff1a

\u25cf \u786e\u4fdd\u5df2\u6fc0\u6d3b\u865a\u62df\u73af\u5883

\u25cf \u68c0\u67e5\u662f\u5426\u5df2\u5b89\u88c5\u6240\u6709\u4f9d\u8d56

\u25cf \u5c1d\u8bd5\u4f7f\u7528 python -m mineru.cli\u547d\u4ee4\u66ff\u4ee3

"},{"location":"zh/usage/plugin/Cherry_Studio/#_8","title":"\u6587\u4ef6\u8f6c\u6362\u5931\u8d25","text":"

\u95ee\u9898\uff1a\u6587\u4ef6\u4e0a\u4f20\u6210\u529f\u4f46\u8f6c\u6362\u5931\u8d25\u3002

\u89e3\u51b3\u65b9\u6848\uff1a

\u25cf \u68c0\u67e5\u6587\u4ef6\u683c\u5f0f\u662f\u5426\u53d7\u652f\u6301

\u25cf \u786e\u8ba4API\u5bc6\u94a5\u662f\u5426\u6b63\u786e

\u25cf \u67e5\u770bMCP\u670d\u52a1\u65e5\u5fd7\u83b7\u53d6\u8be6\u7ec6\u9519\u8bef\u4fe1\u606f

"},{"location":"zh/usage/plugin/Cherry_Studio/#_9","title":"\u6587\u4ef6\u8def\u5f84\u95ee\u9898","text":"

\u95ee\u9898\uff1a\u4f7f\u7528 parse_documents \u5de5\u5177\u5904\u7406\u672c\u5730\u6587\u4ef6\u65f6\u62a5\u627e\u4e0d\u5230\u6587\u4ef6\u9519\u8bef\u3002

\u89e3\u51b3\u65b9\u6848\uff1a\u8bf7\u786e\u4fdd\u4f7f\u7528\u7edd\u5bf9\u8def\u5f84\uff0c\u6216\u8005\u76f8\u5bf9\u4e8e\u670d\u52a1\u5668\u8fd0\u884c\u76ee\u5f55\u7684\u6b63\u786e\u76f8\u5bf9\u8def\u5f84\u3002

"},{"location":"zh/usage/plugin/Cherry_Studio/#mcp_1","title":"MCP \u670d\u52a1\u8c03\u7528\u8d85\u65f6\u95ee\u9898","text":"

\u95ee\u9898\uff1a\u8c03\u7528 parse_documents \u5de5\u5177\u65f6\u51fa\u73b0 Error calling tool 'parse_documents': MCP error -32001: Request timed out \u9519\u8bef\u3002

\u89e3\u51b3\u65b9\u6848\uff1a\u8fd9\u4e2a\u95ee\u9898\u5e38\u89c1\u4e8e\u5904\u7406\u5927\u578b\u6587\u6863\u6216\u7f51\u7edc\u4e0d\u7a33\u5b9a\u7684\u60c5\u51b5\u3002\u5728\u67d0\u4e9b MCP \u5ba2\u6237\u7aef\uff08\u5982 Cursor\uff09\u4e2d\uff0c\u8d85\u65f6\u540e\u53ef\u80fd\u5bfc\u81f4\u65e0\u6cd5\u518d\u6b21\u8c03\u7528 MCP \u670d\u52a1\uff0c\u9700\u8981\u91cd\u542f\u5ba2\u6237\u7aef\u3002\u6700\u65b0\u7248\u672c\u7684 Cursor \u4e2d\u53ef\u80fd\u4f1a\u663e\u793a\u6b63\u5728\u8c03\u7528 MCP\uff0c\u4f46\u5b9e\u9645\u4e0a\u6ca1\u6709\u771f\u6b63\u8c03\u7528\u6210\u529f\u3002\u5efa\u8bae\uff1a

\u25cf \u7b49\u5f85\u5b98\u65b9\u4fee\u590d\uff1a\u8fd9\u662fCursor\u5ba2\u6237\u7aef\u7684\u5df2\u77e5\u95ee\u9898\uff0c\u5efa\u8bae\u7b49\u5f85Cursor\u5b98\u65b9\u4fee\u590d

\u25cf \u5904\u7406\u5c0f\u6587\u4ef6\uff1a\u5c3d\u91cf\u53ea\u5904\u7406\u5c11\u91cf\u5c0f\u6587\u4ef6\uff0c\u907f\u514d\u5904\u7406\u5927\u578b\u6587\u6863\u5bfc\u81f4\u8d85\u65f6

\u25cf \u5206\u6279\u5904\u7406\uff1a\u5c06\u591a\u4e2a\u6587\u4ef6\u5206\u6210\u591a\u6b21\u8bf7\u6c42\u5904\u7406\uff0c\u6bcf\u6b21\u53ea\u5904\u7406\u4e00\u4e24\u4e2a\u6587\u4ef6

\u25cf \u589e\u52a0\u8d85\u65f6\u65f6\u95f4\u8bbe\u7f6e\uff08\u5982\u679c\u5ba2\u6237\u7aef\u652f\u6301\uff09

\u25cf \u5bf9\u4e8e\u8d85\u65f6\u540e\u65e0\u6cd5\u518d\u6b21\u8c03\u7528\u7684\u95ee\u9898\uff0c\u9700\u8981\u91cd\u542f MCP \u5ba2\u6237\u7aef

\u25cf \u5982\u679c\u53cd\u590d\u51fa\u73b0\u8d85\u65f6\uff0c\u8bf7\u68c0\u67e5\u7f51\u7edc\u8fde\u63a5\u6216\u8003\u8651\u4f7f\u7528\u672c\u5730 API \u6a21\u5f0f

"},{"location":"zh/usage/plugin/Coze/","title":"Coze \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/Coze/#coze","title":"Coze \u7b80\u4ecb","text":"

Coze\uff08\u4e2d\u6587\u7248\u540d\u79f0\uff1a\u6263\u5b50\uff09 \u662f\u5b57\u8282\u8df3\u52a8\u63a8\u51fa\u7684\u96f6\u4ee3\u7801 AI \u5e94\u7528\u5f00\u53d1\u5e73\u53f0\u3002\u65e0\u8bba\u7528\u6237\u662f\u5426\u6709\u7f16\u7a0b\u7ecf\u9a8c\uff0c\u90fd\u53ef\u4ee5\u901a\u8fc7\u8be5\u5e73\u53f0\u5feb\u901f\u521b\u5efa\u5404\u79cd\u7c7b\u578b\u7684\u804a\u5929\u673a\u5668\u4eba\u3001\u667a\u80fd\u4f53\u3001AI \u5e94\u7528\u548c\u63d2\u4ef6\uff0c\u5e76\u5c06\u5176\u90e8\u7f72\u5728\u793e\u4ea4\u5e73\u53f0\u548c\u5373\u65f6\u804a\u5929\u5e94\u7528\u7a0b\u5e8f\u4e2d\u3002

\u76ee\u524d\uff0cMinerU \u63d2\u4ef6\u5df2\u5728 Coze \u63d2\u4ef6\u5546\u5e97\u4e0a\u7ebf\uff0c\u901a\u8fc7\u5176\u5f3a\u5927\u7684\u6587\u6863\u89e3\u6790\u80fd\u529b\uff0c\u4e3a\u7528\u6237\u642d\u5efa\u667a\u80fd\u4f53\u4e0e\u5de5\u4f5c\u6d41\u63d0\u4f9b\u6587\u6863\u89e3\u6790\u80fd\u529b\uff0c\u52a0\u5feb\u7528\u6237 AI \u5e94\u7528\u7684\u5f00\u53d1\u3002

  • \u6263\u5b50\u5b98\u7f51\u5730\u5740\uff1ahttps://www.coze.cn/
  • MinerU \u6263\u5b50\u63d2\u4ef6\u4e0b\u8f7d\u5730\u5740\uff1ahttps://www.coze.cn/store/plugin/7527957359730360354
"},{"location":"zh/usage/plugin/Coze/#mineru-coze","title":"MinerU \u5728 Coze \u4e2d\u7684\u4f7f\u7528\u65b9\u6cd5","text":""},{"location":"zh/usage/plugin/Coze/#coze_1","title":"Coze\uff1a\u96c6\u6210\u5e94\u7528","text":"
  • \u8fdb\u5165 https://www.coze.cn/home coze \u5f00\u53d1\u5e73\u53f0
"},{"location":"zh/usage/plugin/Coze/#_1","title":"\u667a\u80fd\u4f53","text":""},{"location":"zh/usage/plugin/Coze/#-","title":"\u5de5\u4f5c\u7a7a\u95f4 -> \u9879\u76ee\u5f00\u53d1 -> \u521b\u5efa -> \u521b\u5efa\u667a\u80fd\u4f53 -> \u521b\u5efa -> \u8f93\u5165\u9879\u76ee\u540d","text":""},{"location":"zh/usage/plugin/Coze/#-mineru","title":"\u63d2\u4ef6\u914d\u7f6e -> \u6dfb\u52a0 \u63d2\u4ef6 -> \u641c\u7d22 MinerU","text":""},{"location":"zh/usage/plugin/Coze/#parse_file","title":"\u6dfb\u52a0 parse_file \u5de5\u5177\uff08\u5728\u7ebf\u7248\uff09","text":""},{"location":"zh/usage/plugin/Coze/#mineru-api-key","title":"\u9009\u62e9 MinerU \u63d2\u4ef6 -> \u7f16\u8f91\u53c2\u6570 -> \u586b\u5199 api key","text":"

\u8bb0\u5f97\u5173\u95ed url \u548c token \u663e\u793a

"},{"location":"zh/usage/plugin/Coze/#_2","title":"\u8c03\u8bd5 \u667a\u80fd\u4f53","text":""},{"location":"zh/usage/plugin/Coze/#_3","title":"\u5de5\u4f5c\u6d41","text":"

\u7528\u5de5\u4f5c\u6d41\u7684\u65b9\u5f0f\u4f7f\u7528 minerU

"},{"location":"zh/usage/plugin/Coze/#-_1","title":"\u5de5\u4f5c\u6d41 -> \u521b\u5efa\u5de5\u4f5c\u6d41","text":""},{"location":"zh/usage/plugin/Coze/#-mineru-","title":"\u5de5\u4f5c\u6d41\u63d2\u4ef6\u914d\u7f6e -> \u6dfb\u52a0 \u63d2\u4ef6 -> \u641c\u7d22 MinerU -> \u6dfb\u52a0","text":""},{"location":"zh/usage/plugin/Coze/#mineru-api-key_1","title":"\u9009\u62e9MinerU \u63d2\u4ef6 -> \u7f16\u8f91\u53c2\u6570 -> \u586b\u5199 api key","text":""},{"location":"zh/usage/plugin/Coze/#-input-mineru","title":"\u9009\u62e9\u5f00\u59cb\u8282\u70b9 -> \u914d\u7f6e input \u7c7b\u578b\u4e3a\u6587\u4ef6\u7c7b\u578b -> \u8fde\u63a5\u5230 mineru \u8282\u70b9","text":""},{"location":"zh/usage/plugin/Coze/#-mineru-output-mineru-parse_filetext","title":"\u9009\u62e9\u7ed3\u675f\u8282\u70b9 -> \u8fde\u63a5\u5230 mineru \u8282\u70b9 -> \u914d\u7f6e output \u8f93\u51fa\u4e3a mineru \u8282\u70b9\u7684 parse_file.text","text":""},{"location":"zh/usage/plugin/Coze/#-_2","title":"\u4e0a\u4f20\u6587\u4ef6 -> \u8bd5\u8fd0\u884c","text":""},{"location":"zh/usage/plugin/Coze/#-_3","title":"\u53d1\u5e03 -> \u6dfb\u52a0\u5230\u5f53\u524d\u667a\u80fd\u4f53","text":""},{"location":"zh/usage/plugin/Coze/#mineru-","title":"\u79fb\u9664 mineru \u63d2\u4ef6 -> \u8c03\u8bd5","text":""},{"location":"zh/usage/plugin/DataFlow/","title":"\u5143\u67a2\u667a\u6c47 ADP \u667a\u80fd\u6570\u636e\u5e73\u53f0 \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/DataFlow/#adp","title":"\u5143\u67a2\u667a\u6c47 ADP \u667a\u80fd\u6570\u636e\u5e73\u53f0 \u7b80\u4ecb","text":"

\u5143\u67a2\u667a\u6c47 ADP \u667a\u80fd\u6570\u636e\u5e73\u53f0\u57fa\u4e8e\u81ea\u7814 AI \u6570\u636e\u5e93\u548c DataFlow\u6570\u636e\u51c6\u5907\u6846\u67b6\u6253\u9020\uff0c\u65e8\u5728\u5e2e\u52a9\u4f01\u4e1a\u9ad8\u6548\u7ba1\u7406\u3001\u68c0\u7d22\u3001\u5904\u7406\u6d77\u91cf\u6570\u636e\uff0c\u5e76\u901a\u8fc7\u4f53\u7cfb\u5316\u3001\u81ea\u52a8\u5316\u6570\u636e\u6cbb\u7406\u964d\u4f4e\u6a21\u578b/\u667a\u80fd\u4f53\u8bad\u7ec3\u7684\u4e13\u4e1a\u95e8\u69db\uff0c\u5e2e\u52a9\u4f01\u4e1a\u7ed3\u5408\u4e1a\u52a1\u573a\u666f\u53d1\u6325\u79c1\u6709\u6570\u636e\u7684\u4ef7\u503c\uff0c\u771f\u6b63\u843d\u5730AI\u5e94\u7528\u3002

\u76ee\u524d\uff0cMinerU \u5df2\u6df1\u5ea6\u96c6\u6210\u4e8e\u5143\u67a2\u667a\u6c47 ADP \u667a\u80fd\u6570\u636e\u5e73\u53f0\u7684 DataFlow \u6a21\u5757\u4e2d\uff0c\u5176\u6570\u636e\u89e3\u6790\u670d\u52a1\u7531\u6587\u6863\u8bed\u6599\u63d0\u53d6\u5f15\u64ce MinerU \u63d0\u4f9b\u652f\u6301\u3002

  • \u5b98\u7f51\u5730\u5740\uff1ahttps://adp.originhub.tech/agent
  • Miner fastGPT \u63d2\u4ef6\u4e0b\u8f7d\u5730\u5740\uff1ahttps://cloud.fastgpt.io/dashboard/systemPlugin?type=productivity
"},{"location":"zh/usage/plugin/Dify/","title":"Dify \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/Dify/#dify","title":"Dify \u7b80\u4ecb","text":"

Dify \u662f\u4e00\u4e2a\u5f00\u6e90\u7684\u5927\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u5e94\u7528\u5f00\u53d1\u5e73\u53f0\uff0c\u65e8\u5728\u7b80\u5316\u548c\u52a0\u901f\u751f\u6210\u5f0f AI \u5e94\u7528\u7684\u521b\u5efa\u548c\u90e8\u7f72\u3002\u5b83\u7ed3\u5408\u4e86\u540e\u7aef\u5373\u670d\u52a1\uff08BaaS\uff09\u548c LLMOps \u7684\u7406\u5ff5\uff0c\u4e3a\u5f00\u53d1\u8005\u63d0\u4f9b\u4e86\u7528\u6237\u53cb\u597d\u7684\u754c\u9762\u548c\u5f3a\u5927\u7684\u5de5\u5177\uff0c\u6709\u6548\u964d\u4f4e\u4e86 AI \u5e94\u7528\u5f00\u53d1\u7684\u95e8\u69db\u3002

\u76ee\u524d MinerU \u4e0e Dify \u8054\u5408\u7814\u53d1\u7684 MinerU \u63d2\u4ef6\u5df2\u5728 Dify \u5e02\u573a\u4e0a\u67b6\uff0c\u5e2e\u52a9\u7528\u6237\u642d\u5efa\u5de5\u4f5c\u6d41\uff0c\u63d0\u4f9b\u6587\u6863\u89e3\u6790\u7684\u5de5\u4f5c\u3002

  • Dify \u5b98\u7f51\u5730\u5740\uff1ahttps://dify.ai/zh
  • MinerU Dify \u63d2\u4ef6\u4e0b\u8f7d\u5730\u5740\uff1ahttps://marketplace.dify.ai/plugins/langgenius/mineru
"},{"location":"zh/usage/plugin/Dify/#mineru-dify","title":"MinerU \u5728 Dify \u4e2d\u7684\u4f7f\u7528\u65b9\u6cd5","text":""},{"location":"zh/usage/plugin/Dify/#mineru-dify-v040","title":"\u4e00\u3001\u65b0\u7248MinerU Dify\u63d2\u4ef6\u4eae\u70b9 (v0.4.0)","text":"
  • \u5b8c\u7f8e\u9002\u914dMinerU2\uff1a\u5168\u9762\u517c\u5bb9MinerU2\u7684\u6700\u65b0\u529f\u80fd\uff0c\u91ca\u653e\u9876\u5c16\u7684\u6587\u6863\u89e3\u6790\u80fd\u529b\u3002
  • \u8d85\u9ad8\u7075\u6d3b\u6027\uff1a\u540c\u65f6\u652f\u6301\u5b98\u65b9\u5728\u7ebfAPI\u548c\u672c\u5730\u5316\u90e8\u7f72\u7684API\uff08\u5e76\u5411\u4e0b\u517c\u5bb9 1.x \u7248\u672c\uff09\u3002
  • \u8d4b\u80fd\u5de5\u4f5c\u6d41\uff1a\u8ba9Dify\u7684Agent\u62e5\u6709\u5f3a\u5927\u7684\u6587\u6863\u201c\u8bfb\u5199\u201d\u80fd\u529b\uff0c\u8f7b\u677e\u5904\u7406\u590d\u6742\u4efb\u52a1\u3002
"},{"location":"zh/usage/plugin/Dify/#_1","title":"\u4e8c\u3001\u5b9e\u6218\u6f14\u7ec3\uff1a\u4e24\u4e2a\u6848\u4f8b\u5e26\u4f60\u5feb\u901f\u4e0a\u624b","text":"

\u7a7a\u8c08\u4e0d\u5982\u5b9e\u6218\u3002\u4e0b\u9762\u6211\u4eec\u901a\u8fc7\u4e24\u4e2a\u5178\u578b\u573a\u666f\uff0c\u5411\u4f60\u5c55\u793a\u65b0\u7248\u63d2\u4ef6\u7684\u5f3a\u5927\u4e4b\u5904\u3002

"},{"location":"zh/usage/plugin/Dify/#_2","title":"\u51c6\u5907","text":"
  1. \u5728Dify\u63d2\u4ef6\u9875\u9762\u5b89\u88c5MinerU\u63d2\u4ef6\uff08\u79c1\u6709\u5316\u90e8\u7f72\u7684Dify\u540c\u7406\uff09

  2. \u586b\u5199API URL\u7b49\u4fe1\u606f

\u4f7f\u7528\u5b98\u65b9API\u65f6\u4ee4\u724c\uff08Token\uff09\u5fc5\u987b\u63d0\u4f9b\ud83d\udc46\uff0c\u4f7f\u7528\u672c\u5730\u90e8\u7f72API\u65f6\u4ee4\u724c\u53ef\u4e0d\u586b\u5199\ud83d\udc47

"},{"location":"zh/usage/plugin/Dify/#chat-pdf","title":"\u6848\u4f8b\u4e00\uff1a\u89e3\u6790\u5355\u6587\u4ef6\uff0c\u642d\u5efaChat PDF\u5e94\u7528","text":"

\u60f3\u501f\u52a9AI\u4e0e\u4f60\u7684\u6587\u6863\u5bf9\u8bdd\u5417\uff1f\u8ddf\u7740\u4e0b\u9762\u51e0\u6b65\uff0c\u8f7b\u677e\u5b9e\u73b0

"},{"location":"zh/usage/plugin/Dify/#chatflow","title":"\u7b2c\u4e00\u6b65\uff1a\u521b\u5efa\u7a7a\u767d\u5e94\u7528\uff0c\u9009\u62e9\u201cChatflow\u201d","text":"

\u8f93\u5165\u5e94\u7528\u540d\u79f0\u4e0e\u63cf\u8ff0

"},{"location":"zh/usage/plugin/Dify/#_3","title":"\u7b2c\u4e8c\u6b65\uff1a\u521b\u5efa\u7684\u521d\u59cb\u6a21\u677f\u4e2d\uff0c\u9009\u62e9\u201c\u5f00\u59cb\u201d\u8282\u70b9","text":"

\u5b57\u6bb5\u7c7b\u578b\u9009\u4e3a\u5355\u6587\u4ef6\uff0c\u586b\u5199\u53d8\u91cf\u540d\u79f0\uff08\u6b64\u5904\u586b\u4e3ainput_file\uff09,\u652f\u6301\u6587\u6863\u7c7b\u578b\u9009\u4e3a\u6587\u6863\u4e0e\u56fe\u7247

"},{"location":"zh/usage/plugin/Dify/#mineru","title":"\u7b2c\u4e09\u6b65\uff1a\u6dfb\u52a0\u5de5\u5177\u8282\u70b9\u2014\u2014MinerU\u63d2\u4ef6\u6765\u89e3\u6790\u4e0a\u4e00\u6b65\u5f00\u59cb\u8282\u70b9\u4e0a\u4f20\u7684\u6587\u4ef6","text":""},{"location":"zh/usage/plugin/Dify/#mineru-input_file","title":"\u7b2c\u56db\u6b65\uff1a\u8bbe\u7f6eMinerU\u7684\u8f93\u5165\u53d8\u91cf\uff0c\u9009\u62e9\u4e0a\u4e00\u6b65\u5f00\u59cb\u8282\u70b9\u6dfb\u52a0\u7684 input_file","text":""},{"location":"zh/usage/plugin/Dify/#llm","title":"\u7b2c\u4e94\u6b65\uff1a\u914d\u7f6eLLM\u6a21\u578b","text":"

\u9009\u62e9\u201cLLM\u201d\u8282\u70b9\u540e\uff0c\u5982\u679c\u6ca1\u6709\u6a21\u578b\u53ef\u7528\uff0c\u9700\u8981\u5355\u72ec\u5728\u63d2\u4ef6\u5e02\u573a\u5b89\u88c5\uff08\u8fd9\u91cc\u4f7f\u7528 Deepseek\u4f5c\u4e3a\u793a\u4f8b\uff09

\u201c\u4e0a\u4e0b\u6587\u201d\u9009\u62e9MinerU\u7684\u8f93\u51fa\u53d8\u91cf text\uff08MinerU\u89e3\u6790\u6587\u6863\u540e\u7684markdown\u683c\u5f0f\uff09

\u5728\u201cSYSTEM\u201d\u533a\u57df\u6839\u636e\u5b9e\u9645\u9700\u6c42\u586b\u5199\u63d0\u793a\u8bcd\uff0c\u53ef\u5982\u56fe\u586b\u5199\u201c\u5728Parse File text\u4e2d\u63d0\u53d6\u7528\u6237\u7684\u95ee\u9898\u7b54\u6848\u201d

"},{"location":"zh/usage/plugin/Dify/#_4","title":"\u7b2c\u516d\u6b65\uff1a\u9884\u89c8\uff0c\u4e0a\u4f20\u6587\u4ef6\u5e76\u63d0\u95ee\u673a\u5668\u4eba\u5173\u4e8e\u6587\u6863\u7684\u5185\u5bb9","text":"

\u81f3\u6b64\u4e00\u4e2a\u7b80\u5355\u7684\u6587\u6863\u95ee\u7b54\u5e94\u7528Chat PDF\u642d\u5efa\u5b8c\u6210\uff0c\u70b9\u51fb\u201c\u9884\u89c8\u201d\uff0c\u67e5\u770b\u6548\u679c\u5982\u4f55\ud83d\udc47

\u7ed3\u679c\u5982\u4e0b\uff1a

"},{"location":"zh/usage/plugin/Dify/#_5","title":"\u7b2c\u4e03\u6b65\uff1a\u53d1\u5e03\u4e0e\u6d4b\u8bd5","text":"

\u4fdd\u5b58\u5e76\u53d1\u5e03\u4f60\u7684\u5e94\u7528\u3002\u73b0\u5728\uff0c\u4e0a\u4f20\u4e00\u4efdPDF\u6216\u56fe\u7247\uff0c\u4f60\u5c31\u53ef\u4ee5\u548c\u5b83\u81ea\u7531\u5bf9\u8bdd\u4e86\uff01

"},{"location":"zh/usage/plugin/Dify/#s3","title":"\u6848\u4f8b\u4e8c\uff1a\u81ea\u52a8\u5316\u6279\u91cf\u5904\u7406\u6587\u6863\uff0c\u5e76\u4e0a\u4f20\u81f3\u4e91\u7aefS3","text":"

\u9700\u8981\u5904\u7406\u5927\u91cf\u6587\u6863\u5e76\u5f52\u6863\uff1fMinerU \u63d2\u4ef6\u540c\u6837\u80fd\u80dc\u4efb

"},{"location":"zh/usage/plugin/Dify/#botos3","title":"\u7b2c\u4e00\u6b65\uff1a\u5b89\u88c5 botos3 \u63d2\u4ef6","text":""},{"location":"zh/usage/plugin/Dify/#s3-bucket","title":"\u7b2c\u4e8c\u6b65\uff1a\u914d\u7f6e S3 bucket","text":""},{"location":"zh/usage/plugin/Dify/#_6","title":"\u7b2c\u4e09\u6b65\uff1a\u521b\u5efa\u5de5\u4f5c\u6d41","text":"

\u9009\u62e9\u5b57\u6bb5\u7c7b\u578b\u4e3a\u201c\u6587\u4ef6\u5217\u8868\u201d\uff0c\u586b\u5199\u53d8\u91cf\u540d\u79f0\uff08\u6b64\u5904\u586b\u4e3ainput_files\uff09,\u652f\u6301\u7684\u6587\u6863\u7c7b\u578b\u9009\u4e3a\u6587\u6863\u4e0e\u56fe\u7247

"},{"location":"zh/usage/plugin/Dify/#_7","title":"\u7b2c\u56db\u6b65\uff1a\u6dfb\u52a0\u201c\u8fed\u4ee3\u201d","text":"

\u5728\u201c\u5f00\u59cb\u201d\u8282\u70b9\u540e\u6dfb\u52a0\u201c\u8fed\u4ee3\u201d\uff0c\u5e76\u914d\u7f6e\u8fed\u4ee3\u5185\u7684MinerU\u8282\u70b9,\u8bbe\u7f6e\u8fed\u4ee3\u7684\u8f93\u5165\u4e3a\u4e0a\u4e00\u6b65\u5f00\u59cb\u8282\u70b9\u7684upload_files\uff0c\u8f93\u51fa\u8282\u70b9\u6682\u65f6\u4e0d\u586b\u5199\uff0c\u518d\u6574\u4e2a\u8fed\u4ee3\u914d\u7f6e\u5b8c\u6210\u540e\u9009\u62e9MinerU\u8282\u70b9Parse File\u7684full_zip_url

\u5c06MinerU\u7684\u8f93\u5165\u53c2\u6570file\u9009\u62e9\u4e3a\u8fed\u4ee3\u5668\u7684 item

"},{"location":"zh/usage/plugin/Dify/#mineru_1","title":"\u7b2c\u4e94\u6b65\uff1a\u589e\u52a0\u4e2d\u95f4\u8282\u70b9\u201c\u4ee3\u7801\u6267\u884c\u201d\u6765\u8f6c\u6362MinerU\u7684\u89e3\u6790\u7ed3\u679c","text":"

\u8f93\u5165\u53d8\u91cf(\u53d8\u91cf\u540d\u79f0\u9700\u4e0e\u4ee3\u7801\u5b9a\u4e49\u4e00\u81f4)

  • text\uff1a\u9009\u62e9MinerU Parse File\u7684\u8f93\u51fa\u53d8\u91cftext
  • uploadFiles\uff1a\u9009\u62e9\u201c\u5f00\u59cb\u201d\u8282\u70b9\u7684\u6587\u4ef6\u5217\u8868upload_files\uff0c\u7528\u6765\u6839\u636e\u8fed\u4ee3\u7684index\u7d22\u5f15\u4e0b\u6807\u627e\u5230\u5bf9\u5e94\u7684\u539f\u59cb\u6587\u4ef6\u540d
  • index\uff1a\u8fed\u4ee3\u7684\u4e0b\u6807\u7d22\u5f15\uff0c\u9009\u62e9\u8fed\u4ee3\u5668\u7684index

\u8f93\u51fa\u53d8\u91cf(\u53d8\u91cf\u540d\u79f0\u9700\u4e0e\u4ee3\u7801\u5b9a\u4e49\u4e00\u81f4)

  • fileName\uff1aString
  • base64\uff1aString

\u4ee3\u7801\u9009\u62e9JavaScript\uff0c\u7f16\u5199\u8f6c\u6362\u4ee3\u7801\uff1a

\u6682\u65f6\u65e0\u6cd5\u5728\u98de\u4e66\u6587\u6863\u5916\u5c55\u793a\u6b64\u5185\u5bb9

\u4ee5\u4e0b\u4e3aPython\u7248\u672c\uff1a

\u6682\u65f6\u65e0\u6cd5\u5728\u98de\u4e66\u6587\u6863\u5916\u5c55\u793a\u6b64\u5185\u5bb9

"},{"location":"zh/usage/plugin/Dify/#botos3_1","title":"\u7b2c\u516d\u6b65\uff1a\u914d\u7f6e Botos3 \u63d2\u4ef6\u6765\u4e0a\u4f20\u5185\u5bb9","text":"

\u6dfb\u52a0\u5de5\u5177\u8282\u70b9Botos3\uff0c\u9009\u62e9\u201c\u901a\u8fc7s3\u4e0a\u4f20base64\u201d

\u6587\u4ef6base64\u9009\u62e9\u4ee3\u7801\u6267\u884c\uff08\u56fe\u4e2d\u4e3a\u8f6c\u6362MINERU MD\u6587\u672c\uff09\u8f93\u51fa\u7684base64\u5b57\u6bb5

S3\u5bf9\u8c61key\uff0cS3 \u5bf9\u8c61key\u586b\u5199\u6587\u4ef6\u5b58\u50a8\u7684\u8def\u5f84\uff0c\u5728 botos3 \u63d2\u4ef6\u914d\u7f6e\u754c\u9762\u5df2\u7ecf\u586b\u5199\u4e86 bucket \u540d\u79f0\uff0c\u8fd9\u91cc\u53ea\u9700\u8981\u586b\u5199\u5728bucket\u4e0b\u5b58\u50a8\u7684\u76ee\u5f55\u5373\u53ef\u3002\u9009\u62e9\u4ee3\u7801\u6267\u884c\uff08\u56fe\u4e2d\u4e3a\u8f6c\u6362MINERU MD\u6587\u672c\uff09\u7684fileName

"},{"location":"zh/usage/plugin/Dify/#_8","title":"\u7b2c\u4e03\u6b65\uff1a\u9884\u89c8\u6548\u679c","text":"

\u8fde\u63a5\u7ed3\u675f\u8282\u70b9\uff0c\u81f3\u6b64\uff0c\u4e00\u4e2a\u7b80\u5355\u7684\u4e0a\u4f20\u5230s3\u7684\u5de5\u4f5c\u6d41\u914d\u7f6e\u5b8c\u6210\uff0c\u70b9\u51fb\u201c\u8fd0\u884c\u201d\u770b\u770b\u6548\u679c\ud83d\udc47\uff1a

"},{"location":"zh/usage/plugin/Dify/#vis3","title":"\u7b2c\u516b\u6b65\uff1aVis3\u67e5\u770b\u6587\u6863","text":"

\u8fd0\u884c\u7ed3\u675f\uff0c\u53ef\u901a\u8fc7vis3\u6765\u67e5\u770bS3\u6876\u5185\u662f\u5426\u5df2\u4e0a\u4f20\u89e3\u6790\u540e\u7684md\u6587\u4ef6\uff0cVis3\u4f7f\u7528\u53ef\u53c2\u8003

\u65b0\u5de5\u5177\u5f00\u6e90\uff01Vis3\u5927\u6a21\u578b\u6570\u636e\u53ef\u89c6\u5316\u5229\u5668\uff1a\u586b AK/SK \u76f4\u63a5\u9884\u89c8 S3 \u6570\u636e\uff0cJSON/\u89c6\u9891/\u56fe\u7247\u79d2\u5f00\uff01\u672c\u5730\u6587\u4ef6\u4e5f\u53ef\u7528

"},{"location":"zh/usage/plugin/DingTalk/","title":"\u9489\u9489\u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/DingTalk/#_1","title":"\u9489\u9489\u7b80\u4ecb","text":"

\u9489\u9489\uff08DingTalk\uff09\u662f\u963f\u91cc\u5df4\u5df4\u96c6\u56e2\u6253\u9020\u7684\u4f01\u4e1a\u7ea7\u667a\u80fd\u79fb\u52a8\u529e\u516c\u5e73\u53f0\uff0c\u662f\u6570\u5b57\u7ecf\u6d4e\u65f6\u4ee3\u7684\u4f01\u4e1a\u7ec4\u7ec7\u534f\u540c\u529e\u516c\u548c\u5e94\u7528\u5f00\u53d1\u5e73\u53f0\u3002\u9489\u9489\u6574\u5408\u4e86 IM \u5373\u65f6\u6c9f\u901a\u3001\u9489\u9489\u6587\u6863\u3001\u9489\u95ea\u4f1a\u3001\u9489\u76d8\u3001Teambition\u3001OA\u5ba1\u6279\u3001\u667a\u80fd\u4eba\u4e8b\u3001\u9489\u5de5\u724c\u3001\u5de5\u4f5c\u53f0\u7b49\u529f\u80fd\uff0c\u65e8\u5728\u5b9e\u73b0\u7b80\u5355\u3001\u9ad8\u6548\u3001\u5b89\u5168\u3001\u667a\u80fd\u7684\u6570\u5b57\u5316\u5de5\u4f5c\u65b9\u5f0f\u3002\u5b83\u652f\u6301\u4f01\u4e1a\u7ec4\u7ec7\u6570\u5b57\u5316\u548c\u4e1a\u52a1\u6570\u5b57\u5316\uff0c\u8986\u76d6\u201c\u4eba\u3001\u8d22\u3001\u7269\u3001\u4e8b\u3001\u4ea7\u3001\u4f9b\u3001\u9500\u3001\u5b58\u201d\u7684\u5168\u94fe\u8def\u7ba1\u7406\u3002

\u901a\u8fc7\u9489\u9489\u5f00\u653e\u5e73\u53f0\u4e0a\u7684SaaS\u8f6f\u4ef6\uff0c\u4f01\u4e1a\u53ef\u4f4e\u6210\u672c\u642d\u5efa\u6570\u5b57\u5316\u5e94\u7528\uff0c\u6574\u5408\u6240\u6709\u6570\u5b57\u5316\u7cfb\u7edf\u3002\u6b64\u5916\uff0c\u9489\u9489\u63d0\u4f9b\u8d85\u8fc72000\u4e2aAPI\u63a5\u53e3\uff0c\u4e3a\u4f01\u4e1a\u6570\u5b57\u5316\u8f6c\u578b\u63d0\u4f9b\u5f00\u653e\u517c\u5bb9\u73af\u5883\u3002\u4e0d\u4f1a\u4ee3\u7801\u7684\u7528\u6237\u4e5f\u53ef\u5229\u7528\u4f4e\u4ee3\u7801\u5de5\u5177\u6784\u5efaCRM\u3001ERP\u3001OA\u3001\u9879\u76ee\u7ba1\u7406\u3001\u8fdb\u9500\u5b58\u7b49\u7cfb\u7edf\u3002

\u76ee\u524d\uff0c\u9489\u9489\u6587\u6863\u3001AI \u8868\u683c\u7b49\u4ea7\u54c1\u6b64\u524d\u5df2\u6df1\u5ea6\u96c6\u6210 MinerU \u80fd\u529b\uff0c\u5e76\u901a\u8fc7\u5f00\u653e\u5e73\u53f0\u5411\u751f\u6001\u5f00\u53d1\u8005\u5f00\u653e\u6587\u6863\u89e3\u6790\u529f\u80fd\uff0c\u4e3a DLU \u7684\u8054\u5408\u7814\u53d1\u63d0\u4f9b\u4e86\u624e\u5b9e\u7684\u6280\u672f\u4e0e\u573a\u666f\u57fa\u7840\u3002

  • \u9489\u9489\u5b98\u7f51\uff1ahttps://www.dingtalk.com/
"},{"location":"zh/usage/plugin/FastGPT/","title":"FastGPT \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/FastGPT/#fastgpt","title":"FastGPT \u7b80\u4ecb","text":"

FastGPT \u662f\u4e00\u4e2a\u57fa\u4e8e LLM \u5927\u8bed\u8a00\u6a21\u578b\u7684\u77e5\u8bc6\u5e93\u95ee\u7b54\u7cfb\u7edf\uff0c\u5c06\u667a\u80fd\u5bf9\u8bdd\u4e0e\u53ef\u89c6\u5316\u7f16\u6392\u5b8c\u7f8e\u7ed3\u5408\uff0c\u8ba9 AI \u5e94\u7528\u5f00\u53d1\u53d8\u5f97\u7b80\u5355\u81ea\u7136\u3002\u65e0\u8bba\u60a8\u662f\u5f00\u53d1\u8005\u8fd8\u662f\u4e1a\u52a1\u4eba\u5458\uff0c\u90fd\u80fd\u8f7b\u677e\u6253\u9020\u4e13\u5c5e\u7684 AI \u5e94\u7528\u3002

\u76ee\u524d\uff0cMinerU \u63d2\u4ef6\u5df2\u5728 Coze \u63d2\u4ef6\u5546\u5e97\u4e0a\u7ebf\uff0c\u901a\u8fc7\u5176\u5f3a\u5927\u7684\u6587\u6863\u89e3\u6790\u80fd\u529b\uff0c\u4e3a\u7528\u6237\u642d\u5efa\u667a\u80fd\u4f53\u4e0e\u5de5\u4f5c\u6d41\u63d0\u4f9b\u6587\u6863\u89e3\u6790\u80fd\u529b\uff0c\u52a0\u5feb\u7528\u6237 AI \u5e94\u7528\u7684\u5f00\u53d1\u3002

  • \u5b98\u7f51\u5730\u5740\uff1ahttps://fastgpt.cn
  • Miner fastGPT \u63d2\u4ef6\u4e0b\u8f7d\u5730\u5740\uff1ahttps://cloud.fastgpt.io/dashboard/systemPlugin?type=productivity
"},{"location":"zh/usage/plugin/ModelWhale/","title":"ModelWhale \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/ModelWhale/#modelwhale","title":"ModelWhale \u7b80\u4ecb","text":"

ModelWhale\u662f\u4e00\u6b3e\u9ad8\u6548\u7387\u7684\u6570\u636e\u79d1\u5b66\u4e91\u7aef\u534f\u4f5c\u5de5\u5177\uff0c\u4e3a\u6570\u636e\u5de5\u4f5c\u8005\u63d0\u4f9b\u4e86\u5373\u5f00\u5373\u7528\u7684\u4e91\u7aef\u5206\u6790\u73af\u5883\uff0cJupyter Notebook \u4ea4\u4e92\u5f0f\u548cCanvas \u62d6\u62fd\u5f0f\u4e24\u79cd\u5206\u6790\u754c\u9762\uff0c\u5e2e\u52a9\u79d1\u7814\u8005\u3001\u6559\u80b2\u5de5\u4f5c\u8005\u89e3\u51b3\u5e95\u5c42\u5de5\u7a0b\u7e41\u590d\u3001\u6570\u636e\u96be\u4ee5\u5b89\u5168\u5e94\u7528\u3001\u6210\u679c\u6d41\u8f6c\u590d\u73b0\u56f0\u96be\u7b49\u95ee\u9898\u3002\u57fa\u4e8e\u4e0d\u540c\u4f7f\u7528\u573a\u666f\uff0cModelWhale \u4e3a\u7528\u6237\u63d0\u4f9b\u4e09\u4e2a\u4ea7\u54c1\u7248\u672c\uff0c\u5206\u522b\u662f\u57fa\u7840\u7248\u3001\u4e13\u4e1a\u7248\u3001\u56e2\u961f\u7248\u3002

\u76ee\u524d\uff0cMinerU \u63d2\u4ef6\u5df2\u5728 ModelWhale \u5de5\u4f5c\u4e2d\uff0c\u901a\u8fc7\u5176\u5f3a\u5927\u7684\u6587\u6863\u89e3\u6790\u80fd\u529b\uff0c\u4e3a\u7528\u6237\u642d\u5efa\u667a\u80fd\u4f53\u4e0e\u5de5\u4f5c\u6d41\u63d0\u4f9b\u6587\u6863\u89e3\u6790\u80fd\u529b\uff0c\u52a0\u5feb\u7528\u6237 AI \u5e94\u7528\u7684\u5f00\u53d1\u3002

images/DingTalk_01.png

  • ModelWhale \u5b98\u7f51\uff1aMohttps://www.modelwhale.com/pricing?scroll=1
  • MinerU \u5728ModelWhale \u7684\u4f7f\u7528\u5730\u5740\uff1ahttps://www.heywhale.com/org/7b38d/workspace/iframe?url=https://www.heywhale.com/api/model/services/68089d360b1519a862ccb9b4/app/
"},{"location":"zh/usage/plugin/RagFlow/","title":"RagFlow","text":""},{"location":"zh/usage/plugin/RagFlow/#ragflow","title":"RAGFlow","text":"

RAGFlow \u662f\u4e00\u6b3e\u5f00\u6e90 RAG\uff08Retrieval-Augmented Generation\uff09\u5f15\u64ce\u4e0e\u5e94\u7528\u5e73\u53f0\uff0c\u6df1\u5ea6\u878d\u5408\u4e86\u6df1\u5ea6\u6587\u6863\u7406\u89e3\u3001\u81ea\u52a8\u5316 RAG \u5de5\u4f5c\u6d41\u4e0e\u5927\u6a21\u578b\u8c03\u7528\uff0c\u6253\u901a\u4e86\u590d\u6742\u6570\u636e\u5904\u7406\u3001\u77e5\u8bc6\u68c0\u7d22\u3001\u589e\u5f3a\u751f\u6210\u7684\u5168\u6d41\u7a0b\uff0c\u65e8\u5728\u4e3a\u4f01\u4e1a\u53ca\u5f00\u53d1\u8005\u63d0\u4f9b\u4e00\u7ad9\u5f0f\u667a\u80fd\u95ee\u7b54\u5f00\u53d1\u670d\u52a1\uff0c\u5e76\u652f\u6301\u5404\u7c7b\u590d\u6742\u573a\u666f\u4e0b\u5927\u6a21\u578b\u7684\u6784\u5efa\u4e0e\u5e94\u7528\u843d\u5730\u3002

\u76ee\u524d\uff0cMinerU \u5df2\u6df1\u5ea6\u96c6\u6210\u81f3 RAGFlow \u77e5\u8bc6\u5e93\u5728\u7ebf\u7248\u672c\uff0c\u4f5c\u4e3a\u5185\u7f6e PDF \u6587\u6863\u89e3\u6790\u5668\uff0c\u4e3a\u7528\u6237\u77e5\u8bc6\u5e93\u642d\u5efa\u63d0\u4f9b\u4e13\u4e1a\u3001\u53ef\u9760\u7684\u6587\u6863\u89e3\u6790\u652f\u6301\u3002\u672c\u5730\u90e8\u7f72\u7248\u672c\u90e8\u7f72\u4f7f\u7528\u65b9\u5f0f\u8be6\u89c1\u4e0b\u65b9\u4f7f\u7528\u6559\u7a0b\u3002

\u4f7f\u7528\u53ef\u8bbf\u95ee\uff1ahttps://demo.ragflow.io/

"},{"location":"zh/usage/plugin/RagFlow/#ragflow-mineru","title":"\u4f7f\u7528\u6559\u7a0b\uff1a\u5982\u4f55\u5728 RAGFlow \u4e2d\u4f7f\u7528 MinerU","text":""},{"location":"zh/usage/plugin/RagFlow/#_1","title":"\u4e00\u3001\u5b89\u88c5\u914d\u7f6e","text":"

\u9996\u5148\uff0c\u6211\u4eec\u5efa\u8bae\u60a8\u901a\u8fc7 docker \u7684\u5f62\u5f0f\u5728\u672c\u5730\u90e8\u7f72 RagFlow \u4ee5\u65b9\u4fbf\u4f7f\u7528 MinerU \u63d2\u4ef6\u4f5c\u4e3a\u89e3\u6790\u5de5\u5177\u3002\u5728\u5b89\u88c5\u5b8c RagFlow \u540e\u6267\u884c\uff1a

  1. \u7248\u672c\u68c0\u67e5\uff1a

\u786e\u4fdd\u4f60\u7684RAGFlow\u7248\u672c >= v0.21.1\u3002

  1. \u66f4\u65b0 .env \u6587\u4ef6\uff1a

\u4e3a\u4e86\u786e\u4fdd\u670d\u52a1\u80fd\u88ab\u5e73\u7a33\u4fee\u6539\uff0c\u5efa\u8bae\u5148\u5728 cmd \u8fd0\u884c docker compose down \u505c\u6389\u670d\u52a1\u3002

\u6253\u5f00 .env \u6587\u4ef6\uff0c\u5728\u6587\u4ef6\u7684\u672b\u5c3e\uff0c\u6dfb\u52a0\u8fd9\u4e24\u884c\u4ee3\u7801\uff0c\u4fdd\u5b58\u6587\u4ef6\u3002

HF_ENDPOINT=https://hf-mirror.com\nMINERU_EXECUTABLE=/ragflow/uv_tools/.venv/bin/mineru\n
  1. \u542f\u52a8\u5e76\u8fdb\u5165\u5bb9\u5668\uff1a

\u5728 cmd \u4e2d\uff0c\u91cd\u65b0\u542f\u52a8\u670d\u52a1\uff1adocker compose up -d

\u7b49\u5f85\u670d\u52a1\u5168\u90e8 Running \u6216 Healthy \u540e\uff0c\u8fd0\u884c\u4ee5\u4e0b\u547d\u4ee4\u8fdb\u5165RAGFlow\u7684\u6838\u5fc3\u5bb9\u5668\uff1a

docker compose exec ragflow-cpu bash\n

\uff08\u4f60\u7684\u547d\u4ee4\u884c\u63d0\u793a\u7b26\u4f1a\u4ece C:\\...> \u53d8\u4e3a root@...\uff09

  1. \u5728\u5bb9\u5668\u5185\u4e0b\u8f7d MinerU \u6a21\u578b\uff1a

    \u5728\u5bb9\u5668\u5185\u90e8\uff0c\u4f9d\u6b21\u8fd0\u884c\u4ee5\u4e0b 5 \u6761\u547d\u4ee4

mkdir uv_tools\ncd uv_tools\nuv venv .venv\nsource .venv/bin/activate\nuv pip install -U \"mineru[core]\" -i https://mirrors.aliyun.com/pypi/simple\n
  1. \u9000\u51fa\u5e76\u91cd\u542f\uff1a

\u5b89\u88c5\u5b8c\u6210\u540e\uff0c\u8f93\u5165 exit \u5e76\u6309\u56de\u8f66\u3002

\u8fd0\u884c\u91cd\u542f\u547d\u4ee4\uff0c\u8ba9 RAGFlow \u52a0\u8f7d\u521a\u88c5\u597d\u7684 MinerU

docker compose restart ragflow-cpu\n
"},{"location":"zh/usage/plugin/RagFlow/#_2","title":"\u4e8c\u3001\u4f7f\u7528\u5165\u53e3","text":"

\u5728\u672c\u5730\u90e8\u7f72\u5b8c\u6bd5\u540e\uff0c\u8981\u542f\u7528 MinerU\uff0c\u60a8\u9700\u8981\u5728\u8fdb\u5165 RagFlow \u7279\u5b9a\u77e5\u8bc6\u5e93\u7684\u914d\u7f6e\u9875\u9762\u5e76\u9009\u62e9 MinerU \u4f5c\u4e3a\u9ed8\u8ba4\u7684 PDF \u89e3\u6790\u5668\u3002\uff08\u6ce8\uff1aRagFlow \u5728\u7ebf\u7248\u4e2d\u5df2\u7ecf\u5185\u7f6e\u4e86 MinerU \u63d2\u4ef6\u4e3a\u60a8\u63d0\u4f9b\u4e86\u9ad8\u7ea7\u7684 PDF \u6587\u4ef6\u89e3\u6790\u80fd\u529b\uff0c\u4f7f\u7528\u65b9\u5f0f\u4e0e\u6b64\u4e00\u81f4\u3002\uff09

\u5165\u53e3\u548c\u914d\u7f6e\u6b65\u9aa4\uff1a

  1. \u8fdb\u5165\u77e5\u8bc6\u5e93\u914d\u7f6e\uff1a
  2. \u9996\u5148\uff0c\u5728\u60a8\u7684\u77e5\u8bc6\u5e93\u7ba1\u7406\u754c\u9762\uff0c\u9009\u62e9\u60a8\u9700\u8981\u914d\u7f6e\u7684\u7279\u5b9a\u77e5\u8bc6\u5e93\uff08\u4f8b\u5982\u56fe\u793a\u4e2d\u7684 \"content\" \u77e5\u8bc6\u5e93\uff09\u3002
  3. \u5728\u77e5\u8bc6\u5e93\u8be6\u60c5\u9875\u9762\u7684\u5de6\u4fa7\u5bfc\u822a\u680f\u4e2d\uff0c\u70b9\u51fb\u3010\u914d\u7f6e\u3011\u9009\u9879\u5361\u3002
  4. \u5b9a\u4f4d PDF \u89e3\u6790\u5668\u8bbe\u7f6e\uff1a
  5. \u5411\u4e0b\u6eda\u52a8\u9875\u9762\uff0c\u627e\u5230\u201cIngestion pipeline\u201d\uff08\u6444\u53d6\u7ba1\u9053\uff09\u8bbe\u7f6e\u90e8\u5206\u3002
  6. \u5728\u6b64\u90e8\u5206\u4e2d\uff0c\u60a8\u4f1a\u770b\u5230\u4e00\u4e2a\u540d\u4e3a\u3010PDF\u89e3\u6790\u5668\u3011\uff08PDF Parser\uff09\u7684\u9009\u9879\u3002
  7. \u9009\u62e9 MinerU\uff1a
  8. \u70b9\u51fb\u3010PDF\u89e3\u6790\u5668\u3011\u65c1\u8fb9\u7684\u4e0b\u62c9\u83dc\u5355\u3002
  9. \u4ece\u53ef\u7528\u9009\u9879\u4e2d\uff0c\u9009\u62e9\u3010MinerU\u3011\u3002
  10. \u4fdd\u5b58\u4fee\u6539\uff1a
  11. \u5b8c\u6210\u9009\u62e9\u540e\uff0c\u8bf7\u52a1\u5fc5\u70b9\u51fb\u9875\u9762\u5e95\u90e8\u7684\u3010\u4fdd\u5b58\u3011\u6309\u94ae\uff0c\u4ee5\u4f7f\u66f4\u6539\u751f\u6548\u3002

"},{"location":"zh/usage/plugin/Sider/","title":"Sider \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/Sider/#sider","title":"Sider \u7b80\u4ecb","text":"

Sider \u662f\u4e00\u6b3e\u6d4f\u89c8\u5668\u4fa7\u8fb9\u680f\u7c7b\u7684 AI \u52a9\u624b\u6269\u5c55\uff0c\u4e3b\u8981\u5728\u7f51\u9875\u53f3\u4fa7\u5f00\u542f\u4e00\u4e2a\u201c\u968f\u5904\u53ef\u7528\u201d\u7684\u667a\u80fd\u9762\u677f\uff0c\u5c06\u5bf9\u8bdd\u5f0f AI\uff08\u5982 GPT\u3001Claude\u3001Gemini \u7b49\uff09\u5e26\u5230\u4f60\u6b63\u5728\u6d4f\u89c8\u7684\u4efb\u4f55\u9875\u9762\u4e2d\u3002\u5b83\u7684\u6838\u5fc3\u5b9a\u4f4d\u662f\uff1a\u63d0\u5347\u9605\u8bfb\u3001\u5199\u4f5c\u3001\u7ffb\u8bd1\u3001\u68c0\u7d22\u4e0e\u603b\u7ed3\u6548\u7387\uff0c\u5e76\u4e0e\u7f51\u9875\u5185\u5bb9\u6df1\u5ea6\u8054\u52a8\u3002

\u76ee\u524d\uff0cSider\u5728 Wisebase \u6a21\u5757\u4e2d\u6df1\u5ea6\u96c6\u6210\u4e86 MinerU \u7684\u76f8\u5173\u529f\u80fd\u3002\u8be5\u6a21\u5757\u662f\u4e00\u4e2a\u7531AI\u9a71\u52a8\u7684\u77e5\u8bc6\u5e93\uff0c\u60a8\u53ef\u4ee5\u901a\u8fc7\u4e0a\u4f20 PDF \u7b49\u5404\u7c7b\u578b\u6587\u4ef6\uff0c\u6784\u5efa\u4e2a\u4eba\u56fe\u4e66\u9986\u4ee5\u5b9e\u73b0\u9ad8\u6548\u7684\u77e5\u8bc6\u7ba1\u7406\uff0cMinerU \u53ef\u4ee5\u5e2e\u52a9\u60a8\u66f4\u597d\u5730\u89e3\u6790\u6b64\u7c7b\u6587\u4ef6\uff0c\u7cbe\u51c6\u5730\u63d0\u53d6\u6587\u4ef6\u4e2d\u7684\u4fe1\u606f\u3002

  • Sider \u5b98\u7f51\u5730\u5740\uff1ahttps://sider.ai/zh-CN/chat
  • \u4f7f\u7528\u96c6\u6210 MinerU \u76f8\u5173\u529f\u80fd\u7684 Sider \u5730\u5740\uff1ahttps://sider.ai/zh-CN/wisebase
"},{"location":"zh/usage/plugin/n8n/","title":"n8n \u7b80\u4ecb","text":""},{"location":"zh/usage/plugin/n8n/#n8n","title":"n8n \u7b80\u4ecb","text":"

n8n \u662f\u4e00\u6b3e\u4ee5\u4f4e\u4ee3\u7801\uff08Low-code\uff09\u3001\u5de5\u4f5c\u6d41\u81ea\u52a8\u5316\u4e3a\u6838\u5fc3\u7684\u5e94\u7528\u5f00\u53d1\u5e73\u53f0\uff0c\u8bb8\u591a\u4f01\u4e1a\u90fd\u501f\u52a9\u4e8e\u5176\u7075\u6d3b\u7684\u8282\u70b9\uff08Node\uff09\u914d\u7f6e\uff0c\u5b9e\u73b0\u4e1a\u52a1\u6d41\u7a0b\u7684\u81ea\u52a8\u5316\u6267\u884c\u3002\u5b83\u901a\u8fc7\u53ef\u89c6\u5316\u754c\u9762\u548c\u4ee3\u7801\u6269\u5c55\u80fd\u529b\uff0c\u5e2e\u52a9\u7528\u6237\u8fde\u63a5\u5404\u79cd\u5e94\u7528\u7a0b\u5e8f\u548c\u670d\u52a1\uff0c\u6784\u5efa\u590d\u6742\u7684\u81ea\u52a8\u5316\u6d41\u7a0b,\u964d\u4f4e\u7528\u6237\u4f7f\u7528\u95e8\u69db\u3002

\u76ee\u524d\uff0cMinerU \u5df2\u5c06\u5176\u5f3a\u5927\u7684\u6587\u6863\u89e3\u6790\u80fd\u529b\u5c01\u88c5\u4e3a n8n \u8282\u70b9\uff0c\u7528\u6237\u5728\u642d\u5efa\u5de5\u4f5c\u6d41\u65f6\uff0c\u53ef\u4ee5\u66f4\u52a0\u4fbf\u6377\u5730\u5904\u7406\u590d\u6742\u7684\u6587\u6863\u89e3\u6790\u4efb\u52a1\u3002

  • n8n \u5b98\u7f51\u5730\u5740\uff1ahttps://n8n.io/
  • MinerU n8n \u63d2\u4ef6\u4e0b\u8f7d\u5730\u5740\uff1ahttps://www.npmjs.com/package/n8n-nodes-mineru
"},{"location":"zh/usage/plugin/n8n/#mineru-n8n","title":"MinerU \u5728 n8n \u4e2d\u7684\u4f7f\u7528\u65b9\u6cd5","text":""},{"location":"zh/usage/plugin/n8n/#step1-node","title":"step1 \u8fdb\u5165\u793e\u533anode\u5b89\u88c5\u754c\u9762","text":""},{"location":"zh/usage/plugin/n8n/#step2-n8n-nodes-mineru","title":"step2 \u5b89\u88c5 n8n-nodes-mineru \u8282\u70b9","text":"

\u2248assets/images/n8n_2.png)

"},{"location":"zh/usage/plugin/n8n/#step3-n8n-nodes-mineru-api-key","title":"step3 \u65b0\u5efa\u5de5\u4f5c\u6d41\uff0c\u6dfb\u52a0 n8n-nodes-mineru \u8282\u70b9\uff0c\u5e76\u8bbe\u7f6e api key","text":""},{"location":"zh/usage/plugin/n8n/#n8n_1","title":"n8n\u4f7f\u7528\u8282\u70b9\u6587\u6863","text":"

https://www.npmjs.com/package/n8n-nodes-mineru

"},{"location":"zh/usage/plugin/n8n/#_1","title":"\u5728\u5de5\u4f5c\u6d41\u5185\u96c6\u6210\u89e3\u538b\u529f\u80fd","text":""},{"location":"zh/usage/plugin/n8n/#json","title":"\u5bfc\u5165 json \u6a21\u677f","text":"

\u6682\u65f6\u65e0\u6cd5\u5728\u98de\u4e66\u6587\u6863\u5916\u5c55\u793a\u6b64\u5185\u5bb9

"},{"location":"zh/usage/plugin/n8n/#url","title":"\u914d\u7f6e \u51ed\u8bc1\u548c\u6587\u6863url","text":""},{"location":"zh/usage/plugin/n8n/#_2","title":"\u6839\u636e\u5404\u81ea\u7684\u9700\u6c42\u914d\u7f6e\u6240\u9700\u7684\u8f93\u51fa","text":""},{"location":"zh/usage/plugin/n8n/#_3","title":"\u8c03\u8bd5","text":""}]} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml new file mode 100644 index 00000000..86de3f1c --- /dev/null +++ b/sitemap.xml @@ -0,0 +1,343 @@ + + + + https://opendatalab.github.io/MinerU/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/demo/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/faq/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/quick_start/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/quick_start/docker_deployment/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/quick_start/extension_modules/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/reference/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/reference/changelog/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/reference/output_files/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/usage/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/usage/advanced_cli_parameters/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/usage/cli_tools/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/usage/model_source/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/usage/quick_usage/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/demo/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/faq/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/quick_start/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/quick_start/docker_deployment/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/quick_start/extension_modules/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/reference/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/reference/changelog/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/reference/output_files/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/usage/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/usage/advanced_cli_parameters/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/usage/cli_tools/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/usage/model_source/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/usage/quick_usage/ + 2026-03-26 + daily + + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/AMD/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Ascend/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Biren/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Cambricon/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Enflame/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Hygon/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/IluvatarCorex/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Kunlunxin/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/METAX/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/MooreThreads/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/THead/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/Tecorigin/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/acceleration_cards/VastAI/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/BISHENG/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/Cherry_Studio/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/Coze/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/DataFlow/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/Dify/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/DingTalk/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/FastGPT/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/ModelWhale/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/RagFlow/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/Sider/ + 2026-03-26 + daily + + + + https://opendatalab.github.io/MinerU/zh/usage/plugin/n8n/ + 2026-03-26 + daily + + + \ No newline at end of file diff --git a/sitemap.xml.gz b/sitemap.xml.gz new file mode 100644 index 00000000..ce5bd17f Binary files /dev/null and b/sitemap.xml.gz differ diff --git a/usage/advanced_cli_parameters/index.html b/usage/advanced_cli_parameters/index.html new file mode 100644 index 00000000..8113b49a --- /dev/null +++ b/usage/advanced_cli_parameters/index.html @@ -0,0 +1,1967 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Advanced CLI Parameters - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Advanced Command Line Parameters

+

Pass-through of inference engine parameters

+

vllm Acceleration Parameter Optimization

+
+

Tip

+

If you can already use vllm normally for accelerated VLM model inference but still want to further improve inference speed, you can try the following parameters:

+
    +
  • If you have multiple graphics cards, you can use vllm's multi-card parallel mode to increase throughput: --data-parallel-size 2
  • +
+
+

Parameter Passing Instructions

+
+

Tip

+
    +
  • All officially supported vllm/lmdeploy parameters can be passed to MinerU through command line arguments, including the following commands: mineru, mineru-openai-server, mineru-gradio, mineru-api
  • +
  • If you want to learn more about vllm parameter usage, please refer to the vllm official documentation
  • +
  • If you want to learn more about lmdeploy parameter usage, please refer to the lmdeploy official documentation
  • +
+
+

GPU Device Selection and Configuration

+

CUDA_VISIBLE_DEVICES Basic Usage

+
+

Tip

+
    +
  • In any situation, you can specify visible GPU devices by adding the CUDA_VISIBLE_DEVICES environment variable at the beginning of the command line. For example: +
    CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
    +
  • +
  • This specification method is effective for all command line calls, including mineru, mineru-openai-server, mineru-gradio, and mineru-api, and applies to both pipeline and vlm backends.
  • +
+
+

Common Device Configuration Examples

+
+

Tip

+

Here are some common CUDA_VISIBLE_DEVICES setting examples: +

CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen
+CUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible
+CUDA_VISIBLE_DEVICES="0,1"  # Same as above, quotation marks are optional
+CUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked
+CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
+
+
+

Practical Application Scenarios

+
+

Tip

+

Here are some possible usage scenarios:

+
    +
  • +

    If you have multiple graphics cards and need to specify cards 0 and 1, using multi-card parallelism to start openai-server, you can use the following command: +

    CUDA_VISIBLE_DEVICES=0,1 mineru-openai-server --engine vllm --port 30000 --data-parallel-size 2
    +
    +
  • +
  • +

    If you have multiple graphics cards and need to start two fastapi services on cards 0 and 1, listening on different ports respectively, you can use the following commands: +

    # In terminal 1
    +CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
    +# In terminal 2
    +CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
    +
    +
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/usage/cli_tools/index.html b/usage/cli_tools/index.html new file mode 100644 index 00000000..1d626d27 --- /dev/null +++ b/usage/cli_tools/index.html @@ -0,0 +1,1978 @@ + + + + + + + + + + + + + + + + + + + + + + + + + CLI Tools - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Command Line Tools Usage Instructions

+

View Help Information

+

To view help information for MinerU command line tools, you can use the --help parameter. Here are help information examples for various command line tools: +

mineru --help
+Usage: mineru [OPTIONS]
+
+Options:
+  -v, --version                   Show version and exit
+  -p, --path PATH                 Input file path or directory (required)
+  -o, --output PATH               Output directory (required)
+  --api-url TEXT                  MinerU FastAPI base URL; if omitted, `mineru` starts a temporary local `mineru-api`
+  -m, --method [auto|txt|ocr]     Parsing method: auto (default), txt, ocr (pipeline and hybrid* backend only)
+  -b, --backend [pipeline|hybrid-auto-engine|hybrid-http-client|vlm-auto-engine|vlm-http-client]
+                                  Parsing backend (default: hybrid-auto-engine)
+  -l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|th|el|latin|arabic|east_slavic|cyrillic|devanagari]
+                                  Specify document language (improves OCR accuracy, pipeline and hybrid* backend only)
+  -u, --url TEXT                  OpenAI-compatible backend URL passed through to the server when using http-client
+  -s, --start INTEGER             Starting page number for parsing (0-based)
+  -e, --end INTEGER               Ending page number for parsing (0-based)
+  -f, --formula BOOLEAN           Enable formula parsing (default: enabled)
+  -t, --table BOOLEAN             Enable table parsing (default: enabled)
+  --help                          Show help information
+
+
mineru-api --help
+Usage: mineru-api [OPTIONS]
+
+Options:
+  --host TEXT     Server host (default: 127.0.0.1)
+  --port INTEGER  Server port (default: 8000)
+  --reload        Enable auto-reload (development mode)
+  --help          Show this message and exit.
+
+
mineru-gradio --help
+Usage: mineru-gradio [OPTIONS]
+
+Options:
+  --enable-example BOOLEAN        Enable example files for input. The example
+                                  files to be input need to be placed in the
+                                  `example` folder within the directory where
+                                  the command is currently executed.
+  --enable-http-client BOOLEAN    Enable http-client backend to link openai-
+                                  compatible servers.
+  --enable-api BOOLEAN            Enable gradio API for serving the
+                                  application.
+  --max-convert-pages INTEGER     Set the maximum number of pages to convert
+                                  from PDF to Markdown.
+  --server-name TEXT              Set the server name for the Gradio app.
+  --server-port INTEGER           Set the server port for the Gradio app.
+  --latex-delimiters-type [a|b|all]
+                                  Set the type of LaTeX delimiters to use in
+                                  Markdown rendering: 'a' for type '$', 'b' for
+                                  type '()[]', 'all' for both types.
+  --help                          Show this message and exit.
+
+

Environment Variables Description

+
+

Note

+

Starting from this version, mineru is an orchestration client built on top of mineru-api: +- Without --api-url, the CLI launches a temporary local mineru-api +- With --api-url, the CLI connects to that FastAPI service directly +- --url is no longer the MinerU API address; it is the OpenAI-compatible backend URL used by server-side vlm/hybrid-http-client

+
+

Some parameters of MinerU command line tools have equivalent environment variable configurations. Generally, environment variable configurations have higher priority than command line parameters and take effect across all command line tools. +Here are the environment variables and their descriptions:

+
    +
  • +

    MINERU_TOOLS_CONFIG_JSON:

    +
      +
    • Used to specify configuration file path
    • +
    • defaults to mineru.json in user directory, can specify other configuration file paths through environment variables.
    • +
    +
  • +
  • +

    MINERU_FORMULA_ENABLE:

    +
      +
    • Used to enable formula parsing
    • +
    • defaults to true, can be set to false through environment variables to disable formula parsing.
    • +
    +
  • +
  • +

    MINERU_FORMULA_CH_SUPPORT:

    +
      +
    • Used to enable Chinese formula parsing optimization (experimental feature)
    • +
    • Default is false, can be set to true via environment variable to enable Chinese formula parsing optimization.
    • +
    • Only effective for pipeline backend.
    • +
    +
  • +
  • +

    MINERU_TABLE_ENABLE:

    +
      +
    • Used to enable table parsing
    • +
    • Default is true, can be set to false via environment variable to disable table parsing.
    • +
    +
  • +
  • +

    MINERU_TABLE_MERGE_ENABLE:

    +
      +
    • Used to enable table merging functionality
    • +
    • Default is true, can be set to false via environment variable to disable table merging functionality.
    • +
    +
  • +
  • +

    MINERU_PDF_RENDER_TIMEOUT:

    +
      +
    • Used to set the timeout (in seconds) for rendering PDFs to images.
    • +
    • Default is 300 seconds; you can set a different value via an environment variable to adjust the rendering timeout.
    • +
    • Only effective on Linux and macOS systems.
    • +
    +
  • +
  • +

    MINERU_PDF_RENDER_THREADS:

    +
      +
    • Used to set the number of threads used when rendering PDFs to images.
    • +
    • Default is 4; you can set a different value via an environment variable to adjust the number of threads for image rendering.
    • +
    • Only effective on Linux and macOS systems.
    • +
    +
  • +
  • +

    MINERU_INTRA_OP_NUM_THREADS:

    +
      +
    • Used to set the intra_op thread count for ONNX models, affects the computation speed of individual operators
    • +
    • Default is -1 (auto-select), can be set to other values via environment variable to adjust the thread count.
    • +
    +
  • +
  • +

    MINERU_INTER_OP_NUM_THREADS:

    +
      +
    • Used to set the inter_op thread count for ONNX models, affects the parallel execution of multiple operators
    • +
    • Default is -1 (auto-select), can be set to other values via environment variable to adjust the thread count.
    • +
    +
  • +
  • +

    MINERU_HYBRID_BATCH_RATIO:

    +
      +
    • Used to set the batch ratio for small model processing in hybrid-* backends.
    • +
    • Commonly used in hybrid-http-client, it allows adjusting the VRAM usage of a single client by controlling the batch ratio of small models.
    • +
    • + + + + + + + + + + + + + + + + + + + + + + + + + +
      Single Client VRAM SizeMINERU_HYBRID_BATCH_RATIO
      <= 6 GB8
      <= 4.5 GB4
      <= 3 GB2
      <= 2.5 GB1
      +
    • +
    +
  • +
  • +

    MINERU_HYBRID_FORCE_PIPELINE_ENABLE:

    +
      +
    • Used to force the text extraction part in hybrid-* backends to be processed using small models.
    • +
    • Defaults to false. Can be set to true via environment variable to enable this feature, thereby reducing hallucinations in certain extreme cases.
    • +
    +
  • +
  • +

    MINERU_VL_MODEL_NAME:

    +
      +
    • Used to specify the model name for the vlm/hybrid backend, allowing you to designate the model required for MinerU to run when multiple models exist on a remote openai-server.
    • +
    +
  • +
  • +

    MINERU_VL_API_KEY:

    +
      +
    • Used to specify the API Key for the vlm/hybrid backend, enabling authentication on the remote openai-server.
    • +
    +
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/usage/index.html b/usage/index.html new file mode 100644 index 00000000..6b27b733 --- /dev/null +++ b/usage/index.html @@ -0,0 +1,1690 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Usage - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Usage Guide

+

This section provides comprehensive usage instructions for the project. We will help you progressively master the project's usage from basic to advanced through the following sections:

+

Table of Contents

+ +

Getting Started

+

We recommend reading the documentation in the order listed above, which will help you better understand and use the project features.

+

If you encounter issues during usage, please check the FAQ

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/usage/model_source/index.html b/usage/model_source/index.html new file mode 100644 index 00000000..e88d5dfe --- /dev/null +++ b/usage/model_source/index.html @@ -0,0 +1,1935 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Model Source - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Model Source Documentation

+

MinerU uses HuggingFace and ModelScope as model repositories. Users can switch model sources or use local models as needed.

+
    +
  • HuggingFace is the default model source, providing excellent loading speed and high stability globally.
  • +
  • ModelScope is the best choice for users in mainland China, providing seamlessly compatible hf SDK modules, suitable for users who cannot access HuggingFace.
  • +
+

Methods to Switch Model Sources

+

Switch via Command Line Parameters

+

Currently, only the mineru command line tool supports switching model sources through command line parameters. Other command line tools such as mineru-api, mineru-gradio, etc., do not support this yet. +

mineru -p <input_path> -o <output_path> --source modelscope
+
+

Switch via Environment Variables

+

You can switch model sources by setting environment variables in any situation. This applies to all command line tools and API calls. +

export MINERU_MODEL_SOURCE=modelscope
+
+or +
import os
+os.environ["MINERU_MODEL_SOURCE"] = "modelscope"
+
+
+

Tip

+

Model sources set through environment variables will take effect in the current terminal session until the terminal is closed or the environment variable is modified. They have higher priority than command line parameters - if both command line parameters and environment variables are set, the command line parameters will be ignored.

+
+

Using Local Models

+

1. Download Models to Local Storage

+

mineru-models-download --help
+
+or use the interactive command line tool to select model downloads: +
mineru-models-download
+
+
+

Note

+
    +
  • After download completion, the model path will be output in the current terminal window and automatically written to mineru.json in the user directory.
  • +
  • You can also create it by copying the configuration template file to your user directory and renaming it to mineru.json.
  • +
  • After downloading models locally, you can freely move the model folder to other locations while updating the model path in mineru.json.
  • +
  • If you deploy the model folder to another server, please ensure you move the mineru.json file to the user directory of the new device and configure the model path correctly.
  • +
  • If you need to update model files, you can run the mineru-models-download command again. Model updates do not support custom paths currently - if you haven't moved the local model folder, model files will be incrementally updated; if you have moved the model folder, model files will be re-downloaded to the default location and mineru.json will be updated.
  • +
+
+

2. Use Local Models for Parsing

+

mineru -p <input_path> -o <output_path> --source local
+
+or enable through environment variables: +
export MINERU_MODEL_SOURCE=local
+mineru -p <input_path> -o <output_path>
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/usage/quick_usage/index.html b/usage/quick_usage/index.html new file mode 100644 index 00000000..de84d1a1 --- /dev/null +++ b/usage/quick_usage/index.html @@ -0,0 +1,1974 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Quick Usage - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Using MinerU

+

Quick Model Source Configuration

+

MinerU uses huggingface as the default model source. If users cannot access huggingface due to network restrictions, they can conveniently switch the model source to modelscope through environment variables: +

export MINERU_MODEL_SOURCE=modelscope
+
+For more information about model source configuration and custom local model paths, please refer to the Model Source Documentation in the documentation. +

Quick Usage via Command Line

+

MinerU has built-in command line tools that allow users to quickly use MinerU for PDF parsing through the command line: +

mineru -p <input_path> -o <output_path>
+
+
+

Tip

+
    +
  • <input_path>: Local PDF/image file or directory
  • +
  • <output_path>: Output directory
  • +
  • Without --api-url, the CLI launches a temporary local mineru-api
  • +
  • With --api-url, the CLI connects to an existing local or remote FastAPI service directly
  • +
+

For more information about output files, please refer to Output File Documentation.

+
+
+

Note

+

The command line tool will automatically attempt cuda/mps acceleration on Linux and macOS systems. +Windows users who need cuda acceleration should visit the PyTorch official website to select the appropriate command for their cuda version to install acceleration-enabled torch and torchvision.

+
+

If you need to adjust parsing options through custom parameters, you can also check the more detailed Command Line Tools Usage Instructions in the documentation.

+

Advanced Usage via API, WebUI, http-client/server

+
    +
  • Direct Python API calls: Python Usage Example
  • +
  • FastAPI calls: +
    mineru-api --host 0.0.0.0 --port 8000
    +
    +

    Tip

    +

    Access http://127.0.0.1:8000/docs in your browser to view the API documentation.

    +
      +
    • Health endpoint: GET /health + Returns protocol_version, processing_window_size, max_concurrent_requests, and task stats
    • +
    • Asynchronous task submission endpoint: POST /tasks
    • +
    • Synchronous parsing endpoint: POST /file_parse
    • +
    • Task query endpoints: GET /tasks/{task_id}, GET /tasks/{task_id}/result
    • +
    • API outputs are controlled by the server and written to ./output by default
    • +
    +

    POST /tasks returns immediately with a task_id. POST /file_parse uses the same task manager internally, waits for the task to finish, and then returns the final result synchronously. +Tasks are tracked only in-process for a single mineru-api instance. Task status is not preserved across service restarts, --reload, or multi-process deployments. +Completed or failed tasks are retained for 24 hours by default, then their task state and output directory are cleaned automatically. After cleanup, task status and result endpoints return 404. +Use MINERU_API_TASK_RETENTION_SECONDS and MINERU_API_TASK_CLEANUP_INTERVAL_SECONDS to adjust retention and cleanup polling intervals.

    +

    Asynchronous task submission example: +

    curl -X POST http://127.0.0.1:8000/tasks \
    +  -F "files=@demo/pdfs/demo1.pdf" \
    +  -F "return_md=true"
    +
    +

    Synchronous parsing example: +

    curl -X POST http://127.0.0.1:8000/file_parse \
    +  -F "files=@demo/pdfs/demo1.pdf" \
    +  -F "return_md=true" \
    +  -F "response_format_zip=true" \
    +  -F "return_original_file=true"
    +
    +

    Poll task status and fetch results: +

    curl http://127.0.0.1:8000/tasks/<task_id>
    +curl http://127.0.0.1:8000/tasks/<task_id>/result
    +curl http://127.0.0.1:8000/health
    +
    +
    +
  • +
  • +

    Start Gradio WebUI visual frontend: +

    mineru-gradio --server-name 0.0.0.0 --server-port 7860
    +
    +
    +

    Tip

    +
      +
    • Access http://127.0.0.1:7860 in your browser to use the Gradio WebUI.
    • +
    +
    +
  • +
  • +

    Using http-client/server method: +

    # Start openai compatible server (requires vllm or lmdeploy environment)
    +mineru-openai-server --port 30000
    +
    +
    +

    Tip

    +

    In another terminal, connect to openai server via http client +

    mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000
    +
    +
    +
  • +
+
+

Note

+

All officially supported vllm/lmdeploy parameters can be passed to MinerU through command line arguments, including the following commands: mineru, mineru-openai-server, mineru-gradio, mineru-api. +We have compiled some commonly used parameters and usage methods for vllm/lmdeploy, which can be found in the documentation Advanced Command Line Parameters.

+
+

Extending MinerU Functionality with Configuration Files

+

MinerU is now ready to use out of the box, but also supports extending functionality through configuration files. You can edit mineru.json file in your user directory to add custom configurations.

+
+

Important

+

The mineru.json file will be automatically generated when you use the built-in model download command mineru-models-download, or you can create it by copying the configuration template file to your user directory and renaming it to mineru.json.

+
+

Here are some available configuration options:

+
    +
  • +

    latex-delimiter-config:

    +
      +
    • Used to configure LaTeX formula delimiters
    • +
    • Defaults to $ symbol, can be modified to other symbols or strings as needed.
    • +
    +
  • +
  • +

    llm-aided-config:

    +
      +
    • Used to configure parameters for LLM-assisted title hierarchy
    • +
    • Compatible with all LLM models supporting openai protocol, defaults to using Alibaba Cloud Bailian's qwen3-next-80b-a3b-instruct model.
    • +
    • You need to configure your own API key and set enable to true to enable this feature.
    • +
    • If your API provider does not support the enable_thinking parameter, please manually remove it.
        +
      • For example, in your configuration file, the llm-aided-config section may look like: +
        "llm-aided-config": {
        +   "api_key": "your_api_key",
        +   "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        +   "model": "qwen3-next-80b-a3b-instruct",
        +   "enable_thinking": false,
        +   "enable": false
        +}
        +
      • +
      • To remove the enable_thinking parameter, simply delete the line containing "enable_thinking": false, resulting in: +
        "llm-aided-config": {
        +   "api_key": "your_api_key",
        +   "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        +   "model": "qwen3-next-80b-a3b-instruct",
        +   "enable": false
        +}
        +
      • +
      +
    • +
    +
  • +
  • +

    models-dir:

    +
      +
    • Used to specify local model storage directory
    • +
    • Please specify model directories for pipeline and vlm backends separately.
    • +
    • After specifying the directory, you can use local models by configuring the environment variable export MINERU_MODEL_SOURCE=local.
    • +
    +
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/demo/index.html b/zh/demo/index.html new file mode 100644 index 00000000..367b65b8 --- /dev/null +++ b/zh/demo/index.html @@ -0,0 +1,1658 @@ + + + + + + + + + + + + + + + + + + + + + + + 在线演示 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

在线演示

+ + + + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/faq/index.html b/zh/faq/index.html new file mode 100644 index 00000000..868ac295 --- /dev/null +++ b/zh/faq/index.html @@ -0,0 +1,1700 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 常见问题解答 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

常见问题解答

+

如果未能列出您的问题,您也可以使用DeepWiki与AI助手交流,这可以解决大部分常见问题。

+

如果您仍然无法解决问题,您可通过DiscordWeChat加入社区,与其他用户和开发者交流。

+
+在WSL2的Ubuntu22.04中遇到报错ImportError: libGL.so.1: cannot open shared object file: No such file or directory +

WSL2的Ubuntu22.04中缺少libgl库,可通过以下命令安装libgl库解决:

+
sudo apt-get install libgl1-mesa-glx
+
+

参考:#388

+
+
+在 Linux 系统安装并使用时,解析结果缺失部份文字信息。 +

MinerU在>=2.0的版本中使用pypdfium2代替pymupdf作为PDF页面的渲染引擎,以解决AGPLv3的许可证问题,在某些Linux发行版,由于缺少CJK字体,可能会在将PDF渲染成图片的过程中丢失部份文字。 +为了解决这个问题,您可以通过以下命令安装noto字体包,这在Ubuntu/debian系统中有效: +

sudo apt update
+sudo apt install fonts-noto-core
+sudo apt install fonts-noto-cjk
+fc-cache -fv
+
+也可以直接使用我们的Docker部署方式构建镜像,镜像中默认包含以上字体包。 +

参考:#2915

+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/index.html b/zh/index.html new file mode 100644 index 00000000..b2651521 --- /dev/null +++ b/zh/index.html @@ -0,0 +1,1741 @@ + + + + + + + + + + + + + + + + + + + + + + + MinerU - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MinerU

+ +
+ +

+ +

+
+ + + +

stars +forks +open issues +issue resolution +PyPI version +PyPI - Python Version +Downloads +Downloads +OpenDataLab +ModelScope +HuggingFace +Colab +arXiv +arXiv +Ask DeepWiki

+ + +

项目简介

+

MinerU是一款将PDF转化为机器可读格式的工具(如markdown、json),可以很方便地抽取为任意格式。 +MinerU诞生于书生-浦语的预训练过程中,我们将会集中精力解决科技文献中的符号转化问题,希望在大模型时代为科技发展做出贡献。 +相比国内外知名商用产品MinerU还很年轻,如果遇到问题或者结果不及预期请到issue提交问题,同时附上相关PDF

+

+

主要功能

+
    +
  • 删除页眉、页脚、脚注、页码等元素,确保语义连贯
  • +
  • 输出符合人类阅读顺序的文本,适用于单栏、多栏及复杂排版
  • +
  • 保留原文档的结构,包括标题、段落、列表等
  • +
  • 提取图像、图片描述、表格、表格标题及脚注
  • +
  • 自动识别并转换文档中的公式为LaTeX格式
  • +
  • 自动识别并转换文档中的表格为HTML格式
  • +
  • 自动检测扫描版PDF和乱码PDF,并启用OCR功能
  • +
  • OCR支持109种语言的检测与识别
  • +
  • 支持多种输出格式,如多模态与NLP的Markdown、按阅读顺序排序的JSON、含有丰富信息的中间格式等
  • +
  • 支持多种可视化结果,包括layout可视化、span可视化等,便于高效确认输出效果与质检
  • +
  • 支持纯CPU环境运行,并支持 GPU(CUDA)/NPU(CANN)/MPS 加速
  • +
  • 兼容Windows、Linux和Mac平台
  • +
+

使用指南

+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/quick_start/docker_deployment/index.html b/zh/quick_start/docker_deployment/index.html new file mode 100644 index 00000000..3cdf1cf3 --- /dev/null +++ b/zh/quick_start/docker_deployment/index.html @@ -0,0 +1,1963 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Docker部署 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

使用docker部署Mineru

+

MinerU提供了便捷的docker部署方式,这有助于快速搭建环境并解决一些棘手的环境兼容问题。

+

使用 Dockerfile 构建镜像

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/Dockerfile
+docker build -t mineru:latest -f Dockerfile .
+
+

Docker说明

+

Mineru的docker使用了vllm/vllm-openai作为基础镜像,因此在docker中默认集成了vllm推理加速框架和必需的依赖环境。因此在满足条件的设备上,您可以直接使用vllm加速VLM模型推理。

+
+

Note

+

使用vllm加速VLM模型推理需要满足的条件是:

+
    +
  • 设备包含Volta及以后架构的显卡,且可用显存大于等于8G。
  • +
  • 物理机的显卡驱动应支持CUDA 12.9.1或更高版本,可通过nvidia-smi命令检查驱动版本。
  • +
  • docker中能够访问物理机的显卡设备。
  • +
+
+

启动 Docker 容器

+
docker run --gpus all \
+  --shm-size 32g \
+  -p 30000:30000 -p 7860:7860 -p 8000:8000 \
+  --ipc=host \
+  -it mineru:latest \
+  /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,并映射了一些端口用于可能会使用的服务,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

通过 Docker Compose 直接启动服务

+

我们提供了compose.yml文件,您可以通过它来快速启动MinerU服务。

+
# 下载 compose.yaml 文件
+wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/compose.yaml
+
+
+

Note

+
    +
  • compose.yaml文件中包含了MinerU的多个服务配置,您可以根据需要选择启动特定的服务。
  • +
  • 不同的服务可能会有额外的参数配置,您可以在compose.yaml文件中查看并编辑。
  • +
  • 由于vllm推理加速框架预分配显存的特性,您可能无法在同一台机器上同时运行多个vllm服务,因此请确保在启动vlm-openai-server服务或使用vlm-vllm-engine后端时,其他可能使用显存的服务已停止。
  • +
+
+
+

启动 openai兼容接口 服务

+

并通过vlm-http-client后端连接openai-server +

docker compose -f compose.yaml --profile openai-server up -d
+
+
+

Tip

+

在另一个终端中通过http client连接openai server(只需cpu与网络,不需要vllm环境) +

mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://<server_ip>:30000
+
+
+
+

启动 Web API 服务

+
docker compose -f compose.yaml --profile api up -d
+
+
+

Tip

+

在浏览器中访问 http://<server_ip>:8000/docs 查看API文档。

+
+
+

启动 Gradio WebUI 服务

+
docker compose -f compose.yaml --profile gradio up -d
+
+
+

Tip

+
    +
  • 在浏览器中访问 http://<server_ip>:7860 使用 Gradio WebUI。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/quick_start/extension_modules/index.html b/zh/quick_start/extension_modules/index.html new file mode 100644 index 00000000..2b470aac --- /dev/null +++ b/zh/quick_start/extension_modules/index.html @@ -0,0 +1,1921 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 扩展模块安装 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MinerU 扩展模块安装指南

+

MinerU 支持根据不同需求,按需安装扩展模块,以增强功能或支持特定的模型后端。

+

常见场景

+

核心功能安装

+

core 模块是 MinerU 的核心依赖,包含了除vllm/lmdeploy外的所有功能模块。安装此模块可以确保 MinerU 的基本功能正常运行。 +

uv pip install "mineru[core]"
+
+
+

使用vllm加速 VLM 模型推理

+
+

Note

+

vllmlmdeploy对vlm的推理加速效果和使用方式几乎相同,您可以根据实际情况选择其中之一进行安装和使用,但不建议同时安装这两个模块,以避免潜在的依赖冲突。

+
+

vllm 模块提供了对 VLM 模型推理的加速支持,适用于具有 Volta 及以后架构的显卡(8G 显存及以上)。安装此模块可以显著提升模型推理速度。 +

uv pip install "mineru[core,vllm]"
+
+
+

Tip

+

如在安装包含vllm的扩展包过程中发生异常,请参考 vllm 官方文档 尝试解决,或直接使用 Docker 方式部署镜像。

+
+
+

使用lmdeploy加速 VLM 模型推理

+
+

Note

+

vllmlmdeploy对vlm的推理加速效果和使用方式几乎相同,您可以根据实际情况选择其中之一进行安装和使用,但不建议同时安装这两个模块,以避免潜在的依赖冲突。

+
+

lmdeploy 模块提供了对 VLM 模型推理的加速支持,适用于具有 Volta 及以后架构的显卡(8G 显存及以上)。安装此模块可以显著提升模型推理速度。 +

uv pip install "mineru[core,lmdeploy]"
+
+
+

Tip

+

如在安装包含lmdeploy的扩展包过程中发生异常,请参考 lmdeploy 官方文档 尝试解决。

+
+
+

安装轻量版client连接兼容openai服务器使用 (适用vlm-http-client模式)

+

如果您需要在边缘设备上安装轻量版的 client 端以连接兼容 openai 接口的服务端来使用vlm模式,可以安装mineru的基础包,非常轻量,适合在只有cpu和网络连接的设备上使用。 +

uv pip install mineru
+mineru -p <input_path> -o <output_path> -b vlm-http-client -u http://127.0.0.1:30000
+
+
+

安装轻量版client连接兼容openai服务器使用 (适用hybrid-http-client模式)

+

如果您需要在边缘设备上安装轻量版的 client 端以连接兼容 openai 接口的服务端来使用hybrid模式,可以安装mineru的pipeline扩展包,相对较轻量,可以在只有cpu和网络连接的设备上使用,同时在支持gpu加速的设备上可以更快运行。 +

uv pip install "mineru[pipeline]"
+mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/quick_start/index.html b/zh/quick_start/index.html new file mode 100644 index 00000000..09f507d5 --- /dev/null +++ b/zh/quick_start/index.html @@ -0,0 +1,1811 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 快速入门 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

快速入门

+

如果遇到任何安装问题,请先查询 FAQ

+

在线体验

+

官网在线应用

+

官网在线版功能与客户端一致,界面美观,功能丰富,需要登录使用

+
    +
  • OpenDataLab
  • +
+

基于Gradio的在线demo

+

基于gradio开发的webui,界面简洁,仅包含核心解析功能,免登录

+
    +
  • ModelScope
  • +
  • HuggingFace
  • +
+

本地部署

+
+

Warning

+

安装前必看——软硬件环境支持说明

+

为了确保项目的稳定性和可靠性,我们在开发过程中仅对特定的软硬件环境进行优化和测试。这样当用户在推荐的系统配置上部署和运行项目时,能够获得最佳的性能表现和最少的兼容性问题。

+

通过集中资源和精力于主线环境,我们团队能够更高效地解决潜在的BUG,及时开发新功能。

+

在非主线环境中,由于硬件、软件配置的多样性,以及第三方依赖项的兼容性问题,我们无法100%保证项目的完全可用性。因此,对于希望在非推荐环境中使用本项目的用户,我们建议先仔细阅读文档以及FAQ,大多数问题已经在FAQ中有对应的解决方案,除此之外我们鼓励社区反馈问题,以便我们能够逐步扩大支持范围。

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
解析后端pipeline*-auto-engine*-http-client
hybridvlmhybridvlm
后端特性兼容性好硬件配置要求较高适用于OpenAI兼容服务器2
精度指标182+90+
操作系统Linux3 / Windows4 / macOS5
纯CPU平台支持
GPU加速支持Volta及以后架构GPU或Apple Silicon不需要
显存最低要求6GB10GB8GB3GB
内存要求最低16GB以上,推荐32GB以上8GB
磁盘空间要求20GB以上,推荐使用SSD2GB
python版本3.10-3.13
+ +

1 精度指标为OmniDocBench (v1.5)的End-to-End Evaluation Overall分数,基于MinerU最新版本测试
+2 兼容OpenAI API的服务器,如通过vLLM/SGLang/LMDeploy等推理框架部署的本地模型服务器或远程模型服务
+3 Linux仅支持2019年及以后发行版
+4 由于关键依赖ray未能在windows平台支持Python 3.13,故仅支持至3.10~3.12版本
+5 macOS 需使用14.0以上版本

+
+

Tip

+

除以上主流环境与平台外,我们也收录了一些社区用户反馈的其他平台支持情况,详情请参考其他加速卡适配
+如果您有意将自己的环境适配经验分享给社区,欢迎通过show-and-tell提交或提交PR至其他加速卡适配文档。

+
+

安装 MinerU

+

使用pip或uv安装MinerU

+
pip install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple
+pip install uv -i https://mirrors.aliyun.com/pypi/simple
+uv pip install -U "mineru[all]" -i https://mirrors.aliyun.com/pypi/simple 
+
+

通过源码安装MinerU

+
git clone https://github.com/opendatalab/MinerU.git
+cd MinerU
+uv pip install -e .[all] -i https://mirrors.aliyun.com/pypi/simple
+
+
+

Tip

+

mineru[all]包含所有核心功能,兼容Windows / Linux / macOS系统,适合绝大多数用户。 +如果您需要指定vlm模型的推理框架,或是仅准备在边缘设备安装轻量版client端,可以参考文档扩展模块安装指南

+
+
+

使用docker部署Mineru

+

MinerU提供了便捷的docker部署方式,这有助于快速搭建环境并解决一些棘手的环境兼容问题。 +您可以在文档中获取Docker部署说明

+
+

使用 MinerU

+
+

Tip

+

默认使用托管在huggingface的模型进行解析,首次使用时会自动下载所需模型文件,后续使用将直接加载本地缓存的模型。如果您无法访问huggingface,可以通过以下命令切换至国内镜像源: +

export MINERU_MODEL_SOURCE=modelscope
+
+
+

如果您的设备满足上表中GPU加速的条件,可以使用简单的命令行进行文档解析: +

mineru -p <input_path> -o <output_path>
+
+如果您的设备不满足GPU加速条件,可以指定后端为pipeline,以在纯CPU环境下运行: +
mineru -p <input_path> -o <output_path> -b pipeline
+
+

您可以通过命令行、API、WebUI等多种方式使用MinerU进行PDF解析,具体使用方法请参考使用指南

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/reference/changelog/index.html b/zh/reference/changelog/index.html new file mode 100644 index 00000000..29ed67d7 --- /dev/null +++ b/zh/reference/changelog/index.html @@ -0,0 +1,3236 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 更新日志 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

更新日志

+

本文档记录了MinerU项目2.6.7及更早版本的更新历史。最新版本的更新请查看项目README

+
+

2.6 系列版本

+

2.6.7 (2025/12/12)

+
    +
  • bug修复: #4168
  • +
+

2.6.6 (2025/12/02)

+

Ascend适配优化

+
    +
  • 优化命令行工具初始化流程,使Ascend适配方案中vlm-vllm-engine后端在命令行工具中可用。
  • +
  • 为Atlas 300I Duo(310p)设备更新适配文档。
  • +
+

mineru-api工具优化

+
    +
  • mineru-api接口参数增加描述性文本,优化接口文档可读性。
  • +
  • 可通过环境变量MINERU_API_ENABLE_FASTAPI_DOCS控制是否启用自动生成的接口文档页面,默认为启用。
  • +
  • vlm-vllm-async-enginevlm-lmdeploy-enginevlm-http-client后端增加并发数配置选项,用户可通过环境变量MINERU_API_MAX_CONCURRENT_REQUESTS控制api接口的最大并发请求数,默认为不限制数量。
  • +
+

2.6.5 (2025/11/26)

+
    +
  • 增加新后端vlm-lmdeploy-engine支持,使用方式与vlm-vllm-(async)engine类似,但使用lmdeploy作为推理引擎,与vllm相比额外支持Windows平台原生推理加速。
  • +
  • 新增国产算力平台昇腾/npu平头哥/ppu沐曦/maca的适配支持,用户可在对应平台上使用pipelinevlm模型,并使用vllm/lmdeploy引擎加速vlm模型推理,具体使用方式请参考其他加速卡适配
  • +
  • 国产平台适配不易,我们已尽量确保适配的完整性和稳定性,但仍可能存在一些稳定性/兼容问题与精度对齐问题,请大家根据适配文档页面内红绿灯情况自行选择合适的环境与场景进行使用。
  • +
  • 如在使用国产化平台适配方案的过程中遇到任何文档未提及的问题,为便于其他用户查找解决方案,请在discussions的指定帖子中进行反馈。
  • +
+

2.6.4 (2025/11/04)

+
    +
  • 为pdf渲染图片增加超时配置,默认为300秒,可通过环境变量MINERU_PDF_RENDER_TIMEOUT进行配置,防止部分异常pdf文件导致渲染过程长时间阻塞。
  • +
  • 为onnx模型增加cpu线程数配置选项,默认为系统cpu核心数,可通过环境变量MINERU_INTRA_OP_NUM_THREADSMINERU_INTER_OP_NUM_THREADS进行配置,以减少高并发场景下的对cpu资源的抢占冲突。
  • +
+

2.6.3 (2025/10/31)

+
    +
  • 增加新后端vlm-mlx-engine支持,在Apple Silicon设备上支持使用MLX加速MinerU2.5模型推理,相比vlm-transformers后端,vlm-mlx-engine后端速度提升100%~200%。
  • +
  • bug修复: #3849 #3859
  • +
+

2.6.2 (2025/10/24)

+

pipline后端优化

+
    +
  • 增加对中文公式的实验性支持,可通过配置环境变量export MINERU_FORMULA_CH_SUPPORT=1开启。该功能可能会导致MFR速率略微下降、部分长公式识别失败等问题,建议仅在需要解析中文公式的场景下开启。如需关闭该功能,可将环境变量设置为0
  • +
  • OCR速度大幅提升200%~300%,感谢 @cjsdurj 提供的优化方案
  • +
  • OCR模型优化拉丁文识别的准度和广度,并更新西里尔文(cyrillic)、阿拉伯文(arabic)、天城文(devanagari)、泰卢固语(te)、泰米尔语(ta)语系至ppocr-v5版本,精度相比上代模型提升40%以上
  • +
+

vlm后端优化

+
    +
  • table_captiontable_footnote匹配逻辑优化,提升页内多张连续表场景下的表格标题和脚注的匹配准确率和阅读顺序合理性
  • +
  • 优化使用vllm后端时高并发时的cpu资源占用,降低服务端压力
  • +
  • 适配vllm0.11.0版本
  • +
+

通用优化

+
    +
  • 跨页表格合并效果优化,新增跨页续表合并支持,提升在多列合并场景下的表格合并效果
  • +
  • 为表格合并功能增加环境变量配置选项MINERU_TABLE_MERGE_ENABLE,表格合并功能默认开启,可通过设置该变量为0来关闭表格合并功能
  • +
+
+

2.5 系列版本

+

2.5.4 (2025/09/26)

+
    +
  • 🎉🎉 MinerU2.5技术报告现已发布,欢迎阅读全面了解其模型架构、训练策略、数据工程和评测结果。
  • +
  • 修复部分pdf文件被识别成ai文件导致无法解析的问题
  • +
+

2.5.3 (2025/09/20)

+
    +
  • 依赖版本范围调整,使得Turing及更早架构显卡可以使用vLLM加速推理MinerU2.5模型。
  • +
  • pipeline后端对torch 2.8.0的一些兼容性修复。
  • +
  • 降低vLLM异步后端默认的并发数,降低服务端压力以避免高压导致的链接关闭问题。
  • +
  • 更多兼容性相关内容详见公告
  • +
+

2.5.2 (2025/09/19)

+

我们正式发布 MinerU2.5,当前最强文档解析多模态大模型。仅凭 1.2B 参数,MinerU2.5 在 OmniDocBench 文档解析评测中,精度已全面超越 Gemini2.5-Pro、GPT-4o、Qwen2.5-VL-72B等顶级多模态大模型,并显著领先于主流文档解析专用模型(如 dots.ocr, MonkeyOCR, PP-StructureV3 等)。

+

模型已发布至HuggingFaceModelScope平台,欢迎大家下载使用!

+

核心亮点

+
    +
  • 极致能效,性能SOTA: 以 1.2B 的轻量化规模,实现了超越百亿乃至千亿级模型的SOTA性能,重新定义了文档解析的能效比。
  • +
  • 先进架构,全面领先: 通过 "两阶段推理" (解耦布局分析与内容识别) 与 原生高分辨率架构 的结合,在布局分析、文本识别、公式识别、表格识别及阅读顺序五大方面均达到 SOTA 水平。
  • +
+

关键能力提升

+
    +
  • 布局检测: 结果更完整,精准覆盖页眉、页脚、页码等非正文内容;同时提供更精准的元素定位与更自然的格式还原(如列表、参考文献)。
  • +
  • 表格解析: 大幅优化了对旋转表格、无线/少线表、以及长难表格的解析能力。
  • +
  • 公式识别: 显著提升中英混合公式及复杂长公式的识别准确率,大幅改善数学类文档解析能力。
  • +
+

仓库调整

+

此外,伴随vlm 2.5的发布,我们对仓库做出一些调整:

+
    +
  • vlm后端升级至2.5版本,支持MinerU2.5模型,不再兼容MinerU2.0-2505-0.9B模型,最后一个支持2.0模型的版本为mineru-2.2.2。
  • +
  • vlm推理相关代码已移至mineru_vl_utils,降低与mineru主仓库的耦合度,便于后续独立迭代。
  • +
  • vlm加速推理框架从sglang切换至vllm,并实现对vllm生态的完全兼容,使得用户可以在任何支持vllm框架的平台上使用MinerU2.5模型并加速推理。
  • +
  • 由于vlm模型的重大升级,支持更多layout type,因此我们对解析的中间文件middle.json和结果文件content_list.json的结构做出一些调整,请参考文档了解详情。
  • +
+

其他仓库优化

+
    +
  • 移除对输入文件的后缀名白名单校验,当输入文件为PDF文档或图片时,对文件的后缀名不再有要求,提升易用性。
  • +
+
+

2.2 - 2.4 系列版本

+

2.2.2 (2025/09/10)

+
    +
  • 修复新的表格识别模型在部分表格解析失败时影响整体解析任务的问题
  • +
+

2.2.1 (2025/09/08)

+
    +
  • 修复使用模型下载命令时,部分新增模型未下载的问题
  • +
+

2.2.0 (2025/09/05)

+

主要更新

+
    +
  • 在这个版本我们重点提升了表格的解析精度,通过引入新的有线表识别模型和全新的混合表格结构解析算法,显著提升了pipeline后端的表格识别能力。
  • +
  • 另外我们增加了对跨页表格合并的支持,这一功能同时支持pipelinevlm后端,进一步提升了表格解析的完整性和准确性。
  • +
+

其他更新

+
    +
  • pipeline后端增加270度旋转的表格解析能力,现已支持0/90/270度三个方向的表格解析
  • +
  • pipeline增加对泰文、希腊文的ocr能力支持,并更新了英文ocr模型至最新,英文识别精度提升11%,泰文识别模型精度 82.68%,希腊文识别模型精度 89.28%(by PPOCRv5)
  • +
  • 在输出的content_list.json中增加了bbox字段(映射至0-1000范围内),方便用户直接获取每个内容块的位置信息
  • +
  • 移除pipeline_old_linux安装可选项,不再支持老版本的Linux系统如Centos 7等,以便对uvsync/run等命令进行更好的支持
  • +
+
+

2.1 系列版本

+

2.1.10 (2025/08/01)

+
    +
  • 修复pipeline后端因block覆盖导致的解析结果与预期不符 #3232
  • +
+

2.1.9 (2025/07/30)

+
    +
  • transformers 4.54.1 版本适配
  • +
+

2.1.8 (2025/07/28)

+
    +
  • sglang 0.4.9.post5 版本适配
  • +
+

2.1.7 (2025/07/27)

+
    +
  • transformers 4.54.0 版本适配
  • +
+

2.1.6 (2025/07/26)

+
    +
  • 修复vlm后端解析部分手写文档时的表格异常问题
  • +
  • 修复文档旋转时可视化框位置漂移问题 #3175
  • +
+

2.1.5 (2025/07/24)

+
    +
  • sglang 0.4.9 版本适配,同步升级dockerfile基础镜像为sglang 0.4.9.post3
  • +
+

2.1.4 (2025/07/23)

+

bug修复

+
    +
  • 修复pipeline后端中MFR步骤在某些情况下显存消耗过大的问题 #2771
  • +
  • 修复某些情况下image/tablecaption/footnote匹配不准确的问题 #3129
  • +
+

2.1.1 (2025/07/16)

+

bug修复

+
    +
  • 修复pipeline在某些情况可能发生的文本块内容丢失问题 #3005
  • +
  • 修复sglang-client需要安装torch等不必要的包的问题 #2968
  • +
  • 更新dockerfile以修复linux字体缺失导致的解析文本内容不完整问题 #2915
  • +
+

易用性更新

+
    +
  • 更新compose.yaml,便于用户直接启动sglang-servermineru-apimineru-gradio服务
  • +
  • 启用全新的在线文档站点,简化readme,提供更好的文档体验
  • +
+

2.1.0 (2025/07/05)

+

这是 MinerU 2 的第一个大版本更新,包含了大量新功能和改进,包含众多性能优化、体验优化和bug修复,具体更新内容如下:

+

性能优化

+
    +
  • 大幅提升某些特定分辨率(长边2000像素左右)文档的预处理速度
  • +
  • 大幅提升pipeline后端批量处理大量页数较少(<10)文档时的后处理速度
  • +
  • pipeline后端的layout分析速度提升约20%
  • +
+

体验优化

+
    +
  • 内置开箱即用的fastapi服务gradio webui,详细使用方法请参考文档
  • +
  • sglang适配0.4.8版本,大幅降低vlm-sglang后端的显存要求,最低可在8G显存(Turing及以后架构)的显卡上运行
  • +
  • 对所有命令增加sglang的参数透传,使得sglang-engine后端可以与sglang-server一致,接收sglang的所有参数
  • +
  • 支持基于配置文件的功能扩展,包含自定义公式标识符开启标题分级功能自定义本地模型目录,详细使用方法请参考文档
  • +
+

新特性

+
    +
  • pipeline后端更新 PP-OCRv5 多语种文本识别模型,支持法语、西班牙语、葡萄牙语、俄语、韩语等 37 种语言的文字识别,平均精度涨幅超30%。详情
  • +
  • pipeline后端增加对竖排文本的有限支持
  • +
+
+

2.0 系列版本

+

2.0.6 (2025/06/20)

+
    +
  • 修复vlm模式下,某些偶发的无效块内容导致解析中断问题
  • +
  • 修复vlm模式下,某些不完整的表结构导致的解析中断问题
  • +
+

2.0.5 (2025/06/17)

+
    +
  • 修复了sglang-client模式下依然需要下载模型的问题
  • +
  • 修复了sglang-client模式需要依赖torch等实际运行不需要的包的问题
  • +
  • 修复了同一进程内尝试通过多个url启动多个sglang-client实例时,只有第一个生效的问题
  • +
+

2.0.3 (2025/06/15)

+
    +
  • 修复了当下载模型类型设置为all时,配置文件出现键值更新错误的问题
  • +
  • 修复了命令行模式下公式和表格功能开关不生效导致功能无法关闭的问题
  • +
  • 修复了sglang-engine模式下,0.4.7版本sglang的兼容性问题
  • +
  • 更新了sglang环境下部署完整版MinerU的Dockerfile和相关安装文档
  • +
+

2.0.0 (2025/06/13)

+

全新架构

+

MinerU 2.0 在代码结构和交互方式上进行了深度重构,显著提升了系统的易用性、可维护性与扩展能力。

+
    +
  • 去除第三方依赖限制:彻底移除对 pymupdf 的依赖,推动项目向更开放、合规的开源方向迈进。
  • +
  • 开箱即用,配置便捷:无需手动编辑 JSON 配置文件,绝大多数参数已支持命令行或 API 直接设置。
  • +
  • 模型自动管理:新增模型自动下载与更新机制,用户无需手动干预即可完成模型部署。
  • +
  • 离线部署友好:提供内置模型下载命令,支持完全断网环境下的部署需求。
  • +
  • 代码结构精简:移除数千行冗余代码,简化类继承逻辑,显著提升代码可读性与开发效率。
  • +
  • 统一中间格式输出:采用标准化的 middle_json 格式,兼容多数基于该格式的二次开发场景,确保生态业务无缝迁移。
  • +
+

全新模型

+

MinerU 2.0 集成了我们最新研发的小参数量、高性能多模态文档解析模型,实现端到端的高速、高精度文档理解。

+
    +
  • 小模型,大能力:模型参数不足 1B,却在解析精度上超越传统 72B 级别的视觉语言模型(VLM)。
  • +
  • 多功能合一:单模型覆盖多语言识别、手写识别、版面分析、表格解析、公式识别、阅读顺序排序等核心任务。
  • +
  • 极致推理速度:在单卡 NVIDIA 4090 上通过 sglang 加速,达到峰值吞吐量超过 10,000 token/s,轻松应对大规模文档处理需求。
  • +
  • 在线体验:您可以在MinerU.netHugging Face, 以及ModelScope体验我们的全新VLM模型
  • +
+

不兼容变更说明

+

为提升整体架构合理性与长期可维护性,本版本包含部分不兼容的变更:

+
    +
  • Python 包名从 magic-pdf 更改为 mineru,命令行工具也由 magic-pdf 改为 mineru,请同步更新脚本与调用命令。
  • +
  • 出于对系统模块化设计与生态一致性的考虑,MinerU 2.0 已不再内置 LibreOffice 文档转换模块。如需处理 Office 文档,建议通过独立部署的 LibreOffice 服务先行转换为 PDF 格式,再进行后续解析操作。
  • +
+
+

1.x 系列历史版本

+

1.3.12 (2025/05/24)

+

增加ppocrv5模型的支持,将ch_server模型更新为PP-OCRv5_rec_serverch_lite模型更新为PP-OCRv5_rec_mobile(需更新模型)

+
    +
  • 在测试中,发现ppocrv5(server)对手写文档效果有一定提升,但在其余类别文档的精度略差于v4_server_doc,因此默认的ch模型保持不变,仍为PP-OCRv4_server_rec_doc
  • +
  • 由于ppocrv5强化了手写场景和特殊字符的识别能力,因此您可以在日繁混合场景以及手写文档场景下手动选择使用ppocrv5模型
  • +
  • 您可通过lang参数lang='ch_server'(python api)或--lang ch_server(命令行)自行选择相应的模型:
  • +
  • chPP-OCRv4_rec_server_doc(默认)(中英日繁混合/1.5w字典)
  • +
  • ch_serverPP-OCRv5_rec_server(中英日繁混合+手写场景/1.8w字典)
  • +
  • ch_litePP-OCRv5_rec_mobile(中英日繁混合+手写场景/1.8w字典)
  • +
  • ch_server_v4PP-OCRv4_rec_server(中英混合/6k字典)
  • +
  • ch_lite_v4PP-OCRv4_rec_mobile(中英混合/6k字典)
  • +
+

增加手写文档的支持,通过优化layout对手写文本区域的识别,现已支持手写文档的解析

+
    +
  • 默认支持此功能,无需额外配置
  • +
  • 可以参考上述说明,手动选择ppocrv5模型以获得更好的手写文档解析效果
  • +
+

huggingfacemodelscope的demo已更新为支持手写识别和ppocrv5模型的版本,可自行在线体验

+

1.3.10 (2025/04/29)

+
    +
  • 支持使用自定义公式标识符,可通过修改用户目录下的magic-pdf.json文件中的latex-delimiter-config项实现。
  • +
+

1.3.9 (2025/04/27)

+
    +
  • 优化公式解析功能,提升公式渲染的成功率
  • +
+

1.3.8 (2025/04/23)

+

ocr默认模型(ch)更新为PP-OCRv4_server_rec_doc(需更新模型)

+
    +
  • PP-OCRv4_server_rec_doc是在PP-OCRv4_server_rec的基础上,在更多中文文档数据和PP-OCR训练数据的混合数据训练而成,增加了部分繁体字、日文、特殊字符的识别能力,可支持识别的字符为1.5万+,除文档相关的文字识别能力提升外,也同时提升了通用文字的识别能力。
  • +
  • PP-OCRv4_server_rec_doc/PP-OCRv4_server_rec/PP-OCRv4_mobile_rec 性能对比
  • +
  • 经验证,PP-OCRv4_server_rec_doc模型在中英日繁单种语言或多种语言混合场景均有明显精度提升,且速度与PP-OCRv4_server_rec相当,适合绝大部分场景使用。
  • +
  • PP-OCRv4_server_rec_doc在小部分纯英文场景可能会发生单词粘连问题,PP-OCRv4_server_rec则在此场景下表现更好,因此我们保留了PP-OCRv4_server_rec模型,用户可通过增加参数lang='ch_server'(python api)或--lang ch_server(命令行)调用。
  • +
+

1.3.7 (2025/04/22)

+
    +
  • 修复表格解析模型初始化时lang参数失效的问题
  • +
  • 修复在cpu模式下ocr和表格解析速度大幅下降的问题
  • +
+

1.3.4 (2025/04/16)

+
    +
  • 通过移除一些无用的块,小幅提升了ocr-det的速度
  • +
  • 修复部分情况下由footnote导致的页面内排序错误
  • +
+

1.3.2 (2025/04/12)

+
    +
  • 修复了windows系统下,在python3.13环境安装时一些依赖包版本不兼容的问题
  • +
  • 优化批量推理时的内存占用
  • +
  • 优化旋转90度表格的解析效果
  • +
  • 优化财报样本中超大表格的解析效果
  • +
  • 修复了在未指定OCR语言时,英文文本区域偶尔出现的单词黏连问题(需要更新模型)
  • +
+

1.3.1 (2025/04/08)

+

修复了一些兼容问题

+
    +
  • 支持python 3.13
  • +
  • 为部分过时的linux系统(如centos7)做出最后适配,并不再保证后续版本的继续支持,安装说明
  • +
+

1.3.0 (2025/04/03)

+

安装与兼容性优化

+
    +
  • 通过移除layout中layoutlmv3的使用,解决了由detectron2导致的兼容问题
  • +
  • torch版本兼容扩展到2.2~2.6(2.5除外)
  • +
  • cuda兼容支持11.8/12.4/12.6/12.8(cuda版本由torch决定),解决部分用户50系显卡与H系显卡的兼容问题
  • +
  • python兼容版本扩展到3.10~3.12,解决了在非3.10环境下安装时自动降级到0.6.1的问题
  • +
  • 优化离线部署流程,部署成功后不需要联网下载任何模型文件
  • +
+

性能优化

+
    +
  • 通过支持多个pdf文件的batch处理(脚本样例),提升了批量小文件的解析速度 (与1.0.1版本相比,公式解析速度最高提升超过1400%,整体解析速度最高提升超过500%)
  • +
  • 通过优化mfr模型的加载和使用,降低了显存占用并提升了解析速度(需重新执行模型下载流程以获得模型文件的增量更新)
  • +
  • 优化显存占用,最低仅需6GB即可运行本项目
  • +
  • 优化了在mps设备上的运行速度
  • +
+

解析效果优化

+
    +
  • mfr模型更新到unimernet(2503),解决多行公式中换行丢失的问题
  • +
+

易用性优化

+
    +
  • 通过使用paddleocr2torch,完全替代paddle框架以及paddleocr在项目中的使用,解决了paddletorch的冲突问题,和由于paddle框架导致的线程不安全问题
  • +
  • 解析过程增加实时进度条显示,精准把握解析进度,让等待不再痛苦
  • +
+

1.2.1 (2025/03/03)

+

修复了一些问题

+
    +
  • 修复在字母与数字的全角转半角操作时对标点符号的影响
  • +
  • 修复在某些情况下caption的匹配不准确问题
  • +
  • 修复在某些情况下的公式span丢失问题
  • +
+

1.2.0 (2025/02/24)

+

这个版本我们修复了一些问题,提升了解析的效率与精度:

+

性能优化

+
    +
  • auto模式下pdf文档的分类速度提升
  • +
+

解析优化

+
    +
  • 优化对包含水印文档的解析逻辑,显著提升包含水印文档的解析效果
  • +
  • 改进了单页内多个图像/表格与caption的匹配逻辑,提升了复杂布局下图文匹配的准确性
  • +
+

问题修复

+
    +
  • 修复在某些情况下图片/表格span被填充进textblock导致的异常
  • +
  • 修复在某些情况下标题block为空的问题
  • +
+

1.1.0 (2025/01/22)

+

在这个版本我们重点提升了解析的精度与效率:

+

模型能力升级(需重新执行 模型下载流程 以获得模型文件的增量更新)

+
    +
  • 布局识别模型升级到最新的 doclayout_yolo(2501) 模型,提升了layout识别精度
  • +
  • 公式解析模型升级到最新的 unimernet(2501) 模型,提升了公式识别精度
  • +
+

性能优化

+
    +
  • 在配置满足一定条件(显存16GB+)的设备上,通过优化资源占用和重构处理流水线,整体解析速度提升50%以上
  • +
+

解析效果优化

+
    +
  • 在线demo(mineru.net / huggingface / modelscope)上新增标题分级功能(测试版本,默认开启),支持对标题进行分级,提升文档结构化程度
  • +
+

1.0.1 (2025/01/10)

+

这是我们的第一个正式版本,在这个版本中,我们通过大量重构带来了全新的API接口和更广泛的兼容性,以及全新的自动语言识别功能:

+

全新API接口

+
    +
  • 对于数据侧API,我们引入了Dataset类,旨在提供一个强大而灵活的数据处理框架。该框架当前支持包括图像(.jpg及.png)、PDF、Word(.doc及.docx)、以及PowerPoint(.ppt及.pptx)在内的多种文档格式,确保了从简单到复杂的数据处理任务都能得到有效的支持。
  • +
  • 针对用户侧API,我们将MinerU的处理流程精心设计为一系列可组合的Stage阶段。每个Stage代表了一个特定的处理步骤,用户可以根据自身需求自由地定义新的Stage,并通过创造性地组合这些阶段来定制专属的数据处理流程。
  • +
+

更广泛的兼容性适配

+
    +
  • 通过优化依赖环境和配置项,确保在ARM架构的Linux系统上能够稳定高效运行。
  • +
  • 深度适配华为昇腾NPU加速,积极响应信创要求,提供自主可控的高性能计算能力,助力人工智能应用平台的国产化应用与发展。 NPU加速教程
  • +
+

自动语言识别

+
    +
  • 通过引入全新的语言识别模型, 在文档解析中将 lang 配置为 auto,即可自动选择合适的OCR语言模型,提升扫描类文档解析的准确性。
  • +
+
+

0.x 系列历史版本

+

0.10.0 (2024/11/22)

+

通过引入混合OCR文本提取能力:

+
    +
  • 在公式密集、span区域不规范、部分文本使用图像表现等复杂文本分布场景下获得解析效果的显著提升
  • +
  • 同时具备文本模式内容提取准确、速度更快与OCR模式span/line区域识别更准的双重优势
  • +
+

0.9.3 (2024/11/15)

+

为表格识别功能接入了RapidTable,单表解析速度提升10倍以上,准确率更高,显存占用更低

+

0.9.2 (2024/11/06)

+

为表格识别功能接入了StructTable-InternVL2-1B模型

+

0.9.0 (2024/10/31)

+

这是我们进行了大量代码重构的全新版本,解决了众多问题,提升了性能,降低了硬件需求,并提供了更丰富的易用性:

+
    +
  • 重构排序模块代码,使用 layoutreader 进行阅读顺序排序,确保在各种排版下都能实现极高准确率
  • +
  • 重构段落拼接模块,在跨栏、跨页、跨图、跨表情况下均能实现良好的段落拼接效果
  • +
  • 重构列表和目录识别功能,极大提升列表块和目录块识别的准确率及对应文本段落的解析效果
  • +
  • 重构图、表与描述性文本的匹配逻辑,大幅提升 caption 和 footnote 与图表的匹配准确率,并将描述性文本的丢失率降至接近0
  • +
  • 增加 OCR 的多语言支持,支持 84 种语言的检测与识别,语言支持列表详见 OCR 语言支持列表
  • +
  • 增加显存回收逻辑及其他显存优化措施,大幅降低显存使用需求。开启除表格加速外的全部加速功能(layout/公式/OCR)的显存需求从16GB降至8GB,开启全部加速功能的显存需求从24GB降至10GB
  • +
  • 优化配置文件的功能开关,增加独立的公式检测开关,无需公式检测时可大幅提升速度和解析效果
  • +
  • 集成 PDF-Extract-Kit 1.0
  • +
  • 加入自研的 doclayout_yolo 模型,在相近解析效果情况下比原方案提速10倍以上,可通过配置文件与 layoutlmv3 自由切换
  • +
  • 公式解析升级至 unimernet 0.2.1,在提升公式解析准确率的同时,大幅降低显存需求
  • +
  • PDF-Extract-Kit 1.0 更换仓库,需要重新下载模型,步骤详见 如何下载模型
  • +
+

0.8.1 (2024/09/27)

+

修复了一些bug,同时提供了在线demo本地化部署版本前端界面

+

0.8.0 (2024/09/09)

+

支持Dockerfile快速部署,同时上线了huggingface、modelscope demo

+

0.7.1 (2024/08/30)

+

集成了paddle tablemaster表格识别功能

+

0.7.0b1 (2024/08/09)

+

简化安装步骤提升易用性,加入表格识别功能

+

0.6.2b1 (2024/08/01)

+

优化了依赖冲突问题和安装文档

+

首次开源 (2024/07/05)

+

MinerU项目首次开源发布

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/reference/index.html b/zh/reference/index.html new file mode 100644 index 00000000..16959e37 --- /dev/null +++ b/zh/reference/index.html @@ -0,0 +1,1701 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 参考资料 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

参考文档

+

本章节提供了 MinerU 项目的详细参考资料。在这里您可以找到技术规范、API 文档、输出文件格式说明以及版本历史记录。

+

目录

+ +

文档概览

+

输出文件说明

+

理解 MinerU 生成的输出文件对于有效使用工具至关重要。输出文件文档提供了:

+
    +
  • 可视化调试文件:帮助您理解文档解析过程
  • +
  • 结构化数据文件:包含详细的解析结果,可用于进一步处理
  • +
  • 文件格式规范:每种输出文件类型的详细说明
  • +
+

更新日志

+

更新日志记录了 MinerU 的演进历程,包括:

+
    +
  • 版本更新:每个版本的新功能和改进
  • +
  • 错误修复:每个版本中解决的问题
  • +
  • 重大变更:可能影响您使用的重要变更
  • +
  • 功能弃用:正在逐步淘汰的功能
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/reference/output_files/index.html b/zh/reference/output_files/index.html new file mode 100644 index 00000000..b8b8bd97 --- /dev/null +++ b/zh/reference/output_files/index.html @@ -0,0 +1,3497 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 输出文件格式 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MinerU 输出文件说明

+

概览

+

mineru 命令执行后,除了输出主要的 markdown 文件外,还会生成多个辅助文件用于调试、质检和进一步处理。这些文件包括:

+
    +
  • 可视化调试文件:帮助用户直观了解文档解析过程和结果
  • +
  • 结构化数据文件:包含详细的解析数据,可用于二次开发
  • +
+

下面将详细介绍每个文件的作用和格式。

+

可视化调试文件

+

布局分析文件 (layout.pdf)

+

文件命名格式{原文件名}_layout.pdf

+

功能说明

+
    +
  • 可视化展示每一页的布局分析结果
  • +
  • 每个检测框右上角的数字表示阅读顺序
  • +
  • 使用不同背景色块区分不同类型的内容块
  • +
+

使用场景

+
    +
  • 检查布局分析是否正确
  • +
  • 确认阅读顺序是否合理
  • +
  • 调试布局相关问题
  • +
+

layout 页面示例

+

文本片段文件 (span.pdf)

+
+

Note

+

仅适用于 pipeline 后端

+
+

文件命名格式{原文件名}_span.pdf

+

功能说明

+
    +
  • 根据 span 类型使用不同颜色线框标注页面内容
  • +
  • 用于质量检查和问题排查
  • +
+

使用场景

+
    +
  • 快速排查文本丢失问题
  • +
  • 检查行内公式识别情况
  • +
  • 验证文本分割准确性
  • +
+

span 页面示例

+

结构化数据文件

+
+

Important

+

2.5版本vlm后端的输出存在较大变化,与pipeline版本存在不兼容情况,如需基于结构化输出进行二次开发,请仔细阅读本文档内容。

+
+

pipeline 后端 输出结果

+

模型推理结果 (model.json)

+

文件命名格式{原文件名}_model.json

+
数据结构定义
+
from pydantic import BaseModel, Field
+from enum import IntEnum
+
+class CategoryType(IntEnum):
+    """内容类别枚举"""
+    title = 0               # 标题
+    plain_text = 1          # 文本
+    abandon = 2             # 包括页眉页脚页码和页面注释
+    figure = 3              # 图片
+    figure_caption = 4      # 图片描述
+    table = 5               # 表格
+    table_caption = 6       # 表格描述
+    table_footnote = 7      # 表格注释
+    isolate_formula = 8     # 行间公式
+    formula_caption = 9     # 行间公式的标号
+    embedding = 13          # 行内公式
+    isolated = 14           # 行间公式
+    text = 15               # OCR 识别结果
+
+class PageInfo(BaseModel):
+    """页面信息"""
+    page_no: int = Field(description="页码序号,第一页的序号是 0", ge=0)
+    height: int = Field(description="页面高度", gt=0)
+    width: int = Field(description="页面宽度", ge=0)
+
+class ObjectInferenceResult(BaseModel):
+    """对象识别结果"""
+    category_id: CategoryType = Field(description="类别", ge=0)
+    poly: list[float] = Field(description="四边形坐标,格式为 [x0,y0,x1,y1,x2,y2,x3,y3]")
+    score: float = Field(description="推理结果的置信度")
+    latex: str | None = Field(description="LaTeX 解析结果", default=None)
+    html: str | None = Field(description="HTML 解析结果", default=None)
+
+class PageInferenceResults(BaseModel):
+    """页面推理结果"""
+    layout_dets: list[ObjectInferenceResult] = Field(description="页面识别结果")
+    page_info: PageInfo = Field(description="页面元信息")
+
+# 完整的推理结果
+inference_result: list[PageInferenceResults] = []
+
+
坐标系统说明
+

poly 坐标格式:[x0, y0, x1, y1, x2, y2, x3, y3]

+
    +
  • 分别表示左上、右上、右下、左下四点的坐标
  • +
  • 坐标原点在页面左上角
  • +
+

poly 坐标示意图

+
示例数据
+
[
+    {
+        "layout_dets": [
+            {
+                "category_id": 2,
+                "poly": [
+                    99.1906967163086,
+                    100.3119125366211,
+                    730.3707885742188,
+                    100.3119125366211,
+                    730.3707885742188,
+                    245.81326293945312,
+                    99.1906967163086,
+                    245.81326293945312
+                ],
+                "score": 0.9999997615814209
+            }
+        ],
+        "page_info": {
+            "page_no": 0,
+            "height": 2339,
+            "width": 1654
+        }
+    },
+    {
+        "layout_dets": [
+            {
+                "category_id": 5,
+                "poly": [
+                    99.13092803955078,
+                    2210.680419921875,
+                    497.3183898925781,
+                    2210.680419921875,
+                    497.3183898925781,
+                    2264.78076171875,
+                    99.13092803955078,
+                    2264.78076171875
+                ],
+                "score": 0.9999997019767761
+            }
+        ],
+        "page_info": {
+            "page_no": 1,
+            "height": 2339,
+            "width": 1654
+        }
+    }
+]
+
+

中间处理结果 (middle.json)

+

文件命名格式{原文件名}_middle.json

+
顶层结构
+ + + + + + + + + + + + + + + + + + + + + + + + + +
字段名类型说明
pdf_infolist[dict]每一页的解析结果数组
_backendstring解析模式:pipelinevlm
_version_namestringMinerU 版本号
+
页面信息结构 (pdf_info)
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
字段名说明
preproc_blocksPDF 预处理后的未分段中间结果
page_idx页码,从 0 开始
page_size页面的宽度和高度 [width, height]
images图片块信息列表
tables表格块信息列表
interline_equations行间公式块信息列表
discarded_blocks需要丢弃的块信息
para_blocks分段后的内容块结果
+
块结构层次
+
一级块 (table | image)
+└── 二级块
+    └── 行 (line)
+        └── 片段 (span)
+
+
一级块字段
+ + + + + + + + + + + + + + + + + + + + + +
字段名说明
type块类型:tableimage
bbox块的矩形框坐标 [x0, y0, x1, y1]
blocks包含的二级块列表
+
二级块字段
+ + + + + + + + + + + + + + + + + + + + + +
字段名说明
type块类型(详见下表)
bbox块的矩形框坐标
lines包含的行信息列表
+
二级块类型
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
类型说明
image_body图像本体
image_caption图像描述文本
image_footnote图像脚注
table_body表格本体
table_caption表格描述文本
table_footnote表格脚注
text文本块
title标题块
index目录块
list列表块
interline_equation行间公式块
+
行和片段结构
+

行 (line) 字段: +- bbox:行的矩形框坐标 +- spans:包含的片段列表

+

片段 (span) 字段: +- bbox:片段的矩形框坐标 +- type:片段类型(imagetabletextinline_equationinterline_equation) +- content | img_path:文本内容或图片路径

+
示例数据
+
{
+    "pdf_info": [
+        {
+            "preproc_blocks": [
+                {
+                    "type": "text",
+                    "bbox": [
+                        52,
+                        61.956024169921875,
+                        294,
+                        82.99800872802734
+                    ],
+                    "lines": [
+                        {
+                            "bbox": [
+                                52,
+                                61.956024169921875,
+                                294,
+                                72.0000228881836
+                            ],
+                            "spans": [
+                                {
+                                    "bbox": [
+                                        54.0,
+                                        61.956024169921875,
+                                        296.2261657714844,
+                                        72.0000228881836
+                                    ],
+                                    "content": "dependent on the service headway and the reliability of the departure ",
+                                    "type": "text",
+                                    "score": 1.0
+                                }
+                            ]
+                        }
+                    ]
+                }
+            ],
+            "layout_bboxes": [
+                {
+                    "layout_bbox": [
+                        52,
+                        61,
+                        294,
+                        731
+                    ],
+                    "layout_label": "V",
+                    "sub_layout": []
+                }
+            ],
+            "page_idx": 0,
+            "page_size": [
+                612.0,
+                792.0
+            ],
+            "_layout_tree": [],
+            "images": [],
+            "tables": [],
+            "interline_equations": [],
+            "discarded_blocks": [],
+            "para_blocks": [
+                {
+                    "type": "text",
+                    "bbox": [
+                        52,
+                        61.956024169921875,
+                        294,
+                        82.99800872802734
+                    ],
+                    "lines": [
+                        {
+                            "bbox": [
+                                52,
+                                61.956024169921875,
+                                294,
+                                72.0000228881836
+                            ],
+                            "spans": [
+                                {
+                                    "bbox": [
+                                        54.0,
+                                        61.956024169921875,
+                                        296.2261657714844,
+                                        72.0000228881836
+                                    ],
+                                    "content": "dependent on the service headway and the reliability of the departure ",
+                                    "type": "text",
+                                    "score": 1.0
+                                }
+                            ]
+                        }
+                    ]
+                }
+            ]
+        }
+    ],
+    "_backend": "pipeline",
+    "_version_name": "0.6.1"
+}
+
+

内容列表 (content_list.json)

+

文件命名格式{原文件名}_content_list.json

+
功能说明
+

这是一个简化版的 middle.json,按阅读顺序平铺存储所有可读内容块,去除了复杂的布局信息,便于后续处理。

+
内容类型
+ + + + + + + + + + + + + + + + + + + + + + + + + +
类型说明
image图片
table表格
text文本/标题
equation行间公式
+
文本层级标识
+

通过 text_level 字段区分文本层级:

+
    +
  • text_leveltext_level: 0:正文文本
  • +
  • text_level: 1:一级标题
  • +
  • text_level: 2:二级标题
  • +
  • 以此类推...
  • +
+
通用字段
+
    +
  • 所有内容块都包含 page_idx 字段,表示所在页码(从 0 开始)。
  • +
  • 所有内容块都包含 bbox 字段,表示内容块的边界框坐标 [x0, y0, x1, y1] 映射在0-1000范围内的结果。
  • +
+
示例数据
+
[
+        {
+        "type": "text",
+        "text": "The response of flow duration curves to afforestation ",
+        "text_level": 1, 
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],
+        "page_idx": 0
+    },
+    {
+        "type": "image",
+        "img_path": "images/a8ecda1c69b27e4f79fce1589175a9d721cbdc1cf78b4cc06a015f3746f6b9d8.jpg",
+        "image_caption": [
+            "Fig. 1. Annual flow duration curves of daily flows from Pine Creek, Australia, 1989–2000. "
+        ],
+        "image_footnote": [],
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],
+        "page_idx": 1
+    },
+    {
+        "type": "equation",
+        "img_path": "images/181ea56ef185060d04bf4e274685f3e072e922e7b839f093d482c29bf89b71e8.jpg",
+        "text": "$$\nQ _ { \\% } = f ( P ) + g ( T )\n$$",
+        "text_format": "latex",
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],
+        "page_idx": 2
+    },
+    {
+        "type": "table",
+        "img_path": "images/e3cb413394a475e555807ffdad913435940ec637873d673ee1b039e3bc3496d0.jpg",
+        "table_caption": [
+            "Table 2 Significance of the rainfall and time terms "
+        ],
+        "table_footnote": [
+            "indicates that the rainfall term was significant at the $5 \\%$ level, $T$ indicates that the time term was significant at the $5 \\%$ level, \\* represents significance at the $10 \\%$ level, and na denotes too few data points for meaningful analysis. "
+        ],
+        "table_body": "<html><body><table><tr><td rowspan=\"2\">Site</td><td colspan=\"10\">Percentile</td></tr><tr><td>10</td><td>20</td><td>30</td><td>40</td><td>50</td><td>60</td><td>70</td><td>80</td><td>90</td><td>100</td></tr><tr><td>Traralgon Ck</td><td>P</td><td>P,*</td><td>P</td><td>P</td><td>P,</td><td>P,</td><td>P,</td><td>P,</td><td>P</td><td>P</td></tr><tr><td>Redhill</td><td>P,T</td><td>P,T</td><td>,*</td><td>**</td><td>P.T</td><td>P,*</td><td>P*</td><td>P*</td><td>*</td><td>,*</td></tr><tr><td>Pine Ck</td><td></td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td><td>T</td><td>na</td><td>na</td></tr><tr><td>Stewarts Ck 5</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P.T</td><td>P.T</td><td>P,T</td><td>na</td><td>na</td><td>na</td></tr><tr><td>Glendhu 2</td><td>P</td><td>P,T</td><td>P,*</td><td>P,T</td><td>P.T</td><td>P,ns</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td></tr><tr><td>Cathedral Peak 2</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Cathedral Peak 3</td><td>P.T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td></tr><tr><td>Lambrechtsbos A</td><td>P,T</td><td>P</td><td>P</td><td>P,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>*,T</td><td>T</td></tr><tr><td>Lambrechtsbos B</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>P,T</td><td>T</td><td>T</td></tr><tr><td>Biesievlei</td><td>P,T</td><td>P.T</td><td>P,T</td><td>P,T</td><td>*,T</td><td>*,T</td><td>T</td><td>T</td><td>P,T</td><td>P,T</td></tr></table></body></html>",
+        "bbox": [
+            62,
+            480,
+            946,
+            904
+        ],  
+        "page_idx": 5
+    }
+]
+
+

VLM 后端 输出结果

+

模型推理结果 (model.json)

+

文件命名格式{原文件名}_model.json

+
文件格式说明
+
    +
  • 该文件为 VLM 模型的原始输出结果,包含两层嵌套list,外层表示页面,内层表示该页的内容块
  • +
  • 每个内容块都是一个dict,包含 typebboxanglecontent 字段
  • +
+
支持的内容类型
+
{
+    "text": "文本",
+    "title": "标题", 
+    "equation": "行间公式",
+    "image": "图片",
+    "image_caption": "图片描述",
+    "image_footnote": "图片脚注",
+    "table": "表格",
+    "table_caption": "表格描述",
+    "table_footnote": "表格脚注",
+    "phonetic": "拼音",
+    "code": "代码块",
+    "code_caption": "代码描述",
+    "ref_text": "参考文献",
+    "algorithm": "算法块",
+    "list": "列表",
+    "header": "页眉",
+    "footer": "页脚",
+    "page_number": "页码",
+    "aside_text": "装订线旁注", 
+    "page_footnote": "页面脚注"
+}
+
+
坐标系统说明
+

bbox 坐标格式:[x0, y0, x1, y1]

+
    +
  • 分别表示左上、右下两点的坐标
  • +
  • 坐标原点在页面左上角
  • +
  • 坐标为相对于原始页面尺寸的百分比,范围在0-1之间
  • +
+
示例数据
+
[
+    [
+        {
+            "type": "header",
+            "bbox": [
+                0.077,
+                0.095,
+                0.18,
+                0.181
+            ],
+            "angle": 0,
+            "score": null,
+            "block_tags": null,
+            "content": "ELSEVIER",
+            "format": null,
+            "content_tags": null
+        },
+        {
+            "type": "title",
+            "bbox": [
+                0.157,
+                0.228,
+                0.833,
+                0.253
+            ],
+            "angle": 0,
+            "score": null,
+            "block_tags": null,
+            "content": "The response of flow duration curves to afforestation",
+            "format": null,
+            "content_tags": null
+        }
+    ]
+]
+
+

中间处理结果 (middle.json)

+

文件命名格式{原文件名}_middle.json

+
文件格式说明
+

vlm 后端的 middle.json 文件结构与 pipeline 后端类似,但存在以下差异:

+
    +
  • +

    list变成二级block,增加sub_type字段区分list类型:

    +
      +
    • text(文本类型)
    • +
    • ref_text(引用类型)
    • +
    +
  • +
  • +

    增加code类型block,code类型包含两种"sub_type":

    +
      +
    • 分别是codealgorithm
    • +
    • 至少有code_body, 可选code_caption
    • +
    +
  • +
  • +

    discarded_blocks内元素type增加以下类型:

    +
      +
    • header(页眉)
    • +
    • footer(页脚)
    • +
    • page_number(页码)
    • +
    • aside_text(装订线文本)
    • +
    • page_footnote(脚注)
    • +
    +
  • +
  • 所有block增加angle字段,用来表示旋转角度,0,90,180,270
  • +
+
示例数据
+
    +
  • list block 示例 +
    {
    +    "bbox": [
    +        174,
    +        155,
    +        818,
    +        333
    +    ],
    +    "type": "list",
    +    "angle": 0,
    +    "index": 11,
    +    "blocks": [
    +        {
    +            "bbox": [
    +                174,
    +                157,
    +                311,
    +                175
    +            ],
    +            "type": "text",
    +            "angle": 0,
    +            "lines": [
    +                {
    +                    "bbox": [
    +                        174,
    +                        157,
    +                        311,
    +                        175
    +                    ],
    +                    "spans": [
    +                        {
    +                            "bbox": [
    +                                174,
    +                                157,
    +                                311,
    +                                175
    +                            ],
    +                            "type": "text",
    +                            "content": "H.1 Introduction"
    +                        }
    +                    ]
    +                }
    +            ],
    +            "index": 3
    +        },
    +        {
    +            "bbox": [
    +                175,
    +                182,
    +                464,
    +                229
    +            ],
    +            "type": "text",
    +            "angle": 0,
    +            "lines": [
    +                {
    +                    "bbox": [
    +                        175,
    +                        182,
    +                        464,
    +                        229
    +                    ],
    +                    "spans": [
    +                        {
    +                            "bbox": [
    +                                175,
    +                                182,
    +                                464,
    +                                229
    +                            ],
    +                            "type": "text",
    +                            "content": "H.2 Example: Divide by Zero without Exception Handling"
    +                        }
    +                    ]
    +                }
    +            ],
    +            "index": 4
    +        }
    +    ],
    +    "sub_type": "text"
    +}
    +
  • +
  • code block 示例 +
    {
    +    "type": "code",
    +    "bbox": [
    +        114,
    +        780,
    +        885,
    +        1231
    +    ],
    +    "blocks": [
    +        {
    +            "bbox": [
    +                114,
    +                780,
    +                885,
    +                1231
    +            ],
    +            "lines": [
    +                {
    +                    "bbox": [
    +                        114,
    +                        780,
    +                        885,
    +                        1231
    +                    ],
    +                    "spans": [
    +                        {
    +                            "bbox": [
    +                                114,
    +                                780,
    +                                885,
    +                                1231
    +                            ],
    +                            "type": "text",
    +                            "content": "1 // Fig. H.1: DivideByZeroNoExceptionHandling.java  \n2 // Integer division without exception handling.  \n3 import java.util.Scanner;  \n4  \n5 public class DivideByZeroNoExceptionHandling  \n6 {  \n7 // demonstrates throwing an exception when a divide-by-zero occurs  \n8 public static int quotient( int numerator, int denominator )  \n9 {  \n10 return numerator / denominator; // possible division by zero  \n11 } // end method quotient  \n12  \n13 public static void main(String[] args)  \n14 {  \n15 Scanner scanner = new Scanner(System.in); // scanner for input  \n16  \n17 System.out.print(\"Please enter an integer numerator: \");  \n18 int numerator = scanner.nextInt();  \n19 System.out.print(\"Please enter an integer denominator: \");  \n20 int denominator = scanner.nextInt();  \n21"
    +                        }
    +                    ]
    +                }
    +            ],
    +            "index": 17,
    +            "angle": 0,
    +            "type": "code_body"
    +        },
    +        {
    +            "bbox": [
    +                867,
    +                160,
    +                1280,
    +                189
    +            ],
    +            "lines": [
    +                {
    +                    "bbox": [
    +                        867,
    +                        160,
    +                        1280,
    +                        189
    +                    ],
    +                    "spans": [
    +                        {
    +                            "bbox": [
    +                                867,
    +                                160,
    +                                1280,
    +                                189
    +                            ],
    +                            "type": "text",
    +                            "content": "Algorithm 1 Modules for MCTSteg"
    +                        }
    +                    ]
    +                }
    +            ],
    +            "index": 19,
    +            "angle": 0,
    +            "type": "code_caption"
    +        }
    +    ],
    +    "index": 17,
    +    "sub_type": "code"
    +}
    +
  • +
+

内容列表 (content_list.json)

+

文件命名格式{原文件名}_content_list.json

+
文件格式说明
+

vlm 后端的 content_list.json 文件结构与 pipeline 后端类似,伴随本次middle.json的变化,做了以下调整:

+
    +
  • +

    新增code类型,code类型包含两种"sub_type":

    +
      +
    • 分别是codealgorithm
    • +
    • 至少有code_body, 可选code_caption
    • +
    +
  • +
  • +

    新增list类型,list类型包含两种"sub_type":

    +
      +
    • text
    • +
    • ref_text
    • +
    +
  • +
  • +

    增加所有所有discarded_blocks的输出内容

    +
      +
    • header
    • +
    • footer
    • +
    • page_number
    • +
    • aside_text
    • +
    • page_footnote
    • +
    +
  • +
+
示例数据
+
    +
  • code 类型 content +
    {
    +    "type": "code",
    +    "sub_type": "algorithm",
    +    "code_caption": [
    +        "Algorithm 1 Modules for MCTSteg"
    +    ],
    +    "code_body": "1: function GETCOORDINATE(d)  \n2:  $x \\gets d / l$ ,  $y \\gets d$  mod  $l$   \n3: return  $(x, y)$   \n4: end function  \n5: function BESTCHILD(v)  \n6:  $C \\gets$  child set of  $v$   \n7:  $v' \\gets \\arg \\max_{c \\in C} \\mathrm{UCTScore}(c)$   \n8:  $v'.n \\gets v'.n + 1$   \n9: return  $v'$   \n10: end function  \n11: function BACK PROPAGATE(v)  \n12: Calculate  $R$  using Equation 11  \n13: while  $v$  is not a root node do  \n14:  $v.r \\gets v.r + R$ ,  $v \\gets v.p$   \n15: end while  \n16: end function  \n17: function RANDOMSEARCH(v)  \n18: while  $v$  is not a leaf node do  \n19: Randomly select an untried action  $a \\in A(v)$   \n20: Create a new node  $v'$   \n21:  $(x, y) \\gets \\mathrm{GETCOORDINATE}(v'.d)$   \n22:  $v'.p \\gets v$ ,  $v'.d \\gets v.d + 1$ ,  $v'.\\Gamma \\gets v.\\Gamma$   \n23:  $v'.\\gamma_{x,y} \\gets a$   \n24: if  $a = -1$  then  \n25:  $v.lc \\gets v'$   \n26: else if  $a = 0$  then  \n27:  $v.mc \\gets v'$   \n28: else  \n29:  $v.rc \\gets v'$   \n30: end if  \n31:  $v \\gets v'$   \n32: end while  \n33: return  $v$   \n34: end function  \n35: function SEARCH(v)  \n36: while  $v$  is fully expanded do  \n37:  $v \\gets$  BESTCHILD(v)  \n38: end while  \n39: if  $v$  is not a leaf node then  \n40:  $v \\gets$  RANDOMSEARCH(v)  \n41: end if  \n42: return  $v$   \n43: end function",
    +    "bbox": [
    +        510,
    +        87,
    +        881,
    +        740
    +    ],
    +    "page_idx": 0
    +}
    +
  • +
  • list 类型 content +
    {
    +    "type": "list",
    +    "sub_type": "text",
    +    "list_items": [
    +        "H.1 Introduction",
    +        "H.2 Example: Divide by Zero without Exception Handling",
    +        "H.3 Example: Divide by Zero with Exception Handling",
    +        "H.4 Summary"
    +    ],
    +    "bbox": [
    +        174,
    +        155,
    +        818,
    +        333
    +    ],
    +    "page_idx": 0
    +}
    +
  • +
  • discarded 类型 content +
    [{
    +    "type": "header",
    +    "text": "Journal of Hydrology 310 (2005) 253-265",
    +    "bbox": [
    +        363,
    +        164,
    +        623,
    +        177
    +    ],
    +    "page_idx": 0
    +},
    +{
    +    "type": "page_footnote",
    +    "text": "* Corresponding author. Address: Forest Science Centre, Department of Sustainability and Environment, P.O. Box 137, Heidelberg, Vic. 3084, Australia. Tel.: +61 3 9450 8719; fax: +61 3 9450 8644.",
    +    "bbox": [
    +        71,
    +        815,
    +        915,
    +        841
    +    ],
    +    "page_idx": 0
    +}]
    +
  • +
+

总结

+

以上文件为 MinerU 的完整输出结果,用户可根据需要选择合适的文件进行后续处理:

+
    +
  • +

    模型输出(使用原始输出):

    +
      +
    • model.json
    • +
    +
  • +
  • +

    调试和验证(使用可视化文件):

    +
      +
    • layout.pdf
    • +
    • span.pdf
    • +
    +
  • +
  • +

    内容提取(使用简化文件):

    +
      +
    • *.md
    • +
    • content_list.json
    • +
    +
  • +
  • +

    二次开发(使用结构化文件):

    +
      +
    • middle.json
    • +
    +
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/AMD/index.html b/zh/usage/acceleration_cards/AMD/index.html new file mode 100644 index 00000000..ab0e0c80 --- /dev/null +++ b/zh/usage/acceleration_cards/AMD/index.html @@ -0,0 +1,1969 @@ + + + + + + + + + + + + + + + + + + + + + AMD - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

AMD

+ +

基于Triton的ROCm 不同后端实现优化,基本实现vllm后端正常推理,以及pipeline后端中第一步layout用的DocLayout-YOLO

+

已有完整python vllm和mineru环境直接跳转第五步!!! +其他GPU执行问题可以参考,先prof查看定位找到哪个算子问题,然后triton后端实现即可 +测试了一下,基本和MinerU官网效果差不多,用AMD的人也不是很多,就在评论区分享给大家了

+

1.结果介绍

+

补充一个200页的PDF python编程书测试一下速度,可以到1.99it/s: +Two Step Extraction: 100%|████████████████████████████████████████| 200/200 [01:40<00:00, 1.99it/s]

+

下面为之前14学术论文测试结果: +7900xtx mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-vllm-engine true 速度大概为1.6-1.8s/it,没有仔细测试,简单试了两个文档。第二种矩阵乘法代替原来的dots点乘可以进一步提速到1.3s/it,优化后的主要算子耗时在hipblast(这个没法提升了)和vllm triton后端,各占25%耗时吧,vllm tirion后端这个这个只能等官方优化了。。。。 +doclayout-yolo的layout速度从原来的1.6it/s提高到15it/s,注意需要缓存一下输入的pdf尺寸后,triton必须要缓存尺寸没办法。主要是为了保留模型输入输出接口,最小代码改动。 +采用-b vlm-vllm-engine模式举个例子

+
+

测试结果为优化为5d矩阵乘代替原来的点积结果: +2025-10-05 15:45:12.985 | INFO | mineru.backend.vlm.vlm_analyze:get_model:128 - get vllm-engine predictor cost: 18.45s +Adding requests: 100%|████████████████████████████████████████████████████████████████████████████████| 14/14 [00:01<00:00, 12.20it/s] +Processed prompts: 100%|█████████████████████| 14/14 [00:08<00:00, 1.56it/s, est. speed input: 2174.18 toks/s, output: 791.87 toks/s] +Adding requests: 100%|█████████████████████████████████████████████████████████████████████████████| 278/278 [00:00<00:00, 323.03it/s] +Processed prompts: 100%|██████████████████| 278/278 [00:07<00:00, 37.63it/s, est. speed input: 5264.66 toks/s, output: 2733.31 toks/s]

+

mineru-gradio --server-name 0.0.0.0 --server-port 7860 --enable-vllm-engine true测试: +2025-10-05 15:46:55.953 | WARNING | mineru.cli.common:convert_pdf_bytes_to_bytes_by_pypdfium2:54 - end_page_id is out of range, use pdf_docs length +Two Step Extraction: 100%|████████████████████████████████████████████████████████████████████████████| 14/14 [00:18<00:00, 1.30s/it]

+
+

2.原因介绍

+

AMD RDNA使用vllm后端有严重的性能问题,原因是因为vllm的qwen2_vl.py中有一个算子在rocm kernel上没有对应的实现,导致性能出现严重的卷积计算回退,一次执行花了12s,。。。。。。。。一言难尽。即MIOpen 库中缺少模型中特定 Conv3d(bfloat16) 的优化内核。 +DocLayout-YOLO的g2l_crm.py空洞卷积也是这个问题,专业的CDNA MI210也没解决这个问题 +正好一起处理了。

+
+

3.环境介绍

+

System: Ubuntu 24.04.3 Kernel: Linux 6.14.0-33-generic ROCm version: 7.0.1 +python环境: +python 3.12 +pytorch-triton-rocm 3.5.0+gitbbb06c03 +torch 2.10.0.dev20251001+rocm7.0 +torchvision 0.25.0.dev20251003+rocm7.0 +vllm 0.11.0rc2.dev198+g736fbf4c8.rocm701 +不同版本无所谓,处理方法是一样的。

+
+

4.前置环境安装

+

uv venv --python python3.12
+source .venv/bin/activate
+uv pip install --pre torch torchvision   -i https://pypi.tuna.tsinghua.edu.cn/simple/   --extra-index-url https://download.pytorch.org/whl/nightly/rocm7.0
+uv pip install pip
+# 避免覆盖我们本地的pytorch,改用pip而没有继续使用uv pip
+pip install -U "mineru[core]" -i https://pypi.mirrors.ustc.edu.cn/simple/
+
+vllm 安装参考官方手册Vllm +
#手动安装aiter,vllm,amd-smi等,自行找一个位置clone,然后进入该目录吧
+git clone --recursive https://github.com/ROCm/aiter.git
+cd aiter
+git submodule sync; git submodule update --init --recursive
+python setup.py develop
+cd ..
+git clone https://github.com/vllm-project/vllm.git
+cd vllm/
+cp -r /opt/rocm/share/amd_smi ~/Pytorch/vllm/
+pip install amd_smi/
+pip install --upgrade numba \
+    scipy \
+    huggingface-hub[cli,hf_transfer] \
+    setuptools_scm
+pip install -r requirements/rocm.txt
+export PYTORCH_ROCM_ARCH="gfx1100"   #根据自己的GPU架构 rocminfo | grep gfx
+python setup.py develop
+
+
+

5.vllm中关键triton算子添加

+

这里我给出两种解决方法,第一种解决方法就是前面提到的优化到1.5到1.8s/it,第二种方法有手动优化算子到矩阵乘法,7900xtx肯定适用,大概1.3s/it,其他AMD GPU相对方案一也有提速,但是不一定是最佳速度实现,里面的手动部分可能需要微调。

+

注意pip把triton 后端的flash_attn卸载了,搞了半天各种尝试还是报错,问题比较大,直接不用就行了 +

#定位自己vllm位置XXX
+pip show vllm
+
+关键更改 +XXX/vllm/model_executor/models/qwen2_vl.py文件: +1.qwen2_vl.py文件33行下增加from .qwen2_vl_vision_kernels import triton_conv3d_patchify +
from collections.abc import Iterable, Mapping, Sequence
+from functools import partial
+from typing import Annotated, Any, Callable, Literal, Optional, Union
+
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from .qwen2_vl_vision_kernels import triton_conv3d_patchify
+
+接下来分为方案一(2.1和3.1)和方案二(2.2和3.2),选取一种实现即可 +
+

方案1 +2.1qwen2_vl.py文件498行class Qwen2VisionPatchEmbed(nn.Module),PS.就是这玩意AMD没有现成的内核算子导致回退 +

class Qwen2VisionPatchEmbed(nn.Module):
+
+    def __init__(
+        self,
+        patch_size: int = 14,
+        temporal_patch_size: int = 2,
+        in_channels: int = 3,
+        embed_dim: int = 1152,
+    ) -> None:
+        super().__init__()
+        self.patch_size = patch_size
+        self.temporal_patch_size = temporal_patch_size
+        self.embed_dim = embed_dim
+
+        kernel_size = (temporal_patch_size, patch_size, patch_size)
+        self.proj = nn.Conv3d(in_channels,
+                              embed_dim,
+                              kernel_size=kernel_size,
+                              stride=kernel_size,
+                              bias=False)
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        L, C = x.shape
+        x_reshaped = x.view(L, -1, self.temporal_patch_size, self.patch_size,
+                            self.patch_size)
+
+        # Call your custom Triton kernel instead of self.proj
+        x_out = triton_conv3d_patchify(x_reshaped, self.proj.weight)
+
+        # The output of our kernel is already the correct shape [L, embed_dim]
+        return x_out
+
+3.1XXX/vllm/model_executor/models/目录下创建qwen2_vl_vision_kernels.py文件,用triton实现 +
import torch
+from vllm.triton_utils import tl, triton
+
+@triton.jit
+def _conv3d_patchify_kernel(
+    # Pointers to tensors
+    X, W, Y,
+    # Tensor dimensions
+    N, C_in, D_in, H_in, W_in,
+    C_out, KD, KH, KW,
+    # Stride and padding for memory access
+    stride_xn, stride_xc, stride_xd, stride_xh, stride_xw,
+    stride_wn, stride_wc, stride_wd, stride_wh, stride_ww,
+    stride_yn, stride_yc,
+    # Triton-specific metaparameters
+    BLOCK_SIZE: tl.constexpr,
+):
+    """
+    Triton kernel for a non-overlapping 3D patching convolution.
+    Each kernel instance computes one output value for one patch.
+    """
+    # Get the program IDs for the N (patch) and C_out (output channel) dimensions
+    pid_n = tl.program_id(0)  # The index of the patch we are processing
+    pid_cout = tl.program_id(1) # The index of the output channel we are computing
+
+    # --- Calculate memory pointers ---
+    # Pointer to the start of the current input patch
+    x_ptr = X + (pid_n * stride_xn)
+    # Pointer to the start of the current filter (weight)
+    w_ptr = W + (pid_cout * stride_wn)
+    # Pointer to where the output will be stored
+    y_ptr = Y + (pid_n * stride_yn + pid_cout * stride_yc)
+
+    # --- Perform the convolution (element-wise product and sum) ---
+    # This is a dot product between the flattened patch and the flattened filter.
+    accumulator = tl.zeros((BLOCK_SIZE,), dtype=tl.float32)
+
+    # Iterate over the elements of the patch/filter
+    for c_offset in range(0, C_in):
+        for d_offset in range(0, KD):
+            for h_offset in range(0, KH):
+                # Unrolled loop for the innermost dimension (width) for performance
+                for w_offset in range(0, KW, BLOCK_SIZE):
+                    # Create masks to handle cases where KW is not a multiple of BLOCK_SIZE
+                    w_range = w_offset + tl.arange(0, BLOCK_SIZE)
+                    w_mask = w_range < KW
+
+                    # Calculate offsets to load data
+                    patch_offset = (c_offset * stride_xc + d_offset * stride_xd +
+                                    h_offset * stride_xh + w_range * stride_xw)
+                    filter_offset = (c_offset * stride_wc + d_offset * stride_wd +
+                                     h_offset * stride_wh + w_range * stride_ww)
+
+                    # Load patch and filter data, applying masks
+                    patch_vals = tl.load(x_ptr + patch_offset, mask=w_mask, other=0.0)
+                    filter_vals = tl.load(w_ptr + filter_offset, mask=w_mask, other=0.0)
+
+                    # Multiply and accumulate
+                    accumulator += patch_vals.to(tl.float32) * filter_vals.to(tl.float32)
+
+    # Sum the accumulator block and store the single output value
+    output_val = tl.sum(accumulator, axis=0)
+    tl.store(y_ptr, output_val)
+
+
+def triton_conv3d_patchify(x: torch.Tensor, weight: torch.Tensor) -> torch.Tensor:
+    """
+    Python wrapper for the 3D patching convolution Triton kernel.
+    """
+    # Get tensor dimensions
+    N, C_in, D_in, H_in, W_in = x.shape
+    C_out, _, KD, KH, KW = weight.shape
+
+    # Create the output tensor
+    # The output of this specific conv is (N, C_out, 1, 1, 1), which we squeeze
+    Y = torch.empty((N, C_out), dtype=x.dtype, device=x.device)
+
+    # Define the grid for launching the Triton kernel
+    # Each kernel instance handles one patch (N) for one output channel (C_out)
+    grid = (N, C_out)
+
+    # Launch the kernel
+    # We pass all strides to make the kernel flexible
+    _conv3d_patchify_kernel[grid](
+        x, weight, Y,
+        N, C_in, D_in, H_in, W_in,
+        C_out, KD, KH, KW,
+        x.stride(0), x.stride(1), x.stride(2), x.stride(3), x.stride(4),
+        weight.stride(0), weight.stride(1), weight.stride(2), weight.stride(3), weight.stride(4),
+        Y.stride(0), Y.stride(1),
+        BLOCK_SIZE=16, # A reasonable default, can be tuned
+    )
+
+    return Y
+
+
+

方案2 +2.2qwen2_vl.py文件498行class Qwen2VisionPatchEmbed(nn.Module)函数,PS.就是这玩意AMD没有现成的内核算子导致回退,这里我们直接5D张量一步到位,改为矩阵乘法 +

class Qwen2VisionPatchEmbed(nn.Module):
+
+    def __init__(
+        self,
+        patch_size: int = 14,
+        temporal_patch_size: int = 2,
+        in_channels: int = 3,
+        embed_dim: int = 1152,
+    ) -> None:
+        super().__init__()
+        self.patch_size = patch_size
+        self.temporal_patch_size = temporal_patch_size
+        self.embed_dim = embed_dim
+
+        kernel_size = (temporal_patch_size, patch_size, patch_size)
+
+        self.proj = nn.Conv3d(in_channels,
+                              embed_dim,
+                              kernel_size=kernel_size,
+                              stride=kernel_size,
+                              bias=False)
+
+    def forward(self, x: torch.Tensor) -> torch.Tensor:
+        L, C = x.shape
+        x_reshaped_5d = x.view(L, -1, self.temporal_patch_size, self.patch_size,
+                               self.patch_size)
+
+        return triton_conv3d_patchify(x_reshaped_5d, self.proj.weight)
+
+3.2XXX/vllm/model_executor/models/目录下创建qwen2_vl_vision_kernels.py文件,用triton实现 +
import torch
+from vllm.triton_utils import tl, triton
+
+@triton.jit
+def _conv_gemm_kernel(
+    A, B, C, M, N, K,
+    stride_am, stride_ak,
+    stride_bk, stride_bn,
+    stride_cm, stride_cn,
+    BLOCK_M: tl.constexpr, BLOCK_N: tl.constexpr, BLOCK_K: tl.constexpr,
+):
+    pid_m = tl.program_id(0)
+    pid_n = tl.program_id(1)
+    offs_m = pid_m * BLOCK_M + tl.arange(0, BLOCK_M)
+    offs_n = pid_n * BLOCK_N + tl.arange(0, BLOCK_N)
+    offs_k = tl.arange(0, BLOCK_K)
+    a_ptrs = A + (offs_m[:, None] * stride_am + offs_k[None, :] * stride_ak)
+    b_ptrs = B + (offs_k[:, None] * stride_bk + offs_n[None, :] * stride_bn)
+    accumulator = tl.zeros((BLOCK_M, BLOCK_N), dtype=tl.float32)
+    for k in range(0, K, BLOCK_K):
+        a = tl.load(a_ptrs, mask=(offs_m[:, None] < M) & (offs_k[None, :] < K), other=0.0)
+        b = tl.load(b_ptrs, mask=(offs_k[:, None] < K) & (offs_n[None, :] < N), other=0.0)
+        accumulator += tl.dot(a, b)
+        a_ptrs += BLOCK_K * stride_ak
+        b_ptrs += BLOCK_K * stride_bk
+        offs_k += BLOCK_K
+    c = accumulator.to(C.dtype.element_ty)
+    offs_cm = pid_m * BLOCK_M + tl.arange(0, BLOCK_M)
+    offs_cn = pid_n * BLOCK_N + tl.arange(0, BLOCK_N)
+    c_ptrs = C + stride_cm * offs_cm[:, None] + stride_cn * offs_cn[None, :]
+    c_mask = (offs_cm[:, None] < M) & (offs_cn[None, :] < N)
+    tl.store(c_ptrs, c, mask=c_mask)
+
+def triton_conv3d_patchify(x_5d: torch.Tensor, weight_5d: torch.Tensor) -> torch.Tensor:
+    N_patches, _, _, _, _ = x_5d.shape
+    C_out, _, _, _, _ = weight_5d.shape
+    A = x_5d.view(N_patches, -1)
+    B = weight_5d.view(C_out, -1).transpose(0, 1).contiguous()
+    M, K = A.shape
+    _K, N = B.shape
+    assert K == _K
+    C = torch.empty((M, N), device=A.device, dtype=A.dtype)
+
+    # --- 针对7900xtx的手动调优配置,其他GPU的最优组合可能需要自行寻找,AMD的autotune效果就是没有效果 ---
+    best_config = {
+        'BLOCK_M': 128,
+        'BLOCK_N': 128,
+        'BLOCK_K': 32,
+    }
+    num_stages = 4
+    num_warps = 8
+
+    grid = (triton.cdiv(M, best_config['BLOCK_M']),
+            triton.cdiv(N, best_config['BLOCK_N']))
+
+    _conv_gemm_kernel[grid](
+        A, B, C,
+        M, N, K,
+        A.stride(0), A.stride(1),
+        B.stride(0), B.stride(1),
+        C.stride(0), C.stride(1),
+        **best_config,
+        num_stages=num_stages,
+        num_warps=num_warps
+    )
+
+    return C
+
+
+

4.关闭终端后再次使用mineru-gradio会报一个Lora错误,修改代码跳过它 +

pip show mineru_vl_utils
+
+

打开该文件XXX/mineru_vl_utils/vlm_client/vllm_async_engine_client.py修改第58行self.tokenizer = vllm_async_llm.tokenizer.get_lora_tokenizer()为: +

        try:
+            self.tokenizer = vllm_async_llm.tokenizer.get_lora_tokenizer()
+        except AttributeError:
+            # 如果没有 get_lora_tokenizer 方法,直接使用原始 tokenizer
+            self.tokenizer = vllm_async_llm.tokenizer
+
+

最后整两个环境变量后愉快玩耍即可 +

export MINERU_MODEL_SOURCE=modelscope
+export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1
+
+
+

6.vllm后端已经没有问题,下面是pipeline 中layout用的doclayout-yolo模型空洞卷积问题

+

我在 DocLayout-YOLO 下做了一个回答,因此 pipeline 的空洞卷积问题不在这里赘述,直接点击链接查看即可。

+

查看自己doclayout-yolo安装位置如下,然后进入修改链接中回复介绍的文件即可 +

pip show doclayout-yolo
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Ascend/index.html b/zh/usage/acceleration_cards/Ascend/index.html new file mode 100644 index 00000000..71837cdf --- /dev/null +++ b/zh/usage/acceleration_cards/Ascend/index.html @@ -0,0 +1,1805 @@ + + + + + + + + + + + + + + + + + + + + + Ascend - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Ascend

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: CTyunOS 22.06  
+cpu: Kunpeng-920 (aarch64)  
+npu: Ascend 910B2  
+driver: 23.0.3 
+docker: 20.10.12
+
+

2. 环境准备

+
+

Note

+

Ascend加速卡支持使用vllmlmdeploy进行VLM模型推理加速。请根据实际需求选择安装和使用其中之一:

+
+

2.1 使用 Dockerfile 构建镜像 (vllm)

+
+

Tip

+

ascend-vllm支持设备如下:

+
    +
  • Atlas A2 training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
  • +
  • Atlas 800I A2 inference series (Atlas 800I A2)
  • +
  • Atlas A3 training series (Atlas 800T A3, Atlas 900 A3 SuperPoD, Atlas 9000 A3 SuperPoD)
  • +
  • Atlas 800I A3 inference series (Atlas 800I A3)
  • +
  • [Experimental] Atlas 300I inference series (Atlas 300I Duo)
  • +
+

Dockerfile文件第三行为ascend-vllm基础镜像信息,默认tag为A2适配的版本,例如 v0.11.0

+
    +
  • 如需使用A3适配的版本,请将第三行的tag修改为 v0.11.0-a3,然后再执行build操作。
  • +
  • 如需使用Atlas 300I Duo适配的版本,请将第三行的tag修改为 v0.10.0rc1-310p,然后再执行build操作。
  • +
+
+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/npu.Dockerfile
+docker build --network=host -t mineru:npu-vllm-latest -f npu.Dockerfile .
+
+

2.2 使用 Dockerfile 构建镜像 (lmdeploy)

+
+

Tip

+

ascend-lmdeploy支持设备如下:

+
    +
  • Atlas A2 training series (Atlas 800T A2, Atlas 900 A2 PoD, Atlas 200T A2 Box16, Atlas 300T A2)
  • +
  • Atlas 800I A2 inference series (Atlas 800I A2)
  • +
+

如果您的设备为Atlas A3系列或Atlas 300I Duo系列,请使用vllm版本的镜像。

+
+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/npu.Dockerfile
+# 将基础镜像从 vllm 切换为 lmdeploy
+sed -i '3s/^/# /' npu.Dockerfile && sed -i '5s/^# //' npu.Dockerfile
+docker build --network=host -t mineru:npu-lmdeploy-latest -f npu.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run -u root --name mineru_docker --privileged=true \
+    --ipc=host \
+    --network=host \
+    --device=/dev/davinci0 \
+    --device=/dev/davinci1 \
+    --device=/dev/davinci2 \
+    --device=/dev/davinci3 \
+    --device=/dev/davinci4 \
+    --device=/dev/davinci5 \
+    --device=/dev/davinci6 \
+    --device=/dev/davinci7 \
+    --device=/dev/davinci_manager \
+    --device=/dev/devmm_svm \
+    --device=/dev/hisi_hdc \
+    -v /var/log/npu/:/usr/slog \
+    -v /usr/local/dcmi:/usr/local/dcmi \
+    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
+    -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
+    -e VLLM_WORKER_MULTIPROC_METHOD=spawn \
+    -e MINERU_MODEL_SOURCE=local \
+    -e MINERU_LMDEPLOY_DEVICE=ascend \
+    -it mineru:npu-vllm-latest \
+    /bin/bash
+
+
+

Tip

+

请根据实际情况选择使用vllmlmdeploy版本的镜像,如需使用lmdeploy,替换上述命令中的mineru:npu-vllm-latestmineru:npu-lmdeploy-latest即可。

+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+
+

Note

+

由于310p加速卡不支持图模式与bf16精度,因此在使用该加速卡时,执行任意与vllm相关命令需追加--enforce-eager --dtype float16参数。 +例如: +

mineru-openai-server --port 30000 --enforce-eager --dtype float16
+
+
+

4. 注意事项

+

不同环境下,MinerU对Ascend加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllmlmdeploy
命令行工具(mineru)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
fastapi服务(mineru-api)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
gradio界面(mineru-gradio)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
openai-server服务(mineru-openai-server)🟢🟢
数据并行 (--data-parallel-size/--dp)🟢🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • NPU加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考ASCEND_RT_VISIBLE_DEVICES
  • +
  • 在Ascend平台可以通过npu-smi info命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Biren/index.html b/zh/usage/acceleration_cards/Biren/index.html new file mode 100644 index 00000000..f36ef3a4 --- /dev/null +++ b/zh/usage/acceleration_cards/Biren/index.html @@ -0,0 +1,1734 @@ + + + + + + + + + + + + + + + + + + + + + Biren - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Biren

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.4 LTS
+cpu: Intel x86-64
+gpu: Biren 106C
+driver: 1.10.0
+docker: 28.0.4
+
+

2. 环境准备

+

2.1 下载并加载镜像 (vllm)

+
wget http://birentech.com/xxx/MinerU/mineru-vllm.tar 链接获取请联系壁仞内部人员(邮箱:MonaLiu@birentech.com)
+docker load -i mineru-vllm.tar
+
+

3. 启动 Docker 容器

+
docker run -it --name mineru_docker \
+    --privileged \
+    --network=host \
+    --shm-size=100G \
+    -e MINERU_MODEL_SOURCE=local \
+    -e MINERU_DEVICE_MODEL=supa \
+    -e SHAPE_TRANSFORM_GRANK=true \
+    mineru:biren-vllm-latest \
+    /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对Biren加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • Biren加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明, +将环境变量CUDA_VISIBLE_DEVICES替换为SUPA_VISIBLE_DEVICES即可。
  • +
  • 在壁仞平台可以通过brsmi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Cambricon/index.html b/zh/usage/acceleration_cards/Cambricon/index.html new file mode 100644 index 00000000..41eb5aac --- /dev/null +++ b/zh/usage/acceleration_cards/Cambricon/index.html @@ -0,0 +1,1790 @@ + + + + + + + + + + + + + + + + + + + + + Cambricon - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Cambricon

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.5 LTS  
+cpu: Hygon Hygon C86 7490
+mlu: MLU590-M9D
+driver: v6.2.11
+docker: 28.3.0
+
+

2. 环境准备

+
+

Note

+

Cambricon加速卡支持使用lmdeployvllm进行VLM模型推理加速。请根据实际需求选择安装和使用其中之一:

+
+

2.1 使用 Dockerfile 构建镜像 (lmdeploy)

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/mlu.Dockerfile
+docker build --network=host -t mineru:mlu-lmdeploy-latest -f mlu.Dockerfile .
+
+

2.2 使用 Dockerfile 构建镜像 (vllm)

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/mlu.Dockerfile
+# 将基础镜像从 lmdeploy 切换为 vllm
+sed -i -e '3,4s/^/# /' -e '6,7s/^# //' mlu.Dockerfile
+docker build --network=host -t mineru:mlu-vllm-latest -f mlu.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run --name mineru_docker \
+   --privileged \
+   --ipc=host \
+   --network=host \
+   --shm-size=400g \
+   --ulimit memlock=-1 \
+   -v /dev:/dev \
+   -v /lib/modules:/lib/modules:ro \
+   -v /usr/bin/cnmon:/usr/bin/cnmon \
+   -e MINERU_MODEL_SOURCE=local \
+   -e MINERU_LMDEPLOY_DEVICE=camb \
+   -it mineru:mlu-lmdeploy-latest \
+   /bin/bash
+
+
+

Tip

+

请根据实际情况选择使用vllmlmdeploy版本的镜像,如需使用vllm,请执行以下操作:

+
    +
  • +

    替换上述命令中的mineru:mlu-lmdeploy-latestmineru:mlu-vllm-latest

    +
  • +
  • +

    进入容器后,通过以下命令切换venv环境: +

    source /torch/venv3/pytorch_infer/bin/activate
    +
    +
  • +
  • +

    切换成功后,您可以在命令行前看到(pytorch_infer)的标识,这表示您已成功进入vllm的虚拟环境。

    +
  • +
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+
+

Note

+

兼容性说明:由于寒武纪(Cambricon)目前对 vLLM v1 引擎的支持尚待完善,MinerU 现阶段采用 v0 引擎作为适配方案。 +受此限制,vLLM 的异步引擎(Async Engine)功能存在兼容性问题,可能导致部分使用场景无法正常运行。 +我们将持续跟进寒武纪对 vLLM v1 引擎的支持进展,并及时在 MinerU 中进行相应的适配与优化。

+
+

不同环境下,MinerU对Cambricon加速卡的支持情况如下表所示:

+
+

Tip

+
    +
  • lmdeploy黄灯问题为不能输入文件夹使用批量解析功能,输入单个文件时表现正常。
  • +
  • vllm黄灯问题为在精度未对齐,在部分场景下可能出现预期外结果。
  • +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllmlmdeploy
命令行工具(mineru)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟡🟡
<vlm/hybrid>-http-client🟡🟢
fastapi服务(mineru-api)pipeline🟢🟢
<vlm/hybrid>-auto-engine🔴🟢
<vlm/hybrid>-http-client🟡🟢
gradio界面(mineru-gradio)pipeline🟢🟢
<vlm/hybrid>-auto-engine🔴🟢
<vlm/hybrid>-http-client🟡🟢
openai-server服务(mineru-openai-server)🟡🟢
数据并行 (--data-parallel-size/--dp)🔴🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • Cambricon加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明, +将环境变量CUDA_VISIBLE_DEVICES替换为MLU_VISIBLE_DEVICES即可。
  • +
  • 在Cambricon平台可以通过cnmon命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Enflame/index.html b/zh/usage/acceleration_cards/Enflame/index.html new file mode 100644 index 00000000..60bfbc98 --- /dev/null +++ b/zh/usage/acceleration_cards/Enflame/index.html @@ -0,0 +1,1732 @@ + + + + + + + + + + + + + + + + + + + + + Enflame - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Enflame

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.4 LTS  
+cpu: Intel x86-64
+gcu: Enflame S60 
+driver: 1.7.0.9
+docker: 28.0.1
+
+

2. 环境准备

+

2.1 使用 Dockerfile 构建镜像

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/gcu.Dockerfile
+docker build --network=host -t mineru:gcu-vllm-latest -f gcu.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run -u root --name mineru_docker \
+    --network=host \
+    --ipc=host \
+    --privileged \
+    -e MINERU_MODEL_SOURCE=local \
+    -it mineru:gcu-vllm-latest \
+    /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对Enflame加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • GCU加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明, +将环境变量CUDA_VISIBLE_DEVICES替换为TOPS_VISIBLE_DEVICES即可。
  • +
  • 在Enflame平台可以通过efsmi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Hygon/index.html b/zh/usage/acceleration_cards/Hygon/index.html new file mode 100644 index 00000000..d1b3f867 --- /dev/null +++ b/zh/usage/acceleration_cards/Hygon/index.html @@ -0,0 +1,1738 @@ + + + + + + + + + + + + + + + + + + + + + Hygon - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Hygon

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.3 LTS  
+cpu: Hygon C86-4G(x86-64)
+dcu: BW200
+driver: 6.3.13-V1.12.0a
+docker: 20.10.24
+
+

2. 环境准备

+

2.1 使用 Dockerfile 构建镜像

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/dcu.Dockerfile
+docker build --network=host -t mineru:dcu-vllm-latest -f dcu.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run -u root --name mineru_docker \
+    --network=host \
+    --ipc=host \
+    --shm-size=16G \
+    --device=/dev/kfd \
+    --device=/dev/mkfd \
+    --device=/dev/dri \
+    -v /opt/hyhal:/opt/hyhal \
+    --group-add video \
+    --cap-add=SYS_PTRACE \
+    --security-opt seccomp=unconfined \
+    -e MINERU_MODEL_SOURCE=local \
+    -it mineru:dcu-vllm-latest \
+    /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对Hygon加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🟢
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • DCU加速卡指定可用加速卡的方式与AMD GPU类似,请参考GPU isolation techniques
  • +
  • 在Hygon平台可以通过hy-smi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/IluvatarCorex/index.html b/zh/usage/acceleration_cards/IluvatarCorex/index.html new file mode 100644 index 00000000..706cd3c2 --- /dev/null +++ b/zh/usage/acceleration_cards/IluvatarCorex/index.html @@ -0,0 +1,1746 @@ + + + + + + + + + + + + + + + + + + + + + IluvatarCorex - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

IluvatarCorex

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.5 LTS  
+cpu: Intel x86-64
+gpu: Iluvatar BI-V150
+driver: 4.4.0
+docker: 28.1.1
+
+

2. 环境准备

+

2.1 使用 Dockerfile 构建镜像

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/corex.Dockerfile
+docker build --network=host -t mineru:corex-vllm-latest -f corex.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run --name mineru_docker \
+   -v /usr/src:/usr/src \
+   -v /lib/modules:/lib/modules \
+   -v /dev:/dev \
+   --privileged \
+   --cap-add=ALL \
+   --pid=host \
+   --group-add video \
+   --network=host \
+   --shm-size '400gb' \
+   --ulimit memlock=-1 \
+   --security-opt seccomp=unconfined \
+   --security-opt apparmor=unconfined \
+   -e VLLM_ENFORCE_CUDA_GRAPH=1 \
+   -e MINERU_MODEL_SOURCE=local \
+   -e MINERU_VLLM_DEVICE=corex \
+   -it mineru:corex-vllm-latest \
+   /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+
+

Tip

+

目前Iluvatar方案使用vllm作为推理引擎时,可能出现服务停止后显存无法正常释放的问题,如果遇到该问题,请重启Docker容器以释放显存。

+
+

不同环境下,MinerU对Iluvatar加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🟢
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • Iluvatar加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明
  • +
  • 在Iluvatar平台可以通过ixsmi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Kunlunxin/index.html b/zh/usage/acceleration_cards/Kunlunxin/index.html new file mode 100644 index 00000000..a2e06871 --- /dev/null +++ b/zh/usage/acceleration_cards/Kunlunxin/index.html @@ -0,0 +1,1747 @@ + + + + + + + + + + + + + + + + + + + + + Kunlunxin - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Kunlunxin

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.5 LTS  
+cpu: Intel x86-64
+xpu: P800
+driver: 515.58
+docker: 20.10.5
+
+

2. 环境准备

+

2.1 使用 Dockerfile 构建镜像 (vllm)

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/kxpu.Dockerfile
+docker build --network=host -t mineru:kxpu-vllm-latest -f kxpu.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run -u root --name mineru_docker \
+    --device=/dev/xpu0:/dev/xpu0 \
+    --device=/dev/xpu1:/dev/xpu1 \
+    --device=/dev/xpu2:/dev/xpu2 \
+    --device=/dev/xpu3:/dev/xpu3 \
+    --device=/dev/xpu4:/dev/xpu4 \
+    --device=/dev/xpu5:/dev/xpu5 \
+    --device=/dev/xpu6:/dev/xpu6 \
+    --device=/dev/xpu7:/dev/xpu7 \
+    --device=/dev/xpuctrl:/dev/xpuctrl \
+    --net=host \
+    --cap-add=SYS_PTRACE --security-opt seccomp=unconfined \
+    --tmpfs /dev/shm:rw,nosuid,nodev,exec,size=32g \
+    --cap-add=SYS_PTRACE \
+    -v /home/users/vllm-kunlun:/home/vllm-kunlun \
+    -v /usr/local/bin/xpu-smi:/usr/local/bin/xpu-smi \
+    -w /workspace \
+    -e MINERU_MODEL_SOURCE=local \
+    -e MINERU_FORMULA_CH_SUPPORT=true \
+    -e MINERU_VLLM_DEVICE=kxpu \
+    -it mineru:kxpu-vllm-latest \
+    /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对Kunlunxin加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • Kunlunxin加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明, +将环境变量CUDA_VISIBLE_DEVICES替换为XPU_VISIBLE_DEVICES即可。
  • +
  • 在Kunlunxin平台可以通过xpu-smi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/METAX/index.html b/zh/usage/acceleration_cards/METAX/index.html new file mode 100644 index 00000000..d5f187f6 --- /dev/null +++ b/zh/usage/acceleration_cards/METAX/index.html @@ -0,0 +1,1778 @@ + + + + + + + + + + + + + + + + + + + + + METAX - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

METAX

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04   
+cpu: INTEL x86_64
+gpu: C500  
+driver: 2.12.13
+docker: 28.1.1
+
+

2. 环境准备

+
+

Note

+

maca加速卡支持使用vllmlmdeploy进行VLM模型推理加速。请根据实际需求选择安装和使用其中之一:

+
+

2.1 使用metax官方镜像作为基础镜像构建vllm环境镜像

+
    +
  1. 从metax官方仓库拉取基础镜像
      +
    • 1.1 镜像获取地址:https://developer.metax-tech.com/softnova/docker
    • +
    • 1.2 在镜像网站选择AI分类,软件包类型选择vllm,操作系统选择ubuntu
    • +
    • 1.3 找到vllm:maca.ai3.1.0.7-torch2.6-py310-ubuntu22.04-amd64镜像,复制拉取命令并在本地终端执行
    • +
    +
  2. +
  3. 使用 Dockerfile 构建镜像 (vllm) +
    wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/maca.Dockerfile
    +docker build --network=host -t mineru:maca-vllm-latest -f maca.Dockerfile .
    +
  4. +
+

2.2 使用 Dockerfile 构建镜像 (lmdeploy)

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/maca.Dockerfile
+# 将基础镜像从 vllm 切换为 lmdeploy
+sed -i '3s/^/# /' maca.Dockerfile && sed -i '5s/^# //' maca.Dockerfile
+docker build --network=host -t mineru:maca-lmdeploy-latest -f maca.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run --ipc host \
+   --cap-add SYS_PTRACE \
+   --privileged=true \
+   --device=/dev/mem \
+   --device=/dev/dri \
+   --device=/dev/mxcd \
+   --device=/dev/infiniband \
+   --group-add video \
+   --network=host \
+   --shm-size '100gb' \
+   --ulimit memlock=-1 \
+   --security-opt seccomp=unconfined \
+   --security-opt apparmor=unconfined \
+   --name mineru_docker \
+   -v /datapool:/datapool \
+   -e MINERU_MODEL_SOURCE=local \
+   -e MINERU_LMDEPLOY_DEVICE=maca \
+   -it mineru:maca-vllm-latest \
+   /bin/bash
+
+
+

Tip

+

请根据实际情况选择使用vllmlmdeploy版本的镜像,如需使用lmdeploy,替换上述命令中的mineru:maca-vllm-latestmineru:maca-lmdeploy-latest即可。

+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对maca加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllmlmdeploy
命令行工具(mineru)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
fastapi服务(mineru-api)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
gradio界面(mineru-gradio)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
openai-server服务(mineru-openai-server)🟢🟢
数据并行 (--data-parallel-size/--dp)🔴🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • MACA加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明。
  • +
  • 在METAX平台可以通过mx-smi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/MooreThreads/index.html b/zh/usage/acceleration_cards/MooreThreads/index.html new file mode 100644 index 00000000..d46e3c81 --- /dev/null +++ b/zh/usage/acceleration_cards/MooreThreads/index.html @@ -0,0 +1,1740 @@ + + + + + + + + + + + + + + + + + + + + + MooreThreads - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

MooreThreads

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.4 LTS  
+cpu: Intel x86-64
+dcu: MTT S4000
+driver: 3.0.0-rc-KuaE2.0
+docker: 24.0.7
+
+

2. 环境准备

+

2.1 使用 Dockerfile 构建镜像

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/musa.Dockerfile
+docker build --network=host -t mineru:musa-vllm-latest -f musa.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run -u root --name mineru_docker \
+    --network=host \
+    --ipc=host \
+    --shm-size=80g \
+    --privileged \
+    -e MTHREADS_VISIBLE_DEVICES=all \
+    -e MINERU_VLLM_DEVICE=musa \
+    -e MINERU_MODEL_SOURCE=local \
+    -it mineru:musa-vllm-latest \
+    /bin/bash
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对MooreThreads加速卡的支持情况如下表所示:

+
+

Note

+

兼容性说明:由于摩尔线程(MooreThreads)目前对 vLLM v1 引擎的支持尚待完善,MinerU 现阶段采用 v0 引擎作为适配方案。 +受此限制,vLLM 的异步引擎(Async Engine)功能存在兼容性问题,可能导致部分使用场景无法正常运行。 +我们将持续跟进摩尔线程对 vLLM v1 引擎的支持进展,并及时在 MinerU 中进行相应的适配与优化。

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🔴
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🔴
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • MooreThreads加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考GPU 枚举
  • +
  • 在MooreThreads平台可以通过mthreads-gmi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/THead/index.html b/zh/usage/acceleration_cards/THead/index.html new file mode 100644 index 00000000..a9b47115 --- /dev/null +++ b/zh/usage/acceleration_cards/THead/index.html @@ -0,0 +1,1765 @@ + + + + + + + + + + + + + + + + + + + + + THead - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

THead

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04   
+cpu: INTEL x86_64
+ppu: ZW810E  
+driver: 1.4.0
+docker: 26.1.4
+
+

2. 环境准备

+
+

Note

+

ppu加速卡支持使用vllmlmdeploy进行VLM模型推理加速。请根据实际需求选择安装和使用其中之一:

+
+

2.1 使用 Dockerfile 构建镜像 (vllm)

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/ppu.Dockerfile
+docker build --network=host -t mineru:ppu-vllm-latest -f ppu.Dockerfile .
+
+

2.2 使用 Dockerfile 构建镜像 (lmdeploy)

+
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/docker/china/ppu.Dockerfile
+# 将基础镜像从 vllm 切换为 lmdeploy
+sed -i '3s/^/# /' ppu.Dockerfile && sed -i '5s/^# //' ppu.Dockerfile
+docker build --network=host -t mineru:ppu-lmdeploy-latest -f ppu.Dockerfile .
+
+

3. 启动 Docker 容器

+
docker run --privileged=true \
+  --name mineru_docker \
+  --device=/dev/alixpu \
+  --device=/dev/alixpu_ctl \
+  --ipc=host \
+  --network=host \
+  --ulimit memlock=-1 \
+  --ulimit stack=67108864 \
+  --shm-size=500g \
+  -v /mnt:/mnt \
+  -v /datapool:/datapool \
+  -v /var/run/docker.sock:/var/run/docker.sock \
+  -e MINERU_MODEL_SOURCE=local \
+  -it mineru:ppu-vllm-latest \
+  /bin/bash
+
+
+

Tip

+

请根据实际情况选择使用vllmlmdeploy版本的镜像,如需使用lmdeploy,替换上述命令中的mineru:ppu-vllm-latestmineru:ppu-lmdeploy-latest即可。

+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对ppu加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllmlmdeploy
命令行工具(mineru)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
fastapi服务(mineru-api)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
gradio界面(mineru-gradio)pipeline🟢🟢
<vlm/hybrid>-auto-engine🟢🟢
<vlm/hybrid>-http-client🟢🟢
openai-server服务(mineru-openai-server)🟢🟢
数据并行 (--data-parallel-size/--dp)🔴🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • PPU加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明。
  • +
  • 在T-Head平台可以通过ppu-smi命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/Tecorigin/index.html b/zh/usage/acceleration_cards/Tecorigin/index.html new file mode 100644 index 00000000..48d014fe --- /dev/null +++ b/zh/usage/acceleration_cards/Tecorigin/index.html @@ -0,0 +1,1744 @@ + + + + + + + + + + + + + + + + + + + + + Tecorigin - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Tecorigin

+ +

1. 测试平台

+

以下为本指南测试使用的平台信息,供参考: +

os: Ubuntu 22.04.5 LTS  
+cpu: AMD EPYC (amd64)
+gpu: T100
+driver: 3.0.0
+docker: 28.0.4
+
+

2. 环境准备

+

2.1 下载并加载镜像 (vllm)

+
wget http://wb.tecorigin.com:8082/repository/teco-customer-repo/Course/MinerU/mineru-vllm.tar
+
+docker load -i mineru-vllm.tar
+
+

3. 启动 Docker 容器

+
docker run -dit --name mineru_docker \
+    --privileged \
+    --cap-add SYS_PTRACE \
+    --cap-add SYS_ADMIN \
+    --network=host \
+    --shm-size=500G \
+    mineru:sdaa-vllm-latest \
+    /bin/bash
+
+
+

Tip

+

如需使用vllm环境,请执行以下操作: +- 进入容器后,通过以下命令切换到conda环境: +

conda activate vllm_env_py310
+
+
    +
  • 切换成功后,您可以在命令行前看到(vllm_env_py310)的标识,这表示您已成功进入vllm的虚拟环境。
  • +
+
+

执行该命令后,您将进入到Docker容器的交互式终端,您可以直接在容器内运行MinerU相关命令来使用MinerU的功能。 +您也可以直接通过替换/bin/bash为服务启动命令来启动MinerU服务,详细说明请参考通过命令启动服务

+

4. 注意事项

+

不同环境下,MinerU对Tecorigin加速卡的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景容器环境
vllm
命令行工具(mineru)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
fastapi服务(mineru-api)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
gradio界面(mineru-gradio)pipeline🟢
<vlm/hybrid>-auto-engine🟢
<vlm/hybrid>-http-client🟢
openai-server服务(mineru-openai-server)🟢
数据并行 (--data-parallel-size)🔴
+ +

注:
+🟢: 支持,运行较稳定,精度与Nvidia GPU基本一致
+🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
+🔴: 不支持,无法运行,或精度存在较大差异

+
+

Tip

+
    +
  • Tecorigin加速卡指定可用加速卡的方式与NVIDIA GPU类似,请参考使用指定GPU设备章节说明, +将环境变量CUDA_VISIBLE_DEVICES替换为SDAA_VISIBLE_DEVICES即可。
  • +
  • 在太初平台可以通过teco-smi -c命令查看加速卡的使用情况,并根据需要指定空闲的加速卡ID以避免资源冲突。
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/acceleration_cards/VastAI/index.html b/zh/usage/acceleration_cards/VastAI/index.html new file mode 100644 index 00000000..df57bbb9 --- /dev/null +++ b/zh/usage/acceleration_cards/VastAI/index.html @@ -0,0 +1,1866 @@ + + + + + + + + + + + + + + + + + + + + + VastAI - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

VastAI

+ +

1. 瀚博半导体

+

vastaitech

+
    +
  • 官方网址:https://www.vastaitech.com
  • +
  • 模型中心:https://github.com/Vastai/VastModelZOO
  • +
+

2. 测试平台

+
    +
  • 以下为本指南测试使用的平台信息,供参考 +
    os: Ubuntu-22.04.3-LTS-x86_64
    +cpu: Hygon C86-4G
    +gpu: VA16 / VA1L / VA10L
    +torch: 2.8.0+cpu
    +torch-vacc: 1.3.3.777
    +vllm: 0.11.1.dev0+gb8b302cde.d20251030.cpu
    +vllm-vacc: 0.11.0.777
    +driver: 00.25.12.30 d3_3_v2_9_a3_1 a76bf37 20251230
    +docker: 28.1.1
    +
  • +
+

3. 环境准备

+
    +
  • +

    获取vllm_vacc基础镜像 +

    sudo docker pull harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP2
    +
    +
  • +
  • +

    启动容器 +

    sudo docker run -it \
    +    --privileged=true \
    +    --shm-size=256g \
    +    --name vllm_service \
    +    --ipc=host \
    +    --network=host \
    +    harbor.vastaitech.com/ai_deliver/vllm_vacc:VVI-25.12.SP2 bash
    +
    +
  • +
  • +

    安装MinerU

    +
  • +
  • +

    参考官方文档安装:README_zh-CN.md#安装-mineru

    +
    ```bash
    +# 启动容器
    +# sudo docker exec -it vllm_service bash
    +
    +# 可选pypi源
    +# https://mirrors.163.com/pypi/simple/
    +# https://mirrors.aliyun.com/pypi/simple/
    +# https://pypi.mirrors.ustc.edu.cn/simple/
    +# https://pypi.tuna.tsinghua.edu.cn/simple/
    +# https://mirror.baidu.com/pypi/simple
    +
    +# 通过源码安装MinerU
    +git clone https://github.com/opendatalab/MinerU.git
    +git checkout 8c4b3ef3a20b11ddac9903f25124d24ea82639b5
    +pip install -e .[core] -i https://mirrors.aliyun.com/pypi/simple
    +
    +# 或使用pip安装MinerU
    +pip install -U "mineru[core]==2.7.0" -i https://mirrors.aliyun.com/pypi/simple
    +```
    +
    +
  • +
+
+

Note

+
    +
  • vllm_vacc基础镜像内已包含torch/vllm等相关依赖
  • +
  • 截至2025/12/31VastAI已支持MinerU至最新版本2.7.0master分支8c4b3ef3
  • +
  • NVIDIA硬件下CUDA_VISIBLE_DEVICES类似;在VastAI硬件中可以使用VACC_VISIBLE_DEVICES指定可见计算卡ID,如-e VACC_VISIBLE_DEVICES=0,1,2,3
  • +
  • 需指定适当的--shm-size虚拟内存
  • +
+
+

4. MinerU功能

+
+

Note

+
    +
  • VastAI加速卡仅支持使用vlm-auto-enginevlm-http-client形式进行VLM模型推理加速
  • +
+
+
    +
  • +

    进入容器 +

    sudo docker exec -it vllm_service bash
    +
    +
  • +
  • +

    使用MinerU

    +
      +
    • +

      模型准备,参考官方介绍:model_source.md

      +
    • +
    • +

      方式一:vlm-auto-engine

      +
      export MINERU_MODEL_SOURCE=modelscope
      +
      +# step1, 以`vlm-auto-engine`方式启动MinerU解析任务
      +mineru -p image.png \
      +-o ./output \
      +-b vlm-auto-engine \
      +--http-timeout 1200 \
      +--tensor-parallel-size 2 \
      +--enforce_eager \
      +--trust-remote-code \
      +--max-model-len 16384
      +
      +
    • +
    • +

      方式二:vlm-http-client

      +
      # step1, 启动vLLM API server
      +vllm serve /root/.cache/modelscope/hub/models/OpenDataLab/MinerU2.5-2509-1.2B \
      +--tensor-parallel-size 2 \
      +--trust-remote-code \
      +--enforce_eager \
      +--port 8090 \
      +--max-model-len 16384 \
      +--served-model-name MinerU2.5-2509-1.2B
      +
      +# step2,以`vlm-http-client`方式启动MinerU解析任务
      +mineru -p demo/pdfs/demo1.pdf \
      +-o ./output \
      +-b vlm-http-client \
      +-u http://127.0.0.1:8090 \
      +--http-timeout 1200
      +
      +
    • +
    +
  • +
+
+

Note

+
    +
  • 注意在执行任意与vllm相关命令需追加--enforce_eager参数
  • +
+
+

5. 注意事项

+

VastAI加速卡对MinerU的支持情况如下表所示:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
使用场景支持情况
命令行工具(mineru)pipeline🔴
hybrid-http-client🔴
hybrid-auto-engine🔴
vlm-auto-engine🟢
vlm-http-client🟢
fastapi服务(mineru-api)pipeline🔴
hybrid-http-client🔴
hybrid-auto-engine🔴
vlm-auto-engine🟢
vlm-http-client🟢
gradio界面(mineru-gradio)pipeline🔴
hybrid-http-client🔴
hybrid-auto-engine🔴
vlm-auto-engine🟢
vlm-http-client🟢
openai-server服务(mineru-openai-server)🟢
Tensor并行 (--tensor-parallel-size)🟢
数据并行 (--data-parallel-size)🔴
+ +
+

Note

+
    +
  • 🟢: 支持,运行较稳定,精度与NVIDIA GPU基本一致
  • +
  • 🟡: 支持但较不稳定,在某些场景下可能出现异常,或精度存在一定差异
  • +
  • 🔴: 不支持,无法运行,或精度存在较大差异
  • +
  • vlm-auto-engine:VastAI仅支持vLLM后端
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/advanced_cli_parameters/index.html b/zh/usage/advanced_cli_parameters/index.html new file mode 100644 index 00000000..ea4b9514 --- /dev/null +++ b/zh/usage/advanced_cli_parameters/index.html @@ -0,0 +1,1967 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 命令行进阶参数 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

命令行参数进阶

+

推理引擎参数透传

+

vllm 加速参数优化

+
+

Tip

+

如果您已经可以正常使用vllm对vlm模型进行加速推理,但仍然希望进一步提升推理速度,可以尝试以下参数:

+
    +
  • 如果您有超过多张显卡,可以使用vllm的多卡并行模式来增加吞吐量:--data-parallel-size 2
  • +
+
+

参数传递说明

+
+

Tip

+
    +
  • 所有vllm/lmdeploy官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:minerumineru-openai-servermineru-gradiomineru-api
  • +
  • 如果您想了解更多有关vllm的参数使用方法,请参考 vllm官方文档
  • +
  • 如果您想了解更多有关lmdeploy的参数使用方法,请参考 lmdeploy官方文档
  • +
+
+

GPU 设备选择与配置

+

CUDA_VISIBLE_DEVICES 基本用法

+
+

Tip

+
    +
  • 任何情况下,您都可以通过在命令行的开头添加CUDA_VISIBLE_DEVICES 环境变量来指定可见的 GPU 设备: +
    CUDA_VISIBLE_DEVICES=1 mineru -p <input_path> -o <output_path>
    +
  • +
  • 这种指定方式对所有的命令行调用都有效,包括 minerumineru-openai-servermineru-gradiomineru-api,且对pipelinevlm后端均适用。
  • +
+
+

常见设备配置示例

+
+

Tip

+

以下是一些常见的 CUDA_VISIBLE_DEVICES 设置示例: +

CUDA_VISIBLE_DEVICES=1  # Only device 1 will be seen
+CUDA_VISIBLE_DEVICES=0,1  # Devices 0 and 1 will be visible
+CUDA_VISIBLE_DEVICES="0,1"  # Same as above, quotation marks are optional
+CUDA_VISIBLE_DEVICES=0,2,3  # Devices 0, 2, 3 will be visible; device 1 is masked
+CUDA_VISIBLE_DEVICES=""  # No GPU will be visible
+
+
+

实际应用场景

+
+

Tip

+

以下是一些可能的使用场景:

+
    +
  • +

    如果您有多张显卡,需要指定卡0和卡1,并使用多卡并行来启动openai-server,可以使用以下命令: +

    CUDA_VISIBLE_DEVICES=0,1 mineru-openai-server --engine vllm --port 30000 --data-parallel-size 2
    +
    +
  • +
  • +

    如果您有多张显卡,需要在卡0和卡1上启动两个fastapi服务,并分别监听不同的端口,可以使用以下命令: +

    # 在终端1中
    +CUDA_VISIBLE_DEVICES=0 mineru-api --host 127.0.0.1 --port 8000
    +# 在终端2中
    +CUDA_VISIBLE_DEVICES=1 mineru-api --host 127.0.0.1 --port 8001
    +
    +
  • +
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/cli_tools/index.html b/zh/usage/cli_tools/index.html new file mode 100644 index 00000000..5cb81bcb --- /dev/null +++ b/zh/usage/cli_tools/index.html @@ -0,0 +1,1973 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 命令行工具 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

命令行工具使用说明

+

查看帮助信息

+

要查看 MinerU 命令行工具的帮助信息,可以使用 --help 参数。以下是各个命令行工具的帮助信息示例: +

mineru --help
+Usage: mineru [OPTIONS]
+
+Options:
+  -v, --version                   显示版本并退出
+  -p, --path PATH                 输入文件路径或目录(必填)
+  -o, --output PATH               输出目录(必填)
+  --api-url TEXT                  MinerU FastAPI 服务地址;不传时自动拉起本地临时 mineru-api
+  -m, --method [auto|txt|ocr]     解析方法:auto(默认)、txt、ocr(仅用于 pipeline  hybrid* 后端)
+  -b, --backend [pipeline|hybrid-auto-engine|hybrid-http-client|vlm-auto-engine|vlm-http-client]
+                                  解析后端(默认为 hybrid-auto-engine)
+  -l, --lang [ch|ch_server|ch_lite|en|korean|japan|chinese_cht|ta|te|ka|th|el|latin|arabic|east_slavic|cyrillic|devanagari]
+                                  指定文档语言(可提升 OCR 准确率,仅用于 pipeline  hybrid* 后端)
+  -u, --url TEXT                  当使用 http-client 时,传给服务端后端的 OpenAI 兼容地址
+  -s, --start INTEGER             开始解析的页码(从 0 开始)
+  -e, --end INTEGER               结束解析的页码(从 0 开始)
+  -f, --formula BOOLEAN           是否启用公式解析(默认开启)
+  -t, --table BOOLEAN             是否启用表格解析(默认开启)
+  --help                          显示帮助信息
+
+
mineru-api --help
+Usage: mineru-api [OPTIONS]
+
+Options:
+  --host TEXT     服务器主机地址(默认:127.0.0.1)
+  --port INTEGER  服务器端口(默认:8000)
+  --reload        启用自动重载(开发模式)
+  --help          显示此帮助信息并退出
+
+
mineru-gradio --help
+Usage: mineru-gradio [OPTIONS]
+
+Options:
+  --enable-example BOOLEAN        启用示例文件输入(需要将示例文件放置在当前
+                                  执行命令目录下的 `example` 文件夹中)
+  --enable-http-client BOOLEAN    在后端选项中启用 HTTP 客户端选项
+  --enable-api BOOLEAN            启用 Gradio API 以提供应用程序服务
+  --max-convert-pages INTEGER     设置从 PDF 转换为 Markdown 的最大页数
+  --server-name TEXT              设置 Gradio 应用程序的服务器主机名
+  --server-port INTEGER           设置 Gradio 应用程序的服务器端口
+  --latex-delimiters-type [a|b|all]
+                                  设置在 Markdown 渲染中使用的 LaTeX 分隔符类型
+                                  ('a' 表示 '$' 类型,'b' 表示 '()[]' 类型,
+                                  'all' 表示两种类型都使用)
+  --help                          显示此帮助信息并退出
+
+

环境变量说明

+
+

Note

+

从当前版本开始,mineru 是基于 mineru-api 的编排客户端: +- 未传 --api-url 时,CLI 会自动拉起本地临时 mineru-api +- 传入 --api-url 时,CLI 会直连该 FastAPI 服务 +- --url 不再表示 MinerU API 地址,而是服务端 vlm/hybrid-http-client 所需的 OpenAI 兼容地址

+
+

MinerU命令行工具的某些参数存在相同功能的环境变量配置,通常环境变量配置的优先级高于命令行参数,且在所有命令行工具中都生效。 +以下是常用的环境变量及其说明:

+
    +
  • +

    MINERU_TOOLS_CONFIG_JSON

    +
      +
    • 用于指定配置文件路径
    • +
    • 默认为用户目录下的mineru.json,可通过环境变量指定其他配置文件路径。
    • +
    +
  • +
  • +

    MINERU_FORMULA_ENABLE

    +
      +
    • 用于启用公式解析
    • +
    • 默认为true,可通过环境变量设置为false来禁用公式解析。
    • +
    +
  • +
  • +

    MINERU_FORMULA_CH_SUPPORT

    +
      +
    • 用于启用中文公式解析优化(实验性功能)
    • +
    • 默认为false,可通过环境变量设置为true来启用中文公式解析优化。
    • +
    • 仅对pipeline后端生效。
    • +
    +
  • +
  • +

    MINERU_TABLE_ENABLE

    +
      +
    • 用于启用表格解析
    • +
    • 默认为true,可通过环境变量设置为false来禁用表格解析。
    • +
    +
  • +
  • +

    MINERU_TABLE_MERGE_ENABLE

    +
      +
    • 用于启用表格合并功能
    • +
    • 默认为true,可通过环境变量设置为false来禁用表格合并功能。
    • +
    +
  • +
  • +

    MINERU_PDF_RENDER_TIMEOUT

    +
      +
    • 用于设置将PDF渲染为图片的超时时间(秒)
    • +
    • 默认为300秒,可通过环境变量设置为其他值以调整渲染图片的超时时间。
    • +
    • 仅在linux和macOS系统中生效。
    • +
    +
  • +
  • +

    MINERU_PDF_RENDER_THREADS

    +
      +
    • 用于设置将PDF渲染为图片时使用的线程数
    • +
    • 默认为4,可通过环境变量设置为其他值以调整渲染图片时的线程数。
    • +
    • 仅在linux和macOS系统中生效。
    • +
    +
  • +
  • +

    MINERU_INTRA_OP_NUM_THREADS

    +
      +
    • 用于设置onnx模型的intra_op线程数,影响单个算子的计算速度
    • +
    • 默认为-1(自动选择),可通过环境变量设置为其他值以调整线程数。
    • +
    +
  • +
  • +

    MINERU_INTER_OP_NUM_THREADS

    +
      +
    • 用于设置onnx模型的inter_op线程数,影响多个算子的并行执行
    • +
    • 默认为-1(自动选择),可通过环境变量设置为其他值以调整线程数。
    • +
    +
  • +
  • +

    MINERU_HYBRID_BATCH_RATIO

    +
      +
    • 用于设置 hybrid-* 后端中 小模型处理的batch倍率
    • +
    • 在hybrid-http-client中较为常用,可以通过控制小模型的batch倍率来调整单个客户端的显存占用量
    • +
    • + + + + + + + + + + + + + + + + + + + + + + + + + +
      单个client端显存大小MINERU_HYBRID_BATCH_RATIO
      <= 6 GB8
      <= 4.5 GB4
      <= 3 GB2
      <= 2.5 GB1
      +
    • +
    +
  • +
  • +

    MINERU_HYBRID_FORCE_PIPELINE_ENABLE

    +
      +
    • 用于强制将 hybrid-* 后端中的 文本提取部分使用 小模型 进行处理
    • +
    • 默认为false,可通过环境变量设置为true来启用该功能,从而在某些极端情况下减少幻觉的发生。
    • +
    +
  • +
  • +

    MINERU_VL_MODEL_NAME

    +
      +
    • 用于指定 vlm/hybrid 后端使用的模型名称,这将允许您在同时存在多个模型的远程openai-server中指定 MinerU 运行所需的模型。
    • +
    +
  • +
  • +

    MINERU_VL_API_KEY:

    +
      +
    • 用于指定 vlm/hybrid 后端使用的API Key,这将允许您在远程openai-server中进行身份验证。
    • +
    +
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/index.html b/zh/usage/index.html new file mode 100644 index 00000000..5d9d9b35 --- /dev/null +++ b/zh/usage/index.html @@ -0,0 +1,1723 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 使用指南 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

使用指南

+

本章节提供了项目的完整使用说明。我们将通过以下几个部分,帮助您从基础到进阶逐步掌握项目的使用方法:

+

目录

+ +

开始使用

+

建议按照上述顺序阅读文档,这样可以帮助您更好地理解和使用项目功能。

+

如果您在使用过程中遇到问题,请查看 FAQ

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/model_source/index.html b/zh/usage/model_source/index.html new file mode 100644 index 00000000..e9fe7539 --- /dev/null +++ b/zh/usage/model_source/index.html @@ -0,0 +1,1935 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 模型源配置 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

模型源说明

+

MinerU使用 HuggingFaceModelScope 作为模型仓库,用户可以根据需要切换模型源或使用本地模型。

+
    +
  • HuggingFace 是默认的模型源,在全球范围内提供了优异的加载速度和极高稳定性。
  • +
  • ModelScope 是中国大陆地区用户的最佳选择,提供了无缝兼容的SDK模块,适用于无法访问HuggingFace的用户。
  • +
+

模型源的切换方法

+

通过命令行参数切换

+

目前仅mineru命令行工具支持通过命令行参数切换模型源,其他命令行工具如mineru-apimineru-gradio等暂不支持。 +

mineru -p <input_path> -o <output_path> --source modelscope
+
+

通过环境变量切换

+

在任何情况下可以通过设置环境变量来切换模型源,这适用于所有命令行工具和API调用。 +

export MINERU_MODEL_SOURCE=modelscope
+
+或 +
import os
+os.environ["MINERU_MODEL_SOURCE"] = "modelscope"
+
+
+

Tip

+

通过环境变量设置的模型源会在当前终端会话中生效,直到终端关闭或环境变量被修改。且优先级高于命令行参数,如同时设置了命令行参数和环境变量,命令行参数将被忽略。

+
+

使用本地模型

+

1. 下载模型到本地

+

mineru-models-download --help
+
+或使用交互式命令行工具选择模型下载: +
mineru-models-download
+
+
+

Note

+
    +
  • 下载完成后,模型路径会在当前终端窗口输出,并自动写入用户目录下的 mineru.json
  • +
  • 您也可以通过将配置模板文件复制到用户目录下并重命名为 mineru.json 来创建配置文件。
  • +
  • 模型下载到本地后,您可以自由移动模型文件夹到其他位置,同时需要在 mineru.json 中更新模型路径。
  • +
  • 如您将模型文件夹部署到其他服务器上,请确保将 mineru.json文件一同移动到新设备的用户目录中并正确配置模型路径。
  • +
  • 如您需要更新模型文件,可以再次运行 mineru-models-download 命令,模型更新暂不支持自定义路径,如您没有移动本地模型文件夹,模型文件会增量更新;如您移动了模型文件夹,模型文件会重新下载到默认位置并更新mineru.json
  • +
+
+

2. 使用本地模型进行解析

+

mineru -p <input_path> -o <output_path> --source local
+
+或通过环境变量启用: +
export MINERU_MODEL_SOURCE=local
+mineru -p <input_path> -o <output_path>
+
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/BISHENG/index.html b/zh/usage/plugin/BISHENG/index.html new file mode 100644 index 00000000..aba95cce --- /dev/null +++ b/zh/usage/plugin/BISHENG/index.html @@ -0,0 +1,1639 @@ + + + + + + + + + + + + + + + + + + + + + BISHENG 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

BISHENG 简介

+

BISHENG毕昇 是一款开源 LLM应用开发平台,主攻企业场景, 已有大量行业头部组织及世界500强企业在使用。“毕昇”是活字印刷术的发明人,活字印刷术为人类知识的传递起到了巨大的推动作用。BISHENG毕昇团队希望“BISHENG毕昇”同样能够为智能应用的广泛落地提供有力支撑。

+

+
    +
  • 官网地址:https://bisheng.dataelem.com/
  • +
  • Miner 在BISHENG毕昇 项目中的插件项目:https://github.com/dataelement/bisheng/pulls
  • +
+

特别鸣谢 @pzc163

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/Cherry_Studio/index.html b/zh/usage/plugin/Cherry_Studio/index.html new file mode 100644 index 00000000..a66676d1 --- /dev/null +++ b/zh/usage/plugin/Cherry_Studio/index.html @@ -0,0 +1,1759 @@ + + + + + + + + + + + + + + + + + + + + + Cherry Studio 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Cherry Studio 简介

+

Cherry Studio 是一款功能强大的多模型 AI 客户端软件,支持 Windows、macOS 和 Linux 等多平台运行,集成了 OpenAI、DeepSeek、Gemini、Anthropic 等主流 AI 云服务,同时支持本地模型运行,用户可以灵活切换不同的AI模型。

+

目前,MinerU 强大的文档解析能力已深度集成到 Cherry Studio 的知识库与对话交互中,为用户带来更便捷的文档处理与信息获取体验。

+

img

+
    +
  • Cherry Studio 官网地址:https://www.cherry-ai.com/
  • +
+

MinerU 在 Cherry Studio 中的使用方法

+

进入 Cherry Studio 设置

+

a. 打开 Cherry Studio 应用程序

+

b. 点击左下角的"设置"按钮,进入设置页面

+

c. 在左侧菜单中,选择"MCP 服务器"

+

在右侧的 MCP 服务器配置界面中,您可以看到已有的 MCP 服务器列表。点击右上角的"添加服务器"按钮来创建新的 MCP 服务,或者点击现有服务来编辑配置。

+

添加 MinerU-MCP 配置

+

点击"添加服务器"后,您将看到一个配置表单。请按以下步骤填写:

+

a. 名称:输入"MinerU-MCP"或您喜欢的其他名称

+

b. 描述:可选,如"文档转换为Markdown工具"

+

c. 类型:选择"标准输入/输出(stdio)"

+

d. 命令:输入 uvx

+

e. 参数:输入 mineru-mcp

+

f. 环境变量:添加以下环境变量

+
MINERU_API_BASE=https://mineru.net
+MINERU_API_KEY=您的API密钥
+OUTPUT_DIR=./downloads
+USE_LOCAL_API=false
+LOCAL_MINERU_API_BASE=http://localhost:8888
+
+

使用 uvx 命令可以自动处理 mineru-mcp 的安装和运行,无需预先手动安装 mineru-mcp 包。这是最简单的配置方式。

+

保存配置

+

确认无误后,点击界面右上角的"保存"按钮完成配置。保存后,MCP 服务器列表中会显示您刚刚添加的 MinerU-MCP 服务。

+

img

+

img

+

使用 Cherry Studio 中的 MinerU MCP

+

一旦配置完成,您可以在 Cherry Studio 中的对话中使用 MinerU MCP 工具。在 Cherry Studio 中,您可以使用如下提示让模型调用 MinerU MCP 工具。模型会自动识别任务并调用相应的工具。

+

示例 1: 使用 URL 转换文档

+

用户输入:

+
请使用 MinerU MCP 将以下 URL 的 PDF 文档转换为 Markdown 格式:https://example.com/sample.pdf
+
+

模型将执行的步骤:

+

模型识别这是文档转换任务,并调用 parse_documents 工具,参数为:

+
{"file_sources": "https://example.com/sample.pdf"}
+
+

工具处理完成后,模型会告知您转换结果。

+

img

+

示例 2: 转换本地文档

+

用户输入:

+
请使用 MinerU-MCP 将本地的 D://sample.pdf 文件转换为 Markdown 格式
+
+

模型将执行的步骤:

+

模型识别这是本地文档转换任务,调用 parse_documents 工具,参数为:

+
{"file_sources": "D://sample.pdf"}
+
+

img

+

示例 3: 启用 OCR 处理扫描文档

+

用户输入:

+
请使用 MinerU-MCP 将以下 URL 的扫描版 PDF 文档转换为 Markdown 格式,并启用 OCR 功能:
+https://example.com/scanned.pdf
+
+

模型将执行的步骤:

+

模型识别这是需要 OCR 处理的文档转换任务,调用 parse_documents 工具,并启用 OCR 参数:

+
{"file_sources": "https://example.com/scanned.pdf","enable_ocr": true}
+
+

img

+

示例 4: 完整对话流程

+

以下是一个完整的对话流程示例:

+

用户:

+
我有一份学术论文的 PDF,网址是 https://arxiv.org/pdf/2303.08774.pdf,能帮我转换成 Markdown 格式吗?
+
+

模型:

+
我可以帮您将这份学术论文转换为 Markdown 格式。我将使用 MinerU-MCP 工具来处理这个任务。
+[模型调用 parse_documents 工具,处理过程中...]
+论文已成功转换为 Markdown 格式。转换结果如下:
+# The Capacity of Diffusion Models to Memorize and Generate Training Data
+## Abstract
+Recent diffusion models can generate high-quality images that are nearly indistinguishable from real ones...
+[显示论文内容...]
+
+

img

+

工具参数详解

+

在使用过程中,模型会根据您的指令自动选择合适的工具和参数。以下是主要工具的参数说明:

+

parse_documents 工具参数

+

img

+

get_ocr_languages 工具参数

+

无需参数,用于获取OCR支持的语言列表。

+

高级用法

+

指定语言和页码范围

+

用户输入:

+
请使用 MinerU MCP 将以下 URL 的文档转换为 Markdown 格式,只处理第 5-10 页,并指定语言为中文:https://example.com/document.pdf
+
+

模型会使用 parse_documents 工具,并设置 language 参数为 "ch",page_ranges 参数为 "5-10"。

+

批量处理多个文档

+

用户输入:

+
请使用 MinerU-MCP 将以下多个 URL 的文档转换为 Markdown 格式:
+https://example.com/doc1.pdf
+https://example.com/doc2.pdf
+https://example.com/doc3.pdf
+
+

模型会调用 parse_documents 工具,并将多个 URL 以逗号分隔传入 file_sources 参数。

+

注意事项

+

● 当设置 USE_LOCAL_API=true 时,使用本地配置的API进行解析

+

● 当设置 USE_LOCAL_API=false 时,会使用 MinerU 官网的API进行解析

+

● 处理大型文档可能需要较长时间,请耐心等待

+

● 如果遇到超时问题,请考虑分批处理文档或使用本地API模式

+

常见问题与解决方案

+

无法启动 MCP 服务

+

问题:运行 uv run -m mineru.cli时报错。

+

解决方案

+

● 确保已激活虚拟环境

+

● 检查是否已安装所有依赖

+

● 尝试使用 python -m mineru.cli命令替代

+

文件转换失败

+

问题:文件上传成功但转换失败。

+

解决方案

+

● 检查文件格式是否受支持

+

● 确认API密钥是否正确

+

● 查看MCP服务日志获取详细错误信息

+

文件路径问题

+

问题:使用 parse_documents 工具处理本地文件时报找不到文件错误。

+

解决方案:请确保使用绝对路径,或者相对于服务器运行目录的正确相对路径。

+

MCP 服务调用超时问题

+

问题:调用 parse_documents 工具时出现 Error calling tool 'parse_documents': MCP error -32001: Request timed out 错误。

+

解决方案:这个问题常见于处理大型文档或网络不稳定的情况。在某些 MCP 客户端(如 Cursor)中,超时后可能导致无法再次调用 MCP 服务,需要重启客户端。最新版本的 Cursor 中可能会显示正在调用 MCP,但实际上没有真正调用成功。建议:

+

● 等待官方修复:这是Cursor客户端的已知问题,建议等待Cursor官方修复

+

● 处理小文件:尽量只处理少量小文件,避免处理大型文档导致超时

+

● 分批处理:将多个文件分成多次请求处理,每次只处理一两个文件

+

● 增加超时时间设置(如果客户端支持)

+

● 对于超时后无法再次调用的问题,需要重启 MCP 客户端

+

● 如果反复出现超时,请检查网络连接或考虑使用本地 API 模式

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/Coze/index.html b/zh/usage/plugin/Coze/index.html new file mode 100644 index 00000000..af397639 --- /dev/null +++ b/zh/usage/plugin/Coze/index.html @@ -0,0 +1,1686 @@ + + + + + + + + + + + + + + + + + + + + + Coze 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Coze 简介

+

Coze(中文版名称:扣子) 是字节跳动推出的零代码 AI 应用开发平台。无论用户是否有编程经验,都可以通过该平台快速创建各种类型的聊天机器人、智能体、AI 应用和插件,并将其部署在社交平台和即时聊天应用程序中。

+

目前,MinerU 插件已在 Coze 插件商店上线,通过其强大的文档解析能力,为用户搭建智能体与工作流提供文档解析能力,加快用户 AI 应用的开发。

+

img

+
    +
  • 扣子官网地址:https://www.coze.cn/
  • +
  • MinerU 扣子插件下载地址:https://www.coze.cn/store/plugin/7527957359730360354
  • +
+

MinerU 在 Coze 中的使用方法

+

Coze:集成应用

+
    +
  • 进入 https://www.coze.cn/home coze 开发平台
  • +
+

智能体

+

工作空间 -> 项目开发 -> 创建 -> 创建智能体 -> 创建 -> 输入项目名

+

img

+

img

+

插件配置 -> 添加 插件 -> 搜索 MinerU

+

img

+

添加 parse_file 工具(在线版)

+

img

+

选择 MinerU 插件 -> 编辑参数 -> 填写 api key

+

img

+

img

+
+

记得关闭 url 和 token 显示

+
+

调试 智能体

+

img

+

工作流

+
+

用工作流的方式使用 minerU

+
+

工作流 -> 创建工作流

+

img

+

img

+

工作流插件配置 -> 添加 插件 -> 搜索 MinerU -> 添加

+

img

+

img

+

选择MinerU 插件 -> 编辑参数 -> 填写 api key

+

img

+

选择开始节点 -> 配置 input 类型为文件类型 -> 连接到 mineru 节点

+

img

+

img

+

选择结束节点 -> 连接到 mineru 节点 -> 配置 output 输出为 mineru 节点的 parse_file.text

+

img

+

img

+

上传文件 -> 试运行

+

img

+

img

+

发布 -> 添加到当前智能体

+

img

+

img

+

移除 mineru 插件 -> 调试

+

img

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/DataFlow/index.html b/zh/usage/plugin/DataFlow/index.html new file mode 100644 index 00000000..a31ab2a6 --- /dev/null +++ b/zh/usage/plugin/DataFlow/index.html @@ -0,0 +1,1640 @@ + + + + + + + + + + + + + + + + + + + + + 元枢智汇 ADP 智能数据平台 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

元枢智汇 ADP 智能数据平台 简介

+

元枢智汇 ADP 智能数据平台基于自研 AI 数据库和 DataFlow数据准备框架打造,旨在帮助企业高效管理、检索、处理海量数据,并通过体系化、自动化数据治理降低模型/智能体训练的专业门槛,帮助企业结合业务场景发挥私有数据的价值,真正落地AI应用。

+

目前,MinerU 已深度集成于元枢智汇 ADP 智能数据平台的 DataFlow 模块中,其数据解析服务由文档语料提取引擎 MinerU 提供支持。

+

+

+
    +
  • 官网地址:https://adp.originhub.tech/agent
  • +
  • Miner fastGPT 插件下载地址:https://cloud.fastgpt.io/dashboard/systemPlugin?type=productivity
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/Dify/index.html b/zh/usage/plugin/Dify/index.html new file mode 100644 index 00000000..076c22f1 --- /dev/null +++ b/zh/usage/plugin/Dify/index.html @@ -0,0 +1,1733 @@ + + + + + + + + + + + + + + + + + + + + + Dify 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Dify 简介

+

Dify 是一个开源的大语言模型(LLM)应用开发平台,旨在简化和加速生成式 AI 应用的创建和部署。它结合了后端即服务(BaaS)和 LLMOps 的理念,为开发者提供了用户友好的界面和强大的工具,有效降低了 AI 应用开发的门槛。

+

目前 MinerU 与 Dify 联合研发的 MinerU 插件已在 Dify 市场上架,帮助用户搭建工作流,提供文档解析的工作。

+

img

+
    +
  • Dify 官网地址:https://dify.ai/zh
  • +
  • MinerU Dify 插件下载地址:https://marketplace.dify.ai/plugins/langgenius/mineru
  • +
+

MinerU 在 Dify 中的使用方法

+

一、新版MinerU Dify插件亮点 (v0.4.0)

+
    +
  • 完美适配MinerU2:全面兼容MinerU2的最新功能,释放顶尖的文档解析能力。
  • +
  • 超高灵活性:同时支持官方在线API和本地化部署的API(并向下兼容 1.x 版本)。
  • +
  • 赋能工作流:让Dify的Agent拥有强大的文档“读写”能力,轻松处理复杂任务。
  • +
+

二、实战演练:两个案例带你快速上手

+

空谈不如实战。下面我们通过两个典型场景,向你展示新版插件的强大之处。

+

准备

+
    +
  1. +

    在Dify插件页面安装MinerU插件(私有化部署的Dify同理)

    +
  2. +
  3. +

    填写API URL等信息

    +
  4. +
+

img

+

使用官方API时令牌(Token)必须提供👆,使用本地部署API时令牌可不填写👇

+

img

+

案例一:解析单文件,搭建Chat PDF应用

+

想借助AI与你的文档对话吗?跟着下面几步,轻松实现

+

第一步:创建空白应用,选择“Chatflow”

+

输入应用名称与描述

+

img

+

第二步:创建的初始模板中,选择“开始”节点

+

字段类型选为单文件,填写变量名称(此处填为input_file),支持文档类型选为文档与图片

+

img

+

第三步:添加工具节点——MinerU插件来解析上一步开始节点上传的文件

+

img

+

第四步:设置MinerU的输入变量,选择上一步开始节点添加的 input_file

+

img

+

第五步:配置LLM模型

+

选择“LLM”节点后,如果没有模型可用,需要单独在插件市场安装(这里使用 Deepseek作为示例)

+

“上下文”选择MinerU的输出变量 text(MinerU解析文档后的markdown格式)

+

img

+

在“SYSTEM”区域根据实际需求填写提示词,可如图填写“在Parse File text中提取用户的问题答案”

+

img

+

第六步:预览,上传文件并提问机器人关于文档的内容

+

至此一个简单的文档问答应用Chat PDF搭建完成,点击“预览”,查看效果如何👇

+

img

+

结果如下:

+

img

+

第七步:发布与测试

+

保存并发布你的应用。现在,上传一份PDF或图片,你就可以和它自由对话了!

+

img

+

案例二:自动化批量处理文档,并上传至云端S3

+

需要处理大量文档并归档?MinerU 插件同样能胜任

+

第一步:安装 botos3 插件

+

img

+

第二步:配置 S3 bucket

+

img

+

第三步:创建工作流

+

选择字段类型为“文件列表”,填写变量名称(此处填为input_files),支持的文档类型选为文档与图片

+

img

+

第四步:添加“迭代”

+

在“开始”节点后添加“迭代”,并配置迭代内的MinerU节点,设置迭代的输入为上一步开始节点的upload_files,输出节点暂时不填写,再整个迭代配置完成后选择MinerU节点Parse File的full_zip_url

+

img

+

将MinerU的输入参数file选择为迭代器的 item

+

img

+

img

+

第五步:增加中间节点“代码执行”来转换MinerU的解析结果

+

输入变量(变量名称需与代码定义一致)

+
    +
  • text:选择MinerU Parse File的输出变量text
  • +
  • uploadFiles:选择“开始”节点的文件列表upload_files,用来根据迭代的index索引下标找到对应的原始文件名
  • +
  • index:迭代的下标索引,选择迭代器的index
  • +
+

输出变量(变量名称需与代码定义一致)

+
    +
  • fileName:String
  • +
  • base64:String
  • +
+

img

+

代码选择JavaScript,编写转换代码:

+

暂时无法在飞书文档外展示此内容

+

以下为Python版本:

+

暂时无法在飞书文档外展示此内容

+

第六步:配置 Botos3 插件来上传内容

+

添加工具节点Botos3,选择“通过s3上传base64”

+

img

+

文件base64选择代码执行(图中为转换MINERU MD文本)输出的base64字段

+

img

+

S3对象key,S3 对象key填写文件存储的路径,在 botos3 插件配置界面已经填写了 bucket 名称,这里只需要填写在bucket下存储的目录即可。选择代码执行(图中为转换MINERU MD文本)fileName

+

img

+

第七步:预览效果

+

连接结束节点,至此,一个简单的上传到s3的工作流配置完成,点击“运行”看看效果👇:

+

img

+

img

+

第八步:Vis3查看文档

+

运行结束,可通过vis3来查看S3桶内是否已上传解析后的md文件,Vis3使用可参考

+

新工具开源!Vis3大模型数据可视化利器:填 AK/SK 直接预览 S3 数据,JSON/视频/图片秒开!本地文件也可用

+

img

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/DingTalk/index.html b/zh/usage/plugin/DingTalk/index.html new file mode 100644 index 00000000..6c40e236 --- /dev/null +++ b/zh/usage/plugin/DingTalk/index.html @@ -0,0 +1,1639 @@ + + + + + + + + + + + + + + + + + + + + + 钉钉简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

钉钉简介

+

钉钉(DingTalk)是阿里巴巴集团打造的企业级智能移动办公平台,是数字经济时代的企业组织协同办公和应用开发平台。钉钉整合了 IM 即时沟通、钉钉文档、钉闪会、钉盘、Teambition、OA审批、智能人事、钉工牌、工作台等功能,旨在实现简单、高效、安全、智能的数字化工作方式。它支持企业组织数字化和业务数字化,覆盖“人、财、物、事、产、供、销、存”的全链路管理。

+

通过钉钉开放平台上的SaaS软件,企业可低成本搭建数字化应用,整合所有数字化系统。此外,钉钉提供超过2000个API接口,为企业数字化转型提供开放兼容环境。不会代码的用户也可利用低代码工具构建CRM、ERP、OA、项目管理、进销存等系统。

+

目前,钉钉文档、AI 表格等产品此前已深度集成 MinerU 能力,并通过开放平台向生态开发者开放文档解析功能,为 DLU 的联合研发提供了扎实的技术与场景基础。

+

+
    +
  • 钉钉官网:https://www.dingtalk.com/
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/FastGPT/index.html b/zh/usage/plugin/FastGPT/index.html new file mode 100644 index 00000000..3a89d01e --- /dev/null +++ b/zh/usage/plugin/FastGPT/index.html @@ -0,0 +1,1640 @@ + + + + + + + + + + + + + + + + + + + + + FastGPT 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

FastGPT 简介

+

FastGPT 是一个基于 LLM 大语言模型的知识库问答系统,将智能对话与可视化编排完美结合,让 AI 应用开发变得简单自然。无论您是开发者还是业务人员,都能轻松打造专属的 AI 应用。

+

目前,MinerU 插件已在 Coze 插件商店上线,通过其强大的文档解析能力,为用户搭建智能体与工作流提供文档解析能力,加快用户 AI 应用的开发。

+

img

+

img

+
    +
  • 官网地址:https://fastgpt.cn
  • +
  • Miner fastGPT 插件下载地址:https://cloud.fastgpt.io/dashboard/systemPlugin?type=productivity
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/ModelWhale/index.html b/zh/usage/plugin/ModelWhale/index.html new file mode 100644 index 00000000..80b9d9f7 --- /dev/null +++ b/zh/usage/plugin/ModelWhale/index.html @@ -0,0 +1,1641 @@ + + + + + + + + + + + + + + + + + + + + + ModelWhale 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

ModelWhale 简介

+

ModelWhale是一款高效率的数据科学云端协作工具,为数据工作者提供了即开即用的云端分析环境,Jupyter Notebook 交互式和Canvas 拖拽式两种分析界面,帮助科研者、教育工作者解决底层工程繁复、数据难以安全应用、成果流转复现困难等问题。基于不同使用场景,ModelWhale 为用户提供三个产品版本,分别是基础版、专业版、团队版。

+

目前,MinerU 插件已在 ModelWhale 工作中,通过其强大的文档解析能力,为用户搭建智能体与工作流提供文档解析能力,加快用户 AI 应用的开发。

+

images/DingTalk_01.png

+

+

+
    +
  • ModelWhale 官网:Mohttps://www.modelwhale.com/pricing?scroll=1
  • +
  • MinerU 在ModelWhale 的使用地址:https://www.heywhale.com/org/7b38d/workspace/iframe?url=https://www.heywhale.com/api/model/services/68089d360b1519a862ccb9b4/app/
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/RagFlow/index.html b/zh/usage/plugin/RagFlow/index.html new file mode 100644 index 00000000..731b3182 --- /dev/null +++ b/zh/usage/plugin/RagFlow/index.html @@ -0,0 +1,1697 @@ + + + + + + + + + + + + + + + + + + + + + RagFlow - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

RagFlow

+ +

RAGFlow

+

RAGFlow 是一款开源 RAG(Retrieval-Augmented Generation)引擎与应用平台,深度融合了深度文档理解、自动化 RAG 工作流与大模型调用,打通了复杂数据处理、知识检索、增强生成的全流程,旨在为企业及开发者提供一站式智能问答开发服务,并支持各类复杂场景下大模型的构建与应用落地。

+

目前,MinerU 已深度集成至 RAGFlow 知识库在线版本,作为内置 PDF 文档解析器,为用户知识库搭建提供专业、可靠的文档解析支持。本地部署版本部署使用方式详见下方使用教程。

+

使用可访问:https://demo.ragflow.io/

+

img

+

使用教程:如何在 RAGFlow 中使用 MinerU

+

一、安装配置

+

首先,我们建议您通过 docker 的形式在本地部署 RagFlow 以方便使用 MinerU 插件作为解析工具。在安装完 RagFlow 后执行:

+
    +
  1. 版本检查:
  2. +
+

确保你的RAGFlow版本 >= v0.21.1

+
    +
  1. 更新 .env 文件:
  2. +
+

为了确保服务能被平稳修改,建议先在 cmd 运行 docker compose down 停掉服务。

+

打开 .env 文件,在文件的末尾,添加这两行代码,保存文件。

+
HF_ENDPOINT=https://hf-mirror.com
+MINERU_EXECUTABLE=/ragflow/uv_tools/.venv/bin/mineru
+
+
    +
  1. 启动并进入容器:
  2. +
+

cmd 中,重新启动服务:docker compose up -d

+

等待服务全部 RunningHealthy 后,运行以下命令进入RAGFlow的核心容器:

+
docker compose exec ragflow-cpu bash
+
+

(你的命令行提示符会从 C:\...> 变为 root@...

+
    +
  1. +

    在容器内下载 MinerU 模型:

    +

    在容器内部,依次运行以下 5 条命令

    +
  2. +
+
mkdir uv_tools
+cd uv_tools
+uv venv .venv
+source .venv/bin/activate
+uv pip install -U "mineru[core]" -i https://mirrors.aliyun.com/pypi/simple
+
+
    +
  1. 退出并重启:
  2. +
+

安装完成后,输入 exit 并按回车。

+

运行重启命令,让 RAGFlow 加载刚装好的 MinerU

+
docker compose restart ragflow-cpu
+
+

二、使用入口

+

在本地部署完毕后,要启用 MinerU,您需要在进入 RagFlow 特定知识库的配置页面并选择 MinerU 作为默认的 PDF 解析器。(注:RagFlow 在线版中已经内置了 MinerU 插件为您提供了高级的 PDF 文件解析能力,使用方式与此一致。)

+

入口和配置步骤:

+
    +
  1. 进入知识库配置:
  2. +
  3. 首先,在您的知识库管理界面,选择您需要配置的特定知识库(例如图示中的 "content" 知识库)。
  4. +
  5. 在知识库详情页面的左侧导航栏中,点击【配置】选项卡。
  6. +
  7. 定位 PDF 解析器设置:
  8. +
  9. 向下滚动页面,找到“Ingestion pipeline”(摄取管道)设置部分。
  10. +
  11. 在此部分中,您会看到一个名为【PDF解析器】(PDF Parser)的选项。
  12. +
  13. 选择 MinerU:
  14. +
  15. 点击【PDF解析器】旁边的下拉菜单。
  16. +
  17. 从可用选项中,选择【MinerU】。
  18. +
  19. 保存修改:
  20. +
  21. 完成选择后,请务必点击页面底部的【保存】按钮,以使更改生效。
  22. +
+

img

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/Sider/index.html b/zh/usage/plugin/Sider/index.html new file mode 100644 index 00000000..59c76026 --- /dev/null +++ b/zh/usage/plugin/Sider/index.html @@ -0,0 +1,1639 @@ + + + + + + + + + + + + + + + + + + + + + Sider 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

Sider 简介

+

Sider 是一款浏览器侧边栏类的 AI 助手扩展,主要在网页右侧开启一个“随处可用”的智能面板,将对话式 AI(如 GPT、Claude、Gemini 等)带到你正在浏览的任何页面中。它的核心定位是:提升阅读、写作、翻译、检索与总结效率,并与网页内容深度联动。

+

目前,Sider在 Wisebase 模块中深度集成了 MinerU 的相关功能。该模块是一个由AI驱动的知识库,您可以通过上传 PDF 等各类型文件,构建个人图书馆以实现高效的知识管理,MinerU 可以帮助您更好地解析此类文件,精准地提取文件中的信息。

+

img

+
    +
  • Sider 官网地址:https://sider.ai/zh-CN/chat
  • +
  • 使用集成 MinerU 相关功能的 Sider 地址:https://sider.ai/zh-CN/wisebase
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/plugin/n8n/index.html b/zh/usage/plugin/n8n/index.html new file mode 100644 index 00000000..65ed215b --- /dev/null +++ b/zh/usage/plugin/n8n/index.html @@ -0,0 +1,1661 @@ + + + + + + + + + + + + + + + + + + + + + n8n 简介 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

n8n 简介

+

n8n 是一款以低代码(Low-code)、工作流自动化为核心的应用开发平台,许多企业都借助于其灵活的节点(Node)配置,实现业务流程的自动化执行。它通过可视化界面和代码扩展能力,帮助用户连接各种应用程序和服务,构建复杂的自动化流程,降低用户使用门槛。

+

目前,MinerU 已将其强大的文档解析能力封装为 n8n 节点,用户在搭建工作流时,可以更加便捷地处理复杂的文档解析任务。

+

img

+
    +
  • n8n 官网地址:https://n8n.io/
  • +
  • MinerU n8n 插件下载地址:https://www.npmjs.com/package/n8n-nodes-mineru
  • +
+

MinerU 在 n8n 中的使用方法

+

step1 进入社区node安装界面

+

img

+

step2 安装 n8n-nodes-mineru 节点

+

≈assets/images/n8n_2.png)

+

step3 新建工作流,添加 n8n-nodes-mineru 节点,并设置 api key

+

img

+

img

+

img

+

img

+

n8n使用节点文档

+

https://www.npmjs.com/package/n8n-nodes-mineru

+

在工作流内集成解压功能

+

导入 json 模板

+

暂时无法在飞书文档外展示此内容

+

img

+

配置 凭证和文档url

+

img

+

根据各自的需求配置所需的输出

+

img

+

调试

+

img

+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file diff --git a/zh/usage/quick_usage/index.html b/zh/usage/quick_usage/index.html new file mode 100644 index 00000000..0ece0c7e --- /dev/null +++ b/zh/usage/quick_usage/index.html @@ -0,0 +1,1973 @@ + + + + + + + + + + + + + + + + + + + + + + + + + 基础使用 - MinerU + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + 跳转至 + + +
+
+ +
+ + + + +
+ + +
+ +
+ + + + + + + + + +
+
+ + + +
+
+
+ + + + + + + + + +
+
+
+ + + + +
+
+ + + + + + + +

使用 MinerU

+

快速配置模型源

+

MinerU默认使用huggingface作为模型源,若用户网络无法访问huggingface,可以通过环境变量便捷地切换模型源为modelscope: +

export MINERU_MODEL_SOURCE=modelscope
+
+有关模型源配置和自定义本地模型路径的更多信息,请参考文档中的模型源说明。 +

通过命令行快速使用

+

MinerU内置了命令行工具,用户可以通过命令行快速使用MinerU进行PDF解析: +

mineru -p <input_path> -o <output_path>
+
+
+

Tip

+
    +
  • <input_path>:本地 PDF/图片 文件或目录
  • +
  • <output_path>:输出目录
  • +
  • 未传 --api-url 时,CLI 会自动拉起本地临时 mineru-api
  • +
  • 传入 --api-url 时,CLI 会直连远端或已有本地 FastAPI 服务
  • +
+

更多关于输出文件的信息,请参考输出文件说明

+
+
+

Note

+

命令行工具会在Linux和macOS系统自动尝试cuda/mps加速。Windows用户如需使用cuda加速, +请前往 Pytorch官网 选择适合自己cuda版本的命令安装支持加速的torchtorchvision

+
+

如果需要通过自定义参数调整解析选项,您也可以在文档中查看更详细的命令行工具使用说明

+

通过api、webui、http-client/server进阶使用

+
    +
  • 通过python api直接调用:Python 调用示例
  • +
  • 通过fast api方式调用: +
    mineru-api --host 0.0.0.0 --port 8000
    +
    +

    Tip

    +

    在浏览器中访问 http://127.0.0.1:8000/docs 查看API文档。

    +
      +
    • 健康检查接口:GET /health + 返回 protocol_versionprocessing_window_sizemax_concurrent_requests 等服务信息
    • +
    • 异步任务提交接口:POST /tasks
    • +
    • 同步解析接口:POST /file_parse
    • +
    • 任务查询接口:GET /tasks/{task_id}GET /tasks/{task_id}/result
    • +
    • API 输出目录由服务端固定控制,默认写入 ./output
    • +
    +

    POST /tasks 会立即返回 task_idPOST /file_parse 会在内部提交到同一个任务管理器,等待任务完成后同步返回最终结果。 +任务为单进程、进程内状态实现,服务重启、--reload 热重载或多进程部署后不保证仍可查询历史任务状态。 +默认任务完成或失败后保留 24 小时,随后自动清理任务状态和输出目录;清理后访问任务状态或结果会返回 404。 +可通过环境变量 MINERU_API_TASK_RETENTION_SECONDSMINERU_API_TASK_CLEANUP_INTERVAL_SECONDS 调整保留时长与清理轮询间隔。

    +

    异步任务提交示例: +

    curl -X POST http://127.0.0.1:8000/tasks \
    +  -F "files=@demo/pdfs/demo1.pdf" \
    +  -F "return_md=true"
    +
    +

    同步解析示例: +

    curl -X POST http://127.0.0.1:8000/file_parse \
    +  -F "files=@demo/pdfs/demo1.pdf" \
    +  -F "return_md=true" \
    +  -F "response_format_zip=true" \
    +  -F "return_original_file=true"
    +
    +

    轮询任务状态与结果: +

    curl http://127.0.0.1:8000/tasks/<task_id>
    +curl http://127.0.0.1:8000/tasks/<task_id>/result
    +curl http://127.0.0.1:8000/health
    +
    +
    +
  • +
  • +

    启动gradio webui 可视化前端: +

    mineru-gradio --server-name 0.0.0.0 --server-port 7860
    +
    +
    +

    Tip

    +
      +
    • 在浏览器中访问 http://127.0.0.1:7860 使用 Gradio WebUI。
    • +
    +
    +
  • +
  • +

    使用http-client/server方式调用: +

    # 启动openai兼容服务器(需要安装vllm或lmdeploy环境)
    +mineru-openai-server --port 30000
    +
    +
    +

    Tip

    +

    在另一个终端中通过http client连接openai server +

    mineru -p <input_path> -o <output_path> -b hybrid-http-client -u http://127.0.0.1:30000
    +
    +
    +
  • +
+
+

Note

+

所有vllm/lmdeploy官方支持的参数都可用通过命令行参数传递给 MinerU,包括以下命令:minerumineru-openai-servermineru-gradiomineru-api, +我们整理了一些vllm/lmdeploy使用中的常用参数和使用方法,可以在文档命令行进阶参数中获取。

+
+

基于配置文件扩展 MinerU 功能

+

MinerU 现已实现开箱即用,但也支持通过配置文件扩展功能。您可通过编辑用户目录下的 mineru.json 文件,添加自定义配置。

+
+

Important

+

mineru.json 文件会在您使用内置模型下载命令 mineru-models-download 时自动生成,也可以通过将配置模板文件复制到用户目录下并重命名为 mineru.json 来创建。

+
+

以下是一些可用的配置选项:

+
    +
  • +

    latex-delimiter-config

    +
      +
    • 用于配置 LaTeX 公式的分隔符
    • +
    • 默认为$符号,可根据需要修改为其他符号或字符串。
    • +
    +
  • +
  • +

    llm-aided-config

    +
      +
    • 用于配置 LLM 辅助标题分级的相关参数,兼容所有支持openai协议的 LLM 模型
    • +
    • 默认使用阿里云百炼qwen3-next-80b-a3b-instruct模型
    • +
    • 您需要自行配置 API 密钥并将enable设置为true来启用此功能
    • +
    • 如果您的api供应商不支持enable_thinking参数,请手动将该参数删除
        +
      • 例如,在您的配置文件中,llm-aided-config 部分可能如下所示: +
        "llm-aided-config": {
        +   "api_key": "your_api_key",
        +   "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        +   "model": "qwen3-next-80b-a3b-instruct",
        +   "enable_thinking": false,
        +   "enable": false
        +}
        +
      • +
      • 要移除enable_thinking参数,只需删除包含"enable_thinking": false的那一行,结果如下: +
        "llm-aided-config": {
        +   "api_key": "your_api_key",
        +   "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
        +   "model": "qwen3-next-80b-a3b-instruct",
        +   "enable": false
        +}
        +
      • +
      +
    • +
    +
  • +
  • +

    models-dir

    +
      +
    • 用于指定本地模型存储目录,请为pipelinevlm后端分别指定模型目录,
    • +
    • 指定目录后您可通过配置环境变量export MINERU_MODEL_SOURCE=local来使用本地模型。
    • +
    +
  • +
+ + + + + + + + + + + + + + + +
+
+ + + + + +
+ + + +
+ + + +
+
+
+
+ + + + + + + + + + \ No newline at end of file