Compare commits

..

17 Commits

Author SHA1 Message Date
Xiaomeng Zhao
dde90293f1 Merge pull request #2355 from opendatalab/master
update version
2025-04-23 18:50:01 +08:00
myhloli
a24b9ed8fd Merge remote-tracking branch 'origin/master' 2025-04-23 18:48:46 +08:00
myhloli
e0dc6c8473 docs(README): update changelog for version 1.3.8 release 2025-04-23 18:48:32 +08:00
myhloli
801d3ade19 Update version.py with new version 2025-04-23 10:41:07 +00:00
Xiaomeng Zhao
6b7a861e8f Merge pull request #2354 from opendatalab/release-1.3.8
Release 1.3.8
2025-04-23 18:38:42 +08:00
Xiaomeng Zhao
9fbaee9e89 Merge pull request #2353 from myhloli/dev
test(table): update test_rapidtable.py to handle SegLink text variations
2025-04-23 18:27:20 +08:00
myhloli
61fa95d4e0 test(table): update test_rapidtable.py to handle SegLink text variations
- Modify assertion for first cell text to check for 'SegLink' instead of exact match
- This change accommodates variations in SegLink text format
2025-04-23 18:26:19 +08:00
Xiaomeng Zhao
5c232f0587 Merge pull request #2352 from myhloli/dev
feat(ocr): add new Chinese OCR model and update language support
2025-04-23 18:15:25 +08:00
myhloli
45f5082613 refactor(ocr): update device parameter handling in paddleocr2pytorch
- Replace get_device() function call with direct 'device' variable usage
- Simplify device configuration in OCR model initialization
2025-04-23 18:13:58 +08:00
myhloli
4f88fcaa51 feat(ocr): add new Chinese OCR model and update language support
- Add new Chinese OCR model (ch_PP-OCRv4_rec_server_doc_infer) for server-side use
- Update language support in app.py to include new Chinese model
- Modify models_config.yml to add new model configuration
2025-04-23 18:06:12 +08:00
Xiaomeng Zhao
3cf1ea1f5b Merge pull request #2316 from opendatalab/master
master->dev
2025-04-22 19:28:21 +08:00
myhloli
d874563e38 Update version.py with new version 2025-04-22 11:27:25 +00:00
Xiaomeng Zhao
55fcb7387f Merge pull request #2315 from opendatalab/release-1.3.7
Release 1.3.7
2025-04-22 19:26:03 +08:00
myhloli
4d5fd0ee55 Update version.py with new version 2025-04-21 06:45:36 +00:00
Xiaomeng Zhao
601b44bfe0 Merge pull request #2298 from opendatalab/release-1.3.6
Release 1.3.6
2025-04-21 14:37:23 +08:00
Xiaomeng Zhao
6fbbe3e6f0 Merge pull request #2274 from opendatalab/dev
docs: update issue templates and disable blank issues
2025-04-17 18:46:05 +08:00
github-actions[bot]
19fd2cfa37 @vloum has signed the CLA in opendatalab/MinerU#2267 2025-04-17 03:55:12 +00:00
11 changed files with 15685 additions and 6 deletions

View File

@@ -48,6 +48,12 @@ Easier to use: Just grab MinerU Desktop. No coding, no login, just a simple inte
</div>
# Changelog
- 2025/04/23 1.3.8 Released
- The default `ocr` model (`ch`) has been updated to `PP-OCRv4_server_rec_doc` (model update required)
- `PP-OCRv4_server_rec_doc` is trained on a mix of more Chinese document data and PP-OCR training data, enhancing recognition capabilities for some traditional Chinese characters, Japanese, and special characters. It supports over 15,000 recognizable characters, improving text recognition in documents while also boosting general text recognition.
- [Performance comparison between PP-OCRv4_server_rec_doc, PP-OCRv4_server_rec, and PP-OCRv4_mobile_rec](https://paddlepaddle.github.io/PaddleX/latest/en/module_usage/tutorials/ocr_modules/text_recognition.html#ii-supported-model-list)
- Verified results show that the `PP-OCRv4_server_rec_doc` model significantly improves accuracy in both single-language (`Chinese`, `English`, `Japanese`, `Traditional Chinese`) and mixed-language scenarios, with speed comparable to `PP-OCRv4_server_rec`, making it suitable for most use cases.
- In a small number of pure English scenarios, the `PP-OCRv4_server_rec_doc` model may encounter word concatenation issues, whereas `PP-OCRv4_server_rec` performs better in such cases. Therefore, we have retained the `PP-OCRv4_server_rec` model, which users can invoke by passing the parameter `lang='ch_server'`(python api) or `--lang ch_server`(cli).
- 2025/04/22 1.3.7 Released
- Fixed the issue where the `lang` parameter was ineffective during table parsing model initialization.
- Fixed the significant slowdown in OCR and table parsing speed in `cpu` mode.

View File

@@ -47,6 +47,12 @@
</div>
# 更新记录
- 2025/04/23 1.3.8 发布
- `ocr`默认模型(`ch`)更新为`PP-OCRv4_server_rec_doc`(需更新模型)
- `PP-OCRv4_server_rec_doc`是在`PP-OCRv4_server_rec`的基础上在更多中文文档数据和PP-OCR训练数据的混合数据训练而成增加了部分繁体字、日文、特殊字符的识别能力可支持识别的字符为1.5万+,除文档相关的文字识别能力提升外,也同时提升了通用文字的识别能力。
- [PP-OCRv4_server_rec_doc/PP-OCRv4_server_rec/PP-OCRv4_mobile_rec 性能对比](https://paddlepaddle.github.io/PaddleX/latest/module_usage/tutorials/ocr_modules/text_recognition.html#_3)
- 经验证,`PP-OCRv4_server_rec_doc`模型在`中英日繁`单种语言或多种语言混合场景均有明显精度提升,且速度与`PP-OCRv4_server_rec`相当,适合绝大部分场景使用。
- `PP-OCRv4_server_rec_doc`在小部分纯英文场景可能会发生单词粘连问题,`PP-OCRv4_server_rec`则在此场景下表现更好,因此我们保留了`PP-OCRv4_server_rec`模型,用户可通过增加参数`lang='ch_server'`(python api)或`--lang ch_server`(命令行)调用。
- 2025/04/22 1.3.7 发布
- 修复表格解析模型初始化时lang参数失效的问题
- 修复在`cpu`模式下ocr和表格解析速度大幅下降的问题

View File

@@ -1 +1 @@
__version__ = "1.3.5"
__version__ = "1.3.8"

View File

@@ -55,7 +55,8 @@ class PytorchPaddleOCR(TextSystem):
self.lang = kwargs.get('lang', 'ch')
device = get_device()
if device == 'cpu' and self.lang == 'ch':
if device == 'cpu' and self.lang in ['ch', 'ch_server']:
logger.warning("The current device in use is CPU. To ensure the speed of parsing, the language is automatically switched to ch_lite.")
self.lang = 'ch_lite'
if self.lang in latin_lang:
@@ -79,7 +80,7 @@ class PytorchPaddleOCR(TextSystem):
kwargs['rec_char_dict_path'] = os.path.join(root_dir, 'pytorchocr', 'utils', 'resources', 'dict', dict_file)
# kwargs['rec_batch_num'] = 8
kwargs['device'] = get_device()
kwargs['device'] = device
default_args = vars(args)
default_args.update(kwargs)

View File

@@ -171,6 +171,31 @@ ch_PP-OCRv4_rec_server_infer:
nrtr_dim: 384
max_text_length: 25
ch_PP-OCRv4_rec_server_doc_infer:
model_type: rec
algorithm: SVTR_HGNet
Transform:
Backbone:
name: PPHGNet_small
Head:
name: MultiHead
out_channels_list:
CTCLabelDecode: 15631
head_list:
- CTCHead:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size: [ 1, 3 ]
use_guide: True
Head:
fc_decay: 0.00001
- NRTRHead:
nrtr_dim: 384
max_text_length: 25
chinese_cht_PP-OCRv3_rec_infer:
model_type: rec
algorithm: SVTR

View File

@@ -3,10 +3,14 @@ lang:
det: ch_PP-OCRv3_det_infer.pth
rec: ch_PP-OCRv4_rec_infer.pth
dict: ppocr_keys_v1.txt
ch:
ch_server:
det: ch_PP-OCRv3_det_infer.pth
rec: ch_PP-OCRv4_rec_server_infer.pth
dict: ppocr_keys_v1.txt
ch:
det: ch_PP-OCRv3_det_infer.pth
rec: ch_PP-OCRv4_rec_server_doc_infer.pth
dict: ppocrv4_doc_dict.txt
en:
det: en_PP-OCRv3_det_infer.pth
rec: en_PP-OCRv4_rec_infer.pth

View File

@@ -158,7 +158,7 @@ devanagari_lang = [
'hi', 'mr', 'ne', 'bh', 'mai', 'ang', 'bho', 'mah', 'sck', 'new', 'gom', # noqa: E126
'sa', 'bgc'
]
other_lang = ['ch', 'en', 'korean', 'japan', 'chinese_cht', 'ta', 'te', 'ka']
other_lang = ['ch', 'ch_lite', 'ch_server', 'en', 'korean', 'japan', 'chinese_cht', 'ta', 'te', 'ka']
add_lang = ['latin', 'arabic', 'cyrillic', 'devanagari']
# all_lang = ['', 'auto']

View File

@@ -239,6 +239,14 @@
"created_at": "2025-04-14T10:40:54Z",
"repoId": 765083837,
"pullRequestNo": 2226
},
{
"name": "vloum",
"id": 75369577,
"comment_id": 2811669681,
"created_at": "2025-04-17T03:54:59Z",
"repoId": 765083837,
"pullRequestNo": 2267
}
]
}

View File

@@ -41,7 +41,7 @@ class TestppTableModel(unittest.TestCase):
# 检查第一行数据
first_row = tree.xpath('//table/tr[2]/td')
assert len(first_row) == 5, "First row should have 5 cells"
assert first_row[0].text and first_row[0].text.strip() == "SegLink[26]", "First cell should be 'SegLink[26]'"
assert first_row[0].text and 'SegLink' in first_row[0].text.strip(), "First cell should be 'SegLink [26]'"
assert first_row[1].text and first_row[1].text.strip() == "70.0", "Second cell should be '70.0'"
assert first_row[2].text and first_row[2].text.strip() == "86.0", "Third cell should be '86.0'"
assert first_row[3].text and first_row[3].text.strip() == "77.0", "Fourth cell should be '77.0'"