opendatalab/MinerU
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
PythonAGPL-3.0
Pinned issues
Issues
- 1
解析pdf表格出现内容缺失,格式错误
#3595 opened by freedomlxx - 4
官方在线演示:vlm-vllm-async-engine模式,pdf无线表格被识别为空表格
#3612 opened by lc345 - 4
- 9
- 10
minerU2.5报错TypeError: Qwen2VLForConditionalGeneration.__init__() got an unexpected keyword argument 'dtype
#3575 opened by l878619717 - 3
使用mineru-api 在pipeline模式下转完后gpu显存没有释放
#3617 opened by samwellshi - 1
章节部分的点没有识别出来,是什么原因?比如5.2.1.6,识别成了5216
#3611 opened by intothephone - 2
2.5.3版本magika文件类型识别错误
#3583 opened by PascalZh - 5
pdf incorrectly rejected as .ai file
#3605 opened by DarrenCook - 4
解析pdf文件直接卡住了
#3600 opened by freedomlxx - 2
mineru2.5无法用demo.py文件运行
#3598 opened by ChineseWTAO - 1
ModuleNotFoundError: No module named 'mineru_vl_utils'
#3596 opened by Arvin-928 - 1
Wrong reading order
#3591 opened by AtiqurRahmanAni - 1
为什么使用vllm推理时,只识别了文本内容但是表格没有输出标签
#3590 opened by fxbzyj - 4
- 6
MinerU PDF页数检测错误 网页版和客户端均可稳定复现
#3586 opened by lNeverl - 1
2.5.3版本 解析后表格中部分内容丢失
#3587 opened by AmyShuiOrBing - 2
离线部署后无法正常解析
#3584 opened by chetaofeng - 4
- 3
gradio卡在了uploading
#3581 opened by bondijoe27 - 1
LiteLLM 使用配置
#3580 opened by peiyaoli - 2
cuda 12.4 有支持的docker 镜像吗
#3576 opened by SuperZhanggy - 2
使用Gradio界面解析PDF文件,PDF预览界面为空
#3577 opened by niboliang - 6
vlm模式网页版与本地部署识别效果差异大;pipeline模式图片标题位置识别错误
#3568 opened by calwd0mn - 1
vlm 2.5 版本的版式分析容易将海报、票据的title识别成header
#3570 opened by LRHstudy - 2
无法识别和文本块处于同一行的加粗黑体小标题
#3530 opened by fndmyyy - 1
FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead.
#3566 opened by shiiuen01 - 3
在content_list.json跨页表格的逻辑存在诸多问题
#3561 opened by finley0066 - 6
分页表格识别不完整
#3476 opened by LunaJin-lang - 2
Hallucination for tables
#3487 opened by weissenbacherpwc - 2
昇腾910B + aarch64 启动mineru后,解析速度很慢
#3521 opened by penond - 1
昇腾910b,mineru 2.5.0 + pipeline 搭配哪个版本的torch_npu?
#3528 opened by penond - 7
vlm-sglang-engine 推理字符级重复现象
#3525 opened by DaiJianghai - 5
CUDA12.4 not support 2.5.2
#3543 opened by kevinhonor - 3
无法识别图片中表格及内容
#3536 opened by iicaicai - 1
- 3
Error information show in Gradio WebUI
#3540 opened by Simonqujian78 - 5
- 1
VLM解析化学公式语法错误
#3539 opened by Doge2077 - 4
数学公式出现多余的 '-' 符号
#3531 opened by Doge2077 - 1
带点号的行间公式解析幻觉
#3524 opened by Rundong-Li - 1
Unordered list was detected as text block in VLM mode
#3522 opened by Doge2077 - 2
mineru启动后,MFR Predict 的进度始终卡在 0% 挂代理也不行
#3494 opened by penond - 2
昇腾910B + aarch64架构 应该用什么方式部署mineru?
#3485 opened by penond - 2
Repeated escape of '<' '>' symbols in html table
#3520 opened by Doge2077 - 2
使用celery的线程模式, 批量进行pdf解析时, PDFium 存在线程安全问题
#3484 opened by Isfate - 3
表格被识别为图片并按照图片返回(markdown 图片标签)
#3482 opened by wozai604 - 1
依赖库报错:Pure virtual function called
#3481 opened by cq-ldg - 1
依赖库报错:Pure virtual function called
#3480 opened by cq-ldg - 5
pipeline/vlm模式均无法正确定位包含化学结构式的表格
#3478 opened by sralvins