adithya-s-k/omniparse

解析后怎么导出MD?

Opened this issue · 1 comments

解析好后不管是复制md还是json都直接卡死,没找到直接保存的选项

可以自己调接口,读返回的json,解析text和images字段,分别保存。我用python的requests库实现了一下:

import requests

#### api request ####
url = 'http://localhost:8000/parse_document'
file_path = 'test.pdf'

with open(file_path, 'rb') as f:
    files = {'file': f}
    response = requests.post(url, files=files)

result = response.json()

#### save markdown ####
text = result['text']
with open("./parse_results/test.md", "w") as f:
    f.write(text)

#### save images ####
import base64
from PIL import Image
from io import BytesIO

for raw in result['images']:
    raw_decode = base64.b64decode(raw['image'])
    image_name = raw['image_name']
    Image.open(BytesIO(raw_decode)).save(f'./parse_results/{image_name}', 'PNG')