解析后怎么导出MD?
Opened this issue · 1 comments
Wizarzz commented
解析好后不管是复制md还是json都直接卡死,没找到直接保存的选项
ohownew commented
可以自己调接口,读返回的json,解析text和images字段,分别保存。我用python的requests库实现了一下:
import requests
#### api request ####
url = 'http://localhost:8000/parse_document'
file_path = 'test.pdf'
with open(file_path, 'rb') as f:
files = {'file': f}
response = requests.post(url, files=files)
result = response.json()
#### save markdown ####
text = result['text']
with open("./parse_results/test.md", "w") as f:
f.write(text)
#### save images ####
import base64
from PIL import Image
from io import BytesIO
for raw in result['images']:
raw_decode = base64.b64decode(raw['image'])
image_name = raw['image_name']
Image.open(BytesIO(raw_decode)).save(f'./parse_results/{image_name}', 'PNG')