nlmatics/nlm-ingestor
This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
PythonApache-2.0
Issues
- 8
- 3
JSON Decode error when
#84 opened by shumin018 - 2
For anyone hoping to deploy this as a lambda
#56 opened by dgonier - 1
Cannot connect to the Docker daemon at unix:///run/user/86364/docker.sock. Is the docker daemon running?
#82 opened by Tizzzzy - 1
Other language than English
#88 opened by jackNhat - 1
Docker expose port needs to be corrected
#69 opened by kjoth - 0
NLTK Data directory not found
#85 opened by jinkjonks - 1
- 2
- 0
Bug: Markdown parsing error
#83 opened by jamesvillarrubia - 1
Connection error with Docker run
#67 opened by kjoth - 5
- 0
- 0
Discussions
#80 opened by rmast - 21
KeyError: 'style'
#72 opened by RaphSte - 0
Dependency versions too strict
#46 opened by choyuansu - 2
Latest docker image not working locally
#65 opened by mvennela - 0
Tika server is running OCR twice ?
#76 opened by le-codeur-rapide - 0
bug: missing text from parsed pdf
#75 opened by fede-bello - 1
llm sherpa deployed in eks cluster with 4vcpu and 16gb ram not working properly
#74 opened by gireesh99 - 4
nlm-ingestor is SUPER SLOW
#39 opened by pashpashpash - 4
KeyError: 'return_dict'
#24 opened by ZengJin123 - 0
- 3
- 0
BBOX information
#66 opened by TheMrguiller - 3
Not able to install nlm_ingestor
#30 opened by sli701 - 0
Missing Tests?
#63 opened by ramarnat - 1
docker image is not producing any result
#49 opened by craldaz - 0
Receiving an error 'urllib3.exceptions.LocationValueError: No host specified.'
#61 opened by anirudh-gapblue - 2
Error when parsing a PDF
#44 opened by kaulshashank - 1
Is it possible to run this fully local, so sensitive PII PDFs dont leave the network?
#57 opened by AIMads - 1
memory leaks
#29 opened by ZengJin123 - 1
How to deploy this thing in production building image with the docker file giving error.
#58 opened by aman-vink - 0
Lost pages
#55 opened by sailxjx - 2
bbox error in BlockRender
#52 opened by livelxw - 0
Trivially small chunks returned
#51 opened by thelazydogsback - 0
Issue with finding tables and sections
#45 opened by Aviral-tech - 5
Suggestions for Fast Production Server
#37 opened by yashpatel21 - 0
- 2
box_style not being taken into account
#41 opened by mikecook69 - 0
Disable rules/paranthesized header
#38 opened by mikecook69 - 2
Encoding error with non-ASCII character.
#33 opened by jamesvillarrubia - 0
- 0
- 1
PDF extraction
#32 opened by Amy-raj - 0
- 1
HTLM AND XML INGESTOR
#25 opened by drewskidang - 0
- 0
- 0