CatchTheTornado/text-extract-api
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
PythonGPL-3.0
Stargazers
- guireis1Brazil
- denisfitz57
- shi-yanSolar system
- chrysanthosCyprus
- bdittmerSan Francisco, CA
- smartyhouses
- mr-dogos
- leprosus
- abandurka
- 7Max7
- FunctionalPerformanceSystems
- ya-developer
- Persimmon-Bromide
- Sem951
- LostElysiumRussian Federation
- AnatoliyAvr
- jaxbag
- tomdeleu
- LightUtopia2024
- Lifailon
- davidmiglozAmsterdam
- tomwagnerPrague
- ParAmbula
- cyler88
- ssdaniel24
- daswer123
- bdebeverFrance
- matgrenGdańsk
- BaroshemWrocław
- Pucekcom
- RayedBLille, France
- become-iron
- itcmsPOLAND
- jarektkaczykSingapore
- ogbozoyan
- sekjuPoland