CatchTheTornado/text-extract-api
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
PythonGPL-3.0
Stargazers
- gbodraSão Paulo, Brazil
- mjaskolskiGdynia, Poland
- brunoaureli
- tkarwatkaPoland
- votrukBerlin
- kurdi-devAs Sulaymaniyah, Kurdistan Region, Iraq
- jkobusWarsaw, Poland
- guidofrascadoreRome
- cigolpl
- Don-YinLondon, London
- JanPuzioWarsaw
- krzysztofspilkaEurope
- MaciejKadula
- lukgru
- lebedyncrsPoland, Lublin
- martwozniak
- konrad-pawlusKrakow
- avdeevMoscow
- LaVanguardPoland
- witek1902Poland
- dmikushinLausanne, CH
- xabru
- nickurt
- tomusdrwPoland
- GooRooStuttgart, Germany
- mealWarsaw, Poland
- Dart55
- prorusnet
- iamweird
- ssztembergWarsaw
- scipunch
- wwsilinMoscow, Russia
- drahnrCEST
- Haider-Ali-DSLahore, Pakistan
- piotrkwiecinskiCatanzaro
- gopalanj