/papercast

A Python pipeline tool and plugin ecosystem for processing technical documents. Process papers from arXiv, SemanticScholar, PDF, with GROBID, LangChain, listen as podcast. Customize your own pipelines.

Primary LanguagePythonMIT LicenseMIT

Papercast

Documentation Status License Papercast Discord

papercast logo

An extensible pipeline tool and plugin ecosystem for processing technical documents. Written in Python.

Features

Feature Examples
Add documents in multiple formats, from popular sources: PDF
LaTeX
ArXiv
SemanticScholar
Flexible Text Extraction GROBID
More coming soon!
Write your own!
Flexible Text Narration OSX say command
More coming soon!
Write your own!
Publish to multiple endpoints: Self-hosted RSS podcast using GitHub Pages
Any other endpoint you can think of
Run anywhere: Local machine
Remote server
Cloud (AWS, GCP, Azure, etc.)

More Info