In the fast-paced world of academia and research, keeping up with the latest scientific papers can be a daunting task. While reading remains the traditional method of consuming this knowledge, not everyone finds it the most effective or convenient. That's where our Text-to-Speech (TTS) for Scientific Papers comes into play.
-
Auditory Learning: Some individuals are auditory learners, meaning they grasp and retain information better when they hear it. Converting scientific papers to audio can cater to this learning style, allowing them to understand and recall complex concepts more easily.
-
Multitasking: In today's busy world, not everyone has the luxury of sitting down to read a paper. With TTS, users can now listen to the latest research while driving, cooking, working out, or performing chores. It not only makes use of time more efficiently but also integrates continuous learning into daily routines.
-
Accessibility: For those with visual impairments or other conditions that make reading difficult, TTS offers an alternative way to access and consume the vast world of scientific knowledge.
By providing a platform that converts dense and intricate scientific texts into audible content, we hope to make research more accessible, digestible, and convenient for everyone, regardless of their preferred learning method or lifestyle.
pip install https://github.com/inc0/science_read.git
Simple usage example
scienceread https://www.biorxiv.org/content/10.1101/2023.08.05.552127v1.full paper.wav
Currently project only supports arxiv papers with full text available on web. Click on "full text" tab on arxiv and copy url.
Magic is done by Bark model. It's state of the art TTS model available.
It may take a long time to process paper (it took about 30min on Nvidia 4090 to fully transcribe example paper). For that reason we also added --small
flag that uses smaller version of model. It's quite a bit faster and doesn't require such heavy compute, but it also gets "confused" more often and produces more artifacts.
This is very early version of the tool. All contributions are welcome. There are number of TODO
comments. Larger problems are described in the issue tracker.