A better site reader for reading webpages in Linux which works from CLI. Uses state-of-the-art voice synthesis from Mozilla TTS.
The first page you have it read from a site will always be the longest to read because ekho tries to only read mostly unique text from a site, though on that first page it knows nothing about occurences of text for that site just yet. The script will read any page besides ones that use javascript to load, so you don't have 100% all of the pages available on the web to read. Also, wherever you might have to log in to see something, you can't read that because Ekho just gets the page without any session. It also uses a class I call HashIndex, which makes sure that ekho doesn't become repetitive on the same website. The synthesis and playing of the audio happen on two different threads so that the playback is fairly smooth and continuous, letting it render audio to wav files under the render/ folder.
python3 ekho.py https://www.ibm.com/cloud/learn/what-is-artificial-intelligence
- Mozilla TTS
- pip install TTS
- or their github: https://github.com/mozilla/TTS
- Beautiful Soup 4: Parse/Search HTML documents from text
- pip install beautifulsoup4
- GenSim: Text summarization tool
- pip install gensim
- PlaySound: Simple module that plays the wav files generated by TTS
- pip install playsound
- You need a Speech Model and a Vocoder Model from Mozilla TTS
- I used Tacotron2 and Multiband Melgan
- Change the path on line 249 of ekho.py to conform to the path where your TTS models are located
- If you are using different models, make sure to change model_path, model_config, vocoder_path, vocoder_config variables just after path var
There is a significant stutter situation going on using Tacotron2 and Multiband MelGan. I'm not sure why this is, but some examples have been placed at https://github.com/newsbubbles/tts_bugs where you can hear why this is going on
- Make a setup script so that everything gets installed with one command.