To scrape the Bible in Marathi, open jupyter notebook using:
jupyter notebook
go to notebooks/scraper.ipynb
and find all the relevant information for scraping.
Once you get the audio&text files. You have to perform sentence-level alignment using:
python aligner.py -h
To chunk the audio files from the syncmap file (generated from the above script), run:
python audio_chunk.py -h
This will spit out audio chunk corresponding to the sentence-aligned text.
Filtering IPAs from the audio-ipas file. Run:
python filter_ipas.py --help