avinashvarna/audio_alignment

Handling multiple audio/text files

avinashvarna opened this issue · 6 comments

I was looking at vishvAsa's email regarding ~1k mahAbhArata recordings and texts being available, e.g. here, and thinking that it might be a good addition to this site. The complicating factor appears to be that there are multiple audio files per adhyAya. For generating the alignment, it is easy to locally concatenate the files into one larger file. but it is not clear what is the easiest way to present it on the website. Any thoughts @hrishikeshrt @shreevatsa?

For Rāmāyaṇa, we represent each Sarga as one unit, I think the same should be done for Mahābhārata and use Adhyāya as a unit. Concatenating text and audio files for each Adhyāya.
Is there consistency between audio files and text that it corresponds to? If they refer to two different critical versions, (BORI / Kumbhakonam) there might be discrepancies in the text.

Is there a risk of somewhat "losing information" by concatenating? That is, if the text and audio has already been manually aligned at a crude level (several ślokas together, say), then the automatic alignment ought to be strictly a refinement of it (not cross the manually determined boundaries).

But however the alignment is done, when presenting it on the page, the pages should be presented in whatever division makes sense for reading the text (the adhyāya say). Clicking on a certain region of text can take the user to the corresponding region of the corresponding audio file.

@hrishikeshrt, From @vishvAsa's site, it appears that the recordings are based on the Gita press edition. There may be some minor discrepancies, but it may still be valuable to have the audio alignment.

I agree with both of you that presenting adhyAya makes sense. The easiest way I can think of is to concatenate the audio files corresponding to one adhyAya into one mp3, but that would require us to host a similar collection on archive instead of just using the current files. Adding logic to take the user to the corresponding region of the corresponding file may be possible, but I don't know how much work that would entail.

@shreevatsa As far as I can see the audio clips don't seem to have any annotation regarding which groups of shlokas they correspond to.

Are there any downsides to hosting the concatenated audios?

Adding logic to take the user to the corresponding region of the corresponding file may be possible, but I don't know how much work that would entail.

If this is to be done, we need one more data tag on every word, a file identifier of sorts, (currently we have data-begin and data-end, say we have data-file which will identify the current audio element)
However, I am not a fan of the idea of showing multiple audios on one page.

Are there any downsides to hosting the concatenated audios?

Hosting costs / space, I imagine? Free hosting providers like GitHub pages usually have an informal limit…

If this is to be done, we need one more data tag on every word, a file identifier of sorts, (currently we have data-begin and data-end, say we have data-file which will identify the current audio element) However, I am not a fan of the idea of showing multiple audios on one page.

We can (and probably should) have only a single audio element on the page, which will play different audio files as needed (some JS changing the src attribute of the <audio> element).

@mvnpavan88 , In the future, please open a separate issue for new topics, so as to keep discussions on point. If you don't mind, please delete your comment above, and I will do the same for this post later, so that this issue can be focused on the main topic.

Please use <first_name>.<last_name>@ gmail. com to contact me.