RenShuhuai-Andy/TimeChat

Inference with audio

lakshya-frontera opened this issue · 2 comments

Thank you for this amazing work.

I have been trying to run the inference script (i.e. demo.ipynb) but there is no function in there which takes ASR transcript along with the video. It would be great, if you could point me to the function which also takes ASR transcript for answer generation or provide that script.

same question

Hi, thanks for your interest.

We currently have code for ASR available for pre-processing purposes (see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/docs/DATA.md#automatic-speech-transcription).

I agree that it would be beneficial to integrate this into a function for easier use. I plan to add this feature when I have some free time. Alternatively, if you're interested, you could contribute to adding this feature. Let me know if you're interested!