Inference with audio

Question

Inference with audio

lakshya-frontera opened this issue 8 months ago · 2 comments

Thank you for this amazing work.

I have been trying to run the inference script (i.e. demo.ipynb) but there is no function in there which takes ASR transcript along with the video. It would be great, if you could point me to the function which also takes ASR transcript for answer generation or provide that script.

Answer 1 · 2024-05-06T09:24:36.000Z

same question

Answer 2 · 2024-05-08T14:41:54.000Z

Hi, thanks for your interest.

We currently have code for ASR available for pre-processing purposes (see https://github.com/RenShuhuai-Andy/TimeChat/blob/master/docs/DATA.md#automatic-speech-transcription).

I agree that it would be beneficial to integrate this into a function for easier use. I plan to add this feature when I have some free time. Alternatively, if you're interested, you could contribute to adding this feature. Let me know if you're interested!