What result should be obtained under normal circumstances after preprocessing?

Question

What result should be obtained under normal circumstances after preprocessing?

Xuzhejia opened this issue a year ago · 6 comments

What result should be obtained under normal circumstances after preprocessing?
Currently I only have a 0KB file called lrs3_train_transcript_lengths_seg24s.csv. Is this correct?

Answer 1 · 2024-01-09T05:03:38.000Z

After preprocessing, you should be able to see training files and its tokens in "lrs3_train_transcript_lengths_seg24s.csv". It should not 0KB.

Answer 2 · 2024-01-09T06:06:26.000Z

After preprocessing, you should be able to see training files and its tokens in "lrs3_train_transcript_lengths_seg24s.csv". It should not 0KB.

Not sure if I entered the wrong command
Can you give me your command for reference?
Thank you.

Answer 3 · 2024-02-20T02:33:57.000Z

My problem was in the data_module.py file, which was trying to read an mp4 file and got an error,
waveform, sample_rate = torchaudio.load(data_filename, normalize=True)
causing subsequent steps to be skipped. Is this normal and how can it be resolved?

Answer 4 · 2024-02-20T08:45:34.000Z

My problem was in the data_module.py file, which was trying to read an mp4 file and got an error, waveform, sample_rate = torchaudio.load(data_filename, normalize=True) causing subsequent steps to be skipped. Is this normal and how can it be resolved?
I also found this problem and I'm still looking for a solution.

Answer 5 · 2024-02-20T10:05:09.000Z

My problem was in the data_module.py file, which was trying to read an mp4 file and got an error, waveform, sample_rate = torchaudio.load(data_filename, normalize=True) causing subsequent steps to be skipped. Is this normal and how can it be resolved?
I also found this problem and I'm still looking for a solution.

    def load_audio(self, data_filename):
        video = VideoFileClip(data_filename)
        video.audio.write_audiofile(data_filename[:-1]+"3",logger=None)
        waveform, sample_rate = torchaudio.load(data_filename[:-1]+"3", normalize=True)
        return waveform, sample_rate

This is my solution.

Answer 6 · 2024-02-20T10:43:30.000Z

My problem was in the data_module.py file, which was trying to read an mp4 file and got an error, waveform, sample_rate = torchaudio.load(data_filename, normalize=True) causing subsequent steps to be skipped. Is this normal and how can it be resolved?
I also found this problem and I'm still looking for a solution.
    def load_audio(self, data_filename):
        video = VideoFileClip(data_filename)
        video.audio.write_audiofile(data_filename[:-1]+"3",logger=None)
        waveform, sample_rate = torchaudio.load(data_filename[:-1]+"3", normalize=True)
        return waveform, sample_rate
This is my solution.

def load_audio(self, data_filename):
    probe = ffmpeg.probe(data_filename)
    process = (
    ffmpeg.input(data_filename)
    .output('-', format='wav', acodec='pcm_s16le', ar='16000')
    .run_async(pipe_stdout=True, quiet=True)
    )
    stdout, _ = process.communicate()

    waveform, sample_rate = torchaudio.load(io.BytesIO(stdout), normalize=True)
    return waveform, sample_rate

This also works