speechbrain/speechbrain

A few unoptimised piece of code (augmentation and masking)

TParcollet opened this issue · 2 comments

Describe the bug

While profiling a conformer transducer, I realised that a few functions were taking way more time than expected. I can't share the profiling trace here due to confidentiality, but I can share the code to create it. Basically, the resampling function in the augmentation pipeline of Librispeech is taking 10% of every step (including forward AND backward). This resampling function must be changed.

The make_transformer_src_mask function also is taking a significant amount of time - around 1/3 of a full conformer model inference.

@asumagic may want to look into this with me.

Expected behaviour

Faster, better.

To Reproduce

No response

Environment Details

No response

Relevant Log Output

No response

Additional Context

No response

#2410 solved half of this issue, the mask creation issue remains before this can be closed.

Solved in #2426 and #2410. Thanks for raising this issue guys! :)