speechbrain/speechbrain

[Feature Request]: Improve DAC interface

lucadellalib opened this issue ยท 0 comments

๐Ÿš€ The feature

DAC interface differs quite a lot from other audio tokens extractors (for example EnCodec https://github.com/speechbrain/speechbrain/blob/beb0ecedbcf261f4437166598e921c855cf62614/speechbrain/lobes/models/huggingface_transformers/encodec.py). For example a method for decoding a given token sequence into a waveform is missing and a workaround is necessary to do that. A common interface (method names, returned values, tensor dimensions order etc.) would improve modularity.

Solution outline

Refactor to follow a common interface (probably EnCodec's one is already good enough).

Additional context

No response