[Feature Request]: Improve DAC interface

Question

lucadellalib opened this issue 5 months ago · 0 comments

DAC interface differs quite a lot from other audio tokens extractors (for example EnCodec https://github.com/speechbrain/speechbrain/blob/beb0ecedbcf261f4437166598e921c855cf62614/speechbrain/lobes/models/huggingface_transformers/encodec.py). For example a method for decoding a given token sequence into a waveform is missing and a workaround is necessary to do that. A common interface (method names, returned values, tensor dimensions order etc.) would improve modularity.

Refactor to follow a common interface (probably EnCodec's one is already good enough).

No response