[Feature Request]: Improve DAC interface
lucadellalib opened this issue ยท 0 comments
lucadellalib commented
๐ The feature
DAC interface differs quite a lot from other audio tokens extractors (for example EnCodec https://github.com/speechbrain/speechbrain/blob/beb0ecedbcf261f4437166598e921c855cf62614/speechbrain/lobes/models/huggingface_transformers/encodec.py). For example a method for decoding a given token sequence into a waveform is missing and a workaround is necessary to do that. A common interface (method names, returned values, tensor dimensions order etc.) would improve modularity.
Solution outline
Refactor to follow a common interface (probably EnCodec's one is already good enough).
Additional context
No response