CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Primary LanguagePython