audio-captioning
There are 27 repositories under audio-captioning topic.
soham97/awesome-sound_event_detection
Reading list for research topics in Sound AI
Labbeti/aac-datasets
Audio Captioning datasets for PyTorch.
TheoCoombes/ClipCap
Using pretrained encoder and language models to generate captions from multimedia inputs.
audio-captioning/clotho-dataset
Python code for handling the Clotho dataset.
ilaria-manco/muscaps
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)
ilaria-manco/song-describer
Song Describer is a data collection platform for annotating music with textual descriptions.
an-tran528/wavetransformer
Code base for WaveTransformer: A novel architecture for automated audio captioning
audio-captioning/dcase-2020-baseline
Audio captioning baseline system for DCASE 2020 challenge.
Labbeti/aac-metrics
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
slSeanWU/beats-conformer-bart-audio-captioner
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation"
soham97/sound_ai_progress
Tracking states of the arts and recent results (bibliography) on sound tasks.
minguinho26/Prefix_AAC_ICASSP2023
Official Implementation of "Prefix tuning for Automated Audio Captioning(ICASSP 2023)"
lukewys/dcase_2020_T6
2nd place solution for 2020 DCASE challenge task 6 audio captioning. http://dcase.community/challenge2020/task-automatic-audio-captioning-results#wuyusong2020_t6
blmoistawinde/fense
Fluency ENhanced Sentence-bert Evaluation (FENSE), metric for audio caption evaluation. And Benchmark dataset AudioCaps-Eval, Clotho-Eval.
ExplainableML/ZerAuCap
[NeurIPS 2023 - ML for Audio Workshop (Oral)] Zero-shot audio captioning with audio-language model guidance and audio context keywords
audio-captioning/caption-evaluation-tools
Tools for the evaluation of audio captioning.
Labbeti/conette-audio-captioning
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
iOPENCap/awesome-unimodal-training
text-only training or language-free training for multimodal tasks (image/audio/video caption, retrieval, text2image)
Sreyan88/RECAP
Code for ICASSP 2024 Paper: RECAP: Retrieval-Augmented Audio Captioning
abikaki/DCASE-Workshop-Papers
Workshop on Detection and Classification of Acoustic Scenes and Events
satvik-dixit/mace
Code for the paper: MACE: Leveraging Audio for Evaluating Audio Captioning Systems
Labbeti/dcase2024-task6-baseline
DCASE2024 Challenge Task 6 baseline system (Automated Audio Captioning)
audio-captioning/clotho-dataloader
PyTorch dataloader for Clotho dataset.
paniquex/Automated_Audio_Captioning_DCASE2020
6-th task solution of DCASE2020
dr-costas/clotho-baseline-dataset
Code for using with the Clotho dataset
zelaki/wsac
This reporsitory code form Weakly Supervised Automaed Audio Captioning via Text Only Training
Labbeti/dcase2021task6
IRIT-UPS DCASE 2021 AUDIO CAPTIONING SYSTEM