/ICME2024ASC

baseline for IEEE ICME 2024 GC: Semi-supervised Acoustic Scene Classification under Domain Shift

Primary LanguagePythonMIT LicenseMIT

Semi-supervised Acoustic Scene Classification under Domain Shift

Baseline for IEEE ICME 2024 Grand Challenge.

This Challenge aims to push the boundaries of computational audition by tackling one of its most compelling problems: effectively classifying acoustic scenes under significant domain shifts.

Challenge Website

ICME2024 GC
Challenge website
Google groups
Development dataset
Evaluation dataset

Updates

2024-FEB-05 The challenge has started and links to the development dataset and registration are available.

2024-MAR-10 Update on train.py script to correct issue with calculating macro-average accuracy.

2024-MAR-15 The link to the evaluation dataset is available.

Official Baseline

main

Step 1: Python Running Environment

conda create -n ASC python=3.10
conda activate ASC
git clone git@github.com:JishengBai/ICME2024ASC.git; cd ICME2024ASC
pip install -r requirement.txt

Step 2: Setup Dataset

This step includes dataset download, unzip, and feature extraction.

# Takes about an hour
python3 setup_data.py
# Our development dataset is available on Zenodo: https://zenodo.org/records/10616533.
# Our evaluation dataset will be released on Mar 15, 2024.

Step3: Train and Evaluate Model

# Model training, which includes the following three steps:
# (1) Training with limited labels; (2) Pseudo labeling; (3) Model training with pseudo labels.
# In total the training process takes about 30 minutes on a single NVIDIA 2080 Ti.
python train.py

# Model testing.
python test.py

You can find an example training log here

Cite

@misc{bai2024description,
      title={Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift}, 
      author={Jisheng Bai and Mou Wang and Haohe Liu and Han Yin and Yafei Jia and Siwei Huang and Yutong Du and Dongzhe Zhang and Dongyuan Shi and Woon-Seng Gan and Mark D. Plumbley and Susanto Rahardja and Bin Xiang and Jianfeng Chen},
      year={2024},
      eprint={2402.02694},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}
@ARTICLE{9951400,
  author={Bai, Jisheng and Chen, Jianfeng and Wang, Mou and Ayub, Muhammad Saad and Yan, Qingli},
  journal={IEEE Transactions on Cognitive and Developmental Systems}, 
  title={A Squeeze-and-Excitation and Transformer-Based Cross-Task Model for Environmental Sound Recognition}, 
  year={2023},
  volume={15},
  number={3},
  pages={1501-1513},
  keywords={Task analysis;Acoustics;Computational modeling;Speech recognition;Transformers;Pattern recognition;Computer architecture;Attention mechanism;cross-task model;data augmentation;environmental sound recognition (ESR)},
  doi={10.1109/TCDS.2022.3222350}}

Organization

  • Northwestern Polytechnical University, China
  • Xi'an Lianfeng Acoustic Technologies Co., Ltd., China
  • Nanyang Technological University, Singapore
  • Institute of Acoustics, Chinese Academy of Sciences, China
  • University of Surrey, UK