AudioQR

Our work has been accpted by IJCAI-2023.
Our paper has been highlighted by IJCAI-2023 officially. 🔬The research #AI4SG1368 on "AudioQR: Deep Neural Audio Watermarks for QR Code" by Xinghua Qu, et al. presents an innovative application scenario for visually impaired individuals.

Abstract

Image-based quick response (QR) code is frequently used, but creates barriers for the visual impaired people. With the goal of AI for good, this paper proposes the AudioQR, a barrier-free QR coding mechanism for the visually impaired population via deep neural audio watermarks. Previous audio watermarking approaches are mainly based on handcrafted pipelines, which is less secure and difficult to apply in large-scale scenarios. In contrast, AudioQR is the first comprehensive end-to-end pipeline that hides watermarks in audio imperceptibly and robustly. To achieve this, we jointly train an encoder and decoder, where the encoder is structured as a concatenation of transposed convolutions and multi-receptive field fusion modules. Moreover, we customize the decoder training with a stochastic data augmentation chain to make the watermarked audio robust towards different audio distortions, such as environment background, room impulse response when playing through the air, music surrounding, and Gaussian noise. Experiment results indicate that AudioQR can efficiently hide arbitrary information into audio without introducing significant perceptible difference.

Demos

TO-BE appear

Prepared the environment

pip3 install -r requirements.txt

Download the required datasets

bash dataset_download.sh

Run Training