SynthAC Demo

This is the demo corresponding to the paper "Synth-AC: Enhancing Audio Captioning with Synthetic Supervision".

Demo of image captions and audio captions

We provide some examples of image captions and audio captions, to show the implicit relation between image captions and acoustic scenes.

Image Caption Audio Caption
A black car is near someone riding a bike A man talking and a car passing by loudly
A barking dog looks over a ledge lined with Christmas lights Dog barking and growling
A cat sleeping on a rock near a bike A cat sleeps and snores

Synthetic data demo

Moreover, there are some examples of the synthetic text-audio pairs provided in synthetic_data.

Some insteresting examples

We also provide some generated examples as below:

  • Examples to show that the visual description (e.g., the color "white/black" in "a white/black car") does not affect the content of synthetic audio, at the path "examples about 'a white or black car'".
  • Examples of synthetic audio with the prompt "a ledge lined with Christmas lights", at the path "examples about 'a ledge lined with Christmas lights'".

License

This project is released under the CC BY-NC-ND 4.0 license.