3Settings

We will release the detailed configuration for all experiment settings for the GRID and V2C Animation dataset, like the ".txt" files of reference audio.

into3

1. Result of Dubbing Setting1

  • Text: β€œYes, I’m the baby Jesus”

Ground Truth:

0_GT.mp4

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4

2. Result of Dubbing Setting2

2.1 The GRID Setting2 Results

(Reference audio is the Same-speaker from other video clips)

  • Reference audio:

    Grid_S10.mp4
  • Text: β€œplace red with m eight now”


Ground Truth:

0_GT.mp4

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4

2.2 The V2C Animation Setting2 Results

(Reference audio is the Same-speaker from other video clips)

  • Reference audio:

    FrozenII@Anna.mp4
  • Text: β€œYou are not responsible for their choices, elsa.”


Ground Truth:

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4

3. Result of Dubbing Setting3

(Zero-shot Test)

3.1 Female voice actors dubbing female characters

  • Reference audio:

    Grid-S29.mp4
  • Text: β€œWhy did you bring me here?”

  • Dubbing_Video_Raw (Providing silence video information)

    Frozen@Elsa.mp4

Ground Truth:

⚠️ There are no ground-truth (GT) result, since this test is zero-shot testing for unseen speakers.

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4

3.2 Female voice actors dubbing male characters

  • Reference audio:

    Grid_S16.mp4
  • Text: β€œI thought you would understand.”

  • Dubbing_Video_Raw (Providing silence video information)

    Cloudy@Earl.mp4

Ground Truth:

⚠️ There are no ground-truth (GT) result, since this test is zero-shot testing for unseen speakers.

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4

3.3 Male voice actors dubbing male characters

  • Reference audio:

    Grid_S32.mp4
  • Text: β€œI can't help. I can't help anyone.”

  • Dubbing_Video_Raw (Providing silence video information)

    Toy@Buzz.mp4

Ground Truth:

⚠️ There are no ground-truth (GT) result, since this test is zero-shot testing for unseen speakers.

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4

3.4 Male voice actors dubbing female characters

  • Reference audio:

    Grid-S2.mp4
  • Text: β€œShe was my whole world.”

  • Dubbing_Video_Raw (Providing silence video information)

    ToyII@Jessie_00_0607_00.mp4

Ground Truth:

⚠️ There are no ground-truth (GT) result, since this test is zero-shot testing for unseen speakers.

πŸ‘‡πŸ‘‡ The Generated result:

1_Fastspeech2.mp4
2_Stylespeech.mp4
3_V2C-Net.mp4
4_Zeroshot-TTS.mp4
5_FaceTTS.mp4
6_HPMDubbing.mp4
Our_StyleDubber.mp4