rafaelvalle/asrgen

Generate target samples

Closed this issue · 2 comments

Thank you for your contribution, I have some doubts in the experiment, I hope you can answer.
First question:
In gan_synthesis.ipynb

audio = load_wav_to_torch('data_16khz/zcathy/cathy.wav', SAMPLING_RATE)
audio /= MAX_WAV_VALUE
audio = audio[None, :]
reference_mel = taco_stft.mel_spectrogram(audio)[0]
print(reference_mel.min(), reference_mel.max())

mel -= mel.min()
mel = mel / mel.max()
mel = mel * reference_mel.max()
print(mel.min(), mel.max())**

Is mel = mel * reference_mel.max() the matching of the generated fake audio with the real audio?
I don't quite understand how to use the trained G_NET to generate the voiceprint audio that matches the target.

Second question:
Is gan_attack.ipynb a target attack?
The target ID you set is 0. Can this be modified and replaced with another ID?

Looking forward to your reply!

1a) That scales the target mel-spectrogram to the target mel-spectrogram.
1b) samples = G_net(noise) generates fake samples.

2)Yes, it is a targeted attack.

Closing due to inactivity.