Please download all the files to the local, and then extract the four compressed files to the local, click index.html, you can see the audio display. Below is the audio sample sharing on our proposed W2VC model as well as three baseline models: VQ-VAE, CTC-VQ-VAE, FragmentVC. Four pairs of transformation results are selected for each model. The male speaker “bdl” and the female speaker “slt” were selected as the target speakers. The male speaker “rms” and the female speaker “clb” were selected as the source speakers. Four transformations were applied to each model :rms→slt(male to female), rms→bdl(male to male), clb→slt(female to female), clb→bdl(female to male).