
Testing from .wav failed

Opened this issue · 0 comments

Hi, I ran the following testing code to convert .wav -> mel using librosa and then Univnet with pretrained checkpoint to do the inverse but the results were extremely bad. Can you point out what I'm doing wrong? The input file is clean, US english speech. arguments: -p ./chkpt/ -c config/default_c16.yaml -i /Users/kelseyd/Documents/train/TF -o ./out

for filename in tqdm.tqdm(glob.glob(os.path.join(args.input_folder, '*.wav'))):
y, sr = librosa.load(filename,sr=24000)
mel=librosa.feature.melspectrogram(y=y, sr=sr, n_fft=1024, n_mels=100, fmin=0, fmax=12000)
mel = torch.from_numpy(mel)

        if len(mel.shape) == 2:
            mel = mel.unsqueeze(0)

        audio = model.inference(mel)
        audio = audio.cpu().detach().numpy()