soumik-kanad/diff2lip

i am stuck here

Closed this issue · 9 comments

Executing command: python generate.py --attention_resolutions 32,16,8 --class_cond False --learn_sigma True --num_channels 128 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm False --predict_xstart False --diffusion_steps 1000 --noise_schedule linear --rescale_timesteps False --sampling_seed=7 --sampling_input_type=gt --sampling_ref_type=gt --timestep_respacing ddim25 --use_ddim True --model_path=D:/diff2lip-main/checkpoints/e7.24.1.3_model260000_paper.pt --nframes 5 --nrefer 1 --image_size 128 --sampling_batch_size=32 --face_hide_percentage 0.5 --use_ref=True --use_audio=True --audio_as_style=True --generate_from_filelist 0
MPI.COMM_WORLD.Get_rank() 0
os.environ["CUDA_VISIBLE_DEVICES"] 0
Logging to d2l_gen
creating model...
-vf: No such file or directory
Unrecognized option '2'.
Error splitting the argument list: Option not found
C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\librosa\util\decorators.py:88: UserWarning: PySoundFile failed. Trying audioread instead.
return f(*args, **kwargs)
Traceback (most recent call last):
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\librosa\core\audio.py", line 164, in load
y, sr_native = __soundfile_load(path, offset, duration, dtype)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\librosa\core\audio.py", line 195, in __soundfile_load
context = sf.SoundFile(path)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\soundfile.py", line 740, in init
self._file = self._open(file, mode_int, closefd)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\soundfile.py", line 1264, in _open
_error_check(_snd.sf_error(file_ptr),
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\soundfile.py", line 1455, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'd2l_gen\temp\audio.wav': System error.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\diff2lip-main\generate.py", line 398, in
main()
File "D:\diff2lip-main\generate.py", line 336, in main
generate(args.video_path, args.audio_path, model, diffusion, detector, args, out_path=args.out_path, save_orig=args.save_orig)
File "D:\diff2lip-main\generate.py", line 204, in generate
wrong_all_indiv_mels, wrong_audio_wavform = load_all_indiv_mels(audio_path, args)
File "D:\diff2lip-main\generate.py", line 47, in load_all_indiv_mels
wav = audio.load_wav(out_path, args.sample_rate)
File "D:\diff2lip-main\audio\audio.py", line 10, in load_wav
return librosa.core.load(path, sr=sr)[0]
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\librosa\util\decorators.py", line 88, in inner_f
return f(*args, **kwargs)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\librosa\core\audio.py", line 170, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\librosa\core\audio.py", line 226, in _audioread_load
reader = audioread.audio_open(path)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\audioread_init
.py", line 127, in audio_open
return BackendClass(path)
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\audioread\rawread.py", line 59, in init
self._fh = open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'd2l_gen\temp\audio.wav'
Error: Command failed with exit code 1

Have you specified the audio file path properly?

yes you need o simplify usage that will be alott easier and now i got torch error

Executing command: python generate_dist.py --attention_resolutions 32,16,8 --class_cond False --learn_sigma True --num_channels 128 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm False --predict_xstart False --diffusion_steps 1000 --noise_schedule linear --rescale_timesteps False --sampling_seed=7 --sampling_input_type=gt --sampling_ref_type=gt --timestep_respacing ddim25 --use_ddim True --model_path=D:/diff2lip-main/checkpoints/e7.24.1.3_model260000_paper.pt --nframes 5 --nrefer 1 --image_size 128 --sampling_batch_size=32 --face_hide_percentage 0.5 --use_ref=True --use_audio=True --audio_as_style=True --generate_from_filelist 0
MPI.COMM_WORLD.Get_rank() 0
os.environ["CUDA_VISIBLE_DEVICES"] 0
Logging to d2l_gen
creating model...
Recovering from OOM error; New batch size: 32
Traceback (most recent call last):
File "D:\diff2lip-main\generate_dist.py", line 428, in
main()
File "D:\diff2lip-main\generate_dist.py", line 366, in main
generate(args.video_path, args.audio_path, model, diffusion, detector, args, out_path=args.out_path, save_orig=args.save_orig)
File "D:\diff2lip-main\generate_dist.py", line 250, in generate
torch.cuda.synchronize()
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\torch\cuda_init_.py", line 799, in synchronize
lazy_init()
File "C:\Users\ggrov\anaconda3\envs\diff2lip\lib\site-packages\torch\cuda_init
.py", line 293, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Error: Command failed with exit code 1

FileNotFoundError: [Errno 2] No such file or directory: 'd2l_gen\temp\0\audio.wav'
Error: Command failed with exit code 1

That issue comes in the load_all_indiv_mels() function which first extracts the audio into "audio.wav" file and then converts that into melspectrogram. Conversion to audio.wav is failing. I suspect that this might be because of some error in the ffmpeg command or missing ffmpeg on your system.

Also, the code has been tested on Linux only and there are known distributed data parallel issues right now on the Windows platform. I suggest that you try the code out on google colab notebook instead.

did you made collab notebook of this?

@zachysaur I do have a collab notebook for this now.