Regarding codebook
Closed this issue · 10 comments
Hey, this was a fantastic repo I found in my research from the last few weeks I am trying to understand some code things from your repo is it possible for you to solve my issue below written
- The codebook is missing will I get this thing after training the model, I have seen the code also but it was not written.
- can I use the same codebook that was present CODEBOOK
- After Getting BVH, is there anywhere to convert it into the human avatar image?
Waiting for the solution :)
Thanks
Sai
Dear Sai,
Sorry for the confusing codes, you should use sample.py
rather than inference.py
, I have deleted the main/mydiffusion_zeggs/inference.py
. And, this work hasn't used the codebook.
Best wishes.
Hi YoungSeng, Thanks For your reply.
Taking your reply into consideration I started playing with the sample.py
- First it worked fine with file
015_Happy_4_x_1_0.wav
named this format - I tried with the normal name like `1.wav' the sample.py is throwing the below error
Traceback (most recent call last):
File "/content/drive/MyDrive/DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py", line 418, in <module>
main(config, save_dir, config.model_path, audio_path=None, mfcc_path=None, audiowavlm_path=config.audiowavlm_path, max_len=config.max_len)
File "/content/drive/MyDrive/DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py", line 378, in main
style = style2onehot[audiowavlm_path.split('/')[-1].split('_')[1]]
IndexError: list index out of range
do we have any particular format that needs to be given as the input file name, can you please help me with this.
Regarding Input Format
In what format do we need to send the input with the size and shape of the input file could please help with this also.
Thanks
Sai
Dear Sai,
The code is a hard demo, if you want to use your own audio, you can comment out
and uncomment any of the following lines
DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py
Lines 379 to 380 in 85f4096
to choose your own Style and Intensity as
DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py
Lines 20 to 27 in 85f4096
Hope this will help you!
Hi YoungSeng, Thanks For You Time and Reply
I am facing a shape error, can you please mention the shape and size of the file need to be given as the input
Traceback (most recent call last):
File "/content/drive/MyDrive/DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py", line 420, in <module>
main(config, save_dir, config.model_path, audio_path=None, mfcc_path=None, audiowavlm_path=config.audiowavlm_path, max_len=config.max_len)
File "/content/drive/MyDrive/DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py", line 384, in the main
inference(args, wavlm_model, mfcc, sample_fn, model, n_frames=max_len, smoothing=True, SG_filter=True, minibatch=True, skip_timesteps=0, style=style, seed=123456) # style2onehot['Happy']
File "/content/drive/MyDrive/DiffuseStyleGesture/main/mydiffusion_zeggs/sample.py", line 233, in inference
audio_reshape = torch.from_numpy(audio).to(torch.float32).reshape(num_subdivision, int(stride_poses * 16000 / 20)).to(mydevice).transpose(0, 1) # mfcc[:, :-2]
RuntimeError: shape '[4, 64000]' is invalid for input of size 237867
Looking forward :)
Model file :- './model000450000.pt'
Thanks
Sai
it seems to be the problems of the shape of audio, do you set a max_len that more than the length of real audio? You may try to set max_len equal to 0. If you still have this problem, please upload the audio file. I will check it.
Hi YoungSeng, Thanks for your Time
- I have run the code, BVH file got generated in
"./sample_dir"
, is there any way to convert it into mkv - I am looking to convert directly "bvh" to "some persons image mp4" rendered video can I know if it is possible or can I know the process for it. I will work on it.
Thanks
Sai
Hey Sai,
In practice, I highly recommend using Blender visualization bvh. Similar software are maya, motionbuilder, I have tried them and found Blender more friendly. You can easily perform importing audio, rendering video, or even writing a script like Trimodal.
You can also get a video of the skeleton in Python. Please ref to this issue.
There are some repositories for visualization and you can also try, such as PyMO, npybvh, and Python_BVH_viewer, although I don't really recommend them.
Good luck!
Hi YounSeng,
I have tried a lot but I am not getting how to convert this BVH File to 3D Video With Audio, I need little help. is there any repo or any models or code to like what I needed
Thanks
Sai
I recommend you the method I use:
- Download blender, it is free! And install it.
- Import
.bvh
file and you can play it:
- For render, setting some parameters:
- Then render:
- To add audio:
I also encountered this problem, my audio is about 2 seconds, I set max_lenth=0, but still get this error:
Traceback (most recent call last):
File "sample.py", line 442, in
main(config, save_dir, config.model_path, audio_path=None, mfcc_path=None, audiowavlm_path=config.audiowavlm_path, max_len=config.max_len)
File "sample.py", line 406, in main
inference(args, wavlm_model, mfcc, sample_fn, model, n_frames=max_len, smoothing=True, SG_filter=True, minibatch=True, skip_timesteps=0, style=style, seed=123456) # style2onehot['Happy']
File "sample.py", line 237, in inference
audio_reshape = torch.from_numpy(audio).to(torch.float32).reshape(num_subdivision, int(stride_poses * 16000 / 20)).to(mydevice).transpose(0, 1) # mfcc[:, :-2]
RuntimeError: shape '[4, 64000]' is invalid for input of size 36480