How to verify whether the training of VAE is good?

How to verify whether the training of VAE is good? Have you provided any code for the visualization of VAE training?

Hi LinghaoChan,

If you use the humanml3d dataset, the normal range for diffusion (text-to-motion task) is around [0.45, 1.0]. It should be [0.2, 0.4] for VAE. For the visualization, both vae and diffusion stage can use the same visualization scripts. You can refer to "Details of training" of FAQ (github readme) and issue #5 and #9 for more details.

P.S.

1. Set up blender - WIP

Refer to TEMOS-Rendering motions for blender setup, then install the following dependencies.
YOUR_BLENDER_PYTHON_PATH/python -m pip install -r prepare/requirements_render.txt
2. (Optional) Render rigged cylinders

Run the following command using blender:
YOUR_BLENDER_PATH/blender --background --python render.py -- --cfg=./configs/render.yaml --dir=YOUR_NPY_FOLDER --mode=video --joint_type=HumanML3D
2. Create SMPL meshes with:
python -m fit --dir YOUR_NPY_FOLDER --save_folder TEMP_PLY_FOLDER --cuda
This outputs:

mesh npy file: the generate SMPL vertices with the shape of (nframe, 6893, 3)

ply files: the ply mesh file for blender or meshlab

3. Render SMPL meshes

Run the following command to render SMPL using blender:
YOUR_BLENDER_PATH/blender --background --python render.py -- --cfg=./configs/render.yaml --dir=YOUR_NPY_FOLDER --mode=video --joint_type=HumanML3D
optional parameters:

--mode=video: render mp4 video

--mode=sequence: render the whole motion in a png image.

Hi, if your VAE results are not correct, please pay attention to this issue #18. We have fixed the bug on KL loss.

motion-latent-diffusion/mld/models/losses/mld.py

Line 105 in 719c219

total += self._update_loss("kl_motion", rs_set['dist_m'], rs_set['dist_ref'])

fine, thx.

Hi, here. I notice the LAMBDA_KL=0.0001. It is much smaller than other LAMBDAs. Does it really work in training VAE? I train the model w/ and w/o it. It seems both results are good.

It is quite important for the second stage (diffusion stage). KL can regularize the latent distribution, thus making the latent space meaningful. If you refer to other papers, the weight of KL loss is usually set to a small value, like 1e-3, 1e-4, 1e-5.