cleardusk/3DDFA

Training on higher expressions dimensions

zaverichintan5 opened this issue · 4 comments

I am trying to train the model with 199 shape and 29 expression dimensions.
I have following doubts:

  1. What normalisation is done on the data from face profiling to form param_all_norm.pkl ? ( I am doing subtraction of mean and dividing by the std of params).
  2. How to fine tune the wpdc loss with vdc loss, do I add the losses with some factor ? (Something like combined loss of wpdc and vdc)

Is this the correct way of normalising the params:

all_params = _load(osp.join(d, 'params_600k.mat'))

R = all_params['R']
t = all_params['t3d']
t = np.expand_dims(t, axis=1)
R_t = np.concatenate([R, t], axis=1)
R_t_flat = R_t.reshape(12, R_t.shape[2])
shape_coeff = all_params['alpha']
exp_coeff = all_params['alpha_exp']

params = np.concatenate(( R_t_flat, shape_coeff, exp_coeff))
mean_params = np.mean(params, axis=1)
std_params = np.std(params, axis=1)
print(np.mean(params))
print(np.std(params))

params_norm = params - params.mean(axis=1).reshape((240,1))
params_norm = params_norm/params.std(axis=1).reshape((240,1))

As for normalization, each dimension of the parameter is normalized by its own statistics, so that each parameter is re-scaled to a standard distribution. Suppose params is with shape Nx62, the shape of mean and std is all 62. The paper of 3DDFA_V2 has illustrated it.

As for the param concatenation: [R; t] is with shape 3x4. R is 3x3, t is 3x1.

As for joint training, you can refer to 3DDFA_V2. In the meta-joint optimization section, there is a vanilla joint loss named L_{vanilla-joint}.

Thanks for the quick reply.
Is the keypoints available for the augmented images dataset?
Or any link to the dataset used to train higher expressions model?

Are you aware of this? The pose parameter of the 300W-LP dataset is 7 dimensional
However, in the implementation code of the 3DDFA paper, the 12 dimensions of regression are used directly
I don't know the connection, do you know this? There are many people asking this question on github, but I can't find the answer, can you help me