I have a question regarding the metrics output after training the model.
Closed this issue · 1 comments
Hello! I became interested in this research and ran the demo as follows:
bash generative_modifier/script_generative_modifier.sh 'train' pretrain 1
As a result, the following metrics were output:
BertScore: hashcode distilbert-base-uncased_L5_no-idf_version=0.3.12(hug_trans=4.41.2)
[val] Epoch: [2999] Stats: avg_likelihood: -0.643 kld_a: 1.609 kld_b: -0.528 fid: 0.017 v2v_elbo: 2.243 v2v_dist_avg: 0.213 v2v_dist_top1: 0.125 v2v_dist_top6: 0.15 jts_elbo: 1.791 jts_dist_avg: 0.292 jts_dist_top1: 0.151 jts_dist_top6: 0.191 rot_elbo: 1.259 rot_dist_avg: 9.255 rot_dist_top1: 6.998 rot_dist_top6: 7.696 ret_r1_prec: 89.457 ret_r2_prec: 94.436 ret_r3_prec: 96.597 ret_multimodality: 0.407 ...
Could you please explain what each of these metrics means?
Thank you, and I would appreciate it if you could check this at your convenience!
Hello,
I will refer to the paper below. The emoji 🚮 marks attempt metrics, used at some point during development, that were eventually abandoned, and should thus be taken with a pinch of salt. These metrics are computed here and here.
-
avg_likelihood
: (averaged over the dataset) log-likelihood corresponding to the generated texts (computation here) - 🚮
kld_a
: Kullback–Leibler Divergence (KLD) between the fusion distribution of the pose editing model (parameterized by ($\mu_p, \Sigma_p$ ) in Figure 4), obtained when using the generated text vs. using the original text annotation. - 🚮
kld_b
: difference between two KLDs, obtained by comparing the pose distribution ( ($\mu_b, \Sigma_b$ ) in Figure 4) and the fusion distribution (($\mu_p, \Sigma_p$ ) in Figure 4) of the pose editing model, when using the the generated text vs. using the original text annotation. Instead of comparing the text effect as inkld_a
,kld_b
was supposed to measure the loss in alignment with the probabilistic view of pose B, when using the generated text instead of the original one. - 🚮
fid
: FID (as in Table 5) when using the generated text in the pose-editing model. - 🚮
(v2v|jts|rot)_elbo
: ELBO for vertices, joints and rotations (as in Table 5), when using the generated text in the pose-editing model. - 🚮
(v2v|jts|rot)_dist_avg
: reconstruction metrics from Table 6 (MPVE|MPJE|geodesic), averaged over 30 generated pose samples. -
(v2v|jts|rot)_dist_top(1|6)
: reconstruction metrics from Table 6 (MPVE|MPJE|geodesic) using the "best" (closer to the GT pose) generated pose sample out of 30 (or, if 🚮top6
, using the average over the top 6 best samples). -
ret_r(1|2|3)_prec
are the R precision metrics (table 6, first 3 columns). - 🚮
ret_multimodality
: (computation) originally proposed along with the R precision metrics in TM2T [19], but based here on the cosine similarity instead of the Euclidean distance, as our evaluation retrieval model was trained with the former.