Question about mean model points

Q1. The mean shapes differ from SPD mean shape. I wonder how you get mean shape.

Q2. May I ask why you added a shuffle during the loading mean model points?

CATRE/core/catre/datasets/data_loader.py

Line 376 in a0b42b5

if shuffle:

Hi,
Q1: The mean shape is derived from SPD. For adjusting CATRE, we rescaled the range to [-1, 1].

Q2: There is no special reason. I've done related experiments and found that the shuffle operation won't affect the results.

Thanks for your quick replies.

Regarding Q1, do you have any reason for rescaling the range to [-1, 1]? Does this improve the performance?

Because we also refine the estimated scale, in the code we multiply the mean shape with s_{est}.

Thanks, Is this part what you say?

CATRE/core/catre/engine/batch_test.py

Lines 85 to 90 in a0b42b5

    
           tfd_kps = transform_normed_pts_batch( 
        
               batch["obj_kps"], 
        
               r_est, 
        
               t=None if cfg.INPUT.ZERO_CENTER_INPUT else t_est, 
        
               scale=s_est, 
        
           )

Yes.

Thanks for your replies :)

@shanice-l Can you explain how do you get mean_scale?

CATRE/ref/nocs.py

Lines 105 to 112 in a0b42b5

    
           mean_scale = { 
        
               "bottle": 0.001 * np.array([87, 220, 89], dtype=np.float32), 
        
               "bowl": 0.001 * np.array([165, 80, 165], dtype=np.float32), 
        
               "camera": 0.001 * np.array([88, 128, 156], dtype=np.float32), 
        
               "can": 0.001 * np.array([68, 146, 72], dtype=np.float32), 
        
               "laptop": 0.001 * np.array([346, 200, 335], dtype=np.float32), 
        
               "mug": 0.001 * np.array([146, 83, 114], dtype=np.float32), 
        
           }

The values are slightly different when I check using the mean scale from SPD mean shape.
np.abs(self.mean_model[class].max(0)) / 2

mean_scale = {
"bottle": array([0.07624906, 0.21785153, 0.07224454]),
"bowl": array([0.16847943, 0.06134908, 0.16105685])
"camera":array([0.09654312, 0.11487991, 0.16583854])
"can":array([0.11170839, 0.18665881, 0.11039044])
"laptop":array([0.1408229 , 0.10234819, 0.16517451])
"mug": array([0.18962663, 0.11490806, 0.10203232]),
}

The code is borrowed from GPV-Pose

	tfd_kps = transform_normed_pts_batch(
	batch["obj_kps"],
	r_est,
	t=None if cfg.INPUT.ZERO_CENTER_INPUT else t_est,
	scale=s_est,
	)

	mean_scale = {
	"bottle": 0.001 * np.array([87, 220, 89], dtype=np.float32),
	"bowl": 0.001 * np.array([165, 80, 165], dtype=np.float32),
	"camera": 0.001 * np.array([88, 128, 156], dtype=np.float32),
	"can": 0.001 * np.array([68, 146, 72], dtype=np.float32),
	"laptop": 0.001 * np.array([346, 200, 335], dtype=np.float32),
	"mug": 0.001 * np.array([146, 83, 114], dtype=np.float32),
	}