Distortion for egocentric viewpoints

Hi Alex, how are you doing?
I'd like to ask you about the distortion for the egocentric viewpoints.
I'm trying to visualize projected vertices of SMPL-X mesh to the egocentric viewpoints.
However, it seems vertices are projected to a wrong positions although I filtered points whose depth values are negative.
Could you check the below code? You can simply run it after setting arctic_root_path and smplx_root_path and run python test.py.
When I run below code, I got below visualized results. It seems some vertices are projected to a weird position.
I used the same distortion function as yours (

arctic/common/transforms.py

Line 82 in 9f57709

def distort_pts3d_all(_pts_cam, dist_coeffs):

)

import os.path as osp
from glob import glob
import numpy as np
import cv2
import torch
import json
import smplx
from pytorch3d.io import load_obj, save_obj

# This function is from https://github.com/zc-alexfan/arctic/blob/9f5770966350c66d8bf0ac3fd4cfde74434a109b/common/transforms.py#L82
def distort_pts3d_all(_pts_cam, dist_coeffs):
    # egocentric cameras commonly has heavy distortion
    # this function transform points in the undistorted camera coord
    # to distorted camera coord such that the 2d projection can match the pixels.
    pts_cam = _pts_cam.clone().double()
    z = pts_cam[:, :, 2]
    is_valid = z > 1e-4
    z_inv = 1 / z

    x1 = pts_cam[:, :, 0] * z_inv
    y1 = pts_cam[:, :, 1] * z_inv

    # precalculations
    x1_2 = x1 * x1
    y1_2 = y1 * y1
    x1_y1 = x1 * y1
    r2 = x1_2 + y1_2
    r4 = r2 * r2
    r6 = r4 * r2

    r_dist = (1 + dist_coeffs[0] * r2 + dist_coeffs[1] * r4 + dist_coeffs[4] * r6) / (
        1 + dist_coeffs[5] * r2 + dist_coeffs[6] * r4 + dist_coeffs[7] * r6
    )

    # full (rational + tangential) distortion
    x2 = x1 * r_dist + 2 * dist_coeffs[2] * x1_y1 + dist_coeffs[3] * (r2 + 2 * x1_2)
    y2 = y1 * r_dist + 2 * dist_coeffs[3] * x1_y1 + dist_coeffs[2] * (r2 + 2 * y1_2)
    # denormalize for projection (which is a linear operation)
    cam_pts_dist = torch.stack([x2 * z, y2 * z, z], dim=2).float()
    return cam_pts_dist, is_valid

# path
arctic_root_path = '/data/ARCTIC/arctic/unpack/arctic_data/data' # there are 'images', 'meta', 'raw_seqs', and 'splits_json' folders in this directory
smplx_root_path = '/home/mks0601/workspace/human_model_files' # there is a 'smplx' folder in this directory
subject_name = 's01'
seq_name = 'box_grab_01'
cam_name = '0'
frame_idx = 70

# load files
with open(osp.join(arctic_root_path, 'meta', 'misc.json')) as f:
    db_info = json.load(f)
ego_cam_param = np.load(osp.join(arctic_root_path, 'raw_seqs', subject_name, seq_name + '.egocam.dist.npy'), allow_pickle=True)[()]
smplx_params = np.load(osp.join(arctic_root_path, 'raw_seqs', subject_name, seq_name + '.smplx.npy'), allow_pickle=True)[()]
img = cv2.imread(osp.join(arctic_root_path, 'images', subject_name, seq_name, cam_name, '%05d.jpg' % frame_idx))
v_template, _, _ = load_obj(osp.join(arctic_root_path, 'meta', 'subject_vtemplates', subject_name + '.obj'))
gender = db_info[subject_name]['gender']
smplx_layer = smplx.create(smplx_root_path, 'smplx', gender=gender, use_pca=False, flat_hand_mean=True, v_template=v_template)

# camera parameter
offset = db_info[subject_name]['ioi_offset']
frame_idx_offset = frame_idx - offset
cam_param = {'R': ego_cam_param['R_k_cam_np'][frame_idx_offset], \
            't': ego_cam_param['T_k_cam_np'][frame_idx_offset], \
            'focal': np.array([ego_cam_param['intrinsics'][0][0], ego_cam_param['intrinsics'][1][1]], dtype=np.float32), \
            'princpt': np.array([ego_cam_param['intrinsics'][0][2], ego_cam_param['intrinsics'][1][2]], dtype=np.float32), \
            'distortion': ego_cam_param['dist8']}
    
# get smplx vertices
smplx_param = {k: torch.FloatTensor(v[frame_idx_offset]).view(1,-1) for k,v in smplx_params.items()} # 'transl', 'global_orient', 'body_pose', 'jaw_pose', 'leye_pose', 'reye_pose', 'left_hand_pose', 'right_hand_pose'
output = smplx_layer(global_orient=smplx_param['global_orient'], body_pose=smplx_param['body_pose'], jaw_pose=smplx_param['jaw_pose'], leye_pose=smplx_param['leye_pose'], reye_pose=smplx_param['reye_pose'], left_hand_pose=smplx_param['left_hand_pose'], right_hand_pose=smplx_param['right_hand_pose'], transl=smplx_param['transl'])
xyz = output.vertices[0].detach().numpy() # world coordinate
xyz = np.dot(cam_param['R'], xyz.transpose(1,0)).transpose(1,0) + cam_param['t'].reshape(1,3) # camera coordinate
xyz, is_valid = distort_pts3d_all(torch.FloatTensor(xyz[None]), cam_param['distortion']) # distorted camera coordinate
xyz, is_valid = xyz[0].numpy(), is_valid[0].numpy()
x = xyz[:,0] / xyz[:,2] * cam_param['focal'][0] + cam_param['princpt'][0] # image coordinate
y = xyz[:,1] / xyz[:,2] * cam_param['focal'][1] + cam_param['princpt'][1] # image coordinate

# visualize
for i in range(len(x)):
    if is_valid[i]:
        img = cv2.circle(img, (int(x[i]), int(y[i])), 3, (255,0,0), -1)
cv2.imwrite(subject_name + '_' + seq_name + '_' + cam_name + '_' + str(frame_idx) + '.jpg', img)

FYI, if I project the camera coordinates to image space without using the distortion function, there is no such problem.

Hi Gyeongsik,

The main reason is that hands and objects in egocentric view usually are very close to the camera. Therefore, there are more distortion in the pixel space. To address this, we uses "vertex displacement" to "correct" points in 3D using the distortion parameters such that they have better 2D overlay. However, this approach assumes the points being distorted in 3D are not very close to the camera. This is not the case for SMPLX (the head of SMPLX).

You can ignore the distort_pts3d_all function but it won't take distortion into consideration. Having said that vertex displacement is probably not the most suitable for SMPLX use-case, I am also curious in case you know any other solutions (maybe check out AssemblyHands as they have more distorted images or put the distortion parameters into the renderer to render properly).

Awesome. Thanks for your check!