sicxu/Deep3dPortrait

3D face not in the center

fungtion opened this issue · 12 comments

Thank you for your great work. I open the obj file produced in the 4th step using meshlab, but I found the face is not in the center circle, is there something wrong?

image

And how to project 3D model to 2D plane, may be a silly question though.

sicxu commented

Thank you for your great work. I open the obj file produced in the 4th step using meshlab, but I found the face is not in the center circle, is there something wrong?

image

You may double click the shape to center it. Or you can remove the meaningless padded points and triangles for visualization.

sicxu commented

And how to project 3D model to 2D plane, may be a silly question though.

You may take the following code as a reference. It's an opencv perspective projection func.

def projection_layer(face_shape, focal=1015.0, penter=[112.0, 112.0]): # we choose the focal length and camera position empirically
camera_pos = np.reshape(np.array([0.0,0.0,10.0]),[1,1,3]) # camera position
reverse_z = np.reshape(np.array([1.0,0,0,0,1,0,0,0,-1.0]),[1,3,3])
p_matrix = np.concatenate([[focal],[0.0],[penter[0]],[0.0],[focal],[penter[1]],[0.0],[0.0],[1.0]],axis = 0) # projection matrix
p_matrix = np.reshape(p_matrix,[1,3,3])
# calculate face position in camera space
face_shape = np.matmul(face_shape,reverse_z) + camera_pos
# calculate projection of face vertex using perspective projection
aug_projection = np.matmul(face_shape, np.transpose(p_matrix,[0,2,1]))
face_projection = aug_projection[:,:,0:2]/np.reshape(aug_projection[:,:,2],[1,np.shape(aug_projection)[1],1])
return face_projection

I want to use my photos to generate a 3D portrait model. I learned that in addition to photos, I also need several other files to input.
I have used the method here https://github.com/sicxu/Deep3dPortrait/issues/9 to generate landmark.txt file.But I don't know how to generate the detection.txt files and the .mat file.
I am a novice. I hope you can give me as detailed steps as possible to answer my doubts. Thank you very much for your help.

And how to project 3D model to 2D plane, may be a silly question though.

You may take the following code as a reference. It's an opencv perspective projection func.

def projection_layer(face_shape, focal=1015.0, penter=[112.0, 112.0]): # we choose the focal length and camera position empirically
camera_pos = np.reshape(np.array([0.0,0.0,10.0]),[1,1,3]) # camera position
reverse_z = np.reshape(np.array([1.0,0,0,0,1,0,0,0,-1.0]),[1,3,3])
p_matrix = np.concatenate([[focal],[0.0],[penter[0]],[0.0],[focal],[penter[1]],[0.0],[0.0],[1.0]],axis = 0) # projection matrix
p_matrix = np.reshape(p_matrix,[1,3,3])
# calculate face position in camera space
face_shape = np.matmul(face_shape,reverse_z) + camera_pos
# calculate projection of face vertex using perspective projection
aug_projection = np.matmul(face_shape, np.transpose(p_matrix,[0,2,1]))
face_projection = aug_projection[:,:,0:2]/np.reshape(aug_projection[:,:,2],[1,np.shape(aug_projection)[1],1])
return face_projection

@sicxu Thank you, I tried these codes and can project the 3D face to 2D plane successfully, but it appears a vacant space in the left face
image

I did the projection as following:

  1. extract face_texture, face_xyz, hair_texture, hair_xyz, border_texture, border_xyz from obj file
  2. project *.xyz to 2D plane according to the above codes
  3. fill in the 2D plane with texture of 3D space

I think it may be something wrong with the texture

sicxu commented

This code is just for per-vertex projection. If you want to render the 3d model to obtain a 2d image, pls refer to the rendering pipeline. In this repo, we use the tf_mesh_renderer to render the 3d model. You can simply pass the per-vertex rgb value to depth in this func for your purpose.

def create_renderer_graph(v_num=35709, t_num=70789, img_size=256):
with tf.Graph().as_default() as graph:
focal = tf.placeholder(dtype=tf.float32, shape=[1])
center = tf.placeholder(dtype=tf.float32, shape=[1, 1, 2])
depth = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])
vertex = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])
tri = tf.placeholder(dtype=tf.int32, shape=[1, t_num, 3])
fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180
delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1)
camera_position = tf.constant([0, 0, 10.0])
camera_lookat = tf.constant([0, 0, 0.0])
camera_up = tf.constant([0, 1.0, 0])
light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3])
light_intensities = tf.zeros([1, 1, 3])
depthmap = mesh_renderer(vertex, tri, tf.zeros_like(vertex), depth,
camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up,
light_positions=light_positions, light_intensities=light_intensities,
image_width=img_size,image_height=img_size,
fov_y=fov_y, far_clip=30.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center)
return graph, focal, center, depth, vertex, tri, depthmap

sicxu commented

You may also use other modern renderer to achieve this, such as pyrender or pytorch3d.

OKay, Thank you.

This code is just for per-vertex projection. If you want to render the 3d model to obtain a 2d image, pls refer to the rendering pipeline. In this repo, we use the tf_mesh_renderer to render the 3d model. You can simply pass the per-vertex rgb value to depth in this func for your purpose.

def create_renderer_graph(v_num=35709, t_num=70789, img_size=256):
with tf.Graph().as_default() as graph:
focal = tf.placeholder(dtype=tf.float32, shape=[1])
center = tf.placeholder(dtype=tf.float32, shape=[1, 1, 2])
depth = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])
vertex = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])
tri = tf.placeholder(dtype=tf.int32, shape=[1, t_num, 3])
fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180
delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1)
camera_position = tf.constant([0, 0, 10.0])
camera_lookat = tf.constant([0, 0, 0.0])
camera_up = tf.constant([0, 1.0, 0])
light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3])
light_intensities = tf.zeros([1, 1, 3])
depthmap = mesh_renderer(vertex, tri, tf.zeros_like(vertex), depth,
camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up,
light_positions=light_positions, light_intensities=light_intensities,
image_width=img_size,image_height=img_size,
fov_y=fov_y, far_clip=30.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center)
return graph, focal, center, depth, vertex, tri, depthmap

@sicxu the index of face_tri saved in step3 begins from 1, should be 0 if want to be correctly rendered using tf_mesh_renderer

sicxu commented

Yes, you are correct. The saved results in step3 are used to save as objs for visualization, not directly used for rendering by tf_mesh_renderer.

Hi @fungtion, Can you share a complete code to render 3d object to 2d image?
Thanks,

@make-j64

with open(obj_file, 'r') as f:
            obj_content = f.readlines()
            all_xyz = []
            all_texture = []
            all_tri = []
            for i in range(len(obj_content)):
                face_content = obj_content[i]
                face_content = face_content.strip('v f\n').split(' ')
                if len(face_content) == 6:
                    all_xyz.append([float(face_content[0]), float(face_content[1]), float(face_content[2])])
                    all_texture.append([float(face_content[5]), float(face_content[4]), float(face_content[3])])
                elif len(face_content) == 3:
                    all_tri.append([float(face_content[0]), float(face_content[1]), float(face_content[2])])

            # rendering
            img_size = 256
            with tf.Graph().as_default() as graph:
                all_texture = np.array(all_texture)
                all_texture = tf.constant(np.expand_dims(all_texture, 0), dtype=tf.float32) # all_texture: [1, n, 3]
                all_tri = np.array(all_tri)
                all_tri -= 1
                all_xyz = np.array(all_xyz)
                all_xyz = np.expand_dims(all_xyz, 0)

                all_xyz = np.einsum('aij,ajk->aik', all_xyz, rotation)
                all_xyz = tf.constant(all_xyz, dtype=tf.float32) # all_xyz: [n, 3]
                all_tri = tf.constant(np.expand_dims(all_tri, 0), dtype=tf.int32) # all_tri: [n, 3]
                # normals = tf.cast(tf.nn.l2_normalize(all_xyz, axis=1), tf.float32)
                normals = tf.zeros_like(all_xyz)

                v_num = all_xyz.get_shape()[1]
                t_num = all_tri.get_shape()[1]
                focal = tf.constant(1015.0, dtype=tf.float32)
                center = tf.constant(np.array([[[123.0, 105.0]]]), dtype=tf.float32)
                depth = all_texture
                vertex = all_xyz
                tri = all_tri
                fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180
                delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1)
                camera_position = tf.constant([0, 0, 10.0])
                camera_lookat = tf.constant([0.0, 0.0, 0.0])
                camera_up = tf.constant([0, 1.0, 0])
                light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3])
                light_intensities = tf.zeros([1, 1, 3])
                depthmap = mesh_renderer(vertex, tri, normals, depth,
                camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up,
                light_positions=light_positions, light_intensities=light_intensities, 
                image_width=img_size,image_height=img_size,
                fov_y=fov_y, far_clip=40.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center)
                result = depthmap[0][:, :, :3].eval(session=tf.Session())