3D face not in the center

Thank you for your great work. I open the obj file produced in the 4th step using meshlab, but I found the face is not in the center circle, is there something wrong?

And how to project 3D model to 2D plane, may be a silly question though.

Thank you for your great work. I open the obj file produced in the 4th step using meshlab, but I found the face is not in the center circle, is there something wrong?

You may double click the shape to center it. Or you can remove the meaningless padded points and triangles for visualization.

And how to project 3D model to 2D plane, may be a silly question though.

You may take the following code as a reference. It's an opencv perspective projection func.

Deep3dPortrait/utils/recon_face.py

Lines 53 to 68 in 809b3c4

    
           def projection_layer(face_shape, focal=1015.0, penter=[112.0, 112.0]): # we choose the focal length and camera position empirically 
        
               camera_pos = np.reshape(np.array([0.0,0.0,10.0]),[1,1,3]) # camera position 
        
               reverse_z = np.reshape(np.array([1.0,0,0,0,1,0,0,0,-1.0]),[1,3,3]) 
        
               p_matrix = np.concatenate([[focal],[0.0],[penter[0]],[0.0],[focal],[penter[1]],[0.0],[0.0],[1.0]],axis = 0) # projection matrix 
        
               p_matrix = np.reshape(p_matrix,[1,3,3]) 
        
               # calculate face position in camera space 
        
               face_shape = np.matmul(face_shape,reverse_z) + camera_pos 
        
               # calculate projection of face vertex using perspective projection 
        
               aug_projection = np.matmul(face_shape, np.transpose(p_matrix,[0,2,1])) 
        
               face_projection = aug_projection[:,:,0:2]/np.reshape(aug_projection[:,:,2],[1,np.shape(aug_projection)[1],1]) 
        
               return face_projection

I want to use my photos to generate a 3D portrait model. I learned that in addition to photos, I also need several other files to input.
I have used the method here https://github.com/sicxu/Deep3dPortrait/issues/9 to generate landmark.txt file.But I don't know how to generate the detection.txt files and the .mat file.
I am a novice. I hope you can give me as detailed steps as possible to answer my doubts. Thank you very much for your help.

And how to project 3D model to 2D plane, may be a silly question though.

You may take the following code as a reference. It's an opencv perspective projection func.

Deep3dPortrait/utils/recon_face.py

Lines 53 to 68 in 809b3c4

def projection_layer(face_shape, focal=1015.0, penter=[112.0, 112.0]): # we choose the focal length and camera position empirically

camera_pos = np.reshape(np.array([0.0,0.0,10.0]),[1,1,3]) # camera position

reverse_z = np.reshape(np.array([1.0,0,0,0,1,0,0,0,-1.0]),[1,3,3])

p_matrix = np.concatenate([[focal],[0.0],[penter[0]],[0.0],[focal],[penter[1]],[0.0],[0.0],[1.0]],axis = 0) # projection matrix

p_matrix = np.reshape(p_matrix,[1,3,3])

# calculate face position in camera space

face_shape = np.matmul(face_shape,reverse_z) + camera_pos

# calculate projection of face vertex using perspective projection

aug_projection = np.matmul(face_shape, np.transpose(p_matrix,[0,2,1]))

face_projection = aug_projection[:,:,0:2]/np.reshape(aug_projection[:,:,2],[1,np.shape(aug_projection)[1],1])

return face_projection

@sicxu Thank you, I tried these codes and can project the 3D face to 2D plane successfully, but it appears a vacant space in the left face

I did the projection as following:

extract face_texture, face_xyz, hair_texture, hair_xyz, border_texture, border_xyz from obj file
project *.xyz to 2D plane according to the above codes
fill in the 2D plane with texture of 3D space

I think it may be something wrong with the texture

This code is just for per-vertex projection. If you want to render the 3d model to obtain a 2d image, pls refer to the rendering pipeline. In this repo, we use the tf_mesh_renderer to render the 3d model. You can simply pass the per-vertex rgb value to depth in this func for your purpose.

Deep3dPortrait/utils/create_renderer.py

Lines 11 to 30 in 809b3c4

    
           def create_renderer_graph(v_num=35709, t_num=70789, img_size=256): 
        
               with tf.Graph().as_default() as graph: 
        
                   focal = tf.placeholder(dtype=tf.float32, shape=[1]) 
        
                   center = tf.placeholder(dtype=tf.float32, shape=[1, 1, 2]) 
        
                   depth = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3]) 
        
                   vertex = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3]) 
        
                   tri = tf.placeholder(dtype=tf.int32, shape=[1, t_num, 3]) 
        
                   fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180 
        
                   delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1) 
        
                   camera_position = tf.constant([0, 0, 10.0]) 
        
                   camera_lookat = tf.constant([0, 0, 0.0]) 
        
                   camera_up = tf.constant([0, 1.0, 0]) 
        
                   light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3]) 
        
                   light_intensities = tf.zeros([1, 1, 3]) 
        
                   depthmap = mesh_renderer(vertex, tri, tf.zeros_like(vertex), depth, 
        
                   camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up, 
        
                   light_positions=light_positions, light_intensities=light_intensities,  
        
                   image_width=img_size,image_height=img_size, 
        
                   fov_y=fov_y, far_clip=30.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center) 
        
               return graph, focal, center, depth, vertex, tri, depthmap

You may also use other modern renderer to achieve this, such as pyrender or pytorch3d.

OKay, Thank you.

This code is just for per-vertex projection. If you want to render the 3d model to obtain a 2d image, pls refer to the rendering pipeline. In this repo, we use the tf_mesh_renderer to render the 3d model. You can simply pass the per-vertex rgb value to depth in this func for your purpose.

Deep3dPortrait/utils/create_renderer.py

Lines 11 to 30 in 809b3c4

def create_renderer_graph(v_num=35709, t_num=70789, img_size=256):

with tf.Graph().as_default() as graph:

focal = tf.placeholder(dtype=tf.float32, shape=[1])

center = tf.placeholder(dtype=tf.float32, shape=[1, 1, 2])

depth = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])

vertex = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])

tri = tf.placeholder(dtype=tf.int32, shape=[1, t_num, 3])

fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180

delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1)

camera_position = tf.constant([0, 0, 10.0])

camera_lookat = tf.constant([0, 0, 0.0])

camera_up = tf.constant([0, 1.0, 0])

light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3])

light_intensities = tf.zeros([1, 1, 3])

depthmap = mesh_renderer(vertex, tri, tf.zeros_like(vertex), depth,

camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up,

light_positions=light_positions, light_intensities=light_intensities,

image_width=img_size,image_height=img_size,

fov_y=fov_y, far_clip=30.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center)

return graph, focal, center, depth, vertex, tri, depthmap

@sicxu the index of face_tri saved in step3 begins from 1, should be 0 if want to be correctly rendered using tf_mesh_renderer

Yes, you are correct. The saved results in step3 are used to save as objs for visualization, not directly used for rendering by tf_mesh_renderer.

Hi @fungtion, Can you share a complete code to render 3d object to 2d image?
Thanks,

@make-j64

with open(obj_file, 'r') as f:
            obj_content = f.readlines()
            all_xyz = []
            all_texture = []
            all_tri = []
            for i in range(len(obj_content)):
                face_content = obj_content[i]
                face_content = face_content.strip('v f\n').split(' ')
                if len(face_content) == 6:
                    all_xyz.append([float(face_content[0]), float(face_content[1]), float(face_content[2])])
                    all_texture.append([float(face_content[5]), float(face_content[4]), float(face_content[3])])
                elif len(face_content) == 3:
                    all_tri.append([float(face_content[0]), float(face_content[1]), float(face_content[2])])

            # rendering
            img_size = 256
            with tf.Graph().as_default() as graph:
                all_texture = np.array(all_texture)
                all_texture = tf.constant(np.expand_dims(all_texture, 0), dtype=tf.float32) # all_texture: [1, n, 3]
                all_tri = np.array(all_tri)
                all_tri -= 1
                all_xyz = np.array(all_xyz)
                all_xyz = np.expand_dims(all_xyz, 0)

                all_xyz = np.einsum('aij,ajk->aik', all_xyz, rotation)
                all_xyz = tf.constant(all_xyz, dtype=tf.float32) # all_xyz: [n, 3]
                all_tri = tf.constant(np.expand_dims(all_tri, 0), dtype=tf.int32) # all_tri: [n, 3]
                # normals = tf.cast(tf.nn.l2_normalize(all_xyz, axis=1), tf.float32)
                normals = tf.zeros_like(all_xyz)

                v_num = all_xyz.get_shape()[1]
                t_num = all_tri.get_shape()[1]
                focal = tf.constant(1015.0, dtype=tf.float32)
                center = tf.constant(np.array([[[123.0, 105.0]]]), dtype=tf.float32)
                depth = all_texture
                vertex = all_xyz
                tri = all_tri
                fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180
                delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1)
                camera_position = tf.constant([0, 0, 10.0])
                camera_lookat = tf.constant([0.0, 0.0, 0.0])
                camera_up = tf.constant([0, 1.0, 0])
                light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3])
                light_intensities = tf.zeros([1, 1, 3])
                depthmap = mesh_renderer(vertex, tri, normals, depth,
                camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up,
                light_positions=light_positions, light_intensities=light_intensities, 
                image_width=img_size,image_height=img_size,
                fov_y=fov_y, far_clip=40.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center)
                result = depthmap[0][:, :, :3].eval(session=tf.Session())

	def projection_layer(face_shape, focal=1015.0, penter=[112.0, 112.0]): # we choose the focal length and camera position empirically
	camera_pos = np.reshape(np.array([0.0,0.0,10.0]),[1,1,3]) # camera position
	reverse_z = np.reshape(np.array([1.0,0,0,0,1,0,0,0,-1.0]),[1,3,3])


	p_matrix = np.concatenate([[focal],[0.0],[penter[0]],[0.0],[focal],[penter[1]],[0.0],[0.0],[1.0]],axis = 0) # projection matrix
	p_matrix = np.reshape(p_matrix,[1,3,3])

	# calculate face position in camera space
	face_shape = np.matmul(face_shape,reverse_z) + camera_pos

	# calculate projection of face vertex using perspective projection
	aug_projection = np.matmul(face_shape, np.transpose(p_matrix,[0,2,1]))
	face_projection = aug_projection[:,:,0:2]/np.reshape(aug_projection[:,:,2],[1,np.shape(aug_projection)[1],1])

	return face_projection

	def create_renderer_graph(v_num=35709, t_num=70789, img_size=256):
	with tf.Graph().as_default() as graph:
	focal = tf.placeholder(dtype=tf.float32, shape=[1])
	center = tf.placeholder(dtype=tf.float32, shape=[1, 1, 2])
	depth = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])
	vertex = tf.placeholder(dtype=tf.float32, shape=[1, v_num, 3])
	tri = tf.placeholder(dtype=tf.int32, shape=[1, t_num, 3])
	fov_y = 2 * tf.atan2(img_size//2 * tf.ones_like(focal), focal) / np.pi * 180
	delta_center = tf.concat([(center - img_size//2)/(img_size//2), tf.zeros([center.shape[0], 1, 1])], axis=-1)
	camera_position = tf.constant([0, 0, 10.0])
	camera_lookat = tf.constant([0, 0, 0.0])
	camera_up = tf.constant([0, 1.0, 0])
	light_positions = tf.reshape(tf.constant([0, 0, 1e5]), [1, 1, 3])
	light_intensities = tf.zeros([1, 1, 3])
	depthmap = mesh_renderer(vertex, tri, tf.zeros_like(vertex), depth,
	camera_position=camera_position, camera_lookat=camera_lookat, camera_up=camera_up,
	light_positions=light_positions, light_intensities=light_intensities,
	image_width=img_size,image_height=img_size,
	fov_y=fov_y, far_clip=30.0, ambient_color=tf.ones([1, 3]), delta_center=delta_center)
	return graph, focal, center, depth, vertex, tri, depthmap