Object Activation Library and codes

Hi !

how to define the object activation codes for the object as per the number of objects in the scene ?
Say if I have only 2 objects in the scene what should be the activation library or say may be I have only 1 object what should be the activation library ?
So basically how are the activation codes assigned to the individual objects in the scene ?

In the code I saw that you use torch.embedding and input indices 1 to 5 , so does this mean the code library gives 5 activation codes for the 5 objects (cube,dragon, face, candy, magic cube).......... but which code belongs to which object ?
If I only wanted to render 3 objects out of the 5 objects then what should be done ?

Please help me with this concept of selecting the object codes @ybbbbt

Hi, the code index is selected according to the labels in the segmentation images, which is manually assigned during the data preparation. You can modify the data loader to skip the decoupled learning of some objects.

oh okay , I see you are using labels (1,2..5 ) and obtaining the object codes. But if I want to edit my own scene captured by a bunch of images and use my own objects (I have the mask images i.e. pixel value=1 for object location in the image), but then how do I select the object code for my own objects?

Thanks !

Hi, yyashoatel,

I load the labeled segmentation masks at:

object_nerf/datasets/generic_dataset.py

Lines 106 to 118 in e8b1a7a

    
           def get_instance_mask(self, instance_path, instance_id): 
        
               instance = cv2.resize( 
        
                   cv2.imread(instance_path, cv2.IMREAD_ANYDEPTH), 
        
                   self.img_wh, 
        
                   interpolation=cv2.INTER_NEAREST, 
        
               ) 
        
               if isinstance(instance_id, int): 
        
                   mask = instance == instance_id 
        
               elif isinstance(instance_id, list): 
        
                   mask = np.zeros_like(instance).astype(bool) 
        
                   for id in instance_id: 
        
                       mask = np.logical_or(mask, instance == id) 
        
               return mask

Then, you can modify the instance_id and val_instance_id in config file to let the network knows which instance (with the specific IDs) to be trained and validated (visualized in Tensorboard):

object_nerf/config/scannet_base_0113_multi.yml

Lines 35 to 36 in e8b1a7a

    
           val_instance_id: 4 
        
           instance_id: [4, 6]

Thanks, I get it !

one last question -
Can I change the numbers used in the instance ids i.e. for example in the above file that you shared 4 represents near sofa and 6 represents center sofa.

I have my own objects (a box and a book) so what instance id should I assign to these objects. I know you have assigned the ids based on segmentation labels....but Basically is there a basis of selecting the instance id for an object or I can assign any number ?

Because I have the mask of all objects, but I do not have the segmentation labels, so on what basis should I select a number for instance ids for my objects. Should just assign any 2 numbers as instance ids?

It would be great if you can clarify this ..Thanks !

You can assign any number of IDs, as long as the numbers match the input training data (i.e., instance_ids in batch) and the number for each object identity is always consistent across views:

object_nerf/datasets/generic_dataset.py

Line 269 in e8b1a7a

curr_frame_instance_ids += [sample["instance_ids"]]

During the inference, we assign the learnable object code according to the input IDs:

object_nerf/models/code_library.py

Lines 23 to 27 in e8b1a7a

    
           if "instance_ids" in inputs: 
        
               ret_dict["embedding_instance"] = self.embedding_instance( 
        
                   inputs["instance_ids"].squeeze() 
        
               )

For example, you can assign the box as 1 and the book as 2, and also make sure that the per training ray's instance_ids in dataloader is properly assigned to these values.

Please feel free to let me know if you have any questions.

Thanks for your reply. I understand it now.

Just one last thing, I want to only train and render the object, so I have done changes such that I only train the object branch. So the color loss goes to none sometimes because of the line as seen below ,

https://github.com/zju3dv/object_nerf/blob/e8b1a7a5ab0596babdf4400a9e8908f1bfdcf990/models/losses.py#L75

So does this affect the learning of the network ?

Thanks for your help, I figured it out.

	def get_instance_mask(self, instance_path, instance_id):
	instance = cv2.resize(
	cv2.imread(instance_path, cv2.IMREAD_ANYDEPTH),
	self.img_wh,
	interpolation=cv2.INTER_NEAREST,
	)
	if isinstance(instance_id, int):
	mask = instance == instance_id
	elif isinstance(instance_id, list):
	mask = np.zeros_like(instance).astype(bool)
	for id in instance_id:
	mask = np.logical_or(mask, instance == id)
	return mask

	if "instance_ids" in inputs:
	ret_dict["embedding_instance"] = self.embedding_instance(
	inputs["instance_ids"].squeeze()
	)