isl-org/PhotorealismEnhancement

How to create your own datasets?

Closed this issue · 2 comments

I'm wondering could you please provide more details on how to prepare my own dataset. Specifically, how to generate "robust label map, gbuffer file, and ground truth label map"? I also could not find "dataset/generate_fake_gbuffers.py" in your repo, so I'm stuck at this point.

The robust label maps are generated from images via some pre-trained robust semantic segmentation network. The intuition here is to have a segmentation of the input image that is semantically meaningful and roughly consistent across synthetic and real data. The consistency is generally a challenge and methods just trained on one dataset from a narrow domain is unlikely to work well on some other dataset. This is why we used a method that had been trained on multiple datasets and shown to generalize well. The maps are supposed to be one-hot encoded maps at the same resolution as an input image and with each channel representing a different semantic class.

The gbuffer file in our case was an npz-file containing multiple G-buffers that correspond to the input image at a pixel level. The G-buffers contained e.g., surface normals, the view vector reflected over the surface normal, distance to the camera, surface albedo, glossiness, or approximate irradiance. The intuition is to encode geometry, materials, and lighting information for each pixel in the input image such that the network does not need to learn how to extract this data from an image itself.

The ground truth label map is a semantic segmentation obtained from the synthetic data. The goal here is to cluster pixels with approximately the same material/appearance properties. The resulting maps are then used b the network to process G-buffers in separate streams. Each stream is intended to roughly correspond to some class of materials (or objects if derived from a semantic segmentation map). Concretely, the ground truth label maps would be one-hot-encoded maps as the robust label maps above.

Sorry for missing the generate_fake_gbuffers.py. I'll add it shortly.

I have some problems about the ground truth label map.
What is the difference between ground truth label map and robust label map of synthetic data? I think they are the same.

In readme, it says "The pipeline expects for each dataset a txt file containing paths to all images. Each line should contain paths to image, robust label map, gbuffer file, and ground truth label map, all separated by commas." So in fake_dataset of config file, is the robust label map and ground truth label map the same here?