luciddreamer-cvlab/LucidDreamer

Bug in the image condition preprocessing step

cmh1027 opened this issue · 0 comments

I guess that mask generation of preprocessing step in generate_pcd must be changed as follows

if w_in/h_in > 1.1 or h_in/w_in > 1.1: # if there is a large gap between height and width, do inpainting
    in_res = max(w_in, h_in)
    # image_in, mask_in = np.zeros((in_res, in_res, 3), dtype=np.uint8), 255*np.ones((in_res, in_res, 3), dtype=np.uint8)
    image_in, mask_in = np.zeros((in_res, in_res, 3), dtype=np.uint8), np.zeros((in_res, in_res, 3), dtype=np.uint8)
    image_in[int(in_res/2-h_in/2):int(in_res/2+h_in/2), int(in_res/2-w_in/2):int(in_res/2+w_in/2)] = np.array(rgb_cond)
    # mask_in[int(in_res/2-h_in/2):int(in_res/2+h_in/2), int(in_res/2-w_in/2):int(in_res/2+w_in/2)] = 0
    mask_in[int(in_res/2-h_in/2):int(in_res/2+h_in/2), int(in_res/2-w_in/2):int(in_res/2+w_in/2)] = 255
    Image.fromarray(image_in).save(f"outputs/temp/image_in.png")
    Image.fromarray(mask_in).save(f"outputs/temp/mask_in.png")
    image2 = np.array(Image.fromarray(image_in).resize((self.cam.W, self.cam.H))).astype(float) / 255.0
    mask2 = np.array(Image.fromarray(mask_in).resize((self.cam.W, self.cam.H))).astype(float) / 255.0

    image_curr = self.rgb(
        prompt=prompt,
        image=image2,
        negative_prompt=negative_prompt, generator=generator,
        mask_image=mask2,   
    )

It's because there is also mask inversion step in self.rgb function
mask_pil = Image.fromarray(np.round((1 - mask_image) * 255.).astype(np.uint8))