Performance issue on green screen segmented Image
kashyappiyush1998 opened this issue · 11 comments
What is your input mask? Which one is the output?
It might even be easier if the entire background is green, that way the algorithm can model the background. A class-agnostic algorithm might have difficulty separating FG/BG otherwise.
The first one is input and the second one is output. I want to make a pipeline with finer edges via CascadePSP at the final step.
I cannot always know that background will be monotonous. I am using Unet for background removal and then CascadePSP for better edges
For background removal, do your UNet output a mask or an image?
It outputs the mask image which I using PIL extract the output image
If I understand correctly, 1. (input image) -> UNet -> initial mask 2. (input image + initial mask) -> CascadePSP -> final mask. It is better to look at the (input image + initial mask) for diagnosis. Can I take a look at them?
Thanks for the images! I got a slightly different result than you:
Initial masked:
It is not perfect -- there is still a green boundary around the image. This is perhaps because it is an out-of-distribution image (we train only on natural images). Another reason is from JPEG compression artifact. If you zoom in closely, there is a darker green boundary near the portrait which our algorithm might mistake it as the actual edge. Refinement might still be of use -- in the original composited image, the green boundary is not consistent (over-segmented on the left shoulder, under-segmented on the right side of the neck) while in the refined image I think the mask consistently over-segments by 1.5 (or 2?) pixel.
Applying a erosion of (5, 5)
output = cv2.erode(output, kernel=np.ones((5, 5)))
makes the result look almost perfect to my eyes.
I also tried to play around with your initial mask (dilate/erode it by 10 or even 20 pixels), our algorithm output a similar mask given the perturbation so it might be robust enough for your application.
Thanks for your interest! I actually learnt something :)
Code for reference:
import cv2
import time
import matplotlib.pyplot as plt
import segmentation_refinement as refine
import numpy as np
image = cv2.imread('im.jpg')
mask = cv2.imread('mask.png', cv2.IMREAD_GRAYSCALE)
refiner = refine.Refiner(device='cuda:0') # device can also be 'cpu'
output = refiner.refine(image, mask, fast=False, L=900)
output = cv2.erode(output, kernel=np.ones((5, 5)))
cv2.imwrite('test/out.png', output)
mask = mask.astype(np.float32)/255
output = output.astype(np.float32)/255
cv2.imwrite('im_mask.png', image*mask[:,:,None])
cv2.imwrite('im_out.png', image*output[:,:,None])
Thank you So much. This will be of great use.
But I have to implement this as a general model, erosion, and dilation may work for green/blue screen images, but if I keep this for every image, this will decrease the quality of output of images with non-green screen images or image with better results by hair matting.
Well the question was about green screen images and the problems that I have pointed out are all about green screen images. I don't observe the same issue in real-world images to start with. I think you can use it directly without erosion for real-world images. It probably wouldn't work for matting though.
Ok, Got It. Thanks