xthan/VITON

Can't get results as shown in paper

Closed this issue · 9 comments

I tested the model on some images. I rotated the segment map as well. But, I can't get good results as shown in the paper. I have used the following repositories for pose model and human parser.

Pose Model : https://github.com/tensorboy/pytorch_Realtime_Multi-Person_Pose_Estimation
Human Parsing : https://github.com/Engineering-Course/LIP_JPPNet

image

I haven't gone through the code yet. Do I have to do anything more to get a good result? Has anyone got results the same as mentioned in the paper? Are there any restrictions on outputs of pose and human parsing models like fixed resolution, etc.,?

I found the reason for bad results. It's because of the function 'process_segment_map' used in image pre-processing. Some part of the image(segment map) is getting cropped in this function. I don't understand why this function is used. @xthan Could you please tell me if there is some reason for using this function. Also, the segment map is rotated in this function. If we remove this line we don't have to rotate the segment map manually. Please correct me if I misunderstood anything here.
Following is the result I got.
image

@anilkumar2444 for the JPPNet, are you using the pretrained model that they provided? I got results that looked completely incorrect when using their model, so I'm just wondering if I should go ahead and train it myself.

I couldn't solve the caffe dependency for the given model. Instead, I am using the following human parser which is implemented in tensorflow. It also provides pretrained model.
https://github.com/Engineering-Course/LIP_JPPNet

@anilkumar2444 Thanks! I've tried their pretrained model - did it work well for you? My segmentation mask looks like this with their pretrained model... I'm not sure what went wrong.

@anilkumar2444 Hi, I'm working on VITON for a project. I have the same problem with the result of stage1 images. I found your comment where you say that removing a line in 'process_segment_map' you've been able to resolve the problem. At which line were you reffering to?

@FabioTarocco This is the function I talked about in utils.py file. See the commented lines below.

def process_segment_map(segment, h, w):
"""Extract segment maps."""
segment = np.asarray(segment, dtype=np.uint8)
# if h >= w:
# segment = imresize(segment, (h, h), interp='nearest')
# segment = segment[:, :w]
# else:
# segment = imresize(segment, (w, w), interp='nearest')
# segment = segment[:h, :]
return segment

@anilkumar2444 I tried to modify the function in 'utils.py', I've commented everything except segment= np.asarray(....) and return, but this is the result:
Before:
ProblemaCartonato
After:
capture

@anilkumar2444 Thanks for the .tar you sent me, sorry if I answer this late but I was busy for some exams.
I want to ask you few more question: Which version of TensorFlow have you used? And if there was some casting issues in utils.py?