Suction grasp labelling
akash-greyorange opened this issue · 10 comments
Hello @andyzeng ,
I went through your work and it's really great to see such good development on grasping side. I want to know more about grasp labeling in case of suction gripper. Can it take input of suction cup diameter currently and is the labeling automated ?
Our grasp labeling interface for creating ground truth training labels is manual, and requires the user to paint over the image pixels. I don't think our interface takes the suction cup diameter as input. If by grasp labeling you meant grasp predicting - then yes, it is fully automated, and does not take the suction cup diameter as input.
Thanks for the quick response. No by automated grasp labeling I meant ground truth labels for training FCN and that's what I am looking for. So not related to this repo but can you provide me any other sources for creating automated labeling on rgbd images of objects and generating ground truths as well as taking input of suction cup diameter ?
I see. Unfortunately I don't know of a system that does automatic ground truth grasp labeling for real RGB-D images. Perhaps I'm not understanding your question correctly -- but if you have that kind of system, doesn't it make the FCN obsolete?
Hello @andyzeng ,
Thanks for sharing your achievements.
I'm kind of curious what tools do you use to label the dataset of RGB-D heightmaps. If it is convenient, could you share this tool or details of annotation?
@andyzeng No I don't think so automated labeling of grasps on RGBD image will make FCN obsolete since training FCN on some set of objects and using trained model to predict grasp quality on some other set of objects can be benecifial. Please go through following link. Dex-Net does automated labeling on RGBD images and trains FCN from that, so basically I was looking for similar sources to create automated labeler with suction cup diameter input and sadly Dex-Net hasn't released dataset generation code. Please go through link and revert back if you have any similar sources. Thank you.
@carsonzhu For labeling the images and heightmaps, we wrote a script in Python with OpenCV to “paint” the labels over the images. Unfortunately I’m not sure where the original script is, but it was very similar in spirit to this demo.
@akash-greyorange Thanks for the link - we are well aware of their work. Dex-Net performs automatic grasp labeling on synthetic data, then trains a CNN to transfer to real data. I mistook your question for asking about automatic ground truth grasp labeling on real RGB-D data, which is impossible. If automatically generating synthetic data labels was what you were looking for, this survey paper provides a solid overview of heuristic/analytic methods that function very well on synthetic data. A few of the methods also work on real data, but are far from being able to generate ground truth grasp labels on real data. Best of luck.
@andyzeng Thanks alot for that paper reference. I will surely go through it for further development. As you said automatic ground truth grasp labeling on real RGB-D data is not possible, can you please illustrate it why is it not possible ? Is it because we dont have exact seg mask of real world object as compared to that in synthetic object dataset where we do have seg mask of object available ? So grasp labeling on synthetic data and transfering them to real RGB-D data is the only solution for this problem as that showed in Dex-Net if I dont want to spend precious human labour time in manually labeling grasps ?
@andyzeng Thanks very much. I've got a good tool and made it easiler to label the images manually.
@andyzeng thanks for sharing such great achievement of your work. I'm having an issue of running out of memory while trying to run this model. Is there any way to run this in a minimal memory or run it via cpu? I use a gpu with 2GB of memory. I'm verry much new to torch in lua. Thanks for any answer.
@carsonzhu I am curious of what tool do you find and use to label the images manually? and how did you label those images? thanks
Hello @andyzeng , thank for sharing your nice work. I went through your work and read your paper, but could not understand how a label represents its corresponding parallel-jaw grasp. In your paper I found the sentence describing the labelling process, that is "where each positive and negative grasp
label is represented by a pixel on the heightmap as well as a corresponding angle parallel to the jaw motion of the gripper." If it does not bother you, can you briefly explain what are positive and negative grasps, and how grey and black dots in a labelling image stand for a grasp. Thanks