According to the kinds of grasp, the methods of vision-based robotic grasping can be roughly divided into two kinds, 2D planar grasp and 6DoF Grasp. This repository summaries these methods in recent years, which utilize deep learning mostly. Before this summary, previous review papers are also reviewed.
[arXiv] 2019-A Review of Robot Learning for Manipulation- Challenges, Representations, and Algorithms, [paper]
[arXiv] 2019-Vision-based Robotic Grasping from Object Localization, Pose Estimation, Grasp Detection to Motion Planning: A Review, [paper]
[MTI] 2018-Review of Deep Learning Methods in Robotic Grasp Detection, [paper]
[ToR] 2016-Data-Driven Grasp Synthesis - A Survey, [paper]
[RAS] 2012-An overview of 3D object grasp synthesis algorithms - A Survey, [paper]
Grasp Representation: The grasp is represented as an oriented 2D box, and the grasp is constrained from one direction.
This kind of methods directly regress the oriented 2D box from RGB or RGB-D images. When using RGB-D images, the depth image is regarded as an another channel, which is similar with RGB-based methods.
2019:
[arXiv] Form2Fit: Learning Shape Priors for Generalizable Assembly from Disassembly, [paper] [code]
[IROS] GRIP: Generative Robust Inference and Perception for Semantic Robot Manipulation in Adversarial Environments, [paper]
[arXiv] Efficient Fully Convolution Neural Network for Generating Pixel Wise Robotic Grasps With High Resolution Images, [paper]
[arXiv] A Single Multi-Task Deep Neural Network with Post-Processing for Object Detection with Reasoning and Robotic Grasp Detection, [paper]
[IROS] ROI-based Robotic Grasp Detection for Object Overlapping Scenes, [paper]
[IROS] SilhoNet: An RGB Method for 6D Object Pose Estimation, [paper]
[ICRA] Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter, [paper] [code]
2018:
[arXiv] Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images, [paper]
[arXiv] Real-world Multi-object, Multi-grasp Detection, [paper]
[ICRA] Robotic Pick-and-Place of Novel Objects in Clutter with Multi-Affordance Grasping and Cross-Domain Image Matching, [paper] [code]
2017:
[IROS] Robotic Grasp Detection using Deep Convolutional Neural Networks, [paper]
2016:
[ICRA] Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours, [paper]
2015:
[ICRA] Real-time grasp detection using convolutional neural networks, [paper] [code]
2014:
[IJRR] Deep Learning for Detecting Robotic Grasps, [paper]
Datasets:
Cornell dataset, the dataset consists of 1035 images of 280 different objects.
This kind of methods utilized an indirectly way to obtain the grasp pose, which contains grasp candidate generation and grasp quality evaluation. The candidate grasp with the highly score will be selected as the final grasp.
2019:
[IROS] GQ-STN: Optimizing One-Shot Grasp Detection based on Robustness Classifier, [paper]
[ICRA] Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter, [paper]
[ICRA] MetaGrasp: Data Efficient Grasping by Affordance Interpreter Network, [paper]
[IROS] GlassLoc: Plenoptic Grasp Pose Detection in Transparent Clutter, [paper]
2018:
[RSS] Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach, [paper]
[BMVC] EnsembleNet Improving Grasp Detection using an Ensemble of Convolutional Neural Networks, [paper]
2017:
[RSS] Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics, [paper] [code]
Dataset:
Dex-Net, a synthetic dataset of 6.7 million point clouds, grasps, and robust analytic grasp metrics generated from thousands of 3D models.
Jacquard Dataset, Jacquard: A Large Scale Dataset for Robotic Grasp Detection” in IEEE International Conference on Intelligent Robots and Systems, 2018, [paper]
In order to provide a better input to compute the oriented 2D box, or generate the candidates, the targe object's mask should be computed. The current deep learning-based 2D detection or 2D segmentation methods could assist.
2019:
[IROS] Look Further to Recognize Better: Learning Shared Topics and Category-Specific Dictionaries for Open-Ended 3D Object Recognition, [paper]
[IROS] Recurrent Convolutional Fusion for RGB-D Object Recognition, [paper] [code]
[arXiv] A Review of methods for Textureless Object Recognition, [paper]
[ICCVW] An Annotation Saved is an Annotation Earned: Using Fully Synthetic Training for Object Detection, [paper]
[arXiv] Object Detection in 20 Years A Survey, [paper]
2018:
[arXiv] YOLOv3: An Incremental Improvement, [paper] [code]
2016:
[CVPR] You only look once: Unified, real-time object detection, [paper] [code]
[TPAMI] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, [paper] [code]
[ECCV] SSD: Single Shot MultiBox Detector, [paper] [code]
2019:
[CASE] Deep Workpiece Region Segmentation for Bin Picking, [paper]
2017:
[ICCV] Mask r-cnn, [paper] [code]
[IROS] SegICP: Integrated Deep Semantic Segmentation and Pose Estimation, [paper]
Grasp Representation: The grasp is represented as 6DoF pose in 3D domain, and the gripper can grasp the object from various angles. The input to this task is 3D point cloud from RGB-D sensors, and this task contains two stages. In the first stage, the targe object should be extracted from the scene. In the second stage, if there exist an existing 3D model, the 6D pose of the object could be computed. If there exists no 3D models, the 6DoF grasp pose will be computed from some other methods.
The staightforward way is to conduct 2D dection or segmentation, and utilize the point cloud from the corresponding depth area. This part is already related in section 1.3. In the following, only 3D detection and 3D instance segmentation will be summarized.
This kind of methods can be divided into three kinds: RGB-based methods, point cloud-based methods, and fusion methods which consume images and point cloud. Most of these works are focus on autonomous driving.
Most of this kind of methods estimate depth images from RGB images, and then conduct 3D detection.
2019:
[IROS] Look Further to Recognize Better: Learning Shared Topics and Category-Specific Dictionaries for Open-Ended 3D Object Recognition, [paper]
[arXiv] Task-Aware Monocular Depth Estimation for 3D Object Detection, [paper]
[CVPR] Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving, [paper] [code]
[AAAI] MonoGRNet: A Geometric Reasoning Network for 3D Object Localization, [paper] [code]
[ICCV] Accurate Monocular 3D Object Detection via Color-Embedded 3D Reconstruction for Autonomous Driving, [paper]
[ICCV] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection, [paper]
[ICCVW] Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud, [paper]
[arXiv] Monocular 3D Object Detection and Box Fitting Trained End-to-End Using Intersection-over-Union Loss, [paper]
[arXiv] Monocular 3D Object Detection via Geometric Reasoning on Keypoints, [paper]
This kind of methods purely utilize the 3D point cloud data.
2019:
[NeurIPSW] Patch Refinement -- Localized 3D Object Detection, [paper]
[CoRL] End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds, [paper]
[ICCV] Deep Hough Voting for 3D Object Detection in Point Clouds, [paper] [code]
[arXiv] Part-A2 Net: 3D Part-Aware and Aggregation Neural Network for Object Detection from Point Cloud, [paper]
[ICCV] STD: Sparse-to-Dense 3D Object Detector for Point Cloud, [paper]
[CVPR] PointPillars: Fast Encoders for Object Detection from Point Clouds, [paper]
[arXiv] StarNet: Targeted Computation for Object Detection in Point Clouds, [paper]
2018:
[CVPR] PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, [paper] [code]
[CVPR] PIXOR: Real-time 3D Object Detection from Point Clouds, [paper] [code]
[ECCVW] Complex-YOLO: Real-time 3D Object Detection on Point Clouds, [paper] [code]
[ECCVW] YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud, [paper]
This kind of methods utilize both rgb images and depth images/point clouds. There exist early fusion methods, late fusion methods, and dense fusion methods.
2019:
[ICCV] Transferable Semi-Supervised 3D Object Detection From RGB-D Data, [paper]
[arXiv] Adaptive and Azimuth-Aware Fusion Network of Multimodal Local Features for 3D Object Detection, [paper]
[arXiv] Frustum VoxNet for 3D object detection from RGB-D or Depth images, [paper]
[IROS] Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection, [paper]
[CVPR] Multi-Task Multi-Sensor Fusion for 3D Object Detection, [paper]
2018:
[CVPR] Frustum PointNets for 3D Object Detection from RGB-D Data, [paper] [code]
[ECCV] Deep Continuous Fusion for Multi-Sensor 3D Object Detection, [paper]
[IROS] Joint 3D Proposal Generation and Object Detection from View Aggregation, [paper] [code]
[CVPR] PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation, [paper]
[ICRA] A General Pipeline for 3D Detection of Vehicles, [paper]
2017: [CVPR] Multi-View 3D Object Detection Network for Autonomous Driving, [paper] [code]
2019:
[arXiv] Addressing the Sim2Real Gap in Robotic 3D Object Classification, [paper]
[arXiv] Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds, [paper]
[IROS] LDLS: 3-D Object Segmentation Through Label Diffusion From 2-D Images, [paper]
[arXiv] GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud, [paper]
[CoRL] The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation, [paper] [code]
2018:
[arXiv] PointSIFT: A SIFT-like Network Module for 3D Point Cloud Semantic Segmentation, [paper]
Some of these works are cited from awesome-point-cloud-analysis by Yongcheng Liu, thank him.
2019:
[ICCV] DensePoint: Learning Densely Contextual Representation for Efficient Point Cloud Processing, [paper] [code]
[TOG] Dynamic Graph CNN for Learning on Point Clouds, [paper] [code]
[ICCV] DeepGCNs: Can GCNs Go as Deep as CNNs?, [paper] [code]
[ICCV] KPConv: Flexible and Deformable Convolution for Point Clouds, [paper] [code]
[MM] SRINet: Learning Strictly Rotation-Invariant Representations for Point Cloud Classification and Segmentation, [paper]
[CVPR] PointConv: Deep Convolutional Networks on 3D Point Clouds, [paper] [code]
[CVPR] PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing, [paper] [code]
[CVPR] Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN, [paper] [code]
[arXiv] SAWNet: A Spatially Aware Deep Neural Network for 3D Point Cloud Processing, [paper]
[arXiv] PyramNet: Point Cloud Pyramid Attention Network and Graph Embedding Module for Classification and Segmentation, [paper]
[ICCV] Interpolated Convolutional Networks for 3D Point Cloud Understanding, [paper]
[arXiv] A survey on Deep Learning Advances on Different 3D Data Representations, [paper]
2018:
[TOG] MCCNN: Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds, [paper] [code]
[NeurIPS] PointCNN: Convolution On X-Transformed Points, [paper] [code]
[CVPR] Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling, [paper] [code]
[CVPR] SO-Net: Self-Organizing Network for Point Cloud Analysis, [paper] [code]
[CVPR] SPLATNet: Sparse Lattice Networks for Point Cloud Processing, [paper] [code]
[arXiv] Point Convolutional Neural Networks by Extension Operators, [paper]
2017:
[ICCV] Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models, [paper] [code]
[CVPR] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation, [paper] [code]
[NeurIPS] PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space, [paper] [code]
[CVPR] SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation, [paper]
This kind of methods can be divided into four kinds, which are corresponding-based methods, template-based methods, voting-based methods and regression-based methods.
[ECCVW] A Summary of the 4th International Workshop on Recovering 6D Object Pose, [paper]
2019:
[CVPR] Segmentation-driven 6D Object Pose Estimation, [paper]
2018:
[arXiv] Estimating 6D Pose From Localizing Designated Surface Keypoints, [paper]
2017:
[ICRA] 6-DoF Object Pose from Semantic Keypoints, [paper]
2012:
[3DIMPVT] 3D Object Detection and Localization using Multimodal Point Pair Features, [paper]
2019:
[arXiv] Real-time Background-aware 3D Textureless Object Pose Estimation, [paper]
2012:
[ACCV] Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes, [paper]
2018:
[TPAMI] Robust 3D Object Tracking from Monocular Images Using Stable Parts, [paper]
2014:
[ECCV] Learning 6d object pose estimation using 3d object coordinate, [paper]
[ECCV] Latent-class hough forests for 3d object detection and pose estimation, [paper]
2019:
[CoRL] Scene-level Pose Estimation for Multiple Instances of Densely Packed Objects, [paper]
[IROS] Learning to Estimate Pose and Shape of Hand-Held Objects from RGB Images, [paper]
[IROSW] Motion-Nets: 6D Tracking of Unknown Objects in Unseen Environments using RGB, [paper]
[ICCV] DPOD: 6D Pose Object Detector and Refiner, [paper]
[ICCV] Pix2Pose: Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation, [paper]
[ICCV] Explaining the Ambiguity of Object Detection and 6D Pose From Visual Data, [paper]
[arXiv] Active 6D Multi-Object Pose Estimation in Cluttered Scenarios with Deep Reinforcement Learning, [paper]
[arXiv] 6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints, [paper] [code]
[arXiv] Accurate 6D Object Pose Estimation by Pose Conditioned Mesh Reconstruction, [paper]
[arXiv] Learning Object Localization and 6D Pose Estimation from Simulation and Weakly Labeled Real Images, [paper]
[ICHR] Refining 6D Object Pose Predictions using Abstract Render-and-Compare, [paper]
[CVPR] Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation, [paper] [code]
[CVPR] Densefusion: 6d object pose estimation by iterative dense fusion, [paper] [code]
[arXiv] Deep-6dpose: recovering 6d object pose from a single rgb image, [paper]
2018:
[ECCV] Implicit 3D Orientation Learning for 6D Object Detection From RGB Images, [paper] [code]
[ECCV] DeepIM:Deep Iterative Matching for 6D Pose Estimation [paper] [code]
[RSS] Posecnn: A convolutional neural network for 6d object pose estimation in cluttered scenes, [paper] [code]
[IROS] Robust 6D Object Pose Estimation in Cluttered Scenes using Semantic Segmentation and Pose Regression Networks, [paper]
2017:
[ICCV] SSD-6D: Making rgb-based 3d detection and 6d pose estimation great again, [paper] [code]
2019:
[ICCV] CDPN: Coordinates-Based Disentangled Pose Network for Real-Time RGB-Based 6-DoF Object Pose Estimation, [paper]
[CVPR] PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation, [paper] [code]
2018:
[CVPR] Real-time seamless single shot 6d object pose prediction, [paper] [code]
2017:
[ICCV] BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3d poses of challenging objects without using depth, [paper]
Datasets:
HomebrewedDB: RGB-D Dataset for 6D Pose Estimation of 3D Objects, ICCVW, 2019 [paper]
YCB Datasets: The YCB Object and Model Set: Towards Common Benchmarks for Manipulation Research, IEEE International Conference on Advanced Robotics (ICAR), 2015 [paper]
T-LESS Datasets: T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects, IEEE Winter Conference on Applications of Computer Vision (WACV), 2017 [paper]
The partial-view point cloud will be aligned to the complete shape in order to obtain the 6D pose. Generally, coarse registration should be conduct firstly to provide an intial alignment, and dense registration methods like ICP (Iterative Closest Point) will be conducted to obtain the final 6D pose.
2014:
[SGP] Super 4PCS Fast Global Pointcloud Registration via Smart Indexing, [paper] [code]
2017:
[CVPR] 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions, [paper] [code]
2016:
[arXiv] Lessons from the Amazon Picking Challenge, [paper]
[arXiv] Team Delft's Robot Winner of the Amazon Picking Challenge 2016, [paper]
2019:
[arXiv] DeepICP: An End-to-End Deep Neural Network for 3D Point Cloud Registration, [paper]
[NeurIPS] PRNet: Self-Supervised Learning for Partial-to-Partial Registration, [paper]
[CVPR] PointNetLK: Robust & Efficient Point Cloud Registration using PointNet, [paper] [code]
[ICCV] End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans, [paper]
[arXiv] Iterative Matching Point, [paper]
[arXiv] Deep Closest Point: Learning Representations for Point Cloud Registration, [paper] [code]
[arXiv] PCRNet: Point Cloud Registration Network using PointNet Encoding, [paper] [code]
2017:
[ICRA] Multi-view Self-supervised Deep Learning for 6D Pose Estimation in the Amazon Picking Challenge, [paper] [code]
In this situation, there exist no 3D models, an the 6-DoF grasps are estimated from available partial data. This can be implemented by directly estimating from partial view point cloud, or indirectly estimating after shape completion.
2019:
[IROS] Grasping Unknown Objects Based on Gripper Workspace Spheres, [paper]
[arXiv] Learning to Generate 6-DoF Grasp Poses with Reachability Awareness, [paper]
[CoRL] S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scenes, [paper]
[ICCV] 6-DoF GraspNet: Variational Grasp Generation for Object Manipulation, [paper]
[ICRA] PointNetGPD: Detecting Grasp Configurations from Point Sets, [paper] [code]
2017:
[IJRR] Grasp Pose Detection in Point Clouds, [paper] [code]
2019:
[IROS] Detecting Robotic Affordances on Novel Objects with Regional Attention and Attributes, [paper]
[IROS] Learning Grasp Affordance Reasoning through Semantic Relations, [paper]
[arXiv] Automatic pre-grasps generation for unknown 3D objects, [paper]
[IECON] A novel object slicing based grasp planner for 3D object grasping using underactuated robot gripper, [paper]
2018:
[arXiv] Workspace Aware Online Grasp Planning, [paper]
2019:
[arXiv] ClearGrasp- 3D Shape Estimation of Transparent Objects for Manipulation, [paper]
[arXiv] kPAM-SC: Generalizable Manipulation Planning using KeyPoint Affordance and Shape Completion, [paper] [code]
[arXiv] Data-Efficient Learning for Sim-to-Real Robotic Grasping using Deep Point Cloud Prediction Networks, [paper]
[arXiv] Inferring Occluded Geometry Improves Performance when Retrieving an Object from Dense Clutter, [paper]
[IROS] Robust Grasp Planning Over Uncertain Shape Completions, [paper]
[arXiv] Multi-Modal Geometric Learning for Grasping and Manipulation, [paper]
2018:
[ICRA] Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations, [paper]
[IROS] 3D Shape Perception from Monocular Vision, Touch, and Shape Priors, [paper]
2016:
[IROS] Shape Completion Enabled Robotic Grasping, [paper]
2019:
[CVIU] On the Benefit of Adversarial Training for Monocular Depth Estimation, [paper]
[ICCV] Learning Joint 2D-3D Representations for Depth Completion, [paper]
[ICCV] Deep Optics for Monocular Depth Estimation and 3D Object Detection, [paper]
[arXiv] Deep Classification Network for Monocular Depth Estimation, [paper]
[ICCV] Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints, [paper]
[arXiv] Image-based 3D Object Reconstruction: State-of-the-Art and Trends in the Deep Learning Era, [paper]
[arXiv] Real-time Vision-based Depth Reconstruction with NVidia Jetson, [paper]
[IROS] Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics, [paper]
[arXiv] Mesh R-CNN, [paper]
2018:
[NeurIPS] Learning to Reconstruct Shapes from Unseen Classes, [paper] [code]
[ECCV] Learning Shape Priors for Single-View 3D Completion and Reconstruction, [paper] [code]
[CVPR] Deep Depth Completion of a Single RGB-D Image, [paper] [code]
2019:
[arXiv] KETO: Learning Keypoint Representations for Tool Manipulation, [paper]
[arXiv] Learning Task-Oriented Grasping from Human Activity Datasets, [paper]
2019:
[arXiv] Using Synthetic Data and Deep Networks to Recognize Primitive Shapes for Object Grasping, [paper]
[ICRA] Transferring Grasp Configurations using Active Learning and Local Replanning, [paper]
2017:
[AIP] Fast grasping of unknown objects using principal component analysis, [paper]
2015:
[RAS] Category-based task specific grasping, [paper]
2019:
[arXiv] Non-Rigid Point Set Registration Networks, [paper] [code]
2018:
[RAL] Transferring Category-based Functional Grasping Skills by Latent Space Non-Rigid Registration, [paper]
[RAS] Learning Postural Synergies for Categorical Grasping through Shape Space Registration, [paper]
[RAS] Autonomous Dual-Arm Manipulation of Familiar Objects, [paper]
2019:
[IROS] Multi-step Pick-and-Place Tasks Using Object-centric Dense Correspondences, [code]
[arXiv] Unsupervised cycle-consistent deformation for shape matching, [paper]
[arXiv] ZoomOut: Spectral Upsampling for Efficient Shape Correspondence, [paper]
[C&G] Partial correspondence of 3D shapes using properties of the nearest-neighbor field, [paper]
2019:
[CVPR] PartNet: A Recursive Part Decomposition Network for Fine-grained and Hierarchical Shape Segmentation, [paper] [code]
[C&G] Autoencoder-based part clustering for part-in-whole retrieval of CAD models, [paper]
2016:
[SiggraphAsia] A Scalable Active Framework for Region Annotation in 3D Shape Collections, [paper]
2019:
[arXiv] UniGrasp: Learning a Unified Model to Grasp with N-Fingered Robotic Hands, [paper]
[ScienceRobotics] On the choice of grasp type and location when handing over an object, [paper]
[arXiv] Solving Rubik's Cube with a Robot Hand, [paper]
[IJARS] Fast geometry-based computation of grasping points on three-dimensional point clouds, [paper] [code]
[arXiv] Learning better generative models for dexterous, single-view grasping of novel objects, [paper]
[arXiv] DexPilot: Vision Based Teleoperation of Dexterous Robotic Hand-Arm System, [paper]
[IROS] Optimization Model for Planning Precision Grasps with Multi-Fingered Hands, [paper]
[IROS] Generating Grasp Poses for a High-DOF Gripper Using Neural Networks, [paper]
[arXiv] Deep Dynamics Models for Learning Dexterous Manipulation, [paper]
[CVPR] Learning joint reconstruction of hands and manipulated objects, [paper]
[CVPR] H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions, [paper]
[IROS] Efficient Grasp Planning and Execution with Multi-Fingered Hands by Surface Fitting, [paper]
[arXiv] Efficient Bimanual Manipulation Using Learned Task Schemas, [paper]
[ICRA] High-Fidelity Grasping in Virtual Reality using a Glove-based System, [paper] [code]
2019:
[arXiv] Self-supervised 6D Object Pose Estimation for Robot Manipulation, [paper]
[arXiv] Accept Synthetic Objects as Real-End-to-End Training of Attentive Deep Visuomotor Policies for Manipulation in Clutter, [paper]
[RSSW] Generative grasp synthesis from demonstration using parametric mixtures, [paper]
2018:
[RSS] Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision, [paper]
[CoRL] Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects, [paper]
[arXiv] Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation, [paper]
2017:
[arXiv] Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping, [paper]
2019:
[ICRA] Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks, [paper]
[CVPR] ContactDB: Analyzing and Predicting Grasp Contact via Thermal Imaging, [paper] [code]
2018:
[arXiv] Learning to Grasp without Seeing, [paper]
2019:
[IROS] Robot Learning of Shifting Objects for Grasping in Cluttered Environments, [paper] [code]
[arXiv] Learning Deep Parameterized Skills from Demonstration for Re-targetable Visuomotor Control, [paper]
[arXiv] Adversarial Skill Networks: Unsupervised Robot Skill Learning from Video, [paper]
[IROS] Learning Actions from Human Demonstration Video for Robotic Manipulation, [paper]
[RSSW] Generative grasp synthesis from demonstration using parametric mixtures, [paper]
2018:
[arXiv] Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation, [paper]
2019:
[arXiv] Dynamic Cloth Manipulation with Deep Reinforcement Learning, [paper]
[CoRL] Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning, [paper] [project]
[CoRL] Asynchronous Methods for Model-Based Reinforcement Learning, [paper]
[CoRL] Entity Abstraction in Visual Model-Based Reinforcement Learning, [paper]
[CoRL] Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation, [paper] [project]
[arXiv] Contextual Imagined Goals for Self-Supervised Robotic Learning, [paper]
[arXiv] Learning to Manipulate Deformable Objects without Demonstrations, [paper] [project]
[arXiv] A Deep Learning Approach to Grasping the Invisible, [paper]
[arXiv] Knowledge Induced Deep Q-Network for a Slide-to-Wall Object Grasping, [paper]
[arXiv] Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping, [paper]
[arXiv] Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control, [paper]
[arXiv] Reinforcement Learning for Robotic Manipulation using Simulated Locomotion Demonstrations, [paper]
[arXiv] Self-Supervised Sim-to-Real Adaptation for Visual Robotic Manipulation, [paper]
[arXiv] Object Perception and Grasping in Open-Ended Domains, [paper]
[CoRL] ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots, [paper] [code]
[RSS] End-to-End Robotic Reinforcement Learning without Reward Engineering, [paper]
[arXiv] Learning to combine primitive skills: A step towards versatile robotic manipulation, [paper]
[CoRL] A Survey on Reproducibility by Evaluating Deep Reinforcement Learning Algorithms on Real-World Robots, [paper] [code]
[ICCAS] Deep Reinforcement Learning Based Robot Arm Manipulation with Efficient Training Data through Simulation, [paper]
[CVPR] CRAVES: Controlling Robotic Arm with a Vision-based Economic System, [paper] [code]
[Report] A Unified Framework for Manipulating Objects via Reinforcement Learning, [paper]
2018:
[IROS] Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning, [paper] [code]
[CoRL] QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, [paper]
[arXiv] Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods, [paper]
[arXiv] Pick and Place Without Geometric Object Models, [paper]
2017:
[arXiv] Deep Reinforcement Learning for Robotic Manipulation-The state of the art, [paper]
2016:
[IJRR] Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning, [paper]
2013:
[IJRR] Reinforcement learning in robotics: A survey, [paper]
2019:
[ICRA] Learning Driven Coarse-to-Fine Articulated Robot Tracking, [paper]
[CVPR] Craves: controlling robotic arm with a vision-based, economic system, [paper] [code]
2018:
[arXiv] Point-to-Pose Voting based Hand Pose Estimation using Residual Permutation Equivariant Layer, [paper]
2016:
[ICRA] Robot Arm Pose Estimation by Pixel-wise Regression of Joint Angles, [paper]
2014:
[ICRA] Robot Arm Pose Estimation through Pixel-Wise Part Classification, [paper]
Abhinav Gupta(CMU & FAIR): Robotics, machine learning
Andreas ten Pas(Northeastern University): Robotic Grasping, Deep Learning, Simulation-based Planning
Andy Zeng(Princeton University & Google Brain Robotics): 3D Deep Learning, Robotic Grasping
Animesh Garg(University of Toronto): Robotics, Reinforcement Learning
Cewu Lu(SJTU): Machine Vision
Charles Ruizhongtai Qi(Waymo(Google)): 3D Deep Learning
Danfei Xu(Stanford University): Robotics, Computer Vision
Deter Fox(Nvidia & University of Washington): Robotics, Artificial intelligence, State Estimation
Fei-Fei Li(Stanford University): Computer Vision
Guofeng Zhang(ZJU): 3D Vision, SLAM
Hao Su(UC San Diego): 3D Deep Learning
Jeannette Bohg(Stanford University): perception for autonomous robotic manipulation and grasping
Jianping Shi(SenseTime): Computer Vision
Juxi Leitner(Australian Centre of Excellence for Robotic Vision (ACRV)): Robotic grasping
Lerrel Pinto(UC Berkeley): Robotics
Lorenzo Jamone(Queen Mary University of London (QMUL)): Cognitive Robotics
Lorenzo Natale(Italian Institute of Technology): Humanoid robotic sensing and perception
Kaiming He(Facebook AI Research (FAIR)): Deep Learning
Kai Xu(NUDT): Graphics, Geometry
Ken Goldberg(UC Berkeley): Robotics
Marc Pollefeys(Microsoft & ETH): Computer Vision
Markus Vincze(Technical University Wien (TUW)): Robotic Vision
Oliver Brock(TU Berlin): Robotic manipulation
Pascal Fua(INRIA): Computer Vision
Peter K. Allen.(Columbia University): Robotic Grasping, 3-D vision, Modeling, Medical robotics
Peter Corke(Queensland University of Technology): Robotic vision
Pieter Abbeel(UC Berkeley): Artificial Intelligence, Advanced Robotics
Raquel Urtasun(Uber ATG & University of Toronto): AI for self-driving cars, Computer Vision, Robotics
Robert Platt(Northeastern University): Robotic manipulation
Ruigang Yang(Baidu): Computer Vision, Robotics
Sergey Levine(UC Berkeley): Reinforcement Learning
Shuran Song(Columbia University), 3D Deep Learning, Robotics
Silvio Savarese(Stanford University): Computer Vision
Song-Chun Zhu(UCLA): Computer Vision
Tamim Asfour(Karlsruhe Institute of Technology (KIT)): Humanoid Robotics
Thomas Funkhouser(Princeton University): Geometry, Graphics, Shape
Valerio Ortenzi(University of Birmingham): Robotic vision
Vicient Lepetit(University of Bordeaux): Machine Learning, 3D Vision
Xiaogang Wang(Chinese University of Hong Kong): Deep Learning, Computer Vision
Xiaozhi Chen(DJI): Deep learning
Yan Xinchen(Uber ATG): Deep Representation Learning, Generative Modeling
Yu Xiang(Nvidia): Robotics, Computer Vision
Yue Wang(MIT): 3D Deep Learning