| Video-to-Video Synthesis |
NIPS |
code |
4749 |
| Deep Image Prior |
CVPR |
code |
3451 |
| StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation |
CVPR |
code |
3104 |
| Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network |
ECCV |
code |
2109 |
| Learning to See in the Dark |
CVPR |
code |
2033 |
| Glow: Generative Flow with Invertible 1x1 Convolutions |
NIPS |
code |
1862 |
| Squeeze-and-Excitation Networks |
CVPR |
code |
1263 |
| Efficient Neural Architecture Search via Parameters Sharing |
ICML |
code |
1189 |
| Multimodal Unsupervised Image-to-image Translation |
ECCV |
code |
1183 |
| Non-Local Neural Networks |
CVPR |
code |
859 |
| Image Generation From Scene Graphs |
CVPR |
code |
772 |
| Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? |
CVPR |
code |
690 |
| Single-Shot Refinement Neural Network for Object Detection |
CVPR |
code |
668 |
| GANimation: Anatomically-aware Facial Animation from a Single Image |
ECCV |
code |
628 |
| Detect-and-Track: Efficient Pose Estimation in Videos |
CVPR |
code |
549 |
| Relation Networks for Object Detection |
CVPR |
code |
532 |
| PointCNN |
NIPS |
code |
506 |
| Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples |
ICML |
code |
491 |
| Simple Baselines for Human Pose Estimation and Tracking |
ECCV |
code |
488 |
| Taskonomy: Disentangling Task Transfer Learning |
CVPR |
code |
453 |
| Which Training Methods for GANs do actually Converge? |
ICML |
code |
453 |
| Cascaded Pyramid Network for Multi-Person Pose Estimation |
CVPR |
code |
447 |
| Pelee: A Real-Time Object Detection System on Mobile Devices |
NIPS |
code |
441 |
| Generative Image Inpainting With Contextual Attention |
CVPR |
code |
441 |
| Neural 3D Mesh Renderer |
CVPR |
code |
436 |
| Look at Boundary: A Boundary-Aware Face Alignment Algorithm |
CVPR |
code |
416 |
| Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs |
CVPR |
code |
412 |
| End-to-End Recovery of Human Shape and Pose |
CVPR |
code |
388 |
| In-Place Activated BatchNorm for Memory-Optimized Training of DNNs |
CVPR |
code |
388 |
| ICNet for Real-Time Semantic Segmentation on High-Resolution Images |
ECCV |
code |
372 |
| The Unreasonable Effectiveness of Deep Features as a Perceptual Metric |
CVPR |
code |
360 |
| Distractor-aware Siamese Networks for Visual Object Tracking |
ECCV |
code |
350 |
| Frustum PointNets for 3D Object Detection From RGB-D Data |
CVPR |
code |
346 |
| Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++ |
CVPR |
code |
339 |
| Gibson Env: Real-World Perception for Embodied Agents |
CVPR |
code |
332 |
| Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning |
CVPR |
code |
309 |
| Soccer on Your Tabletop |
CVPR |
code |
308 |
| Noise2Noise: Learning Image Restoration without Clean Data |
ICML |
code |
304 |
| GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose |
CVPR |
code |
301 |
| GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation |
CVPR |
code |
301 |
| Neural Baby Talk |
CVPR |
code |
292 |
| Acquisition of Localization Confidence for Accurate Object Detection |
ECCV |
code |
285 |
| The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks |
CVPR |
code |
283 |
| PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume |
CVPR |
code |
283 |
| Fast End-to-End Trainable Guided Filter |
CVPR |
code |
274 |
| Adversarially Regularized Autoencoders |
ICML |
code |
261 |
| License Plate Detection and Recognition in Unconstrained Scenarios |
ECCV |
code |
258 |
| Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors |
CVPR |
code |
257 |
| Supervising Unsupervised Learning |
NIPS |
code |
255 |
| Pyramid Stereo Matching Network |
CVPR |
code |
250 |
| Convolutional Neural Networks With Alternately Updated Clique |
CVPR |
code |
250 |
| Deep Photo Enhancer: Unpaired Learning for Image Enhancement From Photographs With GANs |
CVPR |
code |
241 |
| Neural Relational Inference for Interacting Systems |
ICML |
code |
240 |
| Learning to Adapt Structured Output Space for Semantic Segmentation |
CVPR |
code |
239 |
| An intriguing failing of convolutional neural networks and the CoordConv solution |
NIPS |
code |
230 |
| Learning to Segment Every Thing |
CVPR |
code |
227 |
| LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation |
CVPR |
code |
223 |
| End-to-End Learning of Motion Representation for Video Understanding |
CVPR |
code |
222 |
| Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images |
ECCV |
code |
219 |
| Bilinear Attention Networks |
NIPS |
code |
216 |
| Iterative Visual Reasoning Beyond Convolutions |
CVPR |
code |
213 |
| Semi-Parametric Image Synthesis |
CVPR |
code |
213 |
| A Style-Aware Content Loss for Real-time HD Style Transfer |
ECCV |
code |
201 |
| Style Aggregated Network for Facial Landmark Detection |
CVPR |
code |
192 |
| Pose-Robust Face Recognition via Deep Residual Equivariant Mapping |
CVPR |
code |
189 |
| GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models |
ICML |
code |
186 |
| Referring Relationships |
CVPR |
code |
185 |
| MoCoGAN: Decomposing Motion and Content for Video Generation |
CVPR |
code |
184 |
| Compressed Video Action Recognition |
CVPR |
code |
180 |
| LayoutNet: Reconstructing the 3D Room Layout From a Single RGB Image |
CVPR |
code |
178 |
| ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation |
ECCV |
code |
176 |
| Latent Alignment and Variational Attention |
NIPS |
code |
172 |
| Multi-Content GAN for Few-Shot Font Style Transfer |
CVPR |
code |
170 |
| SPLATNet: Sparse Lattice Networks for Point Cloud Processing |
CVPR |
code |
166 |
| Attentive Generative Adversarial Network for Raindrop Removal From a Single Image |
CVPR |
code |
158 |
| Single View Stereo Matching |
CVPR |
code |
158 |
| Unsupervised Feature Learning via Non-Parametric Instance Discrimination |
CVPR |
code |
156 |
| An End-to-End TextSpotter With Explicit Alignment and Attention |
CVPR |
code |
156 |
| Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks |
CVPR |
code |
154 |
| ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing |
CVPR |
code |
153 |
| Evolved Policy Gradients |
NIPS |
code |
151 |
| Optimizing Video Object Detection via a Scale-Time Lattice |
CVPR |
code |
150 |
| Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs |
CVPR |
code |
150 |
| Learning Category-Specific Mesh Reconstruction from Image Collections |
ECCV |
code |
146 |
| Group Normalization |
ECCV |
code |
145 |
| DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks |
CVPR |
code |
142 |
| MegaDepth: Learning Single-View Depth Prediction From Internet Photos |
CVPR |
code |
142 |
| ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices |
CVPR |
code |
142 |
| Deep Clustering for Unsupervised Learning of Visual Features |
ECCV |
code |
139 |
| BSN: Boundary Sensitive Network for Temporal Action Proposal Generation |
ECCV |
code |
139 |
| Learning a Single Convolutional Super-Resolution Network for Multiple Degradations |
CVPR |
code |
139 |
| Facelet-Bank for Fast Portrait Manipulation |
CVPR |
code |
138 |
| Image Super-Resolution Using Very Deep Residual Channel Attention Networks |
ECCV |
code |
137 |
| ECO: Efficient Convolutional Network for Online Video Understanding |
ECCV |
code |
137 |
| PlaneNet: Piece-Wise Planar Reconstruction From a Single RGB Image |
CVPR |
code |
137 |
| Self-Imitation Learning |
ICML |
code |
136 |
| Residual Dense Network for Image Super-Resolution |
CVPR |
code |
134 |
| Embodied Question Answering |
CVPR |
code |
132 |
| Unsupervised Cross-Dataset Person Re-Identification by Transfer Learning of Spatial-Temporal Patterns |
CVPR |
code |
131 |
| Two-Stream Convolutional Networks for Dynamic Texture Synthesis |
CVPR |
code |
131 |
| Densely Connected Pyramid Dehazing Network |
CVPR |
code |
130 |
| Camera Style Adaptation for Person Re-Identification |
CVPR |
code |
128 |
| Neural Motifs: Scene Graph Parsing With Global Context |
CVPR |
code |
127 |
| Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer |
CVPR |
code |
125 |
| Relational recurrent neural networks |
NIPS |
code |
124 |
| LSTM Pose Machines |
CVPR |
code |
124 |
| SO-Net: Self-Organizing Network for Point Cloud Analysis |
CVPR |
code |
123 |
| Image-Image Domain Adaptation With Preserved Self-Similarity and Domain-Dissimilarity for Person Re-Identification |
CVPR |
code |
121 |
| Context Embedding Networks |
CVPR |
code |
120 |
| Fast and Accurate Online Video Object Segmentation via Tracking Parts |
CVPR |
code |
119 |
| Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation |
CVPR |
code |
119 |
| Learning to Compare: Relation Network for Few-Shot Learning |
CVPR |
code |
118 |
| Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining |
ECCV |
code |
116 |
| Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships |
CVPR |
code |
116 |
| MVSNet: Depth Inference for Unstructured Multi-view Stereo |
ECCV |
code |
116 |
| Weakly Supervised Instance Segmentation Using Class Peak Response |
CVPR |
code |
116 |
| L4: Practical loss-based stepsize adaptation for deep learning |
NIPS |
code |
116 |
| A Closer Look at Spatiotemporal Convolutions for Action Recognition |
CVPR |
code |
115 |
| Unsupervised Learning of Monocular Depth Estimation and Visual Odometry With Deep Feature Reconstruction |
CVPR |
code |
114 |
| Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling |
CVPR |
code |
114 |
| MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network |
ECCV |
code |
113 |
| Gated Path Planning Networks |
ICML |
code |
113 |
| PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning |
CVPR |
code |
110 |
| Decoupled Networks |
CVPR |
code |
109 |
| Video Based Reconstruction of 3D People Models |
CVPR |
code |
109 |
| CosFace: Large Margin Cosine Loss for Deep Face Recognition |
CVPR |
code |
109 |
| DeepMVS: Learning Multi-View Stereopsis |
CVPR |
code |
108 |
| Hierarchical Imitation and Reinforcement Learning |
ICML |
code |
107 |
| Real-Time Seamless Single Shot 6D Object Pose Prediction |
CVPR |
code |
107 |
| Adaptive Affinity Fields for Semantic Segmentation |
ECCV |
code |
107 |
| Long-term Tracking in the Wild: a Benchmark |
ECCV |
code |
106 |
| Realistic Evaluation of Deep Semi-Supervised Learning Algorithms |
NIPS |
code |
106 |
| Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics |
CVPR |
code |
104 |
| Deep Back-Projection Networks for Super-Resolution |
CVPR |
code |
104 |
| 3D-CODED: 3D Correspondences by Deep Deformation |
ECCV |
code |
102 |
| Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform |
CVPR |
code |
102 |
| Scale-Recurrent Network for Deep Image Deblurring |
CVPR |
code |
101 |
| PU-Net: Point Cloud Upsampling Network |
CVPR |
code |
101 |
| Noisy Natural Gradient as Variational Inference |
ICML |
code |
100 |
| Domain Adaptive Faster R-CNN for Object Detection in the Wild |
CVPR |
code |
99 |
| Rethinking Feature Distribution for Loss Functions in Image Classification |
CVPR |
code |
97 |
| DenseASPP for Semantic Segmentation in Street Scenes |
CVPR |
code |
97 |
| Quantized Densely Connected U-Nets for Efficient Landmark Localization |
ECCV |
code |
97 |
| Graph R-CNN for Scene Graph Generation |
ECCV |
code |
96 |
| Factoring Shape, Pose, and Layout From the 2D Image of a 3D Scene |
CVPR |
code |
94 |
| Density-Aware Single Image De-Raining Using a Multi-Stream Dense Network |
CVPR |
code |
93 |
| Deep Depth Completion of a Single RGB-D Image |
CVPR |
code |
93 |
| MAttNet: Modular Attention Network for Referring Expression Comprehension |
CVPR |
code |
92 |
| Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis |
ICML |
code |
91 |
| ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes |
ECCV |
code |
89 |
| Neural Arithmetic Logic Units |
NIPS |
code |
87 |
| Perturbative Neural Networks |
CVPR |
code |
86 |
| Knowledge Aided Consistency for Weakly Supervised Phrase Grounding |
CVPR |
code |
86 |
| Repulsion Loss: Detecting Pedestrians in a Crowd |
CVPR |
code |
86 |
| End-to-End Weakly-Supervised Semantic Alignment |
CVPR |
code |
86 |
| Learning Blind Video Temporal Consistency |
ECCV |
code |
84 |
| PSANet: Point-wise Spatial Attention Network for Scene Parsing |
ECCV |
code |
84 |
| Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights |
ECCV |
code |
83 |
| Nonlinear 3D Face Morphable Model |
CVPR |
code |
81 |
| Deep Mutual Learning |
CVPR |
code |
80 |
| Image Inpainting for Irregular Holes Using Partial Convolutions |
ECCV |
code |
79 |
| BodyNet: Volumetric Inference of 3D Human Body Shapes |
ECCV |
code |
78 |
| Integral Human Pose Regression |
ECCV |
code |
77 |
| FSRNet: End-to-End Learning Face Super-Resolution With Facial Priors |
CVPR |
code |
77 |
| Attention-based Deep Multiple Instance Learning |
ICML |
code |
77 |
| LiDAR-Video Driving Dataset: Learning Driving Policies Effectively |
CVPR |
code |
77 |
| Multi-View Consistency as Supervisory Signal for Learning Shape and Pose Prediction |
CVPR |
code |
76 |
| Macro-Micro Adversarial Network for Human Parsing |
ECCV |
code |
76 |
| Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence |
ECCV |
code |
75 |
| LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks |
ECCV |
code |
75 |
| Neural Kinematic Networks for Unsupervised Motion Retargetting |
CVPR |
code |
75 |
| Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking |
CVPR |
code |
75 |
| Synthesizing Images of Humans in Unseen Poses |
CVPR |
code |
74 |
| A PID Controller Approach for Stochastic Optimization of Deep Networks |
CVPR |
code |
74 |
| Tell Me Where to Look: Guided Attention Inference Network |
CVPR |
code |
74 |
| Multi-Scale Location-Aware Kernel Representation for Object Detection |
CVPR |
code |
73 |
| Recurrent Relational Networks |
NIPS |
code |
73 |
| VITON: An Image-Based Virtual Try-On Network |
CVPR |
code |
73 |
| VITAL: VIsual Tracking via Adversarial Learning |
CVPR |
code |
73 |
| Future Frame Prediction for Anomaly Detection – A New Baseline |
CVPR |
code |
72 |
| Recurrent Pixel Embedding for Instance Grouping |
CVPR |
code |
71 |
| Learning Human-Object Interactions by Graph Parsing Neural Networks |
ECCV |
code |
69 |
| Repeatability Is Not Enough: Learning Affine Regions via Discriminability |
ECCV |
code |
67 |
| Visual Feature Attribution Using Wasserstein GANs |
CVPR |
code |
67 |
| Avatar-Net: Multi-Scale Zero-Shot Style Transfer by Feature Decoration |
CVPR |
code |
66 |
| Learning SO(3) Equivariant Representations with Spherical CNNs |
ECCV |
code |
64 |
| Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation |
ECCV |
code |
64 |
| SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation |
CVPR |
code |
64 |
| ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans |
CVPR |
code |
64 |
| One-Shot Unsupervised Cross Domain Translation |
NIPS |
code |
62 |
| Pairwise Confusion for Fine-Grained Visual Classification |
ECCV |
code |
62 |
| Multi-Shot Pedestrian Re-Identification via Sequential Decision Making |
CVPR |
code |
62 |
| Generalizing A Person Retrieval Model Hetero- and Homogeneously |
ECCV |
code |
61 |
| Learning Depth From Monocular Videos Using Direct Methods |
CVPR |
code |
61 |
| Optimizing the Latent Space of Generative Networks |
ICML |
code |
60 |
| CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes |
CVPR |
code |
59 |
| “Zero-Shot” Super-Resolution Using Deep Internal Learning |
CVPR |
code |
59 |
| Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking |
CVPR |
code |
59 |
| PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition |
CVPR |
code |
58 |
| Progressive Neural Architecture Search |
ECCV |
code |
58 |
| Generative Neural Machine Translation |
NIPS |
code |
58 |
| Learning to Reweight Examples for Robust Deep Learning |
ICML |
code |
58 |
| Object Level Visual Reasoning in Videos |
ECCV |
code |
57 |
| Generate to Adapt: Aligning Domains Using Generative Adversarial Networks |
CVPR |
code |
57 |
| Improving Generalization via Scalable Neighborhood Component Analysis |
ECCV |
code |
57 |
| Geometry-Aware Learning of Maps for Camera Localization |
CVPR |
code |
57 |
| Path-Level Network Transformation for Efficient Architecture Search |
ICML |
code |
57 |
| Decorrelated Batch Normalization |
CVPR |
code |
57 |
| Ordinal Depth Supervision for 3D Human Pose Estimation |
CVPR |
code |
57 |
| Disentangled Person Image Generation |
CVPR |
code |
57 |
| Regularizing RNNs for Caption Generation by Reconstructing the Past With the Present |
CVPR |
code |
57 |
| Diverse Image-to-Image Translation via Disentangled Representations |
ECCV |
code |
56 |
| Pointwise Convolutional Neural Networks |
CVPR |
code |
56 |
| Neural Program Synthesis from Diverse Demonstration Videos |
ICML |
code |
56 |
| Learning Less Is More - 6D Camera Localization via 3D Surface Regression |
CVPR |
code |
55 |
| Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency |
ECCV |
code |
55 |
| Learning Latent Super-Events to Detect Multiple Activities in Videos |
CVPR |
code |
55 |
| Depth-aware CNN for RGB-D Segmentation |
ECCV |
code |
55 |
| Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning |
CVPR |
code |
54 |
| Unsupervised Discovery of Object Landmarks as Structural Representations |
CVPR |
code |
54 |
| [ |
|
|
|