2017_CVPR_Papers

Good deep-learning papers in 2017 IEEE Conference on Computer Vision and Pattern Recognition.

理论实验&研究

✅ [Feedback Networks]

✅ [Comparative Evaluation of Hand-Crafted and Learned Local Features]

✅ [Understanding deep learning requires rethinking generalization]

模型压缩&模型加速

✅ [Local Binary Convolutional Neural Networks]

✅ [Deep Roots: Improving CNN Efficiency With Hierarchical Filter Groups]

视觉语义&图像理解

✅ [Graph-Structured Representations for Visual Question Answering]

✅ [Unsupervised Video Summarization With Adversarial LSTM Networks]

✅ [A Hierarchical Approach for Generating Descriptive Image Paragraphs]

✅ [Efficient Multiple Instance Metric Learning Using Weakly Supervised Data]

✅ [Neural Scene De-rendering]

✅ [Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection]

✅ [Attend to You: Personalized Image Captioning with Context Sequence Memory Networks]

✅ [Modeling Relationships in Referential Expressions with Compositional Modular Networks]

✅ [The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions]

✅ [ViP-CNN: Visual Phrase Guided Convolutional Neural Network]

✅ [SCC: Semantic Context Cascade for Efficient Action Detection]

✅ [Hierarchical Boundary-Aware Neural Encoder for Video Captioning]

✅ [Emotion Recognition in Context]

✅ [Automatic Understanding of Image and Video Advertisements]

✅ [Person Search with Natural Language Description]

✅ [Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos]

✅ [Dense Captioning With Joint Inference and Visual Context]

✅ [Instance-Aware Image and Sentence Matching With Selective Multimodal LSTM]

三维重建

✅ [Face Normals "In-The-Wild" Using Fully Convolutional Networks]

✅ [3D Face Morphable Models "In-The-Wild"]

✅ [Generating Holistic 3D Scene Abstractions for Text-Based Image Retrieval]

✅ [Unsupervised Monocular Depth Estimation With Left-Right Consistency]

✅ [Exploiting 2D Floorplan for Building-Scale Panorama RGBD Alignment]

✅ [A Point Set Generation Network for 3D Object Reconstruction From a Single Image]

✅ [Recurrent 3D Pose Sequence Machines]

✅ [Learning Detailed Face Reconstruction From a Single Image]

✅ [NID-SLAM: Robust Monocular SLAM using Normalised Information Distance]

✅ [Synthesizing 3D Shapes via Modeling Multi-View Depth Maps and Silhouettes With Deep Generative Networks]

✅ [End-To-End Training of Hybrid CNN-CRF Models for Stereo]

虚拟现实

✅ [Position Tracking for Virtual Reality Using Commodity WiFi]

弱监督深度学习

✅ [Learning by Association -- A Versatile Semi-Supervised Training Method for Neural Networks]

✅ [Weakly Supervised Cascaded Convolutional Networks]

✅ [WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation]

✅ [Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling]

✅ [Simple Does It: Weakly Supervised Instance and Semantic Segmentation]

✅ [Few-Shot Object Recognition from Machine-Labeled Web Images]

✅ [A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning]

✅ [Deep Self-Taught Learning for Weakly Supervised Object Localization]

✅ [From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis]

✅ [Unsupervised Learning of Depth and Ego-Motion From Video]

✅ [Attend in Groups: A Weakly-Supervised Deep Learning Framework for Learning From Web Data]

✅ [Weakly Supervised Dense Video Captioning]

✅ [Learning a Deep Embedding Model for Zero-Shot Learning]

✅ [Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos]

✅ [Unsupervised Learning of Long-Term Motion Dynamics for Videos]

训练提升技巧

✅ [Learning From Synthetic Humans]

✅ [Learning From Noisy Large-Scale Datasets With Minimal Supervision]

✅ [Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach]

图像显著性&注意力机制

✅ [Learning to Detect Salient Objects With Image-Level Supervision]

✅ [Dual Attention Networks for Multimodal Reasoning and Matching]

✅ [Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning]

✅ [Supervising Neural Attention Models for Video Captioning by Human Gaze Data]

✅ [Deep Level Sets for Salient Object Detection]

分割&检测识别&追踪&预测

✅ [Temporal Convolutional Networks for Action Segmentation and Detection]

✅ [One-Shot Video Object Segmentation]

✅ [Polyhedral Conic Classifiers for Visual Object Detection and Classification]

✅ [Mining Object Parts From CNNs via Active Question-Answering]

✅ [Learning Deep Context-aware Features over Body and Latent Parts for Person Re-identification]

✅ [Beyond triplet loss: a deep quadruplet network for person re-identification]

✅ [Surveillance Video Parsing with Single Frame Supervision]

✅ [Semantically Coherent Co-Segmentation and Reconstruction of Dynamic Scenes]

✅ [Pixelwise Instance Segmentation With a Dynamically Instantiated Network]

✅ [Video Propagation Networks]

✅ [Modeling Temporal Dynamics and Spatial Configurations of Actions Using Two-Stream Recurrent Neural Networks]

✅ [Self-Learning Scene-Specific Pedestrian Detectors Using a Progressive Latent Model]

✅ [IRINA: Iris Recognition (Even) in Inaccurately Segmented Data]

✅ [Forecasting Human Dynamics from Static Images]

✅ [Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition With Convolutional Neural Networks]

✅ [WILDCAT: Weakly Supervised Learning of Deep ConvNets for Image Classification, Pointwise Localization and Segmentation]

✅ [PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation]

✅ [Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core]

✅ [Object Detection in Videos With Tubelet Proposal Networks]

✅ [Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling]

✅ [Forecasting Interactive Dynamics of Pedestrians with Fictitious Play]

✅ [Convolutional Random Walk Networks for Semantic Image Segmentation]

✅ [Simple Does It: Weakly Supervised Instance and Semantic Segmentation]

✅ [Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing]

✅ [Finding Tiny Faces]

✅ [Visual-Inertial-Semantic Scene Representation for 3D Object Detection]

✅ [Predictive-Corrective Networks for Action Detection]

✅ [FastMask: Segment Multi-Scale Object Candidates in One Shot]

✅ [ActionVLAD: Learning spatio-temporal aggregation for action classification]

✅ [Interpretable Structure-Evolving LSTM]

✅ [Budget-Aware Deep Semantic Video Segmentation]

✅ [Spindle Net: Person Re-Identification With Human Body Region Guided Feature Decomposition and Fusion]

✅ [Hand Keypoint Detection in Single Images using Multiview Bootstrapping]

✅ [Few-Shot Object Recognition from Machine-Labeled Web Images]

✅ [Perceptual Generative Adversarial Networks for Small Object Detection]

✅ [Weakly Supervised Actor-Action Segmentation via Robust Multi-Task Ranking]

✅ [Sequential Person Recognition in Photo Albums With a Recurrent Network]

✅ [Person Re-Identification in the Wild]

✅ [Deep Self-Taught Learning for Weakly Supervised Object Localization]

✅ [Semantic Amodal Segmentation]

✅ [Deep Sequential Context Networks for Action Prediction]

✅ [Predicting Behaviors of Basketball Players From First Person Videos]

✅ [Spatiotemporal Pyramid Network for Video Action Recognition]

✅ [Object Region Mining With Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach]

✅ [MIML-FCN+: Multi-Instance Multi-Label Learning via Fully Convolutional Networks With Privileged Information]

✅ [Global Context-Aware Attention LSTM Networks for 3D Action Recognition]

✅ [Semantic Scene Completion from a Single Depth Image]

✅ [Multi-Context Attention for Human Pose Estimation]

✅ [Action Unit Detection with Region Adaptation, Multi-labeling Learning and Optimal Temporal Fusing]

✅ [RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation]

✅ [Deep Matching Prior Network: Toward Tighter Multi-Oriented Text Detection]

✅ [Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image]

✅ [Unsupervised Learning of Long-Term Motion Dynamics for Videos]

✅ [SyncSpecCNN: Synchronized Spectral CNN for 3D Shape Segmentation]

✅ [Fully Convolutional Instance-Aware Semantic Segmentation]

对抗神经网络

✅ [DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data]

✅ [Crossing Nets: Combining GANs and VAEs With a Shared Latent Space for Hand Pose Estimation]

✅ [Generating the Future with Adversarial Transformers]

✅ [Image-to-Image Translation with Conditional Adversarial Networks]

✅ [Perceptual Generative Adversarial Networks for Small Object Detection]

✅ [Disentangled Representation Learning GAN for Pose-Invariant Face Recognition]

✅ [3D Convolutional Neural Networks for Efficient and Robust Hand Pose Estimation from Single Depth Images]

✅ [Learning From Simulated and Unsupervised Images Through Adversarial Training]

✅ [Expecting the Unexpected: Training Detectors for Unusual Pedestrians With Adversarial Imposters]

深度强化学习

✅ [Deep Reinforcement Learning-Based Image Captioning With Embedding Reward]

✅ [Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection]

✅ [Collaborative Deep Reinforcement Learning for Joint Object Search]

迁移学习

✅ [Borrowing Treasures From the Wealthy: Deep Transfer Learning Through Selective Joint Fine-Tuning]

✅ [Learning a Deep Embedding Model for Zero-Shot Learning]

数据集

✅ [Visual Dialog]

✅ [Scene Parsing Through ADE20K Dataset]

✅ [Analyzing Computer Vision Data - The Good, the Bad and the Ugly]

神经网络结构

✅ [Multi-Way Multi-Level Kernel Modeling for Neuroimaging Classification]

✅ [Dilated Residual Networks]

✅ [Oriented Response Networks]

✅ [PolyNet: A Pursuit of Structural Diversity in Very Deep Networks]

✅ [Spatially Adaptive Computation Time for Residual Networks]

✅ [Xception: Deep Learning With Depthwise Separable Convolutions]

✅ [Aggregated Residual Transformations for Deep Neural Networks]

✅ [Loss Max-Pooling for Semantic Image Segmentation]

✅ [Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation]

✅ [Instance-Aware Image and Sentence Matching With Selective Multimodal LSTM]

✅ [Deep Temporal Linear Encoding Networks]

✅ [Deep Feature Flow for Video Recognition]

图像转换

✅ [Turning an Urban Scene Video Into a Cinemagraph]

✅ [Real-Time Neural Style Transfer for Videos]

✅ [Predicting Ground-Level Scene Layout from Aerial Imagery]

✅ [Image-to-Image Translation with Conditional Adversarial Networks]

✅ [Deep Joint Rain Detection and Removal from a Single Image]

✅ [From Red Wine to Red Tomato: Composition with Context]

✅ [StyleBank: An Explicit Representation for Neural Image Style Transfer]

✅ [Deep View Morphing]

图像校正

✅ [CLKN: Cascaded Lucas-Kanade Networks for Image Alignment]

✅ [Unrolling the Shutter: CNN to Correct Motion Distortions]

机器学习

✅ [Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition and Detection]

✅ [Efficient Multiple Instance Metric Learning Using Weakly Supervised Data]

✅ [Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction]

✅ [Unified Embedding and Metric Learning for Zero-Exemplar Event Detection]

✅ [Joint Discriminative Bayesian Dictionary and Classifier Learning]

✅ [Superpixel-based Tracking-by-Segmentation using Markov Chains]

✅ [iCaRL: Incremental Classifier and Representation Learning]

✅ [ShapeOdds: Variational Bayesian Learning of Generative Shape Models]

✅ [Outlier-Robust Tensor PCA]

超分辨率&去模糊

✅ [Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution]

✅ [Attention-Aware Face Hallucination via Deep Reinforcement Learning]

✅ [Deep Video Deblurring for Hand-held Cameras]

✅ [DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks]

✅ [From Motion Blur to Motion Flow: A Deep Learning Solution for Removing Heterogeneous Motion Blur]

图像质量评价

✅ [Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework]

视频插值

✅ [Video Frame Interpolation via Adaptive Convolution]

自动驾驶

✅ [Multi-View 3D Object Detection Network for Autonomous Driving]

✅ [End-To-End Learning of Driving Models From Large-Scale Video Datasets]

图像相似性

✅ [Conditional Similarity Networks]

✅ [Memory-Augmented Attribute Manipulation Networks for Interactive Fashion Search]

二维&三维匹配

✅ [3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions]

✅ [Quad-networks: unsupervised learning to rank for interest point detection]

离焦估计

✅ [A Unified Approach of Multi-Scale Deep and Hand-Crafted Features for Defocus Estimation]

其他

✅ [SRN: Side-output Residual Network for Object Symmetry Detection in the Wild]

✅ [Learning Deep Binary Descriptor With Multi-Quantization]

✅ [Learning Non-Lambertian Object Intrinsics Across ShapeNet Categories]

✅ [Efficient Diffusion on Region Manifolds: Recovering Small Objects With Compact CNN Representations]

✅ [Learned Contextual Feature Reweighting for Image Geo-Localization]

✅ [ER3: A Unified Framework for Event Retrieval, Recognition and Recounting]

专业相关（非机器学习）

色彩恒常性

✅ [Fast Fourier Color Constancy]

相机自动校准

✅ [A Practical Method for Fully Automatic Intrinsic Camera Calibration Using Directionally Encoded Light]

光源能量分布估计

✅ [Designing illuminant spectral power distributions for surface classification]

精确光流

✅ [Accurate Optical Flow via Direct Cost Volume Processing]

草图匹配

✅ [Asymmetric Feature Maps With Application to Sketch Based Retrieval]

三维建模

✅ [KillingFusion: Non-rigid 3D Reconstruction without Correspondences]

✅ [Are Large-Scale 3D Models Really Necessary for Accurate Visual Localization?]

可见水印

✅ [On the Effectiveness of Visible Watermarks]

VCL-WHU/2017_CVPR_Papers