ICML-2019

A summary of research work presented in the thirty-sixth International Conference on Machine Learning (ICML) @ Long beach - 2019

Tutorials

Never Ending Learning [Web]
Active learning from theory to practice [Web]
Active Hypothesis Testing: An Information Theoretic (re)View [Web]
Recent Advances in Population-Based Search for Deep Neural Networks: Quality Diversity, Indirect Encodings, and Open-Ended Algorithms [Web]
Meta-Learning: from Few-Shot Learning to Rapid Reinforcement Learning [Web]
A Tutorial on Attention in Deep Learning [Web]
Causal Inference and Stable Learning [Web]
A Primer on PAC-Bayesian Learning [Web]
Neural Approaches to Conversational AI [Web]

Multi Track Research

Day 1

Opening Remarks [Web]
Machine learning for robots to think fast [Web]
Best Paper [Web]
Adversarial Attacks on Node Embeddings via Graph Poisoning [Web] [Slides]
SelectiveNet: A Deep Neural Network with an Integrated Reject Option [Web]
ELF OpenGo: an analysis and open reimplementation of AlphaZero [Web]
A Contrastive Divergence for Combining Variational Inference and MCMC [Web] [Slides]
Regret Circuits: Composability of Regret Minimizers [Web] [Slides]
Refined Complexity of PCA with Outliers [Web]
PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization [Web] [Slides]
Validating Causal Inference Models via Influence Functions [Web]
Data Shapley: Equitable Valuation of Data for Machine Learning [Web]
First-Order Adversarial Vulnerability of Neural Networks and Input Dimension [Web] [Slides]
Manifold Mixup: Better Representations by Interpolating Hidden States [Web] [Slides]
Making Deep Q-learning methods robust to time discretization [Web] [Slides]
Calibrated Approximate Bayesian Inference [Web] [Slides]
Game Theoretic Optimization via Gradient-based Nikaido-Isoda Function [Web] [Slides]
On Efficient Optimal Transport: An Analysis of Greedy and Accelerated Mirror Descent Algorithms [Web] [Slides]
Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization [Web] [Slides]
Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks [Web] [Slides]
Feature Grouping as a Stochastic Regularizer for High-Dimensional Structured Data [Web] [Slides]
On Certifying Non-Uniform Bounds against Adversarial Attacks [Web] [Slides]
Processing Megapixel Images with Deep Attention-Sampling Models [Web] [Slides]
Nonlinear Distributional Gradient Temporal-Difference Learning [Web] [Slides]
Moment-Based Variational Inference for Markov Jump Processes [Web] [Slides]
Stable-Predictive Optimistic Counterfactual Regret Minimization [Web] [Slides]
Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models [Web] [Slides]
Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization [Web] [Slides]
Learning to Groove with Inverse Sequence Transformations [Web] [Slides]
Metric-Optimized Example Weights [Web] [Slides]
Improving Adversarial Robustness via Promoting Ensemble Diversity [Web] [Slides]
TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning [Web] [Slides]
Composing Entropic Policies using Divergence Correction [Web] [Slides]
Understanding MCMC Dynamics as Flows on the Wasserstein Space [Web] [Slides]
When Samples Are Strategically Selected [Web] [Slides]
Teaching a black-box learner [Web] [Slides]
Lower Bounds for Smooth Nonconvex Finite-Sum Optimization [Web] [Slides]
Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI [Web] [Slides]
Improving Model Selection by Employing the Test Data [Web] [Slides]
Adversarial camera stickers: A physical camera-based attack on deep learning systems [Web] [Slides]
Online Meta-Learning [Web] [Slides]
TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning [Web] [Slides]
LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations [Web] [Slides]
Statistical Foundations of Virtual Democracy [Web] [Slides]
PAC Learnability of Node Functions in Networked Dynamical Systems [Web] [Slides]
Nonconvex Variance Reduced Optimization with Arbitrary Sampling [Web] [Slides]
HOList: An Environment for Machine Learning of Higher Order Logic Theorem Proving [Web] [Slides]
Topological Data Analysis of Decision Boundaries with Application to Model Selection [Web] [Slides]
Adversarial examples from computational constraints [Web]
Training Neural Networks with Local Error Signals [Web] [Slides]
Multi-Agent Adversarial Inverse Reinforcement Learning [Web] [Slides]
Amortized Monte Carlo Integration [Web] [Slides]
Optimal Auctions through Deep Learning [Web] [Slides]
Online learning with kernel losses [Web] [Slides]
Error Feedback Fixes SignSGD and other Gradient Compression Schemes [Web]
Molecular Hypergraph Grammar with Its Application to Molecular Optimization [Web] [Slides]
Contextual Memory Trees [Web]
POPQORN: Quantifying Robustness of Recurrent Neural Networks [Web] [Slides]
GMNN: Graph Markov Neural Networks [Web] [Slides]
Policy Consolidation for Continual Reinforcement Learning [Web] [Slides]
Stein Point Markov Chain Monte Carlo [Web] [Slides]
Learning to Clear the Market [Web] [Slides]
Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates [Web] [Slides]
A Composite Randomized Incremental Gradient Method [Web] [Slides]
Graph Neural Network for Music Score Data and Modeling Expressive Piano Performance [Web] [Slides]
Sparse Extreme Multi-label Learning with Oracle Property [Web] [Slides]
Using Pre-Training Can Improve Model Robustness and Uncertainty [Web] [Slides]
Self-Attention Graph Pooling [Web] [Slides]
Off-Policy Deep Reinforcement Learning without Exploration [Web] [Slides]
Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations [Web] [Slides]
Learning to bid in revenue-maximizing auctions [Web] [Slides]
Fast Rates for a kNN Classifier Robust to Unknown Asymmetric Label Noise [Web] [Slides]
Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference [Web] [Slides]
Learning to Prove Theorems via Interacting with Proof Assistants [Web] [Slides]
Shape Constraints for Set Functions [Web] [Slides]
[Web]
Combating Label Noise in Deep Learning using Abstention [Web] [Slides]
Random Expert Distillation: Imitation Learning via Expert Policy Support Estimation [Web] [Slides]
Particle Flow Bayes' Rule [Web] [Slides]
Open-ended learning in symmetric zero-sum games [Web] [Slides]
Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension [Web] [Slides]
Multiplicative Weights Updates as a distributed constrained optimization algorithm: Convergence to second-order stationary points almost always [Web] [Slides]
Circuit-GNN: Graph Neural Networks for Distributed Circuit Design [Web] [Slides]
On The Power of Curriculum Learning in Training Deep Networks [Web] [Slides]
PROVEN: Verifying Robustness of Neural Networks with a Probabilistic Approach [Web] [Slides]
LGM-Net: Learning to Generate Matching Networks for Few-Shot Learning [Web] [Slides]
Revisiting the Softmax Bellman Operator: New Benefits and New Perspective [Web] [Slides]
Correlated Variational Auto-Encoders [Web] [Slides]
Deep Counterfactual Regret Minimization [Web] [Slides]
Maximum Likelihood Estimation for Learning Populations of Parameters [Web] [Slides]
Katalyst: Boosting Convex Katayusha for Non-Convex Problems with a Large Condition Number [Web] [Slides]
Learning to Optimize Multigrid PDE Solvers [Web] [Slides]
Voronoi Boundary Classification: A High-Dimensional Geometric Approach via Weighted Monte Carlo Integration [Web] [Slides]
On Learning Invariant Representations for Domain Adaptation [Web]
Self-Attention Generative Adversarial Networks [Web]
An Investigation of Model-Free Planning [Web]
Towards a Unified Analysis of Random Fourier Features [Web]
Generalized Approximate Survey Propagation for High-Dimensional Estimation [Web] [Slides]
Projection onto Minkowski Sums with Application to Constrained Learning [Web] [Slides]
Safe Policy Improvement with Baseline Bootstrapping [Web] [Slides]
A Block Coordinate Descent Proximal Method for Simultaneous Filtering and Parameter Estimation [Web]
Robust Decision Trees Against Adversarial Examples [Web] [Slides]
Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models [Web] [Slides]
Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching [Web] [Slides]
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning [Web] [Slides]
Learning deep kernels for exponential family densities [Web] [Slides]
Boosted Density Estimation Remastered [Web] [Slides]
Blended Conditonal Gradients [Web] [Slides]
Distributional Reinforcement Learning for Efficient Exploration [Web] [Slides]
Learning Hawkes Processes Under Synchronization Noise [Web] [Slides]
Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth [Web] [Slides]
Adversarial Generation of Time-Frequency Features with application in audio synthesis [Web] [Slides]
High-Fidelity Image Generation With Fewer Labels [Web] [Slides]
Task-Agnostic Dynamics Priors for Deep Reinforcement Learning [Web] [Slides]
Bayesian Deconditional Kernel Mean Embeddings [Web] [Slides]
Inference and Sampling of $K_{33}$-free Ising Models [Web] [Slides]
Acceleration of SVRG and Katyusha X by Inexact Preconditioning [Web] [Slides]
Optimistic Policy Optimization via Multiple Importance Sampling [Web] [Slides]
Generative Adversarial User Model for Reinforcement Learning Based Recommendation System [Web] [Slides]
Look Ma, No Latent Variables: Accurate Cutset Networks via Compilation [Web] [Slides]
On the Universality of Invariant Networks [Web] [Slides]
Revisiting precision recall definition for generative modeling [Web] [Slides]
Diagnosing Bottlenecks in Deep Q-learning Algorithms [Web] [Slides]
A Kernel Perspective for Regularizing Deep Neural Networks [Web] [Slides]
Random Matrix Improved Covariance Estimation for a Large Class of Metrics [Web] [Slides]
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD [Web] [Slides]
Neural Logic Reinforcement Learning [Web] [Slides]
A Statistical Investigation of Long Memory in Language and Music [Web] [Slides]
Optimal Transport for structured data with application on graphs [Web] [Slides]
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks [Web] [Slides]
Wasserstein of Wasserstein Loss for Learning Generative Models [Web] [Slides]
Collaborative Evolutionary Reinforcement Learning [Web] [Slides]
A Persistent Weisfeiler--Lehman Procedure for Graph Classification [Web] [Slides]
Dual Entangled Polynomial Code: Three-Dimensional Coding for Distributed Matrix Multiplication [Web] [Slides]
A Conditional-Gradient-Based Augmented Lagrangian Framework [Web] [Slides]
Learning to Collaborate in Markov Decision Processes [Web] [Slides]
Deep Factors for Forecasting [Web] [Slides]
Learning Optimal Linear Regularizers [Web] [Slides]
Gauge Equivariant Convolutional Networks and the Icosahedral CNN [Web]
Flat Metric Minimization with Applications in Generative Modeling [Web] [Slides]
EMI: Exploration with Mutual Information [Web]
Rehashing Kernel Evaluation in High Dimensions [Web] [Slides]
Neural Joint Source-Channel Coding [Web] [Slides]
SGD: General Analysis and Improved Rates [Web]
Predictor-Corrector Policy Optimization [Web] [Slides]
Weakly-Supervised Temporal Localization via Occurrence Count Learning [Web] [Slides]
On Symmetric Losses for Learning from Corrupted Labels [Web] [Slides]
Feature-Critic Networks for Heterogeneous Domain Generalization [Web] [Slides]
Entropic GANs meet VAEs: A Statistical Approach to Compute Sample Likelihoods in GANs [Web] [Slides]
Imitation Learning from Imperfect Demonstration [Web] [Slides]
Large-Scale Sparse Kernel Canonical Correlation Analysis [Web] [Slides][Slides]
Doubly-Competitive Distribution Estimation [Web] [Slides]
Curvature-Exploiting Acceleration of Elastic Net Computations [Web] [Slides]
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning [Web] [Slides]
Switching Linear Dynamics for Variational Bayes Filtering [Web] [Slides]
AUCµ: A Performance Metric for Multi-Class Machine Learning Models [Web] [Slides]
Learning to Convolve: A Generalized Weight-Tying Approach [Web] [Slides]
Non-Parametric Priors For Generative Adversarial Networks [Web] [Slides]
Curiosity-Bottleneck: Exploration By Distilling Task-Specific Novelty [Web] [Slides]
A Kernel Theory of Modern Data Augmentation [Web] [Slides]
Homomorphic Sensing [Web] [Slides]
Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication [Web] [Slides]
DeepMDP: Learning Continuous Latent Space Models for Representation Learning [Web] [Slides]
Imputing Missing Events in Continuous-Time Event Streams [Web] [Slides]
Regularization in directable environments with application to Tetris [Web] [Slides]
On Dropout and Nuclear Norm Regularization [Web] [Slides]
Lipschitz Generative Adversarial Nets [Web] [Slides]
Dynamic Weights in Multi-Objective Deep Reinforcement Learning [Web] [Slides]
kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection [Web] [Slides]
Phaseless PCA: Low-Rank Matrix Recovery from Column-wise Phaseless Measurements [Web] [Slides]
Safe Grid Search with Optimal Complexity [Web] [Slides]
Importance Sampling Policy Evaluation with an Estimated Behavior Policy [Web] [Slides]
Understanding and Controlling Memory in Recurrent Neural Networks [Web] [Slides]
Improved Dynamic Graph Learning through Fault-Tolerant Sparsification [Web] [Slides]
Gradient Descent Finds Global Minima of Deep Neural Networks [Web] [Slides]
HexaGAN: Generative Adversarial Nets for Real World Classification [Web] [Slides]
Fingerprint Policy Optimisation for Robust Reinforcement Learning [Web] [Slides]
Scalable Learning in Reproducing Kernel Krein Spaces [Web] [Slides]
Rate Distortion For Model Compression:From Theory To Practice [Web] [Slides]
SAGA with Arbitrary Sampling [Web] [Slides]
Learning from a Learner [Web] [Slides]
Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces [Web] [Slides]
Heterogeneous Model Reuse via Optimizing Multiparty Multiclass Margin [Web] [Slides]
Composable Core-sets for Determinant Maximization: A Simple Near-Optimal Algorithm [Web] [Slides]
Graph Matching Networks for Learning the Similarity of Graph Structured Objects [Web] [Slides]
An Investigation into Neural Net Optimization via Hessian Eigenvalue Density [Web] [Slides]
Dirichlet Simplex Nest and Geometric Inference [Web] [Slides]
Formal Privacy for Functional Data with Gaussian Perturbations [Web] [Slides]
Natural Analysts in Adaptive Data Analysis [Web] [Slides]
Separable value functions across time-scales [Web]
Subspace Robust Wasserstein Distances [Web]
Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff [Web]
Sublinear Time Nearest Neighbor Search over Generalized Weighted Space [Web] [Slides]
BayesNAS: A Bayesian Approach for Neural Architecture Search [Web] [Slides]
Differentiable Linearized ADMM [Web] [Slides]
Bayesian leave-one-out cross-validation for large data [Web] [Slides]
Graphical-model based estimation and inference for differential privacy [Web] [Slides]
CapsAndRuns: An Improved Method for Approximately Optimal Algorithm Configuration [Web] [Slides]
Learning Action Representations for Reinforcement Learning [Web] [Slides]
Decomposing feature-level variation with Covariate Gaussian Process Latent Variable Models [Web] [Slides]
Collaborative Channel Pruning for Deep Networks [Web] [Slides]
Compressing Gradient Optimizers via Count-Sketches [Web] [Slides]
Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks [Web] [Slides]
Adaptive Stochastic Natural Gradient Method for One-Shot Neural Architecture Search [Web] [Slides]
Rao-Blackwellized Stochastic Gradients for Discrete Distributions [Web] [Slides]
White-box vs Black-box: Bayes Optimal Strategies for Membership Inference [Web] [Slides]
Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction [Web] [Slides]
Bayesian Counterfactual Risk Minimization [Web] [Slides]
Active Manifolds: A non-linear analogue to Active Subspaces [Web] [Slides]
Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization [Web] [Slides]
Scalable Fair Clustering [Web] [Slides]
Shallow-Deep Networks: Understanding and Mitigating Network Overthinking [Web] [Slides]
A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent [Web] [Slides]
Neurally-Guided Structure Inference [Web] [Slides]
An Optimal Private Stochastic-MAB Algorithm based on Optimal Private Stopping Rule [Web] [Slides]
Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints [Web] [Slides]
Per-Decision Option Discounting [Web] [Slides]
Optimal Minimal Margin Maximization with Boosting [Web] [Slides]
GDPP: Learning Diverse Generations using Determinantal Point Processes [Web] [Slides]
Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator [Web] [Slides]
Graph U-Nets [Web] [Slides]
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study [Web] [Slides]
Bayesian Joint Spike-and-Slab Graphical Lasso [Web] [Slides]
Sublinear Space Private Algorithms Under the Sliding Window Model [Web] [Slides]
Optimality Implies Kernel Sum Classifiers are Statistically Efficient [Web] [Slides]
Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds [Web] [Slides]
Generalized Linear Rule Models [Web] [Slides]
Co-Representation Network for Generalized Zero-Shot Learning [Web] [Slides]
Fault Tolerance in Iterative-Convergent Machine Learning [Web]
SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver [Web] [Slides]
AdaGrad stepsizes: sharp convergence over nonconvex landscapes [Web] [Slides]
Rotation Invariant Householder Parameterization for Bayesian PCA [Web] [Slides]
Locally Private Bayesian Inference for Count Models [Web]
The Implicit Fairness Criterion of Unconstrained Learning [Web]
A Theory of Regularized Markov Decision Processes [Web] [Slides]
Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications [Web] [Slides]
GEOMetrics: Exploiting Geometric Structure for Graph-Encoded Objects [Web] [Slides]
Static Automatic Batching In TensorFlow [Web] [Slides]
Area Attention [Web] [Slides]
Beyond Backprop: Online Alternating Minimization with Auxiliary Variables [Web] [Slides]
A Framework for Bayesian Optimization in Embedded Subspaces [Web] [Slides]
Low Latency Privacy Preserving Inference [Web] [Slides]
Weak Detection of Signal in the Spiked Wigner Model [Web] [Slides]
Discovering Options for Exploration by Minimizing Cover Time [Web] [Slides]
Variational Inference for sparse network reconstruction from count data [Web] [Slides]
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks [Web] [Slides]
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting [Web] [Slides]
The Evolved Transformer [Web] [Slides]
SWALP : Stochastic Weight Averaging in Low Precision Training [Web] [Slides]
Convolutional Poisson Gamma Belief Network [Web] [Slides]
Communication Complexity in Locally Private Distribution Estimation and Heavy Hitters [Web] [Slides]
Rademacher Complexity for Adversarially Robust Generalization [Web] [Slides]
Policy Certificates: Towards Accountable Reinforcement Learning [Web] [Slides]
Simplifying Graph Convolutional Networks [Web] [Slides]
Geometry Aware Convolutional Filters for Omnidirectional Images Representation [Web] [Slides]
Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Applications [Web] [Slides]
Jumpout : Improved Dropout for Deep Neural Networks with ReLUs [Web] [Slides]
Efficient optimization of loops and limits with randomized telescoping sums [Web] [Slides]
Automatic Posterior Transformation for Likelihood-Free Inference [Web] [Slides]
Poission Subsampled R'enyi Differential Privacy [Web] [Slides]
Provably efficient RL with Rich Observations via Latent State Decoding [Web] [Slides]
Action Robust Reinforcement Learning and Applications in Continuous Control [Web] [Slides]
Robust Influence Maximization for Hyperparametric Models [Web] [Slides]
A Personalized Affective Memory Model for Improving Emotion Recognition [Web] [Slides]
DL2: Training and Querying Neural Networks with Logic [Web] [Slides]
Stochastic Deep Networks [Web] [Slides]
Self-similar Epochs: Value in arrangement [Web] [Slides]
Active Learning for Decision-Making from Imbalanced Observational Data [Web] [Slides]
Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA [Web] [Slides]
Information-Theoretic Considerations in Batch Reinforcement Learning [Web] [Slides]
The Value Function Polytope in Reinforcement Learning [Web] [Slides]
HyperGAN: A Generative Model for Diverse, Performant Neural Networks [Web] [Slides]
Temporal Gaussian Mixture Layer for Videos [Web] [Slides]
Posters Tue [Web]

Day 2

The U.S. Census Bureau Tries to be a Good Data Steward in the 21st Century [Web]
Test of Time Award [Web]
Theoretically Principled Trade-off between Robustness and Accuracy [Web]
Sum-of-Squares Polynomial Flow [Web]
Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning [Web]
Distribution calibration for regression [Web]
On the Convergence and Robustness of Adversarial Training [Web] [Slides]
Distributed Learning with Sublinear Communication [Web]
Complexity of Linear Regions in Deep Networks [Web]
Exploiting Worker Correlation for Label Aggregation in Crowdsourcing [Web]
Optimal Algorithms for Lipschitz Bandits with Heavy-tailed Rewards [Web]
The Odds are Odd: A Statistical Test for Detecting Adversarial Examples [Web] [Slides]
FloWaveNet : A Generative Flow for Raw Audio [Web] [Slides]
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning [Web] [Slides]
Graph Convolutional Gaussian Processes [Web] [Slides]
Learning with Bad Training Data via Iterative Trimmed Loss Minimization [Web] [Slides]
On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization [Web] [Slides]
On Connected Sublevel Sets in Deep Learning [Web] [Slides]
Efficient Amortised Bayesian Inference for Hierarchical and Nonlinear Dynamical Systems [Web] [Slides]
Target Tracking for Contextual Bandits: Application to Demand Side Management [Web] [Slides][Slides]
ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation [Web] [Slides]
Are Generative Classifiers More Robust to Adversarial Attacks? [Web] [Slides]
Imitating Latent Policies from Observation [Web] [Slides]
Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation [Web] [Slides]
On discriminative learning of prediction uncertainty [Web] [Slides]
Stochastic Gradient Push for Distributed Deep Learning [Web] [Slides]
Adversarial Examples Are a Natural Consequence of Test Error in Noise [Web] [Slides]
A Multitask Multiple Kernel Learning Algorithm for Survival Analysis with Application to Cancer Biology [Web] [Slides]
Correlated bandits or: How to minimize mean-squared error online [Web] [Slides]
Certified Adversarial Robustness via Randomized Smoothing [Web] [Slides]
A Gradual, Semi-Discrete Approach to Generative Network Training via Explicit Wasserstein Minimization [Web] [Slides]
SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning [Web] [Slides]
GOODE: A Gaussian Off-The-Shelf Ordinary Differential Equation Solver [Web] [Slides]
Understanding and Utilizing Deep Neural Networks Trained with Noisy Labels [Web] [Slides]
Collective Model Fusion for Multiple Black-Box Experts [Web] [Slides]
Greedy Layerwise Learning Can Scale To ImageNet [Web] [Slides]
Fast and Flexible Inference of Joint Distributions from their Marginals [Web] [Slides]
Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging [Web] [Slides]
Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition [Web] [Slides]
Disentangling Disentanglement in Variational Autoencoders [Web] [Slides]
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning [Web] [Slides]
Overcoming Mean-Field Approximations in Recurrent Gaussian Process Models [Web] [Slides]
Does Data Augmentation Lead to Positive Margin? [Web] [Slides]
Trading Redundancy for Communication: Speeding up Distributed SGD for Non-convex Optimization [Web] [Slides]
On the Impact of the Activation function on Deep Neural Networks Training [Web] [Slides]
Cognitive model priors for predicting human decisions [Web] [Slides]
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits [Web] [Slides]
Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization [Web]
EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE [Web]
Structured agents for physical construction [Web]
AReS and MaRS - Adversarial and MMD-Minimizing Regression for SDEs [Web]
Robust Learning from Untrusted Sources [Web] [Slides]
Trimming the $\ell_1$ Regularizer: Statistical Analysis, Optimization, and Applications to Deep Learning [Web] [Slides]
Estimating Information Flow in Deep Neural Networks [Web] [Slides]
Conditioning by adaptive sampling for robust design [Web] [Slides]
Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously [Web] [Slides]
Wasserstein Adversarial Examples via Projected Sinkhorn Iterations [Web] [Slides]
A Wrapped Normal Distribution on Hyperbolic Space for Gradient-Based Learning [Web] [Slides]
Learning Novel Policies For Tasks [Web] [Slides]
End-to-End Probabilistic Inference for Nonstationary Audio Analysis [Web] [Slides]
SELFIE: Refurbishing Unclean Samples for Robust Deep Learning [Web] [Slides]
Compressed Factorization: Fast and Accurate Low-Rank Factorization of Compressively-Sensed Data [Web] [Slides]
The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects [Web] [Slides]
Direct Uncertainty Prediction for Medical Second Opinions [Web] [Slides]
Bilinear Bandits with Low-rank Structure [Web] [Slides]
Transferable Clean-Label Poisoning Attacks on Deep Neural Nets [Web] [Slides]
Emerging Convolutions for Generative Normalizing Flows [Web] [Slides]
Taming MAML: Efficient unbiased meta-reinforcement learning [Web] [Slides]
Deep Gaussian Processes with Importance-Weighted Variational Inference [Web] [Slides]
Zeno: Distributed Stochastic Gradient Descent with Suspicion-based Fault-tolerance [Web] [Slides]
Noisy Dual Principal Component Pursuit [Web] [Slides]
Characterizing Well-Behaved vs. Pathological Deep Neural Networks [Web] [Slides]
Dynamic Measurement Scheduling for Event Forecasting using Deep RL [Web] [Slides]
Online Learning to Rank with Features [Web] [Slides]
NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks [Web] [Slides]
A Large-Scale Study on Regularization and Normalization in GANs [Web] [Slides]
Self-Supervised Exploration via Disagreement [Web] [Slides]
Automated Model Selection with Bayesian Quadrature [Web] [Slides]
Concentration Inequalities for Conditional Value at Risk [Web] [Slides]
Learning a Compressed Sensing Measurement Matrix via Gradient Unrolling [Web] [Slides]
Understanding Geometry of Encoder-Decoder CNNs [Web] [Slides]
Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization [Web] [Slides]
On the Design of Estimators for Bandit Off-Policy Evaluation [Web] [Slides]
Simple Black-box Adversarial Attacks [Web] [Slides]
Variational Annealing of GANs: A Langevin Perspective [Web] [Slides]
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables [Web] [Slides]
[Web]
Data Poisoning Attacks in Multi-Party Learning [Web] [Slides]
Screening rules for Lasso with non-convex Sparse Regularizers [Web] [Slides]
Traditional and Heavy Tailed Self Regularization in Neural Network Models [Web] [Slides]
DeepNose: Using artificial neural networks to represent the space of odorants [Web] [Slides]
Dynamic Learning with Frequent New Product Launches: A Sequential Multinomial Logit Bandit Problem [Web] [Slides]
Causal Identification under Markov Equivalence: Completeness Results [Web]
Invertible Residual Networks [Web]
The Natural Language of Actions [Web] [Slides]
Beyond the Chinese Restaurant and Pitman-Yor processes: Statistical Models with double power-law behavior [Web]
Distributed Weighted Matching via Randomized Composable Coresets [Web] [Slides]
Monge blunts Bayes: Hardness Results for Adversarial Training [Web] [Slides]
Almost surely constrained convex optimization [Web] [Slides]
Domain Agnostic Learning with Disentangled Representations [Web]
Context-Aware Zero-Shot Learning for Object Recognition [Web]
Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models [Web] [Slides]
NAS-Bench-101: Towards Reproducible Neural Architecture Search [Web] [Slides]
Control Regularization for Reduced Variance Reinforcement Learning [Web] [Slides]
DP-GP-LVM: A Bayesian Non-Parametric Model for Learning Multivariate Dependency Structures [Web] [Slides]
Multivariate Submodular Optimization [Web] [Slides]
Better generalization with less data using robust gradient descent [Web] [Slides]
Generalized Majorization-Minimization [Web] [Slides]
Composing Value Functions in Reinforcement Learning [Web] [Slides][Slides]
Band-limited Training and Inference for Convolutional Neural Networks [Web] [Slides]
Causal Discovery and Forecasting in Nonstationary Environments with State-Space Models [Web] [Slides]
Approximated Oracle Filter Pruning for Destructive CNN Width Optimization [Web] [Slides]
On the Generalization Gap in Reparameterizable Reinforcement Learning [Web] [Slides]
Random Function Priors for Correlation Modeling [Web] [Slides]
Beyond Adaptive Submodularity: Approximation Guarantees of Greedy Policy with Adaptive Submodularity Ratio [Web] [Slides]
Near optimal finite time identification of arbitrary linear dynamical systems [Web] [Slides]
On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization [Web] [Slides]
Fast Context Adaptation via Meta-Learning [Web] [Slides]
Learning Classifiers for Target Domain with Limited or No Labels [Web] [Slides]
Classifying Treatment Responders Under Causal Effect Monotonicity [Web] [Slides]
LegoNet: Efficient Convolutional Neural Networks with Lego Filters [Web] [Slides]
Trajectory-Based Off-Policy Deep Reinforcement Learning [Web] [Slides]
Variational Russian Roulette for Deep Bayesian Nonparametrics [Web] [Slides]
Approximating Orthogonal Matrices with Effective Givens Factorization [Web] [Slides]
Lossless or Quantized Boosting with Integer Arithmetic [Web] [Slides]
Simple Stochastic Gradient Methods for Non-Smooth Non-Convex Regularized Optimization [Web] [Slides]
Provable Guarantees for Gradient-Based Meta-Learning [Web] [Slides]
Population Based Augmentation: Efficient Learning of Augmentation Policy Schedules [Web] [Slides]
Learning Models from Data with Measurement Error: Tackling Underreporting [Web] [Slides]
Sorting Out Lipschitz Function Approximation [Web] [Slides]
A Deep Reinforcement Learning Perspective on Internet Congestion Control [Web] [Slides]
Incorporating Grouping Information into Bayesian Decision Tree Ensembles [Web] [Slides]
New results on information theoretic clustering [Web] [Slides]
Orthogonal Random Forest for Causal Inference [Web] [Slides]
Surrogate Losses for Online Learning of Stepsizes in Stochastic Non-Convex Optimization [Web] [Slides]
Towards Understanding Knowledge Distillation [Web] [Slides]
Anomaly Detection With Multiple-Hypotheses Predictions [Web] [Slides]
Adjustment Criteria for Generalizing Experimental Findings [Web] [Slides]
Graph Element Networks: adaptive, structured computation and memory [Web] [Slides]
Model-Based Active Exploration [Web]
Variational Implicit Processes [Web]
Improved Parallel Algorithms for Density-Based Network Clustering [Web] [Slides]
MONK -- Outlier-Robust Mean Embedding Estimation by Median-of-Means [Web]
Efficient Dictionary Learning with Gradient Descent [Web]
Transferable Adversarial Training: A General Approach to Adapting Deep Classifiers [Web] [Slides]
Kernel Mean Matching for Content Addressability of GANs [Web]
Conditional Independence in Testing Bayesian Networks [Web] [Slides]
Training CNNs with Selective Allocation of Channels [Web] [Slides]
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations [Web] [Slides]
Discovering Latent Covariance Structures for Multiple Time Series [Web] [Slides]
Submodular Observation Selection and Information Gathering for Quadratic Models [Web] [Slides]
The advantages of multiple classes for reducing overfitting from test set reuse [Web] [Slides]
Plug-and-Play Methods Provably Converge with Properly Trained Denoisers [Web] [Slides]
Transferability vs. Discriminability: Batch Spectral Penalization for Adversarial Domain Adaptation [Web] [Slides]
Neural Inverse Knitting: From Images to Manufacturing Instructions [Web] [Slides]
Sensitivity Analysis of Linear Structural Causal Models [Web] [Slides]
Equivariant Transformer Networks [Web] [Slides]
Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN [Web] [Slides][Slides]
Scalable Training of Inference Networks for Gaussian-Process Models [Web] [Slides]
Submodular Cost Submodular Cover with an Approximate Oracle [Web] [Slides]
On the statistical rate of nonlinear recovery in generative models with heavy-tailed data [Web] [Slides]
Riemannian adaptive stochastic gradient algorithms on matrix manifolds [Web] [Slides]
Learning-to-Learn Stochastic Gradient Descent with Biased Regularization [Web] [Slides]
Making Convolutional Networks Shift-Invariant Again [Web] [Slides]
More Efficient Off-Policy Evaluation through Regularized Targeted Learning [Web] [Slides]
Overcoming Multi-model Forgetting [Web] [Slides]
A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs [Web] [Slides]
Bayesian Optimization Meets Bayesian Optimal Stopping [Web] [Slides]
Submodular Streaming in All Its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity [Web] [Slides]
Phase transition in PCA with missing data: Reduced signal-to-noise ratio, not sample size! [Web] [Slides]
Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence [Web] [Slides]
BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning [Web] [Slides]
Generative Modeling of Infinite Occluded Objects for Compositional Scene Representation [Web] [Slides]
Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding [Web] [Slides]
Bayesian Nonparametric Federated Learning of Neural Networks [Web] [Slides]
Remember and Forget for Experience Replay [Web] [Slides]
Learning interpretable continuous-time models of latent stochastic dynamical systems [Web] [Slides]
Hiring Under Uncertainty [Web] [Slides]
On Medians of (Randomized) Pairwise Means [Web] [Slides]
Alternating Minimizations Converge to Second-Order Optimal Solutions [Web] [Slides]
Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation [Web] [Slides][Slides]
IMEXnet - A Forward Stable Deep Neural Network [Web] [Slides]
Adversarially Learned Representations for Information Obfuscation and Inference [Web] [Slides]
How does Disagreement Help Generalization against Label Corruption? [Web] [Slides]
Tensor Variable Elimination for Plated Factor Graphs [Web] [Slides]
A Tree-Based Method for Fast Repeated Sampling of Determinantal Point Processes [Web]
Position-aware Graph Neural Networks [Web]
Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances [Web]
Provably Efficient Imitation Learning from Observation Alone [Web]
Active Embedding Search via Noisy Paired Comparisons [Web] [Slides]
Do ImageNet Classifiers Generalize to ImageNet? [Web]
Adaptive Neural Trees [Web] [Slides]
EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis [Web] [Slides]
Predicate Exchange: Inference with Declarative Knowledge [Web] [Slides]
Nonlinear Stein Variational Gradient Descent for Learning Diversified Mixture Models [Web] [Slides]
Detecting Overlapping and Correlated Communities without Pure Nodes: Identifiability and Algorithm [Web] [Slides]
SGD without Replacement: Sharper Rates for General Smooth Convex Functions [Web] [Slides]
Dead-ends and Secure Exploration in Reinforcement Learning [Web] [Slides]
Fast Direct Search in an Optimally Compressed Continuous Target Space for Efficient Multi-Label Active Learning [Web] [Slides]
Exploring the Landscape of Spatial Robustness [Web] [Slides]
Connectivity-Optimized Representation Learning via Persistent Homology [Web] [Slides]
Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment [Web] [Slides]
Discriminative Regularization for Latent Variable Models with Applications to Electrocardiography [Web] [Slides]
Understanding and Accelerating Particle-Based Variational Inference [Web] [Slides]
Learning Generative Models across Incomparable Spaces [Web] [Slides]
On the Complexity of Approximating Wasserstein Barycenters [Web] [Slides]
Statistics and Samples in Distributional Reinforcement Learning [Web] [Slides]
Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments [Web] [Slides]
Sever: A Robust Meta-Algorithm for Stochastic Optimization [Web] [Slides]
Minimal Achievable Sufficient Statistic Learning [Web] [Slides]
Deep Compressed Sensing [Web] [Slides]
Hierarchical Decompositional Mixtures of Variational Autoencoders [Web] [Slides]
Efficient learning of smooth probability functions from Bernoulli tests with guarantees [Web] [Slides]
Relational Pooling for Graph Representations [Web] [Slides]
Estimate Sequences for Variance-Reduced Stochastic Composite Optimization [Web] [Slides]
Hessian Aided Policy Gradient [Web] [Slides]
Bayesian Generative Active Deep Learning [Web] [Slides]
Analyzing Federated Learning through an Adversarial Lens [Web] [Slides]
Learning to Route in Similarity Graphs [Web] [Slides]
Differentiable Dynamic Normalization for Learning Deep Representation [Web] [Slides]
Finding Mixed Nash Equilibria of Generative Adversarial Networks [Web] [Slides]
The Variational Predictive Natural Gradient [Web] [Slides]
Disentangled Graph Convolutional Networks [Web] [Slides]
A Dynamical Systems Perspective on Nesterov Acceleration [Web] [Slides]
Provably Efficient Maximum Entropy Exploration [Web] [Slides]
Active Learning for Probabilistic Structured Prediction of Cuts and Matchings [Web] [Slides]
Fairwashing: the risk of rationalization [Web] [Slides]
Invariant-Equivariant Representation Learning for Multi-Class Data [Web] [Slides]
Toward Understanding the Importance of Noise in Training Neural Networks [Web] [Slides]
CompILE: Compositional Imitation Learning and Execution [Web]
Scalable Nonparametric Sampling from Multimodal Posteriors with the Posterior Bootstrap [Web] [Slides]
Open Vocabulary Learning on Source Code with a Graph-Structured Cache [Web] [Slides]
Random Shuffling Beats SGD after Finite Epochs [Web] [Slides]
Combining parametric and nonparametric models for off-policy evaluation [Web] [Slides]
Active Learning with Disagreement Graphs [Web] [Slides]
Understanding the Origins of Bias in Word Embeddings [Web]
Infinite Mixture Prototypes for Few-shot Learning [Web] [Slides]
Cheap Orthogonal Constraints in Neural Networks: A Simple Parametrization of the Orthogonal and Unitary Group [Web] [Slides]
Sparse Multi-Channel Variational Autoencoder for the Joint Analysis of Heterogeneous Data [Web] [Slides]
An Instability in Variational Inference for Topic Models [Web] [Slides]
Learning Discrete Structures for Graph Neural Networks [Web] [Slides]
First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems [Web] [Slides]
Sample-Optimal Parametric Q-Learning Using Linearly Additive Features [Web] [Slides]
Multi-Frequency Vector Diffusion Maps [Web] [Slides]
Bias Also Matters: Bias Attribution for Deep Neural Network Explanation [Web] [Slides]
MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing [Web] [Slides]
Breaking Inter-Layer Co-Adaptation by Classifier Anonymization [Web] [Slides][Slides]
Deep Generative Learning via Variational Gradient Flow [Web] [Slides]
Bayesian Optimization of Composite Functions [Web] [Slides]
Compositional Fairness Constraints for Graph Embeddings [Web] [Slides]
Improved Convergence for $\ell_1$ and $\ell_\infty$ Regression via Iteratively Reweighted Least Squares [Web] [Slides]
Transfer of Samples in Policy Search via Multiple Importance Sampling [Web] [Slides]
Co-manifold learning with missing data [Web] [Slides]
Interpreting Adversarially Trained Convolutional Neural Networks [Web] [Slides]
Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting [Web] [Slides]
Understanding the Impact of Entropy on Policy Optimization [Web] [Slides]
Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design [Web] [Slides]
The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions [Web] [Slides]
A Recurrent Neural Cascade-based Model for Continuous-Time Diffusion [Web] [Slides]
Optimal Mini-Batch and Step Sizes for SAGA [Web] [Slides]
Exploration Conscious Reinforcement Learning Revisited [Web] [Slides]
[Web]
Counterfactual Visual Explanations [Web] [Slides]
[Web]
Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning [Web] [Slides]
Learning Neurosymbolic Generative Models via Program Synthesis [Web] [Slides]
Quantile Stein Variational Gradient Descent for Batch Bayesian Optimization [Web] [Slides]
Stochastic Blockmodels meet Graph Neural Networks [Web] [Slides]
Differential Inclusions for Modeling Nonsmooth ADMM Variants: A Continuous Limit Theory [Web] [Slides]
Kernel-Based Reinforcement Learning in Robust Markov Decision Processes [Web] [Slides]
[Web]
Data Poisoning Attacks on Stochastic Bandits [Web] [Slides]
Posters Wed [Web]

Day 3

Neural Network Attributions: A Causal Perspective [Web]
State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations [Web] [Slides]
Batch Policy Learning under Constraints [Web] [Slides]
Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions [Web]
Matrix-Free Preconditioning in Online Learning [Web] [Slides]
Geometric Losses for Distributional Learning [Web] [Slides]
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning [Web]
Doubly Robust Joint Learning for Recommendation on Data Missing Not at Random [Web] [Slides]
On Sparse Linear Regression in the Local Differential Privacy Model [Web]
Towards a Deep and Unified Understanding of Deep Neural Models in NLP [Web] [Slides]
Variational Laplace Autoencoders [Web] [Slides]
Quantifying Generalization in Reinforcement Learning [Web] [Slides]
Non-Asymptotic Analysis of Fractional Langevin Monte Carlo for Non-Convex Optimization [Web] [Slides]
Online Convex Optimization in Adversarial Markov Decision Processes [Web] [Slides][Slides]
Classification from Positive, Unlabeled and Biased Negative Data [Web] [Slides]
Stochastic Iterative Hard Thresholding for Graph-structured Sparsity Optimization [Web] [Slides]
Linear-Complexity Data-Parallel Earth Mover's Distance Approximations [Web] [Slides]
Differentially Private Empirical Risk Minimization with Non-convex Loss Functions [Web] [Slides]
Explaining Deep Neural Networks with a Polynomial Time Algorithm for Shapley Value Approximation [Web] [Slides]
Latent Normalizing Flows for Discrete Sequences [Web] [Slides]
Learning Latent Dynamics for Planning from Pixels [Web] [Slides]
Unifying Orthogonal Monte Carlo Methods [Web] [Slides]
Competing Against Nash Equilibria in Adversarially Changing Zero-Sum Games [Web] [Slides]
Complementary-Label Learning for Arbitrary Losses and Models [Web] [Slides]
Neuron birth-death dynamics accelerates gradient descent and converges asymptotically [Web] [Slides]
Model Comparison for Semantic Grouping [Web] [Slides]
Bounding User Contributions: A Bias-Variance Trade-off in Differential Privacy [Web] [Slides]
Functional Transparency for Structured Data: a Game-Theoretic Approach [Web] [Slides]
Multi-objective training of Generative Adversarial Networks with multiple discriminators [Web] [Slides]
Projections for Approximate Policy Iteration Algorithms [Web] [Slides]
Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits [Web] [Slides]
Online Learning with Sleeping Experts and Feedback Graphs [Web] [Slides]
Learning to Infer Program Sketches [Web] [Slides]
Width Provably Matters in Optimization for Deep Linear Neural Networks [Web] [Slides]
RaFM: Rank-Aware Factorization Machines [Web] [Slides]
Differentially Private Learning of Geometric Concepts [Web] [Slides]
Exploring interpretable LSTM neural networks over multi-variable data [Web] [Slides]
Learning Discrete and Continuous Factors of Data via Alternating Disentanglement [Web] [Slides]
Learning Structured Decision Problems with Unawareness [Web] [Slides]
Metropolis-Hastings Generative Adversarial Networks [Web] [Slides]
Incremental Randomized Sketching for Online Kernel Learning [Web] [Slides]
Hierarchically Structured Meta-learning [Web] [Slides]
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path? [Web] [Slides]
CAB: Continuous Adaptive Blending for Policy Evaluation and Learning [Web] [Slides]
Toward Controlling Discrimination in Online Ad Auctions [Web] [Slides]
TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing [Web]
Bit-Swap: Recursive Bits-Back Coding for Lossless Compression with Hierarchical Latent Variables [Web]
Calibrated Model-Based Deep Reinforcement Learning [Web] [Slides]
Scalable Metropolis-Hastings for Exact Bayesian Inference with Large Datasets [Web] [Slides]
Adaptive Scale-Invariant Online Algorithms for Learning Linear Models [Web]
Bridging Theory and Algorithm for Domain Adaptation [Web] [Slides]
Power k-Means Clustering [Web] [Slides]
MetricGAN: Generative Adversarial Networks based Black-box Metric Scores Optimization for Speech Enhancement [Web]
Learning Optimal Fair Policies [Web]
Gaining Free or Low-Cost Interpretability with Interpretable Partial Substitute [Web] [Slides]
Graphite: Iterative Generative Modeling of Graphs [Web] [Slides]
Reinforcement Learning in Configurable Continuous Environments [Web] [Slides]
Replica Conditional Sequential Monte Carlo [Web] [Slides]
Online Control with Adversarial Disturbances [Web] [Slides]
Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation [Web] [Slides]
Distributed Learning over Unreliable Networks [Web] [Slides]
Neural Separation of Observed and Unobserved Distributions [Web] [Slides]
Fairness-Aware Learning for Continuous Attributes and Treatments [Web] [Slides]
State-Regularized Recurrent Neural Networks [Web] [Slides]
Hybrid Models with Deep and Invertible Features [Web] [Slides]
Target-Based Temporal-Difference Learning [Web] [Slides]
A Polynomial Time MCMC Method for Sampling from Continuous Determinantal Point Processes [Web] [Slides]
Adversarial Online Learning with noise [Web] [Slides]
Learning What and Where to Transfer [Web] [Slides]
Escaping Saddle Points with Adaptive Gradient Methods [Web] [Slides]
Almost Unsupervised Text to Speech and Automatic Speech Recognition [Web] [Slides]
Fairness risk measures [Web] [Slides]
Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation [Web] [Slides]
MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets [Web] [Slides]
Iterative Linearized Control: Stable Algorithms and Complexity Guarantees [Web] [Slides]
Adaptive Antithetic Sampling for Variance Reduction [Web] [Slides]
Online Variance Reduction with Mixtures [Web] [Slides]
[Web]
$\texttt{DoubleSqueeze}$: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression [Web] [Slides]
AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss [Web] [Slides]
[Web]
On the Connection Between Adversarial Robustness and Saliency Map Interpretability [Web] [Slides]
On Scalable and Efficient Computation of Large Scale Optimal Transport [Web] [Slides]
Finding Options that Minimize Planning Time [Web] [Slides]
Accelerated Flow for Probability Distributions [Web] [Slides]
Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case [Web] [Slides]
[Web]
Model Function Based Conditional Gradient Method with Armijo-like Line Search [Web] [Slides]
A fully differentiable beam search decoder [Web] [Slides]
[Web]
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem [Web]
Understanding and correcting pathologies in the training of learned optimizers [Web]
Stochastic Beams and Where To Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement [Web] [Slides]
Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel $k$-means Clustering [Web] [Slides]
Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret [Web] [Slides]
DBSCAN++: Towards fast and scalable density clustering [Web]
Analogies Explained: Towards Understanding Word Embeddings [Web] [Slides]
Scaling Up Ordinal Embedding: A Landmark Approach [Web] [Slides]
Proportionally Fair Clustering [Web] [Slides]
On the Spectral Bias of Neural Networks [Web] [Slides]
Demystifying Dropout [Web] [Slides]
Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs [Web] [Slides]
Dimensionality Reduction for Tukey Regression [Web] [Slides]
Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems [Web] [Slides]
Concrete Autoencoders: Differentiable Feature Selection and Reconstruction [Web] [Slides]
Parameter-Efficient Transfer Learning for NLP [Web] [Slides]
Learning to select for a predefined ranking [Web] [Slides]
Stable and Fair Classification [Web] [Slides]
Recursive Sketches for Modular Deep Learning [Web] [Slides]
Ladder Capsule Network [Web] [Slides]
Meta-Learning Neural Bloom Filters [Web] [Slides]
Efficient Full-Matrix Adaptive Regularization [Web] [Slides]
Adaptive Regret of Convex and Smooth Functions [Web] [Slides]
Gromov-Wasserstein Learning for Graph Matching and Node Embedding [Web] [Slides]
Efficient On-Device Models using Neural Projections [Web] [Slides]
Mallows ranking models: maximum likelihood estimate and regeneration [Web] [Slides]
Flexibly Fair Representation Learning by Disentanglement [Web] [Slides]
Zero-Shot Knowledge Distillation in Deep Networks [Web] [Slides]
Unreproducible Research is Reproducible [Web] [Slides]
CoT: Cooperative Training for Generative Modeling of Discrete Data [Web] [Slides]
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms [Web] [Slides]
Online Adaptive Principal Component Analysis and Its extensions [Web] [Slides]
Spectral Clustering of Signed Graphs via Matrix Power Means [Web] [Slides]
Deep Residual Output Layers for Neural Language Generation [Web] [Slides]
Fast and Stable Maximum Likelihood Estimation for Incomplete Multinomial Models [Web] [Slides]
Fair Regression: Quantitative Definitions and Reduction-Based Algorithms [Web] [Slides]
A Convergence Theory for Deep Learning via Over-Parameterization [Web] [Slides]
Geometric Scattering for Graph Data Analysis [Web] [Slides]
Non-Monotonic Sequential Text Generation [Web] [Slides]
Efficient Nonconvex Regularized Tensor Completion with Structure-aware Proximal Iterations [Web] [Slides]
POLITEX: Regret Bounds for Policy Iteration using Expert Prediction [Web] [Slides]
Coresets for Ordered Weighted Clustering [Web] [Slides]
Improving Neural Language Modeling via Adversarial Training [Web] [Slides]
Fast Algorithm for Generalized Multinomial Models with Ranking Data [Web] [Slides]
Fairness without Harm: Decoupled Classifiers with Preference Guarantees [Web] [Slides]
A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks [Web]
Robust Inference via Generative Classifiers for Handling Noisy Labels [Web] [Slides]
Insertion Transformer: Flexible Sequence Generation via Insertion Operations [Web]
Robust Estimation of Tree Structured Gaussian Graphical Models [Web] [Slides]
Anytime Online-to-Batch, Optimism and Acceleration [Web] [Slides]
Fair k-Center Clustering for Data Summarization [Web]
Mixture Models for Diverse Machine Translation: Tricks of the Trade [Web]
Graph Resistance and Learning from Pairwise Comparisons [Web] [Slides]
Differentially Private Fair Learning [Web]
Approximation and non-parametric estimation of ResNet-type convolutional neural networks [Web] [Slides][Slides]
LIT: Learned Intermediate Representation Training for Model Compression [Web] [Slides]
Empirical Analysis of Beam Search Performance Degradation in Neural Sequence Models [Web] [Slides]
Spectral Approximate Inference [Web] [Slides]
Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints [Web] [Slides]
A Better k-means++ Algorithm via Local Search [Web] [Slides]
MASS: Masked Sequence to Sequence Pre-training for Language Generation [Web] [Slides]
Learning Context-dependent Label Permutations for Multi-label Classification [Web] [Slides]
Obtaining Fairness using Optimal Transport Theory [Web] [Slides]
Global Convergence of Block Coordinate Descent in Deep Learning [Web] [Slides]
Analyzing and Improving Representations with the Soft Nearest Neighbor Loss [Web] [Slides]
Trainable Decoding of Sets of Sequences for Neural Sequence Models [Web] [Slides]
Partially Linear Additive Gaussian Graphical Models [Web] [Slides]
Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning [Web] [Slides]
Kernel Normalized Cut: a Theoretical Revisit [Web] [Slides]
Humor in Word Embeddings: Cockamamie Gobbledegook for Nincompoops [Web] [Slides]
Discovering Context Effects from Raw Choice Data [Web] [Slides]
Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions [Web] [Slides]
Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians [Web] [Slides]
What is the Effect of Importance Weighting in Deep Learning? [Web] [Slides]
Learning to Generalize from Sparse and Underspecified Rewards [Web] [Slides]
DAG-GNN: DAG Structure Learning with Graph Neural Networks [Web] [Slides]
Adaptive Sensor Placement for Continuous Spaces [Web] [Slides]
Guarantees for Spectral Clustering with Fairness Constraints [Web] [Slides]
MeanSum: A Neural Model for Unsupervised Multi-Document Abstractive Summarization [Web] [Slides]
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference [Web] [Slides]
On the Long-term Impact of Algorithmic Decision Policies: Effort Unfairness and Feature Segregation through Social Learning [Web] [Slides]
On the Limitations of Representing Functions on Sets [Web] [Slides]
Similarity of Neural Network Representations Revisited [Web] [Slides]
Efficient Training of BERT by Progressively Stacking [Web] [Slides]
Random Walks on Hypergraphs with Edge-Dependent Vertex Weights [Web] [Slides]
Scale-free adaptive planning for deterministic dynamics & discounted rewards [Web] [Slides]
Supervised Hierarchical Clustering with Exponential Linkage [Web] [Slides]
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network [Web] [Slides]
Learning Distance for Sequences by Learning a Ground Metric [Web] [Slides]
Making Decisions that Reduce Discriminatory Impacts [Web] [Slides]
What 4 year olds can do and AI can’t (yet) [Web]
Best Paper [Web]
Probabilistic Neural Symbolic Models for Interpretable Visual Question Answering [Web]
Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations [Web]
Decentralized Exploration in Multi-Armed Bandits [Web] [Slides]
Communication-Constrained Inference and the Role of Shared Randomness [Web] [Slides]
COMIC: Multi-view Clustering Without Parameter Selection [Web]
Submodular Maximization beyond Non-negativity: Guarantees, Fast Algorithms, and Applications [Web]
Nonparametric Bayesian Deep Networks with Local Competition [Web] [Slides][Slides]
Distributed, Egocentric Representations of Graphs for Detecting Critical Structures [Web] [Slides]
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback [Web] [Slides]
Learning and Data Selection in Big Datasets [Web] [Slides]
The Wasserstein Transform [Web] [Slides]
Online Algorithms for Rent-Or-Buy with Expert Advice [Web] [Slides]
Good Initializations of Variational Bayes for Deep Models [Web] [Slides]
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities [Web] [Slides]
Exploiting structure of uncertainty for efficient matroid semi-bandits [Web] [Slides]
Sublinear quantum algorithms for training linear and kernel-based classifiers [Web] [Slides]
Sequential Facility Location: Approximate Submodularity and Greedy Algorithm [Web] [Slides]
Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity [Web] [Slides]
Dropout as a Structured Shrinkage Prior [Web] [Slides]
Multi-Object Representation Learning with Iterative Variational Inference [Web] [Slides]
PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits [Web] [Slides]
Agnostic Federated Learning [Web] [Slides]
Neural Collaborative Subspace Clustering [Web] [Slides]
Categorical Feature Compression via Submodular Optimization [Web] [Slides]
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables [Web] [Slides]
Cross-Domain 3D Equivariant Image Embeddings [Web] [Slides]
Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model [Web] [Slides]
Discovering Conditionally Salient Features with Statistical Guarantees [Web] [Slides]
Unsupervised Deep Learning by Neighbourhood Discovery [Web] [Slides]
Multi-Frequency Phase Synchronization [Web] [Slides]
On Variational Bounds of Mutual Information [Web]
Loss Landscapes of Regularized Linear Autoencoders [Web] [Slides]
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning [Web]
A Theoretical Analysis of Contrastive Unsupervised Representation Learning [Web]
Autoregressive Energy Machines [Web]
Faster Algorithms for Binary Matrix Factorization [Web]
Partially Exchangeable Networks and Architectures for Learning Summary Statistics in Approximate Bayesian Computation [Web] [Slides]
Hyperbolic Disk Embeddings for Directed Acyclic Graphs [Web] [Slides]
TarMAC: Targeted Multi-Agent Communication [Web] [Slides]
The information-theoretic value of unlabeled data in semi-supervised learning [Web] [Slides]
Greedy Orthogonal Pivoting Algorithm for Non-Negative Matrix Factorization [Web] [Slides]
Tractable n-Metrics for Multiple Graphs [Web] [Slides]
Hierarchical Importance Weighted Autoencoders [Web] [Slides]
LatentGNN: Learning Efficient Non-local Relations for Visual Recognition [Web] [Slides]
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning [Web] [Slides]
Unsupervised Label Noise Modeling and Loss Correction [Web] [Slides]
Noise2Self: Blind Denoising by Self-Supervision [Web] [Slides]
Guided evolutionary strategies: augmenting random search with surrogate gradients [Web] [Slides]
Faster Attend-Infer-Repeat with Tractable Probabilistic Models [Web] [Slides]
Robustly Disentangled Causal Mechanisms: Validating Deep Representations for Interventional Robustness [Web] [Slides]
Actor-Attention-Critic for Multi-Agent Reinforcement Learning [Web] [Slides]
Domain Adaptation with Asymmetrically-Relaxed Distribution Alignment [Web] [Slides]
Learning Dependency Structures for Weak Supervision Models [Web] [Slides]
Adaptive and Safe Bayesian Optimization in High Dimensions via One-Dimensional Subspaces [Web] [Slides]
Understanding Priors in Bayesian Neural Networks at the Unit Level [Web] [Slides]
Lorentzian Distance Learning for Hyperbolic Representations [Web] [Slides]
Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning [Web] [Slides]
Pareto Optimal Streaming Unsupervised Classification [Web] [Slides]
Geometry and Symmetry in Short-and-Sparse Deconvolution [Web] [Slides]
Semi-Cyclic Stochastic Gradient Descent [Web] [Slides]
Posters Thu [Web]

namanUIUC/ICML-2019

ICML-2019

Tutorials

Multi Track Research

Day 1

Day 2

Day 3