The Surreal 70 Samples (S3)

Introduction

This repository is a collection of 70 short summaries of papers in Deep Learning. Summaries are based on papers which explore novel groundbreaking ideas or consist of theoretically rich concepts. Each short summary was typically added each week which explores essential aspects of the work, its technical innovation and new questions and ideas raised by the work. Length of each summary is 1 page. Each summary is based on a fixed set of guidlines which are given here.

Note- At the time of writing a sample, I had no relation to the authors or their respective organizations. Opinions and claims expressed in the text are solely my own.

Available Paper Summaries

Summary Number	Paper Title	Author List	Summary Link
1	Action and Perception as Divergence Minimization	Danijar Hafner,Pedro A. Ortega,Jimmy Ba,Thomas Parr,Karl Friston,Nicolas Heess	link
2	Momentum Contrast for Unsupervised Visual Representation Learning	Kaiming He,Haoqi Fan,Yuxin Wu,Saining Xie,Ross Girshick	link
3	When to use parametric models in reinforcement learning?	Hado van Hasselt, Matteo Hessel, John Aslanides	link
4	Data-Efficient Image Recognition With Contrastive Predictive Coding	Olivier J. Henaff,Aravind Srinivas,Jeffrey De Fauw,Ali Razavi,Carl Doersch,S. M. Ali Eslami,Aaron van den Oord	link
5	A Learning Algorithm for Boltzmann Machines	David H. Ackley,Geoffrey E. Hinton,Terrence J. Sejnowski	link
6	Dependence Measures Bounding the Exploration Bias for General Measurements	Jiantao Jiao,Yanjun Han,Tsachy Weissman	link
7	On Variational Bounds of Mutual Information	Ben Poole,Sherjil Ozair,Aaron van den Oord,Alexander A. Alemi,George Tucker	link
8	Hindsight Credit Assignment	Anna Harutyunyan, Will Dabney, Thomas Mesnard, Nicolas Heess, Mohammad G. Azar, Bilal Piot, Hado van Hasselt, Satinder Singh, Greg Wayne, Doina Precup, Rémi Munos	link
9	Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion	Xingye Da, Zhaoming Xie, David Hoeller, Byron Boots, Animashree Anandkumar, Yuke Zhu, Buck Babich, Animesh Garg	link
10	Counterfactual Data Augmentation using Locally Factored Dynamics	Silviu Pitis, Elliot Creager, Animesh Garg	link
11	LEAF: Latent Exploration Along the Frontier	Homanga Bharadhwaj, Animesh Garg, Florian Shkurti	link
12	Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings	Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman	link
13	Model-Based Reinforcement Learning with Value-Targeted Regression	Zeyu Jia, Lin F. Yang, Csaba Szepesvari, Mengdi Wang	link
14	Skill Transfer Via Partially Amortized Hierarchical Planning	Kevin Xie, Homanga Bharadhwaj, Danijar Hafner, Animesh Garg, Florian Shkurti	link
15	Evaluating Agents Without Rewards	Brendon Matusch, Jimmy Ba, Danijar Hafner	link
16	Conservative Safety Critics For Exploration	Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg	link
17	Mastering Atari with Discrete World Models	Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba	link
18	Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning	Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson	link
19	Continual Model-Based Reinforcement Learning with Hypernetworks	Yizhou Huang, Kevin Xie, Homanga Bharadhwaj, Florian Shkurti	link
20	Model-Based Inverse Reinforcement Learning from Visual Demonstrations	Neha Das, Sarah Bechtle, Todor Davchev, Dinesh Jayaraman, Akshara Rai, Franziska Meier	link
21	RODE: Learning Roles To Decomponse Multi-Agent Tasks	Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang	link
22	The Act Of Remembering: A Study In Partially Observable Reinforcement Learning	Rodrigo Toro Icarte, Richard Valenzano, Toryn Q. Klassen, Phillip Christoffersen, Amir-massoud Farahmand, Sheila A. McIlraith	link
23	An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality	Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba	link
24	MAVEN: Multi-Agent Variational Exploration	Anuj Mahajan, Tabish Rashid, Mikayel Samvelyan, Shimon Whiteson	link
25	Visual Imitation Made Easy	Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter Abbeel, Lerrel Pinto	link
26	“Other-Play” for Zero-Shot Coordination	Hengyuan Hu, Adam Lerer, Alex Peysakhovich, Jakob Foerster	link
27	Using Fast Weights to Attend to the Recent Past	Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu	link
28	γ-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction	Michael Janner, Igor Mordatch, Sergey Levine	link
29	Learning to Communicate with Deep Multi-Agent Reinforcement Learning	Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson	link
30	Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning	Jakob N. Foerster, H. Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew M. Botvinick, Michael Bowling	link
31	Improved Variational Inference with Inverse Autoregressive Flow	Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, Max Welling	link
32	Deep Reinforcement Learning from Self-Play in Imperfect-Information Games	Johannes Heinrich, David Silver	link
33	Variational Policy Gradient Method for Reinforcement Learning with General Utilities	Junyu Zhang, Alec Koppel, Amrit Singh Bedi, Csaba Szepesvari, Mengdi Wang	link
34	The Value-Improvement Path Towards Better Representations for Reinforcement Learning	Will Dabney, Andre Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver	link
35	Fictitious Self-Play in Extensive-Form Games	Johannes Heinrich, Marc Lanctot, David Silver	link
36	Expected Eligibility Traces	Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa	link
37	Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement	Andre Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Zıdek, Remi Munos	link
38	A Theoretical and Empirical Analysis of Expected Sarsa	Harm van Seijen, Hado van Hasselt, Shimon Whiteson and Marco Wiering	link
39	Learning to Play No-Press Diplomacy with Best Response Policy Iteration	Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel and Yoram Bachrach	link
40	Neural Dynamic Policies for End-to-End Sensorimotor Learning	Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak	link
41	Learning By Cheating	Dian Chen, Brady Zhou, Vladlen Koltun, Philipp Krähenbühl	link
42	Your Classifier Is Secretly An Energy based Model Aand You Should Treat It Like One	Will Grathwohl, Kuan-Chieh Wang, Jorn-Henrik Jacobsen, David Duvenaud, Kevin Swersky, Mohammad Norouzi	link
43	No MCMC for me: Amortized sampling for fast and stable training of energy-based models	Will Grathwohl, Jacob Kelly, Milad Hashemi, David Duvenaud, Mohammad Norouzi, Kevin Swersky	link
44	Implicit Autoencoders	Alireza Makhzani	link
45	Neural Ordinary Differential Equations	Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud	link
46	Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations	Vincent Sitzmann, Michael Zollhöfer, Gordon Wetzstein	link
47	PixelGAN Autoencoders	Alireza Makhzani, Brendan Frey	link
48	Density Estimation using Real NVP	Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio	link
49	NICE: Non-linear Independent Components Estimation	Laurent Dinh,David Krueger,Yoshua Bengio	link
50	Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting	Yuan Yin, Vincent Le Guen, Jérémie Dona, Emmanuel de Bezenac, Ibrahim Ayed, Nicolas Thome, Patrick Gallinari	link
51	Conservative Q-Learning for Offline Reinforcement Learning	Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine	link
52	Perceiver: General Perception with Iterative Attention	Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, Joao Carreira	link
53	On the mapping between Hopfield networks and Restricted Boltzmann Machines	Matthew Smart, Anton Zilman	link
54	Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients	Brenden K Petersen, Mikel Landajuela Larma, Terrell N. Mundhenk, Claudio Prata Santiago, Soo Kyung Kim, Joanne Taery Kim	link
55	Training Products of Experts by Minimizing Contrastive Divergence	Geoffrey E. Hinton	link
56	Linear Transformers Are Secretly Fast Weight Memory Systems	Imanol Schlag, Kazuki Irie, Jurgen Schmidhuber	link
57	Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images	Rewon Child	link
58	EigenGame: PCA as a Nash Equilibrium	Ian Gemp, Brian McWilliams, Claire Vernade, Thore Graepel	link
59	Disentangled Recurrent Wasserstein Autoencoder	Jun Han, Martin Renqiang Min, Ligong Han, Li Erran L, Xuan Zhang	link
60	VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models	Zhisheng Xiao, Karsten Kreis, Jan Kautz, Arash Vahdat	link
61	Learning Energy-Based Models by Diffusion Recovery Likelihood	Ruiqi Gao, Yang Song, Ben Poole, Ying Nian Wu, Diederik P Kingma	link
62	Improved Contrastive Divergence Training of Energy-Based Model	Yilun Du, Shuang Li, Joshua Tenenbaum, Igor Mordatch	link
63	Conjugate Energy-Based Models	Hao Wu, Babak Esmaeili, Michael L Wick, Jean-Baptiste Tristan, Jan-Willem van de Meent	link
64	Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders	Mangal Prakash, Alexander Krull, Florian Jug	link
65	Bayesian Quadrature on Riemannian Data Manifolds	Christian Frohlich, Alexandra Gessner, Philipp Hennig, Bernhard Scholkopf, Georgios Arvanitidis	link
66	High-Dimensional Bayesian Optimization via Nested Riemannian Manifolds	Noémie Jaquier, Leonel Rozo	link
67	Risk-Averse Offline Reinforcement Learning	Núria Armengol Urpí, Sebastian Curi, Andreas Krause	link
68	Provably Good Batch Reinforcement Learning Without Great Exploration	Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill	link
69	LILA: Language-Informed Latent Actions	Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh	link
70	Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning	Anuj Mahajan, Mikayel Samvelyan, Lei Mao, Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Animashree Anandkumar	link

Summary Guidlines

This section outlines the guidlines which are used to write summaries for the repository. Note that these guidlines must be strictly followed for providing high quality summaries to the reader.

Paper Selection

Since Deep Learning is a fast-moving field, papers can span a broad variety of topics. In order to shorten the range of literature, papers must be selected using the following rules-

Any review paper, survey or a long article should not be considered since these are themselves a review of previous works and reviewing them would defeat the purpose of literature writing.
Short papers, Journal papers and theoretical works are suitable as these present a single idea which may be of interest to the reader.
Papers containing experimental results balanced with theory are encouraged as these validate the practical applications of the proposed methods.
There is no constraint on the publication date and time of papers. However, papers which date back to the 'AI Winter' are highly encouraged since these provide insights which were never looked at for a very long time.
Papers containing simply applications of pre-existing methods to various regimes are not preferred since they do not provide any new insights into the algorithm itself and only deal with its applicability.
Workshop papers and incomplete works are also welcome as researchers can always build on these ideas.
While there is no specific restriction on the content of papers, writer must consider fields in AI which are growing and require a new outlook, eg. Reinforcement Learning, Gradient-Free methods, Meta-Learning, Natural Language at Scale, Explainability, etc.

Introduction

This section provides points on writing a good introduction for the summary. Note that these points are not strict and only serve as a guiding principle for drafting a good introduction.

The introduction should be a high-level idea of the paper and what the work deals with.
The writer must refrain from going into any technical detail and highlight the broad idea of the work and its scope.
No mathematical terms, definitions, technical explanations or algorithmic details should be provided to the reader.
The main focus should be on the problem statement and how the method aims to solve this.
Key takeaway- Detail is your enemy!

Methodology

This section deals with the proposed method and its essential aspects in solving the problem highlighted in the introduction section. Following points serve as a guide to writing this section-

The content must provide a brief overview of the method. This informs the reader about what he is getting into.
Once an overview has been provided, the draft can start diving into the detail which should be highlighted intuitively.
Mathematical details must be followed by words, complicated terminology must be explained intuitively using examples or instances from work.
Reasons related to technical details and there usage must be provided to the reader. The whole point of reading a summary is to crisply go over the details of the paper without wanting to read the entire text.
While the draft should highlight the method and its details, it should also provide the reader with insights from the writer's point of view. These could consist of specific reasons for selecting a set of parameter values, usage of a specific technique existing in literature, novel contributions and the reason behind its usage and any improvements/changes from previous works.
Key takeway- Intuition is king!

Critical Analysis

The critical analysis section deals with the writer's analysis and understanding of the method. Specifically, this section should highlight what the writer thinks about the proposed approach, its strengths, weaknesses and potential areas of improvement and applicability. This is the most important section of the summary since the writer needs to critically evaluate and comment on the technical aspects of the work. The following points should be kept in mind while writing a good critical analysis-

One should evaluate all aspects of the work before diving into this section. A comprehensive analysis of the work is essential as it builds on the writer's understanding of the method.
Once the complete details have been established, a proper structure for evaluation should be constructed. Typically, this structure should consist of comments on motivational aspects, strengths, weaknesses, novel contributions, applicability of the method, extensions and improvements of the method's components and their shortcomings.
The draft should be written in a critical yet formal manner. For instance, the writer must highlight the intriguing aspects of the algorithm and at the same time throw light on its critical parts in a decorous manner.
Writer's comments should be their own and not focus on explaining the work. The main idea behind this section is to present your understanding of the text to the reader, not the text itself.
Key takeway- Your own contributions matter!

New Ideas/Questions

This section follows and builds upon the critical analysis section. The main idea behind this section is to highlight and examine the novel contributions of the proposed method and their contributions towards improving prior techniques. Another reason to study new ideas proposed by a paper is to identify potential directions of research and answer open questions from different perspectives. Following points may come in handy while writing this section-

One should concisely summarize the novel contributions of the work (not more than 1-2 sentences).
The summary should be followed by a critical analysis or a set of possible questions which maybe asked while using the novel aspects of the method.
Since the focus is on novelty, writer may also be interested in asking questions related to other potential techniques and their applicability in place of the novel method.
The section should conclude with brief comments on substitutes/extensions to the novel components and questions left unanswered (if any) in the work.
Key takeway- Questions, Questions, Questions!

Conclusions

Lastly, the conclusions section should clearly and concisely sum up the summary of the work. This section should consist of a crisp summary of all the previous sections and only highlight their most important aspects. The content of this section should be directed to someone who does not have enough time to read the summary and only wishes to grab the essential points. A good conlusion could be written using the following points-

The summary should start with a high level introductory note of the method, its usage and contributions. (not more 1-2 sentences)
The summary should explore the novelty of the method, its implications and the resulting outcome achieved from the writer's point of view.
The summary should walk over the key components of writer's evaluation and his comments on the method.
The summary should conclude with a brief note of new ideas/questions introduced by the work and potential directions for future work.
Key takeway- Just be done already!

karush17/surreal-70-samples

The Surreal 70 Samples (S3)

Introduction

Available Paper Summaries

Summary Guidlines

Paper Selection

Introduction

Methodology

Critical Analysis

New Ideas/Questions

Conclusions