"Are We Learning Yet" Paper Summaries

This repository contains summaries for the papers surveyed in the paper "Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning". The summaries, and some associated metadata, can be found in papers.yaml in a machine-readable format. The pdf file summaries.pdf contains a human-readable copy of the summaries.

Abstract

Many subfields of machine learning share a common stumbling block: evaluation. Advances in machine learning often evaporate under closer scrutiny or turn out to be less widely applicable than originally hoped. We conduct a meta-review of 107 survey papers from natural language processing, recommender systems, computer vision, reinforcement learning, computational biology, graph learning, and more, organizing the wide range of surprisingly consistent critique into a concrete taxonomy of observed failure modes. Inspired by measurement and evaluation theory, we divide failure modes into two categories: internal and external validity. Internal validity issues pertain to evaluation on a learning problem in isolation, such as improper comparisons to baselines or overfitting from test set re-use. External validity relies on relationships between different learning problems, for instance, whether progress on a learning problem translates to progress on seemingly related tasks.

Citation

You can use this bibtex entry to cite the paper associated with this repository:

@article{liao2021are,
  title={Are We Learning Yet? A Meta Review of Evaluation Failures Across Machine Learning},
  author={Thomas Liao and Rohan Taori and Deborah Raji and Ludwig Schmidt},
  year={2021},
}

tholiao/are_we_learning_yet

"Are We Learning Yet" Paper Summaries

Abstract

Citation