/maddness-old

Code for ICML2020 submission

Primary LanguageC++

This page describes how to reproduce the experimental results reported in our paper.

Note that this page (and the clean, easy-to-use version of our code) are still under construction and we refer the reader to https://smarturl.it/Maddness for the latest version.

Install Dependencies

To run the experiments, you will first need to obtain the following tools / libraries and datasets.

C++ Code

  • Xcode, to run the C++ timing benchmarks using Xcode (this is the "official" version that is much better tested and actually works at the moment)
  • Bazel, Google's open-source build system (support coming soon...)

Python Code:

  • Joblib - for caching function output
  • Scikit-learn - for k-means
  • Kmc2 - for k-means seeding
  • Pandas - for storing results and reading in data
  • Seaborn - for plotting, if you want to reproduce our figures

Datasets

The activations and weights from the CIFAR-10 and CIFAR-100 datasets are already included under python/assets.

View Existing Results

All results are in python/results/amm. The timing results are in the subdirectory timing.

Reproduce Timing / Throughput results

The C++ code is driven by Catch run via Xcode. You can just open Bolt.xcodeproj (Maddness was built as a fork of Bolt) and press run with the appropriate arguments. For different experiments, the arguments are:

  • f() speed for various methods: [scan][amm]~[old]
  • g() speed for various methods: [encode][amm]~[old]
  • h() speed for various (not reported in the paper, but interesting): [lut][amm]\~[old]
  • Overall AMM speed: [matmul][amm]~[old].

We highly recommend running this when the machine is otherwise idle. Also note that we haven't yet automated having the C++ code dump results into the appropriate files, so you'll have to manually paste the output into the corresponding file in python/results/amm/timing.

Coming soon: Working Bazel build for all the code and wrapper shell scripts to run and store the output of each experiment.

Reproduce Accuracy Results

From the python directory, run python -m python.amm_main. This will run all the methods we showed in the body of the paper (and some others that run quickly) on CIFAR-10, CIFAR-100, Caltech 101 using both the Sobel and Gaussian filters, and the datasets from the UCR Time Series Archive.

Reproduce Plots

From the python directory, run python -m python.amm_figs2. You can uncomment different lines in main to only produce subsets of the plots.

Other notes

Our method is called Mithral in the source code, not Maddness.