This repository provides code to replicate the experiments in the paper Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions by Aaron Mishkin, Arda Sahiner, and Mert Pilanci.
Python 3.8 or newer.
Clone the repository using
git clone https://github.com/pilanci_lab/scnn_experiments.git
We provide a script for easy setup on Unix systems. Run the setup.sh
file with
./setup.sh
This will:
- Create a virtual environment in
.venv
and install the project dependencies. - Install
scaffold
in development mode. This library contains infrastructure for running our experiments. - Create the
data
,figures
,tables
, andresults
directories.
After running setup.sh
, you need to activate the virtualenv using
source .venv/bin/activate
The experiments are run via a command-line interface. Most experiments can be replicated with a single command, but some require a sequence of experiments to be executed in the correct order.
First, make sure that the virtual environment is active.
Running where python
in bash will show you where the active Python binaries are; this will point to a file in code/.venv/bin
if the virtual environment is active.
Scripts are executed by calling scripts/run_experiment.py
with the -E
flag to specify an experiment name, like in the following:
python scripts/run_experiment.py -E "test"
There are several command line arguments that can be passed to run_experiment.py
, such as -V
for verbose execution, -F
to force re-runs, etc.
Try
python scripts/run_experiment.py --help
To see all the available options.
Experiment configurations are located in scripts/exp_configs
.
The configuration scripts/exp_configs/test.py
is provided so you can
familiarize yourself with the execution system.
Pre-defined sbatch
configurations for slurm
are provided in sbatch_scripts
for transparency and convenience.
These are the exact configurations used to run the original experiments on the Sherlock Cluster.
A flag specifying which sbatch
script can be passed to run_experiment.py
.
This will be used to submit slurm
jobs if the -N
parameter (used to specify number of nodes)
is also passed.
All experiments are named to corresponding with the figure/table which they
generate in the paper.
For example, Figure 1 can be generated by running the figure_1
experiment
defined in scripts/exp_configs/figure_1.py
and then using
python scripts/make_figure_1_6.py
To generate Figures 1 and 6. Check each experiment file to see the exact experiment names. As noted above, some experiments have specific orders in which they must be executed; these are:
-
Table 2:
- Run the experiments in
table_2_gs.py
first. - Run
extract_table_2_best_params.py
. - Run the experiments in
table_2_final.py
.
- Run the experiments in
-
Table 3:
- Run the experiments in
table_3_gs.py
first. Note thattable_3_nc_relu_gs
must be run aftertable_3_relu_gs
, since the latter experiment is used to determine the widths of the non-convex networks. You must update a file-path used to load experiments results before runningtable_3_nc_relu_gs
. - Run
extract_table_3_best_params.py
. - Run the experiments in
table_3_final.py
. Again, there are several file-paths to update andtable_3_nc_relu_final
must be run aftertable_3_relu_support
.
- Run the experiments in
-
Figure 4:
- Run the experiments in
table_4.py
. Again, you must runfigure_4_relu
beforefigure_4_nc_relu
and it is necessary to update a file-path before running the latter.
- Run the experiments in
The remaining experiments are straightforward to replicate.
Please open an issue if you experience any bugs or have trouble replicating the experiments.