The quickest way to setup the Python environment is to use pipenv install
and then pipenv shell
.
Reproducing the benchmark results from the paper entails the following steps:
-
Download and unzip the EMNIST data sets into
./data/emnist-dataset
folder -
Train CNN on different data sets with
shuffled
,sorted
andblock-random
orderings, at different batch sizes:$ python classification-tests.py -d fashion $ python classification-tests.py -d digits $ python classification-tests.py -d letters $ python classification-tests.py -d byclass $ python classification-tests.py -d bymerge $ python classification-tests.py -d balanced $ python classification-tests.py -d mnist"
The outputs are stored in the
batch_size_study
directory. -
Plot the results:
$ python plot.py
Reproducing the channel flow results from the paper entails the following steps:
-
Generate the filtered data from the DNS:
$ python scaling.py
. This createsscaled.npy
in the data directory which has the filtered velocities, gradients and$\tau_ij$ terms. -
Generate the training and test data for the various runs:
$ python gen_data.py -o shuffled-1m -p shuffled -b 16 -n 1000000 $ python gen_data.py -o shuffled-16 -p shuffled -b 16 $ python gen_data.py -o block-16 -p block -b 16 $ python gen_data.py -o sorted-16 -p sorted -b 16
-
Perform hyperparameter sweeps using the shell script:
$ sh parameter_sweeps.sh 1 $ sh parameter_sweeps.sh 2 $ sh parameter_sweeps.sh 3 $ sh parameter_sweeps.sh 4
-
Plot comparisons of results:
$ python compare_runs.py -r runs
-
Train the models using different types of algorithms:
$ sh model_runs.sh
-
Plot a given model result:
$ python plot_run.py -r runs/${directory_name}
@article{Mohan19,
author = {P. Mohan, M. T. Henry de Frahan, R. King, and R. W. Grout},
title = {A block-random algorithm for learning on distributed, heterogeneous data},
journal = {arXiv:1903.00091},
year = {2019}
}