/daloflow

Data locality on Tensorflow.

Primary LanguagePython

daloflow:
DAta LOcality on tensorFLOW


Getting daloflow and initial setup:

  1. Clone from github and initialize for cpu (gpu option is also available):
 git clone https://github.com/saulam/daloflow.git
 cd daloflow
 chmod +x ./daloflow.sh
 ./daloflow.sh init cpu
  1. IF docker + docker-compose is not installed THEN please install pre-requisites:
 ./daloflow.sh prerequisites
  1. Build the docker image:
 ./daloflow.sh build

Typical daloflow work session:

A new single node work session:

./daloflow.sh start 4
./daloflow.sh mpirun 2 "python3 ./do_task.py"
: ...
./daloflow.sh stop

For example, with "./daloflow.sh start" four container are spin-up in one node, the current one (NC=4). Then, do_task.py was executed with 2 process (NP=2, only two containers are used).

A new work session using several nodes:

./daloflow.sh swarm-start 4
./daloflow.sh mpirun 2 "python3 ./do_task.py"
: ...
./daloflow.sh stop

For example, with "./daloflow.sh swarm-start" containers are spin-up in four nodes (NC=4, one container per node). Then, do_task.py was executed with 2 process (NP=2) on the first two nodes.

Some additional options for debugging:

  1. Build a random dataset for 1000000 images of 32x32 pixels:
python3 mk_dataset.py --height 32 --width 32 --ntrain 1000000 --ntest 1000
  1. To get the status of the running containers:
 ./daloflow.sh status
  1. To execute a bash in a container:
 ./daloflow.sh bash <container id: from 1 up to NC>

Authors

  • 🧑‍💻 Saúl Alonso Monsalve
  • 🧑‍💻 Félix García-Carballeira
  • 🧑‍💻 José Rivadeneira López-Bravo
  • 🧑‍💻 Alejandro Calderón Mateos