tfaip is a Python-based research framework for developing, organizing, and deploying Deep Learning models powered by Tensorflow. It enables to implement both simple and complex scenarios that are structured and highly configurable by parameters that can directly be modified by the command line (read the docs). For example, the tutorial.full-scenario for learning MNIST allows to modify the graph during training but also other hyper-parameters such as the optimizer:
export PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found
# Change the graph
tfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation relu
tfaip-train examples.tutorial.full --model.graph MLP --model.graph.nodes 200 100 50 --model.graph.activation tanh
tfaip-train examples.tutorial.full --model.graph CNN --model.graph.filters 40 20 --model.graph.dense 100
# Change the optimizer
tfaip-train examples.tutorial.full --trainer.optimizer RMSprop --trainer.optimizer.beta1 0.01 --trainer.optimizer.clip_global_norm 1
# ...
A trained model can then easily be integrated in a workflow to predict provided data
:
predictor = TutorialScenario.create_predictor("PATH_TO_TRAINED_MODEL", PredictorParams())
for sample in predictor.predict(data):
print(sample.outputs)
In practice, tfaip follows the rules of object orientation, i.e., the code for a scenario (e.g., image-classification (MNIST), text recognition, NLP, etc.) is organized by implementing classes.
By default, each Scenario
must implement Model
, and Data
.
See here for the complete code to run the upper example for MNIST and see here for the minimal setup.
To setup tfaip create a virtual Python (at least 3.7) environment and install the tfaip
pip package: pip install tfaip
:
virtualenv -p python3 venv
source venv/bin/activate
pip install tfaip
pip install tfaip[devel] # to install additional development/test requirements
Have a look at the wiki for further setup instructions.
After the setup succeeded, launch a training of the tutorial which is an implementation of the common MNIST scenario:
export PYTHONPATH=$PWD # set the PYTHONPATH so that the examples dir is found
tfaip-train examples.tutorial.full
# If you have a GPU, select it by specifying its ID
tfaip-train examples.tutorial.full --device.gpus 0
Start reading the Minimum Tutorial, optionally have a look at the Full Tutorial to see more features.
The docs provides a full description of tfaip
.
To set up a new custom scenario, copy the general template and implement the abstract methods.
Consider renaming the classes!
Launch the training by providing the path or package-name of the new scenario which must be located in the PYTHONPATH
!
tfaip provides different features which allow designing generic scenarios with maximum flexibility and high performance.
- Fully Object-Oriented: Implement classes and abstract functions or overwrite any function to extend, adapt, or modify its default functionality.
- Typing support: tfaip is fully typed with simplifies working with an IDE (e.g., use PyCharm!).
- Using pythons
dataclasses
module to set up parameters which are automatically converted to parameters of the command line by ourpaiargparse
package.
Every scenario requires the setup of a data-pipeline to read and transform data.
tfaip offers to easily implement and modify even complex pipelines by defining multiple DataProcessors
which usually implement a small operation to map an input sample to an output sample.
E.g., one DataProcessor
loads the data (input=filename
, output=image
), another one applies normalization rules, again another one applies data augmentation, etc.
The great advantage of this setup is that the data processors run in Python and can automatically be parallelized by tfaip for speed up by setting run_parallel=True
.
Since tfaip is based on Tensorflow the full API are available for designing models, graphs, and even data pipelines. Furthermore, tfaip supports additional common techniques for improving the performance of a Deep-Learning model out of the box:
- Warm-starting (i.e., loading a pretrained model)
- EMA-weights
- Early-Stopping
- Weight-Decay
- various optimizers and learning-rate schedules
We highly encourage users to contribute own scenarios and improvements of tfaip. Please read the contribution guidelines.
All timings were obtained on a Intel Core i7, 10th Gen CPU.
The following Table compares the MNIST Tutorial of Keras to the Minimum Tutorial. The keras code was adopted to use the same network architecture and hyperparemter settings (batch size of 16, 10 epochs of training).
Code | Time Per Epoch | Train Acc | Val Acc | Best Val Acc |
---|---|---|---|---|
Keras | 16 s | 99.65% | 98.24% | 98.60% |
tfaip | 18 s | 99.76% | 98.66% | 98.66% |
tfaip and Keras result in comparable accuracies, as to be expected since the actual code for training the graph is fundamentally identical. tfaip is however a bit slower due some overhead in the input pipeline and additional functionality (e.g., benchmarks, or automatic tracking of the best model). This overhead is negligible for almost any real-world scenario because due to a clearly larger network architecture, the computation times for inference and backpropagation become the bottleneck.
Integrating pure-python operations (e.g., numpy) into a tf.data.Dataset
to apply high-level preprocessing is slow by default since tf.data.Dataset.map in cooperation with tf.py_function does not run in parallel and is therefore blocked by Python's GIL.
tfaip curcumvents this issue by providing an (optional) parallelizable input pipeline.
The following table shows the time in seconds for two different tasks:
- PYTHON: applying some pure python functions on the data
- NUMPY: applying several numpy operations on the data
Mode | Task | Threads 1 | Threads 2 | Threads 4 | Threads 6 |
---|---|---|---|---|---|
tf.py_function | PYTHON | 23.47 | 22.78 | 24.38 | 25.76 |
tfaip | PYTHON | 26.68 | 14.48 | 8.11 | 8.13 |
tf.py_function | NUMPY | 104.10 | 82.78 | 76.33 | 77.56 |
tfaip | NUMPY | 97.07 | 56.93 | 43.78 | 42.73 |
The PYTHON task clearly shows that tf.data.Dataset.map
is not able to utilize multiple threads.
The speed-up in the NUMPY tasks occurs possibly due to paralization in the numpy API to C.