Supplementary codes for our paper "End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training", TMLR2024.
It provides useful implementations for various layer-wise training methods and HSIC-based analyses.
- Python 3.10 or later
- PyTorch
You could have error when using Python 3.9 or lower due to type hinting support.
To install requirements:
python -m venv venv
source venv/bin/activate
# Install requirements
pip install -r requirements.txt
To train the model in the same setting of the paper, run
python main_lw.py
You can specify the model and other hyperparameters by adding arguments like:
python main_lw.py model.name=vgg11
or modifying configuration files under conf
directory.
To add your new local loss, please edit src/models/layer_wise_loss.py
.
To perform layer-wise training in a sequential manner, run
python main_lw_seq.py
Note that it takes more time, as it trains each layer sequentially. For details, please refer to Section B.1 in our paper.
To train the model with signal propagation algorithm by Kohan+, 2022, which is one of the local training methods, run
python main_sp.py
We provide a separate implementation from main_lw.py
because signal propagation propagates label information besides input image and the model architecture is slightly different.
Train models with Forward-Forward algorithm by Hinton, 2022 with
python main_ff.py
Currently, toy MLP and CNN models are supported.
You can try new models by adding new classes under src/models/forward_forward_model.py
.
Embedding label information into the inputs is one of the characteristics of the Forward-Forward algorithm, and it is supported in several ways in addition to the original paper.
For more details, please refer to the descriptions in the LabelEmbedder
class under src/models/forward_forward_block.py
.
For example, settings method=top-left
embeds class information as in the original paper, like
We can also provide class information by subtracting the class prototypes as follows.
If you find our codes useful for your research, please cite using this BibTeX:
@article{
sakamoto2024endtoend,
title={End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training},
author={Keitaro Sakamoto and Issei Sato},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2024},
url={https://openreview.net/forum?id=O3wmRh2SfT},
}
- Add documents to run main files
- Test on GPU machine
- Add codes to reproduce nHSIC dynamics
- Layer-wise training
- Sequential layer-wise training
- Signal Propagation
- Forward-Forward algorithm
- ResNet
- VGG
- Vision transformer