E2E-info

Supplementary codes for our paper "End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training", TMLR2024.

It provides useful implementations for various layer-wise training methods and HSIC-based analyses.

Requrements

Python 3.10 or later
PyTorch

You could have error when using Python 3.9 or lower due to type hinting support.

To install requirements:

python -m venv venv
source venv/bin/activate

# Install requirements
pip install -r requirements.txt

Training

Simultaneous Layer-wise Training

To train the model in the same setting of the paper, run

python main_lw.py

You can specify the model and other hyperparameters by adding arguments like:

python main_lw.py model.name=vgg11

or modifying configuration files under conf directory. To add your new local loss, please edit src/models/layer_wise_loss.py.

Sequential Layer-wise Training

To perform layer-wise training in a sequential manner, run

python main_lw_seq.py

Note that it takes more time, as it trains each layer sequentially. For details, please refer to Section B.1 in our paper.

Signal Propagation

To train the model with signal propagation algorithm by Kohan+, 2022, which is one of the local training methods, run

python main_sp.py

We provide a separate implementation from main_lw.py because signal propagation propagates label information besides input image and the model architecture is slightly different.

Forward-Forward Algorithm

Train models with Forward-Forward algorithm by Hinton, 2022 with

python main_ff.py

Currently, toy MLP and CNN models are supported. You can try new models by adding new classes under src/models/forward_forward_model.py.

Embedding label information into the inputs is one of the characteristics of the Forward-Forward algorithm, and it is supported in several ways in addition to the original paper. For more details, please refer to the descriptions in the LabelEmbedder class under src/models/forward_forward_block.py.

For example, settings method=top-left embeds class information as in the original paper, like

We can also provide class information by subtracting the class prototypes as follows.

nHSIC Evaluation

Citation

If you find our codes useful for your research, please cite using this BibTeX:

@article{
  sakamoto2024endtoend,
  title={End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training},
  author={Keitaro Sakamoto and Issei Sato},
  journal={Transactions on Machine Learning Research},
  issn={2835-8856},
  year={2024},
  url={https://openreview.net/forum?id=O3wmRh2SfT},
}

TODO List

Overall

Add documents to run main files
Test on GPU machine
Add codes to reproduce nHSIC dynamics

Training Algorithms

Layer-wise training
Sequential layer-wise training
Signal Propagation
Forward-Forward algorithm

Architecture

ResNet
VGG
Vision transformer

keitaroskmt/E2E-info