Neural Attention Memory

This is the repository for Neural Attention Memory paper experiments. Clone with --recurse-submodules to load the SCAN dataset.

Requirements

  • Python 3.8
  • CUDA-capable GPU (Tested on RTX 4090 24GB. Reduce --batch_size if gpu memory is limited)
  • PyTorch >= 1.7
  • CUDA >= 10 (Install with PyTorch)
  • Python libraries listed in requirements.txt

Running Experiments


AutoEncode.py is the code for running the experiments as below.

--log will create a log file of the experiment.

python AutoEncode.py --net namtm --seq_type add --digits 10 --log

For 4-DYCK, run python DYCK.py to generate the data points first.

Options


Our program supports multiple command-line options to provide a better user experience. The below table shows major options that can be simply appended when running the program.

Options Default Description
--net namtm Model to run
tf: Transformer
ut: Universal Transformer
dnc: Differentiable Neural Computer
lstm: LSTM w attention
stm: SAM Two-memory Model
namtm: NAM-TM
stack: Stack-RNN
--seq_type add task for prediction
add: addition task (NSP)
reverse: reverse task (NSP)
reduce: Sequence reduction task
dyck: 4-DYCK task
--digits 10 Max number of training digits
--log false Log training/validation results
--exp 0 Assign log file identifier when --log is true

See Options.py or python AutoEncode.py --help for more options.

Copyright Notice

Some parts of this repository are from the following open-source projects.
This repository follows the open-source policies of all of them.