/archibrain

Synthesize bio-plausible neural networks for cognitive tasks, mimicking brain architecture

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

archibrain

We will develop biologically plausible neural network models based on brain architecture that solve cognitive tasks performed in the laboratory.

Inspired by brain architecture, the machine learning community has recently developed various memory-augmented neural networks, that enable symbol and data manipulation tasks that are difficult with standard neural network approaches, see especially from Google Deepmind (NTM, DNC, one-shot learner).

At the same time, models based closely on brain architecture, that perform experimentally-studied tasks, have existed for a while in the neuroscience community, notably from the labs of Eliasmith (SPAWN), O'Reilly (PBWM-LEABRA-Emergent), Alexander+Brown (HER), Roelfsema (AuGMEnT), Hawkins (HTM), and others (Heeger et al 2017, ...). How these models compare to each other on standard tasks is unclear. Further, biological plausibility of these models is quite variable.

From the neuroscience perspective, we want to figure out how the brain performs cognitive tasks, by synthesizing current models and tasks, constrained by known architecture and learning rules. From the machine learning perspective, we will explore whether brain-inspired architecture(s) can improve artificial intelligence (cf. copying bird flight didn't help build airplanes, but copying neurons helped machine learning).

As part of this project, we introduced an extension of AuGMEnT, called hybrid AuGMEnT, that incorporates multiple timescales of memory dynamics, enabling it to solve tasks like 12AX which the original AuGMEnT could not. See our article:
Multi-timescale memory dynamics extend task repertoire in a reinforcement learning network with attention-gated memory
Marco Martinolli, Wulfram Gerstner, Aditya Gilra
Front. Comput. Neurosci. 2018 | doi: 10.3389/fncom.2018.00050
preprint at:
arXiv:1712.10062 [q-bio.NC].
Code for this article is available at https://github.com/martin592/hybrid_AuGMEnT.

We utilize a modular architecture to:

  1. Specify the model such that we can 'plug and play' different modules -- controller, differentiable memories (multiple can be used at the same time). We should be able to interface both the abstract 'neurons' (LSTM, GRU, McCullough-Pitts, ReLU, ...) but also more biological spiking neurons.
  2. Specify Reinforcement Learning or other tasks -- 1-2AX, Raven progressive matrices, BABI tasks...

Currently, we have HER, AuGMEnT, and LSTM implementations that can be run on these tasks:
'0':'task 1-2',
'1':'task AX_CPT',
'2':'task 12-AX_S',
'3':'task 12-AX',
'4':'saccade/anti-saccade task',
'5':'sequence prediction task',
'6':'copy task',
'7':'repeat copy task'
using the script interface.py . We also cloned and modified the official DNC implementation (see README in DNC_analysis folder). We also replicated the One-shot NTM on the onmiglot task (also in DNC_analysis folder).

We will also explore different memory interfacing schemes like content or list-based as in the DNC, or Plate's/Eliasmith's Holographic Reduced Representations/Semantic Pointer Architecture, address-value augmentation, etc.

A larger goal will be to see if the synthesized 'network' can build models of the 'world' which generalize across tasks.

Currently, we have three contributors: Marco Martinolli, Vineet Jain and Aditya Gilra. We are looking for more contributors!

Aditya initiated and supervises the project, with reviews of ideas and architectures and pointers to synthesize them.

Marco implemented the Hierarchical Error Representation (HER) model by Alexander and Brown (2015, 2016), incorporating hierarchical predictive coding and gated working memory structures, and the AuGMEnT model by Rombouts, Bohte and Roelfsema (2015), as well as the relevant tasks Saccade-AntiSaccade, 12AX, and sequence prediction tasks. See the extention of AuGMEnT, hybrid AuGMEnT, developed by him as part of this project.

Vineet developed a common API for models and tasks as well as implemented some tasks. He also tested various parts of the memory architectures of DNC and NTM whose code has been incorporated from their official repositories. See his one shot learning implementation, an offshoot of this project.

See also:
Overview of architectures (work in progress)
[Brief survey of toolkits] (https://github.com/adityagilra/archibrain/wiki/Machine-Learning-library-comparisons-for-testing-brain-architectures). We have chosen Keras as a primary toolkit. We will follow an agile software development process, and frequently re-factor code (might even change the framework).