EE 127: Information Theory - Final Project The Capacity of Hopfield Neural Networks By: Alexander Christenson, Holden Fried, Kenny Yau ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Brief Primer on Hopfield Networks ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In 1982 J.J. Hopfield wrote his seminal paper "Neural networks and physical systems with emergent collective computational abilities." This paper was considered highly important at the time of publication. It helped spur an increase in artificial intelligence research and has recieved nearly 20,000 citations since publication. His original paper draws comparisons between the computational properties of biological organisms and the construction of computers. These come primarily by comparing the function of a neuron in a brain with the function of a transistor in a computer, and the paper presents a mathematical model primarily based on neurobiology, but with the potential to be easily applied to integrated circuits. The main goal of the system is to provide content-addressable memory described by an appropriate phase space flow of the state of the system. A content-addressable memory simply means that given incomplete information about an item, a memory item can be found assuming that the incomplete information is sufficient. For the system to be work it needs to be able to deal with errors, and retrieve the desired memory item from a small section of the input. At the time hardware based content addressable memory systems were relatively simple due to the difficulty in integrating error checking leading to a majority of the error checking being done in software. The error correction is done through mapping the time evolution of the system to a set of coordinates to represent the instantaneous condition of the system (this applies to both continuous and discrete). From this mapping equations of motion of the system describe the flow of the state space. The processing for the system is done in a distributed manner. The neurons are what the processing devices or nodes of the network are referred to in the paper. Each neuron i has a binary 0 or 1 state V_i corresponding to if the neuron is either not firing (off) or firing at maximum strength (on). What we will do in our implementation of the paper for the purpose of analyzing the network capacity is maintain the quantization without rounding. The strength of the connections between the neurons is defined by the weights T_{ij} (for no connection T_{ij} = 0). The state of the neuron is determined by an algorithm, as follows: Each neuron i is assigned a fixed threshold U_i. Each neuron i readjusts its state randomly in time but with a mean attempt rate W setting, V_i = 1 if sum(T_{ij}V_j) over all j != i > U_i V_i = 0 if sum(T_{ij}V_j) over all j != i < U_i This algorithm makes each neuron randomly and asynchronously fire independent of what its state is relative to the threshold. This algorithm differs from the previously used ones because it enables the response of one neuron to affect itself indirectly, as referred to in the paper as the backcoupling of the system. Modeled as A <-> B <-> C <-> A for a network of neurons A, B, C. The information in the system is stored in the weight matrix defining the connections between the neurons. If we want to remember a set of states V^s, s = 1...n, the connections between the neurons are defined as: T_{ij} = sum((2*V^s_i-1)(2*V^s_j-1)) over all states s T_{ii} = 0 From the definition sum(T_{ij}V^{s'}_j) over all j = sum((2*V^s_i-1)[sum(V^{s'}_j(2V^s_j-1)) over all j]) over all s is defined as H^{s'}_j In this situation the mean of the summation in the square brackets is zero for s != s', and otherwise is N/2. From this: sum(T_{ij}V^{s'}_j) over all j = <H^{s'}_i> appx equals (2V^{s'}_i - 1)N/2 This model contrasts with the previously used models in the sense that it uses its strong nonlinearity to make choices, produce categories, and regenerate information. This means that where previous models would produce nonsensical outputs when the stimulus on the network is a linear combination of two stimuli with known outputs that is a combination of the two outputs, this model chooses the more probable output. The biological representation of firing between neurons has previously been linearized in the network, however, now a step function about a threshold is now used. The network of cells described in the original paper have the ability to perform abstract calculations, and the inputs should be appropriately encoded, meaning that the data needs to be pre processed before the hopfield memory network can be used to store its information. The way the associative memory works is through making an energy approximation through the network to be able to map the closest stimula with a corresponding and probably correct output. The energy function is defined as: E = - sum(sum(T_{ij}V_iV_j))/2 over all j, all != j The change in energy due to the changing neuron states is: D(E) = - D(V_i) sum(T_{ij}V_j) over all j != i' The algorithm for changing the state of the neuron forces the energy function to be monotonically decreasing, and state changes will continue until the energy function reaches a local stable minimum. It is possible for the system to have many locally stable states. This algorithm does not always converge to the correct stable equilibria. The space flow can be affected by nominally assigned memories implying that the network can create artificial associated combinations of the memory that form stable equilibria in the phase flow of the system. The system responds to an ambiguous stimulation by making a statistical choice between memory states. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Our objectives for the program ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First thing we should build is the basic framework for a Hopfield Network. This can be done in the Hopfield.py file. A good reference for someone who has built a network specifically for image reconstruction is here: https://github.com/yosukekatada/Hopfield_network/blob/master/hopfield.py A blog post that gave a good explanation for the math involved, and which helped writing the basic network is here: http://neupy.com/2015/09/20/discrete_hopfield_network.html
achristenson/Hopfield-Networks
EE 127: Information Theory Final Project, Spring 2018, Tufts University
Python