=========================================================================================== HTMCLA =========================================================================================== A C++ implementation of Numenta's Hierarchical Temporal Memory Cortical Learning Algorithm By Michael Ferrier, 2013. =========================================================================================== This C++ implementation is based on Numenta's CLA white paper: https://www.groksolutions.com/htm-overview/education/HTM_CorticalLearningAlgorithms.pdf It builds on the OpenHTM C# implementation: https://sourceforge.net/p/openhtm/ A couple of demonstration videos are online here: http://www.youtube.com/watch?v=IXg_XIm5kqk http://www.youtube.com/watch?v=YeBC9eew3Lg =========================================================================================== Build Instructions =========================================================================================== HTMCLA depends on the Qt user interface library. The latest version can be downloaded here: http://qt-project.org/downloads To get Qt working and building with Visual Studio 2012 on Windows 7 64-bit, I had to follow these steps: 1) Install QT from http://qt-project.org/downloads. Use "Qt 5.0.2 for Windows 64-bit (VS 2012, 500 MB)". 2) Install QT's Visual Studio add in, from http://qt-project.org/downloads. Use "Visual Studio Add-in 1.2.1 for Qt5". 3) Open VS, open QT5 menu, go to QT Options. Add the msvc2012_64 directory under the QT installation (the directory with 'bin' in it). 4) In VS, under Build >> Configuration Manager, make sure Platform is set to x64. 5) Under Project >> Properties >> Configuration Properties >> Linker >> Advanced, make sure Target Machine is also set to X64. 6) Under QT5 >> QT Project Settings, set Version to msvc2012_64. 7) Create and attempt to build project. May find that you need to rename lib files being used (in the QT install) to have a '5' before the dot at the end of their filenames. I can't speak for how Qt 5 should be configured on other setups, but I find there's a lot of help on the web. Once you have Qt5 integrated into your environment, you should be able to create a project in an environment of your choice that incorporates QT 5 and each of this project's .cpp and .h files. If you're using Visual Studio 2012, you should be able to use the htm.sln file. =========================================================================================== Using HTMCLA =========================================================================================== File Menu ========= Load Network: Load a .xml file specifying the architecture of a CLA network. This file would be created in a text editor. The HTMCLA package comes with several test network files in the "data" subdirectory. Load Data: You can load the segment and synapse data into a network, so that you can load up the state of that network after training. This is especially useful for complex learning tasks where training takes a long time. You can only load a data file into the same network from which it was saved (although the network parameters may be diffrent from what they were when the data was saved). Data is saved in a .clad file (CLA data). Save Data: After training a network, you can save out its segment and synapse data so that you can load it directly into the same network on a later run, without having to redo the training. It is saved out as a .clad (CLA data) file. Exit: Exit the program. View Menu ========= Update while running: If this is checked, the UI will be updated after each time step when running the network. This slows down execution significantly, but may be useful to watch progress in some cases. Mouse Menu ========== Toggles mouse function between "Select" and "Drag". In select mode, whatever is under the mouse is selected when the left button is clicked. When in drag mode, the view can be repositioned by being dragged. This function remains buggy. Network Panel ============= Displays the filename of the loaded network. The || (Pause), > (Single Step), and >> (Continuous Run) buttons are used to control execution of the network. The current time step is displayed, as well as the "Stop at:" field. If a number is entered here, continuous execution will cease at that given time step. Selected Panel ============== By left clicking on a cell or column in one of the views, that cell or column is selected. Information about the selected cell, column and region is displayed in this panel. If a cell is selected and if it has distal segments, a list of those segments is displayed at the bottom. If the views are displaying connections (synapses), then all of the selected column's proximal synapses will be shown, and all of the selected cell's distal synapses will be shown. To display just the distal synapses on one segment, select that segment by clicking on its line in the list. To deselect the segment and return to displaying all synapses on all distal segments, click the "Display all segments" button. The table of segments is currently quite slow to update, so displaying the segments on a cell with many distal segments can cause the program to pause for a few seconds. The Views ========= On the left side of the UI are two views of the network. Each one can display one region or input space. If you'd like, they can both display the same region or input space. To select what it is that a view displays, select the name of the region or input space from the view's "Show:" menu. A view's "Options" menu contains several options for what information is displayed in the view: View Activity: Displays in the view what cells are active (orange), and what cells are predicted (yellow for one-step prediction, pink for >1 step prediction). View Top-Down Reconstruction: Displays a reconstruction of what is active in the region or input space, based on what is active in the region(s) that it outputs to. Synapses are traced backwards from active cells in the output region(s), to determine what cells in this region (or input space) would lead to those activations. View Prediction: Displays a reconstruction of what is predicted to be active in the region or input space during the next time step, based on what cells are 1-step predictive in the region(s) that it outputs to. Synapses are traced backwards from 1-step predictive cells in the output region(s), to determine what cells in this region (or input space) would lead to those activations. View Boost: When this is on, each column is colored to show its relative boost value. The brightest red columns have the highest boost values, while grey columns are not boosted at all. View Connection In: When this is on, synapses from cells that input to the selected cell and/or column are displayed in the view. Red synapses are not connected; green synapses are connected. The brighter the color, the further the synapse's value from ConnectedPerm (in either direction). Zoom in on the area to view the permanence values of the syanpses. A selected column's proximal synapses are shown in it's region's input region(s) or input space(s). A cell's distal synapses are shown in the cell's same region. View Connections In: This feature is not yet implemented. Eventually it will display the synapses that output from the selected cell to other cells. View Marked Cells: Cells can be marked so as to, for example, compare what cells are active at one time step with what cells are active at a later time step. If this option is selected, then a black dot will be shown over each marked cell. Mark Active and Predicted Cells: All cells that are currently active or in predictive state in the view are marked. Note that the marks will only show if "View Marked Cells" is selected. Mark Predicted Cells: All cells that are in predictive state are marked. Mark Learning Cells: All cells that are learning cells (a subset of active cells) are marked. The magnifying class icons and the slider between them are used to zoom in and out of the view. At closer zoom levels, more informaton is displayed, such as the coordinates of columns, the permanence values of synapses, and an "L" over learning cells. The scrollbars may be used to reposition what part of the view is visible. The divider between the two views can be slid from side to side to changes the sizes of the views. The divider can also be slid all the way to the outer edge, to hide a view. It can then be grabbed back from the outer edge to re-show that view. =========================================================================================== Network Files =========================================================================================== Network architecture is specified in a .xml file that can be created in a text editor. HTMCLA comes with several example network files, located in the "data" subdirectory. The network file specifies all parameters for each region and input space in the network, as well as any input patterns that an input space will use. The network file format is based loosely on the OpenHTM network file format, with a number of changes. Most of the features of a network file are self explanatory by looking at the examples. Here are a few notes: The synapse parameters can be specified separately for proximal and distal synapses, using the <ProximalSynapseParams> and <DistalSynapseParams> blocks. When used outside of any Region block (ie. when not nested), these parameters are used as the default for all proximal or distal synapses throughout all Regions. In addition to specifying network-wide default synapse parameters, you can specify synapse parameters specific to a Region by using the <ProximalSynapseParams> and <DistalSynapseParams> blocks nested within the <Region> block. A network is composed of <InputSpace> and <Region> blocks. InputSpaces provide input from the "outside" of the network, while Regions are where the CLA learning takes place. Each Region must have an <Inputs> block, which specifies one or more inputs to that Region, as well as the input radius for that input. A Region may have any number of inputs, and each one may be either an InputSpace or another Region. The <Boost> tag for a Region is used to specify the boost rate as well as the max allowed boost value. The BoostingPeriod, SpatialLearningPeriod and TemporalLearningPeriod tags are used to specify in what range of time steps each of these types of learning is allowed to take place within for the Region. Typically boosting and spatial learning are ended before temporal learning begins for a Region, so that the temporal pooler will be working with stable spatial representations. However this is unnecessary if the learning rates are slow enough that the temporal pooler has a chance to adapt to changes in the spatial representations. The value -1 can be used for any of these, to mean "start from the beginning" or "never stop". MinOverlapToReuseSegment has a "min" and a "max" value. A value within this range is chosen randomly for each column within the region. This allows columns to have some variation in how predisposed they are to re-using the same segment (and so the same learning cell) for a somewhat different context, versus choosing a different segment (and so a different learning cell). This way, some columns will be more context sensitive while others will tend to generalize over different contexts, providing a mix of the strengths of both approaches. =========================================================================================== Notes on Parameters =========================================================================================== Through experimentation I discovered a few properties about the different parameters and how they relate to one another: NewNumberSynapses (aka newSynapsesCOunt in the white paper): This number should always be lower than the number of active cells that a cell usually has in its LocalityRadius (ie., that are available to be incorporated into distal segments). That way a distal segment will be adding synapses to a sample from the local activity, not adding *all* of the local activity. This is how its function is described in the Numenta docs. I found that if newSynapsesCount is greater than the number of active cells in localityRadius, then that leaves blank "slots" to be filled in later with part of a different pattern sharing some of the same columns, which causes lots of problems (as I described under problem #3 in the temporal context forking thread on the OpenHTM forum). MinOverlapToReuseSegment (aka minThreshold in the white paper): The lower this number, the more likely that a similar pattern will be added to an existing segment rather than having a new segment created for it. So long as NewNumberSynapses is less than the typical number of locally active cells (as described above), this parameter doesn't need to be very high. MinSynapsesPerSegmentThreshold should always be <= SegmentActivationThreshold (to avoid problems where a cell isn't predicted or activated, but is chosen to be a learning cell). I've been using values between 1 and 5. As described above, I've added an option to provide a range for MinSynapsesPerSegmentThreshold, so that each column has a random value within that range chosen for its own MinSynapsesPerSegmentThreshold. This results in some columns being predisposed to choose the same cell to represent similar contexts, while others are biased to pick a different cell. This way the closer two contexts are to one another, the more cells the representations of those contexts will have in common, which should help with the learning of generalization. Providing a range like that is not necessary to fix the temporal forking problem however; using a higher value such as 5 results (in my tests) in different cells being chosen for diffrent contexts, and all of the temporal forking tests working correctly. Further note: This parameter is very important for the temporal context forking problem. Too low and the same cells will be chosen for a given pattern in different contexts. Too high and different cells will be chosen all the time as new context is learned further and further back, resulting in reuse of some of the same cells as it loops around (a short looping sequence such as ABBCBBA). SegmentActivationThreshold (aka activationThreshold in the white paper): This number should not be lower than MinSynapsesPerSegmentThreshold, as implied in the white paper description for getBestMatchingSegment(): "The number of active synapses is allowed to be below activationThreshold, but must be above minThreshold." In my tests I have it set to 5. PermanenceIncrease and PermanenceDecrease: If setting your parameters to allow the same segment to be used sometimes in similar contexts (by using a lower value for MinSynapsesPerSegmentThreshold), then PermanenceInc and PermanenceDec (for distal synapses) must be set so as to allow multiple patterns to be supported by a single segment. The higher the ratio of PermanenceInc/PermanenceDec, the more patterns can be supported in a single segment. I'm using PermanenceInc 0.01 and PermanenceDec 0.005. =========================================================================================== Changes from the White Paper and the OpenHTM implementation =========================================================================================== This C++ implementation of the CLA sticks pretty close to the Numenta white paper and to the OpenHTM implementation. However there are a few major differences that should be reviewed. Hierarchy ========= In this implementation, an InputSpace provides input to the network and a Region performs learning. The Region and InputSpace classes both inherit a common interface called DataSpace, which a Region uses to query its inputs. So, a Region may have any number of inputs, each of which can be either an InputSpace or another Region. Regions are processed in the order that they are defined in the network file. Hypercolumns ============ This implementation incorporates the idea of a cortical hypercolumn, which is a group of columns (in the brain, about 100 columns) which all receive input from the same area of their inputs, and which mutually inhibit among themselves. In area V1, one hypercolumn contains columns that represent lines of all different orientations. The columns in a hypercolumn mutually inhibit among themselves, which helps to cause the different columns to specialize in learning to represent different features (mostly lines of different orientations) and which also causes only one column to be active per hypercolumn at a time. There is a diffrerent hypercolumn for each small patch of the visual field. In this implementation, hypercolumns can be used by setting a region's HypercolumnDiameter. This defaults to 1, in which case the region acts like an OpenHTM region -- each column is its own hypercolumn. InhibitionRadius and PredictionRadius are specified in hypercolumns, so if HypercolumnDiameter is > 1, it will affect how these values are used. A Region has an <Inhibition> tag, with a "type" attribute. If type is set to "automatic", inhibition will work the way it's described in the white paper and implemented in OpenHTM, where an inhibition radius is determined based on the average receptive field size of columns in the Region. If inhibition type is set to "radius", however, then a radius is specified, which is measured in hypercolumns. If the given radius is 0, then only columns within the same hypercolumn will mutually inhibit, as in V1. Overlap ======= In the CLA white paper and in the OpenHTM implementation, a column's Overlap value is simply the count of its active, connected proximal synapses, multiplied by the column's Boost value (a number >= 1). This is also the case in my implementation, but an additional factor is multipled in. That factor is the ratio of the number of connected active synapses, over the sum of itself plus the number of strongly connected inactive synapses. A strongly connected synapse is defined as a synapse with permanence > InitialPerm. The thinking behind this is that a column which exactly represents a particular pattern of active inputs should be a better match for that pattern than another column that represents the same active inputs plus additional active inputs. For example, say a number of inputs are active such that they form the image of a 45 degree line. One column may have only enough connected synapses to exactly represent that pattern, while another column may have all of its synapses connected -- all of the ones that represent that line, plus every other synapse representing every other input. In this case, in without the factor that I brought in, both columns would have the same overlap (given that they have the same boost value). That makes the columns that have most or all of their synapses connected "greedy", in the sense that they can represent many smaller patterns that they don't fit very well. This causes problems when trying to apply CLA to a vision application, and may cause problem in non-visual applications as well. My solution isn't perfect, but to fully address this problem would mean making major changes to how the spatial pooler works. Boosting ======== Boosting is very important for its use in "separating" different patterns so that they share a minimum of active columns. This is necessary in order for the temporal pooler to be able to differentiate between different spatial paterns, and therefore to be able to make clear predictions. I was unable to get useful results out of the boosting algorithm given in the CLA white paper or in the OpenHTM implementation. Boosting is completely implemented in Column's PerformBoosting() method, so a look at the code and comments in that method will help explain the changes I made. In short: - If a column's ActiveDutyCycle is less than a small fraction of its MaxDutyCycle (the maximum ActiveDutyCycle of all the columns in its inhibition radius), then a value (the boost rate) is added to the column's Boost value. This is as described in the CLA. - If a column's Boost rate is increased for the first non-consecutive time (ie. a new period of boosting has begun for it), then each of its connected synapses have their permanence decreased to ConnectedPerm. This makes it easier for the column to come to represent a modified version of the pattern that it currently represents. - A column's boost value is limited to the range from its MinBoost to MaxBoost. MinBoost is a tiny random amount above 1.0, and MaxBoost is a tiny random amount below the Region's given MaxBoost parameter value. The reason for each column having slightly different MinBoost and MaxBoost values is so that multiple columns that have the same overlap and have reached either MinBoost or MaxBoost, will end up having slightly different (post-boosted) overlap values. This is to avoid ties between many columns that have the same overlap value, which could cause many more columns than the specified PercentageLocalActivity should allow, to be active at one time. - If a column would be boosted higher, but has already reached MaxBoost, then its un-connected synapse permanence values are incremented by the boost rate, but they are incremented no higher than ConnectedPerm. If a column has reached MaxBoost and still needs boosting because it hasn't been active, it has no hope of becoming active and so it needs to change to represent a different pattern. Increasing the synapse permanence values gradually gives it an increasing opportunity to represent a different pattern. Segment Updates =============== UpdateSegmentActiveSynapses() has been changed to fix a few problems with the original CLA whitepaper version. - When short repeating loop patterns, such as AAAXAAAX... are given, the CLA quickly learns the full pattern and every cell that makes up part of every representation is in predictive state (1-step or higher) at all times. Because of this, no cell that's part of any representation ever becomes truly inactive, and so no segment updates can ever be negatively reinforced. To solve this, I made it so that if a cell goes from 1-step predictive state to >1-step predictive state, then it negatively reinforces any segment updates that are proven "incorrect" by this change in predictive state. - If a cell is put in predictive state at the same time that it is activated, then in the white paper version and the OpenHTM version, the segment that put it into predictive state would be positively reinforced by its simultaneous activation. This should not be the case however; a prediction implies that the cell will activate in the future, not at the same time. (patterns of simultaneous activation are the spatial pooler's job). UpdateSegmentActiveSynapses() has been modified so that it will not process such a segment update in this case; the segment update will stay in the queue until a later time step, when it can be determined if the prediction (of *future* activation) came true or not. Network File Format =================== As described in th section "Network Files", several new features have been added and made available in the network file format, such as synapse parameters that can be set separately for distal and proximal syanpses, and optionally on a per-region basis, and the ability to specify periods for boosting, spatial learning and temporal learning on a per-region basis. InputSpaces can specify a variety of different types of Patterns, such as text or bitmaps. =========================================================================================== Getting Started =========================================================================================== After getting HTMCLA built and running, I recommend testing the various example network files that can be found in the "data" subdirectory. Open each one in a text editor, and you will find comments at the top of each file with a brief description and instructions. Most of them are specialized toward a particular example. net_test.xml includes many different examples, which can be tested by un-commenting out just the pattern that you want to test. =========================================================================================== Legal Information =========================================================================================== For license information for Numenta's HTM CLA technology, and for Digia's Qt library, see these files in this directory: License_Numenta.txt License_Digia.txt
Containerhouse/HTMCLA
A C++ implementation of Numenta's Hierarchical Temporal Memory (HTM) Cortical Learning Algorithm (CLA). Uses Qt for user interface.
PerlNOASSERTION