numenta/htm.java

AnomalyLikelihood encounters ArrayIndexOutOfBoundsException when copying samples

skidder opened this issue · 1 comments

I have consistently encountered an ArrayIndexOutOfBoundsExceptionafter adding enough points to an AnomalyLikelihood instance to where the points array must be copied.

I wrote the following test function in the AnomalyLikelihoodTest class to demonstrate this defect:

    @Test
    public void testLearning() {
        Map<String, Object> params = new HashMap<>();
        params.put(KEY_MODE, Mode.LIKELIHOOD);
        params.put(AnomalyLikelihood.KEY_LEARNING_PERIOD, 300);
        params.put(AnomalyLikelihood.KEY_ESTIMATION_SAMPLES, 300);
        an = (AnomalyLikelihood)Anomaly.create(params);

        for (int i = 0; i < 2000; i++) {
            an.anomalyProbability(0.07, .5, null);
        }
    }

The following exception is thrown:

java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at gnu.trove.list.array.TDoubleArrayList.toArray(TDoubleArrayList.java:715)
    at gnu.trove.list.array.TDoubleArrayList.toArray(TDoubleArrayList.java:690)
    at org.numenta.nupic.algorithms.AnomalyLikelihood.estimateAnomalyLikelihoods(AnomalyLikelihood.java:217)
    at org.numenta.nupic.algorithms.AnomalyLikelihood.anomalyProbability(AnomalyLikelihood.java:179)
    at org.numenta.nupic.algorithms.AnomalyLikelihoodTest.testLearning(AnomalyLikelihoodTest.java:252)

This is due incorrect usage of the samples.toArray(...) function from the [Trove4J library](http://trove4j.sourceforge.net/javadocs/gnu/trove/list/TDoubleList.html#toArray%28int, int%29). The toArray function takes two parameters:

offset - the offset at which to start copying
len - the number of values to copy.

For the len parameter, the AnomalyLikelihood.estimateAnomalyLikelihoods function passes the size of the samples array rather than the number of values to copy, triggering an ArrayIndexOutOfBoundsException.

The following patch corrects this issue:

diff --git a/src/main/java/org/numenta/nupic/algorithms/AnomalyLikelihood.java b/src/main/java/org/numenta/nupic/algorithms/AnomalyLikelihood.java
index 275f1ba..202c9bd 100644
--- a/src/main/java/org/numenta/nupic/algorithms/AnomalyLikelihood.java
+++ b/src/main/java/org/numenta/nupic/algorithms/AnomalyLikelihood.java
@@ -214,7 +214,8 @@ public class AnomalyLikelihood extends Anomaly {
             distribution = nullDistribution();
         }else{
             TDoubleList samples = records.getMetrics();
-            distribution = estimateNormal(samples.toArray(skipRecords, samples.size()), true);
+            final int numRecordsToCopy = samples.size() - skipRecords;
+            distribution = estimateNormal(samples.toArray(skipRecords, numRecordsToCopy), true);

             /*  Taken from the Python Documentation

@@ -226,7 +227,7 @@ public class AnomalyLikelihood extends Anomaly {

              */
             samples = records.getSamples();
-            Statistic metricDistribution = estimateNormal(samples.toArray(skipRecords, samples.size()), false);
+            Statistic metricDistribution = estimateNormal(samples.toArray(skipRecords, numRecordsToCopy), false);

             if(metricDistribution.variance < 1.5e-5) {
                 distribution = nullDistribution();

@skidder Hi, thanks for finding this! Would you like to submit a PR? If you would, you should do so from your own fork on a different branch (not on master). And then we'll merge it after review?