union_vector_sum throws java.lang.IndexOutOfBoundsException
oconnelc opened this issue · 0 comments
oconnelc commented
The VectorUnionSumUDAF is consistently throwing an IndexOutOfBoundsException. The stack trace is:
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:657)
at java.util.ArrayList.get(ArrayList.java:433)
at brickhouse.udf.timeseries.VectorUnionSumUDAF$VectorArraySumUDAFEvaluator.addVector(VectorUnionSumUDAF.java:146)
at brickhouse.udf.timeseries.VectorUnionSumUDAF$VectorArraySumUDAFEvaluator.iterate(VectorUnionSumUDAF.java:114)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:192)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:638)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:813)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:719)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:787)
This is because the following segment of code is attempting to resize the myagg.sumArray
private void addVector(Object listObj, VectorArrayAggBuffer myagg, ListObjectInspector inputOI) {
int listLen = inputOI.getListLength(listObj);
if (listLen > myagg.sumArray.size())
myagg.sumArray.ensureCapacity(listLen);
However the ensureCapacity does not actually resize the array. According to the stack overflow: https://stackoverflow.com/questions/7688151/java-arraylist-ensurecapacity-not-working
ensuring capacity changes the capacity, which is the size the list can reach before it next needs to copy values