yahoo/sherlock

Error Generating Reports: java.lang.ArrayIndexOutOfBoundsException: 1

Closed this issue · 3 comments

I've been using Sherlock for a few days, and it seems to be working fine for some time.
I'm generating hourly reports, and it works for a few hours. However, after a day or two, the reports start to fail, throwing a java.lang.ArrayIndexOutOfBoundsException: 1 error.

I've looked at the logs and this is the error log:

32893 [INFO ] 2018-09-30 12:30:01,105 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - Getting anomaly reports for job [4] with frequency [hour]
32894 [ERROR] 2018-09-30 12:30:01,108 com.yahoo.sherlock.Routes - Error while viewing job report!
32895 java.lang.ArrayIndexOutOfBoundsException: 1
32896     at com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor.decodeAndSetTimestamp(LettuceAnomalyReportAccessor.java:336)
32897     at com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor.getAnomalyReports(LettuceAnomalyReportAccessor.java:311)
32898     at com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor.getAnomalyReportsForJob(LettuceAnomalyReportAccessor.java:119)
32899     at com.yahoo.sherlock.Routes.viewJobReport(Routes.java:626)
32900     at spark.TemplateViewRouteImpl$1.handle(TemplateViewRouteImpl.java:66)
32901     at spark.http.matching.Routes.execute(Routes.java:61)
32902     at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:126)
32903     at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50)
32904     at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189)
32905     at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
32906     at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
32907     at org.eclipse.jetty.server.Server.handle(Server.java:517)
32908     at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
32909     at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242)
32910     at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:261)
32911     at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
32912     at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75)
32913     at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:213)
32914     at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
32915     at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
32916     at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
32917     at java.lang.Thread.run(Thread.java:748)

I've looked at the redis logs and they are not showing any errors. I'm not sure how to fix it.
Any ideas what going on?

I can not conclude exactly what is wrong here without seeing the redis reports data. Is this for all reports (all hourly reports) ? you can debug the issue by logging specific report keys that are being fetched and check their values, by turning on the debug routes, in redis json dump at /DatabaseJson.

Okay, so I added few more log messages to actually print the report info and recompiled to actually see whats going on. I think I've figured out what the problem is, this error seems to occur when there are multiple entries in a report.

For eg, below is the sample info:

[INFO ] 2018-10-03 09:31:38,833 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - report: AnomalyReport(uniqueId=437f467f-7269-43f2-b516-c62cf7ef58e6, metricName=sum_cost, groupByFilters=, anomalyTimestamps=null, queryURL=some-url, reportQueryEndTime=25642486, jobId=6, jobFrequency=minute, status=warning, modelName=KSigmaModel, modelParam=3.0, deviationString=-100, testName=report Test - Minute, hasAnomaly=false)
[INFO ] 2018-10-03 09:31:38,836 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - --->startVals.size: 1
[INFO ] 2018-10-03 09:31:38,836 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - --->val.score: 0.0

all reports having 0 or 1 value for deviationString worked, but as soon as the values increase, the code breaks.

[INFO ] 2018-10-03 09:23:52,721 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - report: AnomalyReport(uniqueId=277b909b-aafd-4093-baa8-755017fe74c9, metricName=sum_cost, groupByFilters=, anomalyTimestamps=null, queryURL=some-url, reportQueryEndTime=25636845, jobId=4, jobFrequency=hour, status=warning, modelName=KSigmaModel, modelParam=3.0, deviationString=-100,-100, testName=report test, hasAnomaly=false)
[INFO ] 2018-10-03 09:23:52,721 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - --->startVals.size: 1
[INFO ] 2018-10-03 09:23:52,721 com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor - --->val.score: 1.0
[ERROR] 2018-10-03 09:23:52,722 com.yahoo.sherlock.Routes - Error while viewing job report!
java.lang.ArrayIndexOutOfBoundsException: 1
	at com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor.decodeAndSetTimestamp(LettuceAnomalyReportAccessor.java:341)
	at com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor.getAnomalyReports(LettuceAnomalyReportAccessor.java:312)
	at com.yahoo.sherlock.store.redis.LettuceAnomalyReportAccessor.getAnomalyReportsForJob(LettuceAnomalyReportAccessor.java:120)
	at com.yahoo.sherlock.Routes.viewJobReport(Routes.java:626)
	at spark.TemplateViewRouteImpl$1.handle(TemplateViewRouteImpl.java:66)
	at spark.http.matching.Routes.execute(Routes.java:61)
	at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:126)
	at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:189)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:119)
	at org.eclipse.jetty.server.Server.handle(Server.java:517)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:242)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:261)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
	at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:75)
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:213)
	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
	at java.lang.Thread.run(Thread.java:748)

It also broke in one of the jobs where a report had deviationString=-110,75,75.

I am able to see the reports for a job by adding a try/catch block in decodeAndSetTimestamp(), which makes it work, but I'm not sure if it is a proper bug fix for this.