datalust/seq-app-htmlemail

Index out of bounds with high frequency of events

Closed this issue · 5 comments

We use Serilog and Seq server for sentralized logging of a number of applications. From Seq, we forward errors (via a signal) to a channel in Slack. We want to build on this by using the Threshold app to further filter what we send to our Slack channel. However, when I tested this app in our test environment, I keep getting IndexOutOfRangeException with the following stack trace:

System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at Seq.App.Thresholds.ThresholdReactor.On(Event`1 evt) in c:\projects\seq-apps\src\Seq.App.Thresholds\ThresholdReactor.cs:line 79
   at Seq.Server.Features.Runner.AppHost.SendLogEventData(ISubscribeTo`1 reactor, DateTime utcTimestamp, Int64 arrivalOrder, UInt32 hash, String document)
   at Seq.Server.Features.Runner.AppHost.Send(StorageEvent storageEvent)

The error in Seq is: The event event-d4a1341f6d0108d3d2d8a90500000000 could not be sent to Test_Threshold

The signal I am testing it with spams our logging server with a lot of messages per minute, so I wonder if this may be an issue related to the capacity of the app. Does the Threshold app have a limit regarding how many events it can process per minute?

Thanks for the report! No, there's no limit; I think this one is a logical error.

https://github.com/continuousit/seq-apps/blob/master/src/Seq.App.Thresholds/ThresholdReactor.cs#L123

  // The C# modulus (%) is a remainder and thus will be negative in this case; adding
  // the bucket length gets us back into the positive index range (though note the bucket
  // will still be in the 'past'.
  eventBucket = _buckets.Length + ((_currentBucket + distance) % _buckets.Length);

The comment points out the invalid assumption; if _currentBucket is 10 and distance is -5 then the modulus will in fact still be positive (and the result out-of-bounds).

I'd guess the issue here is triggered by events coming in out-of-order (which should be fine).

I'll try to repro/test and get a fix pushed up here shortly.

Sorry, that's not entirely correct; it is the broken line of code, but the issue is when _currentBucket is abs(distance), in which case the result is _buckets.Length.

Found a few ways to break it; added a test making 100,000 jittery clock movements inching forwards over time, all stable now and re-published to NuGet.

Thanks for the fix! Everything seems to be working fine now :) Tested with the same signal as before and got no exceptions.

Great, thanks for looping back! :-)