Loop block monitor stops detecting blockages after a while

Question

Loop block monitor stops detecting blockages after a while

Closed this issue 10 months ago · 4 comments

Loop block monitor has a field called printBlockTimeNS which (I think) is there to prevent too much logging for a single blockage... it appears to alllow logging at 1.4x the duration of the last log time out to 20x the configured monitor interval.

I assume this is supposed to get reset (so blockages will again be reported at the configured interval) each time an iteration completes, but that doesn't appear to be happening. If you put a log message in resetTimers() it doesn't appear to ever get called.

I believe the issue is the logic for calling net.openhft.chronicle.threads.ThreadHolder#resetTimers is

        if (startedNS == 0 || startedNS == Long.MAX_VALUE) {
            thread.resetTimers();
            return false;
        }

Which, for a MediumEventLoop will only trigger if you happen to call it during the pauser.pause() that occurs at the end of a NOT busy iteration. If the event loop is always busy, it will never evaluate to true.

As part of this fix we should beef up testing for LBM as it's an important piece of infrastructure.

Answer 1 · 2024-02-22T11:40:31.000Z

Thanks Nick - I'll start looking at this.

Answer 2 · 2024-03-05T02:19:41.000Z

Released in Chronicle-Threads-2.23.34, BOM-2.23.214

Answer 3 · 2024-03-05T02:29:58.000Z

Released in Chronicle-Threads-2.24.20, BOM-2.24.108

Answer 4 · 2024-03-05T09:50:31.000Z

Released in Chronicle-Threads-2.25ea5, BOM-2.25ea30