OpenHFT/Chronicle-Queue

Memory exhaustion due to heap growth caused by CleaningThreadLocal

dharlanh opened this issue · 3 comments

We have encountered memory issues in our product due to heavy usage of Chronicle Queue. Upon investigation, we identified that the CleaningThreadLocal class is causing the problem. This class maintains a static list (cleaningThreadLocals) that holds references to all instances of CleaningThreadLocal created. Over time, this list grows indefinitely, leading to memory problems in the heap.

Our product involves multiple threads utilizing the writer functionality. Each thread invokes writeMessage when necessary, creating an appender and writing to the queue. As for the reader, we have a single thread triggered every 3 seconds to read these messages and publish them elsewhere.

Writer code:

queue = SingleChronicleQueueBuilder.single(projectPath+"/"+queueDir).rollCycle(SparseRollCycles.valueOf(rollCycle)).build();

@Override
public void writeMessage(TelemetryData message) {
    try (ExcerptAppender queueAppender = ThreadLocalAppender.acquireThreadLocalAppender(queue)) {
        String jsonData = JsonFormat.printer().print(message);
        queueAppender.writeText(jsonData);
    } catch (InvalidProtocolBufferException e) {
        log.error("Exception caught while writing message to chronicle queue", e);
    }
}

Reader:

queue = SingleChronicleQueueBuilder
            .single(queueDir)
            .rollCycle(SparseRollCycles.valueOf(rollCycle))
            .build();
        queueTailer = queue.createTailer(PRIMARY_TAILER);

@Override
  public boolean isNewMessageAvailable() {
      return queueTailer.index() <= queue.lastIndex();
  }

  @Override
  public TelemetryData readMessage() {
      try {
          String jsonData = queueTailer.readText();
          if (jsonData == null) {
              return null;
          }
          return parseJson(jsonData);
      } catch(UnsupportedOperationException e) {
          log.error("Fail to read message from chronicle queue ({})" , queueTailer.queue().fileAbsolutePath());
          return null;
      }
  }

During our investigation, we discovered a static method in the CleaningThreadLocal class called cleanupNonCleaningThreads, intended to clean this static list. However, even after invoking this method, the static list continues to grow indefinitely.

heapdump

I took a heap dump and observed a significant number of instances of CleaningThreadLocal, ThreadLocalMap$Entry, and Collections$SynchronizedSet, all indicative of the same underlying issue.

Despite consulting the documentation, we couldn't identify any mistakes in our usage of Chronicle Queue. Therefore, we require assistance in resolving this memory leak issue.

tgd commented

Hi @dharlanh - many thanks for raising this with us. Please can you confirm which Chronicle Queue version you are using? If you are using the chronicle-bom please also provide that version.

Also, please can you confirm what Java version you are running this on?

Can you share with us a runnable piece of code / unit test that can reproduce this issue? That will greatly speed up any resolution.

We do address open source issues but these requests are queued behind customers with commercial support plans in place. For commercial support options please contact us here: https://chronicle.software/contact-us/

HI @tgd, we changed the code that deals with chronicle queue and the problem disapears. The intention wasn´t solve this problem, but it was solved by consequence. I am not sure how this change solved the problem, but it solved. So I think we could close this ticket, at least for us this is not a problem anymore. Tks anyway.

tgd commented

Thanks for letting us know @dharlanh