georgestarcher/Splunk-Class-httpevent

flushQueue has no limit in size and can eat up the whole memory

Closed this issue · 0 comments

As you can see from code, the flushQueue is initialized with maxsize=0

self.flushQueue = Queue.Queue(0)

Through this the flushQueue can potentially eat up all available system memory in case the event producing code batches new events constantly faster than the threadCount threats can send them to the HEC endpoint.

In most cases eating up the memory is likely not the expected behaviour. Instead the flushQueue should have a reasonably sized limit that enshures event bursts get qued up just fine and the threads can send the batched events to the destination HEC endpoint.
In case flushQueue and batchEvents is full, a call to batchEvent may block until a slot becomes available. This mimics the behaviour of the whole Splunk data processing pipeline which starts blocking from the sink (Indexer) up to the producer (e.g. UF monitor) in case Splunk internal queues get filled up. So this behaviour is likely to be expected by Splunk users.

I suggest a default max size of 100 * threadCount for the max Queue size.
That will roughly allow the queue to grow up to ~100MB in Size.
Math:

    maxByteLength = 100000
    threadCount = 10
    maxQueueSize = 100 * threadCount

100,000 bytes * 100 * 10 = 100,000,000 bytes /1024/1024 = 95,4MB + overhead for the data structure itself

That should be a reasonably sized buffer for most situations while at the same time should not bring any modern system into memory issues.