timescale/outflux

InfluxDB crashes because of high memory consumption

Opened this issue · 2 comments

This is a problem with InfluxDB. I imagine outflux executes something like SELECT * FROM measurement . This kind of queries are documented to exhaust memory resources on InfluxDB hosts influxdata/influxdb#9313 .
I worked around this issue by provisioning 40GB of RAM instead of the usual 8GB on my google VM, and memory consumption during export of a measurement with around 35M metrics reached 25GB.
As documented in the influx issue, using limits and offsets wouldn't work. What I've seen to make a difference is using date ranges 🤷‍♂️ .

I understand this is not an outflux issue, but solving it would help facilitate the transition from influx to timescale

@blagojts Correct me if I'm wrong, but I believe there is a setting where you can control the range that Outflux chooses to execute queries to avoid this OOM issue? Have you tried toggling the --chunk-size and --batch-size flags?

I've tried using --chunk-size to lower the InfluxDB chunk size to 1000 but the problem persists. It seems that influx has some serious problem handling "big" queries.