jeffpierce/cassabon

large batches, or too many for gocql?

Opened this issue · 3 comments

Hey @jeffpierce

We've started seeing an issue where we get a lot of these in our cassandra system.log:

 WARN [Native-Transport-Requests:19140753] 2016-07-20 00:59:47,922 BatchStatement.java (line 226) Batch of prepared statements for [cassabon.rollup_000021600] is of size 113984, exceeding specified threshold of 5120 by 108864.

I followed through all the batching code and it seems straightforward enough, so I set batchsize to something much smaller in cassabon.yaml (it was set at 15000). However at 150 and 1500, while the cassandra too large batch warning subsided, we started getting these in the cassabon system.log:

2016/07/20 00:46:11.768404 [system] [WARN] MetricManager::writer retrying write: gocql: no response received from cassandra within timeout period

My hunch is that gocql is dos'ing itself - when we have smaller batches there are so many more of them that gocql is choking sending them out?

This led me to finding pyr/cyanite#80 (that you commented on) where cyanite removed batches altogether, and doing too much reading (this post, this follow up, and datastax).

However, I don't think that the batches that cassabon is doing are particularly bad. When statements get appended to the batch (https://github.com/jeffpierce/cassabon/blob/master/datastore/metricstore.go#L144) path should be the same many times in a row (I think?) so the batch should be mostly writing to a single partition key. It could probably be changed to an unlogged batch (https://github.com/jeffpierce/cassabon/blob/master/datastore/batchwriter.go#L45) with no ill-effects and a performance boost though.

My thought with fixing this was maybe to drop batches altogether (following cynanite's lead) and have some async channel that handles writing to cassandra. An issue there is making sure the throughput is limited (or limitable) so you don't start overloading cassandra with write requests. But also, given the batches seem to be the most optimal they could be (if it's changed to unlogged) I'm also interested into why turning down the batch threshold size causes gocql to start timing out, and if there's a way the cassabon config could pass something to gocql to deal with this.

Any thoughts or suggestions?

CC @mredivo if you have feedback as well.

Large batches have thus far been a necessity to prevent exactly what you're saying -- a cassabon node (or 6) effectively launching a denial-of-service on a cassandra node due to the huge number of writes -- any backup in that causes a queued pileup which eventually just breaks the whole thing.

It doesn't hurt to give unlogged batches a try to see if there's better performance associated with them, though.

I'll try unlogged batches first (I think we build our rpm from change/cassabon) and report back. My hope is the performance is better so I can decrease the batch size a bit and the problem will go away without having to do something way more complicated with buffering async writes to cassandra.

change/cassabon is quite a bit behind -- it's also missing the Elasticsearch changes that fixed the issues with starting up under load (and just make cassabon quite a bit nicer on ES overall).

Shoot me a PR if the unlogged batches work. Otherwise...yeah, it'll probably take something a bit more complicated like a work queue. If it ends up requiring a work queue, there's an implementation in the index manager.