grafana/metrictank

MT-Whisper-Importer-Writer can get stuck on invalid requests

replay opened this issue · 1 comments

I have seen a case where a user was trying to import data using a wrong storage-schemas.conf. Because of the wrong storage schemas the importer-writer was trying to write this data into a BT Column Family which did not exist, and failed to do so:

2020-12-24 13:01:46.169 [ERROR] btStore: failed to write 1835 of 1835 rows. first error: rpc error: code = NotFound desc = Error while mutating the row '<Metric ID>' (<BT table>) : Requested column family not found.
2020-12-24 13:02:01.098 [ERROR] btStore: failed to write 6 of 6 rows. first error: rpc error: code = NotFound desc = Error while mutating the row '<Metric ID> (<BT table>) : Requested column family not found.
2020-12-24 13:02:09.133 [ERROR] btStore: failed to write 5000 of 5000 rows. first error: rpc error: code = NotFound desc = Error while mutating the row '<Metric ID>' (<BT table>) : Requested column family not found.

The importer-writer is using the Metrictank store implementation to write data to BT/Cassandra, which retries forever if an insert fails. Because of this unlimited retry behavior the whole write queue was stuck and nothing could be imported anymore. I had to restart the whisper-importer writer to unblock it.

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.