riemann/riemann

Store the events for riemann in an external database

Opened this issue · 3 comments

Is your feature request related to a problem? Please describe.
Currently there is no way to scale riemann, wanted to know if we can store the events in a separate DB/Cache and have multiple riemann use it making riemann scalable.

Describe the solution you'd like
Use of a separate DB to store events/ttl etc before they are sent to their destination(ex: influx)

Describe alternatives you've considered
I haven't figured out any alternatives yet. Suggestions are welcome.

jarpy commented

You might want to try the Riemann Users mailing list to have a conversation about architectural patterns that could help you achieve your goals.

My team, for example, uses Logstash as a routing and queuing layer in front of Riemann. It is configured to send most events to Elasticsearch for storage, and also sends some of them to Riemann. If we needed to, we could use the routing layer to route subsets of events to multiple Riemann instances. Riemann itself is inherently not a distributed application, doing everything in memory. That makes it really fast, but leaves distributed architecture decisions in the hands of the operator.

sanel commented

AFAIK you can't scale Riemann this way, because there are two things to store:

  1. index database, which you might or might not use. This is just a hasmap of internal metrics, before they are expired. This can be sourced relatively easily to external storage, like Redis.
  2. core states through function calls. I don't think this can be easily put somewhere else.

I think, the only "proper" was for scaling Riemann is to use federation, something like Prometheus does [1] and @jarpy mentioned: have one Riemann that accepts all metrics and pass them down to another Riemann instances that will do specific logic, calculations or storing things in a database. Image:

                                +--------> riemann #2
            +------------+      |
  metric -> | riemann #1 | -----+
            +------------+      |
                                +--------> riemann #3

code:

(stream
  (where (metric #"^cpu")
    (forward riemann-2))

  (where (metric #"^disk")
    (forward riemann-3)))

Now, you could scale riemann #1 this way by adding multiple nodes behind of e.g. HAProxy, as long as you just forward events around. Also, if you happen to lose riemann #2, you might not get "cpu" events, but you'll get "disk" events. Not ideal, but better than a single instance.

[1] https://prometheus.io/docs/prometheus/latest/federation/

Currently, the approach that we have is the one mentioned by @sanel is what we have for pseudo-Multi-Az approach .
we have multiple instances of #1 behind an LB