InfluxGraph/influxgraph

Support for multiple databases as prefix

infernix opened this issue · 3 comments

We're looking to use many different databases with influx due to privilege separation and creating a graphite-api instance for each seems infeasible. I'd like to ask if it is possible to use multiple databases and select them as the prefix as opposed to hardcoding the database in the config file.

So by way of example, say we have db1 db2 and db3 in influx, with a measurement called temperature with field celcius in each. I'd like to use

db1.temperature.celcius
db2.temperature.celcius
db3.temperature.celcius

Are there any reasons why this would not be possible? I've quickly scanned the code but didn't see a quick and dirty way to do this.

If I understand correctly, the intention is to use db1.temperature.celcius as the metric path in order to select the temperature.celcius metric from db1 database?

It is technically possible, however, there are a few issues that will come up:

  • InfluxDB's template configuration which InfluxGraph has ported over does not support database names in templates
  • InfluxDB does support multiple graphite service configurations in order to write to multiple databases but each of these has to run on its own port and with separate templates. Incoming writes need to know in advance which service port writes to which database.
  • With writes in native influxdb line protocol, the database is configured per query. For reads, the DB also has to be specified per query.
  • Adding database configuration support to templates is do-able but is more complicated code wise than running multiple influxgraph instances. Consider that to do this for X as many databases there would have to be X as many influxdb clients, X as many indices to build, X as many template sections configured and so on. The templating code is pretty complicated as it is but 100% matches the influxdb implementation. I'm not particularly in favour of diverging from the influxdb implementation.

OTOH, running multiple instances is pretty easy with docker and that is what I would recommend for this use case. The client will need to target different service port depending on DB to query like in the InfluxDB multiple graphite services case. This could be used in conjunction with a re-writing proxy in between client and influxgraph to route incoming requests to the right backend, or even attempt all of them until it gets a result.

It would be useful to be able to do this, just not sure how feasible it is on a single instance, given the template and indexing requirements. Very open to suggestions on a better way though.

If the DB configuration does not need to be included in the metric path itself but can be computed from a regex on the metric path, like the existing aggregation function configuration, then that would be much simpler. That does mean that the same schema cannot be used for all DBs however.

Another option would be including the DB as part of the Graphite API query as with native influx queries but that obviously means diverging from the standard Graphite API - Grafana would not know to include it.

That first paragraph does indeed illustrate the idea. In my case, I only care about reads; writes all go straight to influxdb.

The thought was to have just an arbitrary (single) graphite instance in grafana. Then, any graphite metric is just prefixed with the db: field that is now configured in graphite-api.yml (influxdb: db: foo). A global flag for influxgraph should be set (prefix_as_db: true) and graphite-api treats it as any other metric. All templates stay exactly as-is (which works for us since the measurements in each database are identical); it's only influxgraph that strips the prefix off queries received through the graphite API and uses that as a variable in queries (influxql: FROM <database_name>..<measurement_name>). But as I understand now, that may not be possible without modifying graphite-api too.

I guess I will run multiple instances then and run each on a separate port. Should be doable up to a few dozen, and if not I suppose scale-out with VMs will have to do.

It's not as simple as that. Influx queries are not compatible with Graphite API queries. InfluxGraph builds an index to translate the SELECT field1 FROM db1.<measurement> to and from <measurement>.field1 and so on metric paths. This is used both when querying metric paths in InfluxGraph (browsing metrics in Grafana) and when querying data to translate to influx queries.

If DB name is made part of the metric path, that index build needs to be repeated per DB. InfluxGraph needs to know the DB to build the index. It cannot know the DB before hand if the DB is part of metric path. It cannot generate metric path without knowing the DB name.

In other words, there is no way to do this without modifying templates to tell InfluxGraph which DB to use for this template so it can then query the DB to generate metric paths based on that template.

This is far more complex than building separate docker images for the DBs but if you want to take a stab at it, can start with template parsing code to support the prefix_as_db flag and extract DB name from the first field in template.

Btw, for handling multiple docker services, see docker-compose sample configuration.