InfluxCommunity/influxdb-ruby

Roadmap to 1.0

Opened this issue ยท 7 comments

dmke commented

With InfluxDB 1.0 on the horizon, I'd like to push this forward, too.

There are a few construction sites and ideas, I'd like to clear beforehand:

Ruby 1.9 support?

MRI 1.9.3 is officially dead since Feb 23, 2015. In commit 12dbb05 (and only there) was noted, that the codebase would break support for Ruby < 2.0. However, the gem was functional until v0.3.2 under 1.9.3.

When I pushed v0.3.3 yesterday, I accidentally (on purpose?) broke Ruby 1.9 builds (see #150, #151).

I'd suggest restoring Ruby 1.9 support in the 0.3.x line and clearly communicate this fact in the README.

Keyword arguments

Going forward, with v1.0, I'd refactor the API to use more keyword arguments where optional={} hash arguments or where optional arguments (precision=nil) are used now.

For backward compatibility, this requires deprecation warnings in the v0.4.x line and/or a compatibility API layer. v0.5 should then remove the old API.

Non-raising methods

This is described in #41. v1.0 definitively should implement this behaviour change, and maybe we introduce this in v0.6.

Better integration tests

See #139, this is on my cap, I'm late :-) Maybe #145 wouldn't have happened (or would have been detected earlier).

Thoughts?

/ping @toddboom

JSON has been dropped as a dependency from rails/activesupport, sdoc, rdoc, httparty, rack, carrierwave, etc. I'd support just dropping it for 1.0.

Problems with it include lack of support for Ruby 2.4 with json 1.x, and the fact that it's repetitive of Ruby 1.9's built-in json support.

dmke commented

@connorshea Thanks for the heads-up. I've started a 0.4-dev branch and removed the json dependency there.

@dmke Thanks for putting this together. Sorry for the delay in getting you feedback!

I'm on board with fixing the Ruby 1.9.x dependency in the 0.3 branch and deprecating it for 0.4 and beyond. The roadmap for keyword arguments and non-raising methods also sound good to me. What are you thinking for timeframe on moving through those versions to 1.0?

Integration tests are always nice to have, but probably don't need to be too tightly coupled to a particular release. They also tend to be a bit of a rats nest, but I'm in favor of them as long as they're not creating a huge drain on development time due to technical/implementation overhead.

dmke commented

What are you thinking for timeframe on moving through those versions to 1.0?

Well... initially I thought to get an alpha out back in July. But (as it always happens), summer came along and I was quite busy at my day job... The latter situation has started to normalize a bit, so I think I can get an alpha release ready in the next weeks.

Integration tests are always nice to have

The infrastructure code (smoke/provision.sh, rake smoke task) is in place, and Travis already performs some checks against 0.10.3-1.0.2 (+nightly), for example: https://travis-ci.org/influxdata/influxdb-ruby/jobs/167185093

These test are still quite shallow (one queries "/ping" to get a version number, and another plays with the NOAA sample data), brittle (the NOAA test sometimes delivers unexpected results) and hacky (I would like to migrate to an RSpec suite, but I couldn't figure out how to reliably stop users accidentally interacting with their local server installation, when they run the tests with rspec instead of rake).

But: The first steps are done, and provide low-ish hanging fruits for the next contributor (I should open a few tickets for this) :-)

dmke commented

FYI, the upcoming 0.4.0 version will drop support for Ruby < 2.2.0, kwargs are (mostly) in place.

In the case of async tasks, what you think in use concurrent-ruby to manage thread pool or queue of jobs?

dmke commented

For high-concurrency applications, concurrent-ruby is a great choice. For this gem, I feel it's a bit overkill: the async writer spawns a single thread and consumes single queue, and in each iteration writes as much data points as possible.

Writing to the database itself is also not very time consuming. A data point is basically a hash that needs to get serialized. Most of the time overhead comes from the network IO (querying data however is much more involved, but this yields no concurrency issue).

Do you have a concrete performance issue when writing data?