influxdata/influxdb

Consider using seconds as the default precision

Closed this issue · 23 comments

InfluxDB currently defaults to nanosecond precision for writes and queries. Most other tools and languages use second precision. Given that

  • a common beginner mistake is using timestamps with second precision (resulting in a bunch of 1970 points that trip people up), and
  • significant compression gains can be realized when using seconds as the precision with the TSM engine,

Should we consider changing the default precision for timestamps to seconds?

For reference, the default precision is set here:
https://github.com/influxdata/influxdb/blob/master/models/points.go#L1241

I'm a big fan of this, currently our approach has been to default to nanoseconds on the database, but recommend that people use seconds. I think it makes sense to default to the time precision that we recommend people use (and what is probably the more common timestamp precision that is necessary too).

Also +1 to most other tools using seconds by default, this would be a boon to new users.

We could add some auto-guessing code for when no precision has been set anywhere. We're talking about three orders of magnitude between precision settings smsusns. I think javascript defaults to milliseconds since epoch. If a ms timestamp gets interpreted as seconds, it'll be over 46000 years in the future, which can't even be represented with an int64 of "nanoseconds since epoch", which only has a span of ±292 years. Going the other direction (interpreting seconds as milliseconds) is still 46 years in the past.

We could simply pick the precision that is closest to the current time with an order of magnitude check.

We would of course still highly recommend setting a precision, but this change would remove some of the shock factor for new users

@joelegasse An order of magnitude check sounds like a straightforward way to handle timestamps with a lack of user-supplied precision. We could even generate a log message if a batch contains a timestamp close to 1970 even at second precision.

didn't think of that, sounds like a good idea 👍

I'd much prefer the order of magnitude check than changing the default.

This is really a problem around the client libraries. Client libraries should make it clear which precision you're using thus they wouldn't have the problem.

I think the order of magnitude check would mean there is no longer a "default" precision, right? If we go that route, I think we should also write a warning to the logs for each batch of points that doesn't specify a precision. Something that will tell them, "We're guessing what you meant, but you've probably done something bad, and you should feel bad..." 😛

I'd be worried about the logs getting spammed with that message if it was printed on every write without a precision. We already generate a ton of logs.

@joelegasse, @gunnaraasen yeah, logging that on every write would be way too loud

Some of the awesome is lost if users don't experience nanosecond support initially, I think. It's very powerful to SHOW that the database is so modern and powerful that it handles nanoseconds with aplomb. If we default to seconds, that's a power-user feature that almost never gets noticed except by the people explicitly looking for it. Not really a strong argument, I know, but I do think it's important to consider the perceptual impact of this change.

Seconds precision is also the default for devops tools, but what are the defaults in the IoT world? What do historians typically use? In APM, milliseconds is the default. The default we pick shows an opinion as to the primary use case. Why not leave that at nanoseconds, which is allegiant to none and forward-looking?

@beckettsean The order-of-magnitude check would replace the concept of a "default precision", and would instead pick the scale that would have the timestamp closest to the current time. Points without a timestamp would still be tagged with the nanosecond-precision time of when they were received by the server.

This check would mitigate some of the confusion/frustration that comes from just assuming an unlabeled timestamp is in "nanoseconds since epoch". It certainly would not be removing support for nanoseconds, but it would mean that users aren't wondering why their data was "lost", when it's really just stored as a couple minutes in to January 1970.

@joelegasse I like the order of magnitude check, provided that query responses continue to provide nanosecond unless otherwise explicitly restricted.

would be nice if line protocol supported a timestamp with unit [h,m,s,ms,us,ns].
https://docs.influxdata.com/influxdb/v0.11/write_protocols/write_syntax/#line-protocol
The default should likely stay as ns

disk_free,tag=t  value=1  timestamp[s,ms,us,ns]
# if a blank timestamp is given could still include the precision
disk_free,tag=t  value=1  [s,ms,us,ns]

I never knew I could get significant compression gains when using seconds as the precision wish that was included in the help page for the line protocol.

@steverweber Specifying the unit on the timestamp will likely be a feature of the next iteration of the line protocol. See the discussion at #6037 for more details. I've added a comment about allowing precision per point without timestamps [s,ms,us,ns] since it hadn't been suggested before.

I've also opened influxdata/docs.influxdata.com-ARCHIVE#372 to get the improved compression benefits documented in more places.

Is there a reason you'd prefer the default to remain ns when no precision is provided, versus the order-of-magnitude check suggested above?

@gunnaraasen Thanks for managing the suggestions.

Is there a reason you'd prefer the default to remain ns when no precision is provided, versus the order-of-magnitude check suggested above?

As a beginner I assumed the timestamp was in seconds, and failed. If/when #6037 is resolved I assume timestamps being used incorrectly in respect to the line protocol will be largely reduced.

Changing the default from ns seems to require some fun code changes to maintain compatibility. Is the added code complexity worth the gains.. I don't know,

Also... what happens when someone really does want to use some strange times that is in a distant past.. Also the fun to read documentation with a paragraph describing this time check nuance.

@steverweber thanks for the feedback!

As a beginner I assumed the timestamp was in seconds, and failed.

This is an initial pitfall which would be greatly improved by auto-setting a precision based on the order of magnitude of the timestamp. The order of magnitude check will only occur when no precision parameter is set.

The change would add some documentation and code complexity. However, we frequently have issues opened by new users who write seconds precision timestamps without specifying precision and are confused when their data shows up at Jan 1, 1970. Doing the right thing in the majority cases feels like it trumps sticking with an overly precise default which actively causes confusion among new users.

In terms of maintainability, only clients that don't already set a precision and write points within specific time ranges (1969-1971 and >2400) will need to be updated and it'll probably be a one line code change for most clients.

For the timeline, I think we'd like to get the new version of the line protocol into the 0.13 or 0.14 release and both line protocol versions would be supported for a couple releases to allow a smooth transition.

I was playing devils advocate. Looks like a good migration strategy.. Like how the influxdata devs are not afraid to nip theses things in the bud /early/.

We talked about this as a group last week and decided to go ahead and roll forward with this for v0.13.0. As a reminder, this is only applicable when a precision isn't specified. In other words, the specified precision will always be used, but in the absence of that, we'll try to intelligently guess the precision based on timestamp magnitude.

If a precision is not set during write, will it truncate a timestamp to 10 digits? I'm seeing our node.js client send ms, but we lose 3 digits in influxDB. (We aren't specifying precision).

Yet a query like select * from request where time > now() - 30m shows no results. Where as, the padded select * from request where time > now() - 2455 weeks (47 years ~ 1970 😭 ) starts to show results.

@shaunwarman if you aren't specifying precision then InfluxDB thinks you are sending nanoseconds, which is why your metrics are close to the epoch.

thanks @sparrc added the precision flag!

stale commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale commented

This issue has been automatically closed because it has not had recent activity. Please reopen if this issue is still important to you. Thank you for your contributions.