jqlang/jq

strptime/1 ignores ISO-8601 TimeZone (format string "%z")

vintnes opened this issue ยท 5 comments

~ uname -a | awk '$2 = "REDACTED"'
Linux REDACTED 4.9.0-13-amd64 #1 SMP Debian 4.9.228-1 (2020-07-05) x86_64 GNU/Linux

~ jq/jq --version
jq-1.6-128-ga17dd32-dirty

~ jq/jq -nc \
' "2020-10-15T17:30:00-0400"
, "2020-10-15T17:30:00+0000"
| strptime("%FT%T%z")
'
[2020,9,15,17,30,0,4,288]
[2020,9,15,17,30,0,4,288]

Here is my absolutely horrific regex workaround with optional fractional seconds and colon offset.

def fromdateiso8601offset
  : capture
    ( "^"
    + "(?<datetime>[-:0-9T]{19})"
    + "(?<subseconds>\\.[0-9]+)?"  # optional
    + "(?<offset_sign>[-+])"
    + "(?<offset_hours>[0-9]{2})"
    + ":?"
    + "(?<offset_minutes>[0-9]{2})"
    + "$"
    )

  | .datetime += "Z"
  | .subseconds //= 0
  | .offset_sign += "1"  # string math ftw

  | (.subseconds, .offset_sign, .offset_hours, .offset_minutes) |= tonumber

  | (.datetime | fromdateiso8601)
  + ( ( .offset_hours * 3600
      + .offset_minutes * 60
      )
    * .offset_sign * -1  # the Earth rotates eastward
    )
  + .subseconds
  ;

๐Ÿ‘ +1

echo '{"time":"2021-02-16T15:36:29+0000"}{"time":"2021-02-16T15:36:29+0100"}' | 
jq '{input: .time, hour: (.time | strptime("%Y-%m-%dT%H:%M:%S%z") | strftime("%H") ), epoch: (.time | strptime("%Y-%m-%dT%H:%M:%S %z") | mktime)}'

โŒ CentOS7:

$ jq --version
jq-1.6
---
{
  "input": "2021-02-16T15:36:29 +0000",
  "hour": "15",
  "epoch": 1613489789
}
{
  "input": "2021-02-16T15:36:29 +0100",
  "hour": "15",
  "epoch": 1613489789
}

โœ… MacOS (Big Sur):

$ jq --version
jq-1.6
---
{
  "input": "2021-02-16T15:36:29+0000",
  "hour": "15",
  "epoch": 1613489789
}
{
  "input": "2021-02-16T15:36:29+0100",
  "hour": "14",
  "epoch": 1613486189
}

@vintnes thanks for the example above! ๐Ÿ…

We found l23 should be:

* (.offset_sign * -1)

else the offset is applied backwards.

Working example:

echo '{"time":"1970-01-01T02:00:00-0100"}{"time":"1970-01-01T02:00:00+0000"}{"time":"1970-01-01T02:00:00+0100"}' | 
jq -r 'def fromdateiso8601offset
  : capture
    ( "^"
    + "(?<datetime>[-:0-9T]{19})"
    + "(?<subseconds>\\.[0-9]+)?" # optional
    + "(?<offset_sign>[-+])"
    + "(?<offset_hours>[0-9]{2})"
    + ":?"
    + "(?<offset_minutes>[0-9]{2})"
    + "$"
    )

  | .datetime += "Z"
  | .subseconds //= 0
  | .offset_sign += "1" # string math ftw

  | (.subseconds, .offset_sign, .offset_hours, .offset_minutes) |= tonumber

  | (.datetime | fromdateiso8601)
  + ( ( .offset_hours * 3600
      + .offset_minutes * 60
      )
    * (.offset_sign * -1)
    )
  + .subseconds
  ; {time: .time, epoch: (.time | fromdateiso8601offset), utc: (.time | fromdateiso8601offset | strftime("%Y-%m-%dT%H:%M:%S %z"))}';
{
"time": "1970-01-01T02:00:00-0100",
"epoch": 10800,
"utc": "1970-01-01T03:00:00 +0000"
}
{
"time": "1970-01-01T02:00:00+0000",
"epoch": 7200,
"utc": "1970-01-01T02:00:00 +0000"
}
{
"time": "1970-01-01T02:00:00+0100",
"epoch": 3600,
"utc": "1970-01-01T01:00:00 +0000"
}

Thanks @gpearce for reminding me about the rotation of the earth.

I modified the regex provided by @vintnes to bypass strptime entirely and created a fork. I'm not sure what the performance implications are of using a regex as opposed to strptime (or sscanf on systems that don't have it); I also don't know what the performance goal is for jq. That said, my gut instinct is that (a) fromdateiso8601 isn't so widely used as to be a performance bottleneck, and (b) the regex is simple enough that it can be executed quickly (there's no backtracking, for example).