janfri/mini_exiftool

#create_date applies local time zone to UTC timestamps

rlue opened this issue · 6 comments

rlue commented

exiftool provides timestamps with time zone data:

$ curl -O https://private.ryanlue.com/2021-03-26_223527.dng
$ exiftool 2021-03-26_223527.dng | grep Date
File Modification Date/Time     : 2021:05:04 10:36:20-07:00 # PDT
File Access Date/Time           : 2021:05:04 10:36:08-07:00 # PDT
File Inode Change Date/Time     : 2021:05:04 10:36:20-07:00 # PDT
Modify Date                     : 2021:03:26 22:35:27       # UTC
Date/Time Original              : 2021:03:26 22:35:27       # UTC
Create Date                     : 2021:03:26 22:35:27       # UTC

mini_exiftool takes UTC timestamps and erroneously applies the local time zone to them:

$ ruby -rmini_exiftool -e "puts MiniExiftool.new('2021-03-26_223527.dng').create_date"
2021-03-26 22:35:27 -0700 # should be UTC, but is PDT

This behavior appears to occur with all filetypes (tested with .jpg, .dng, and .mov files)

There is a difference bettween the file attributes (File Modification Date/Time, File Access Date/Time, File Access Date/Time) and tags saved in the EXIF-metadata of the file (Modify Date, Date/Time Original, Create Date):

  • File attributes are part of the file system and interpreted by the operating system and therefore it's clear, which zone is to use.

  • Tags stored in (EXIF-)metadata must be interpreted by ExifTool. Timestamps in the EXIF-metadata are stored as Strings (see here). They don't include time zones in most cases (as in your example). Now the question is, how are they stored: as local Time or in UTC? Many users (including me) using the local time zone on their cameras so the stored strings has to be interpreted in local time. Therefore this is the default interpretation of such timestamps. If you want to change this you can use the timestamps-option to switch to DateTime instances which are interpreted in UTC if no time zone offset is given:

$ ruby -Ilib -rmini_exiftool -e "puts MiniExiftool.new('2021-03-26_223527.dng', timestamps: DateTime).create_date"
2021-03-26T22:35:27+00:00

I hope this helps. It's a little bit complicated. :) Feel free to ask further, if anything is unclear or I misunderstand something.

There is also a complication with UTC offsets not always being the same.

For example, the timezone I’m in (America/Halifax): sometimes local time is -0300 and sometimes it’s -0400. This makes interpreting “naked” times difficult. My Zoom recorder for example has no idea where it is in the world so it stores timestamps as 2021:05:04 10:36:20 and it’s up to me to know what timezone offset it is for each file.

Is that what we’re talking about here?

rlue commented

@janfri thanks for your very prompt response.

Photos

Timestamps in the EXIF-metadata are stored as Strings (see here). They don't include time zones in most cases (as in your example).

TIL! I looked into it further and have learned that...

Videos

But wait! exiftool isn't just for reading EXIF metadata—EXIF is for .jpg, .tiff, .png, and .wav, but exiftool can also read video files.

Video timestamps are sometimes in UTC, sometimes in a "naked" local time zone—depending on whether they end with a "Z". ffprobe reports this information, but exiftool does not 🤦‍♂️:

$ ffprobe <(curl -s https://private.ryanlue.com/R0022792.MOV) 2>&1 | grep creation_time
    creation_time   : 2021-05-03T15:48:43.000000Z
      creation_time   : 2021-05-03T15:48:43.000000Z
      creation_time   : 2021-05-03T15:48:43.000000Z
$ exiftool <(curl -s https://private.ryanlue.com/R0022792.MOV) | grep Date
Track Create Date               : 2021:05:03 15:48:43
Track Modify Date               : 2021:05:03 15:48:43
Media Create Date               : 2021:05:03 15:48:43
Media Modify Date               : 2021:05:03 15:48:43
Modify Date                     : 2021:05:03 15:48:42
Date/Time Original              : 2021:05:03 15:48:42 # missing "Z"
Create Date                     : 2021:05:03 15:48:42 # missing "Z"

This is a bug in exiftool and not mini_exiftool, but it affects mini_exiftool all the same.

FWIW, even with this information, video timestamps/zones may still be wrong just because devices don't handle time zones well: the sample .mov file above was recorded on a point-and-shoot camera (Ricoh GR). Its clock was set to local time, and the video was taken at 15:48 local time, but the camera records it as 15:48 UTC because the camera doesn't have a "local time zone" setting.

Solutions?

I would like to suggest that the current implementation is inadequate. For instance:

Many users (including me) using the local time zone on their cameras so the stored strings has to be interpreted in local time.

This works fine as long as you stay in the same place forever, but if you move to another country, now, mini_exiftool reports the wrong timestamps for all photos in your library prior to your move. (Perhaps more relatably, the same problem arises for photos taken on vacations to places in other time zones.)

There's no silver bullet, but I think:

  1. Video timestamps should be treated as UTC by default.
  2. When reporting EXIF timestamps, mini_exiftool should check for the presence of a corresponding OffsetTime* tag and apply it, if found.
  3. If no OffsetTime* tags are found, mini_exiftool should check for the presence of a GPSTimeStamp tag and compute the offset from there. Two problems with this:
    • GPSTimeStamp may not be accurate if the device had no GPS signal at the time the photo was taken
    • GPS time is not identical to UTC time. However, the difference is currently only 18 seconds, which is still a UTC offset of 0.

I'm willing to work on a PR for this if you think it's a good idea.

I would like to suggest that the current implementation is inadequate.

Correct. No objections. 😃

There's no silver bullet, but I think:

1. Video timestamps should be treated as UTC by default.

2. When reporting EXIF timestamps, mini_exiftool should check for the presence of a corresponding `OffsetTime*` tag and apply it, if found.

3. If no `OffsetTime*` tags are found, mini_exiftool should check for the presence of a `GPSTimeStamp` tag and compute the offset from there. Two problems with this:
   
   * `GPSTimeStamp` may not be accurate if the device had no GPS signal at the time the photo was taken
   * [GPS time is not identical to UTC time](https://en.wikipedia.org/wiki/Global_Positioning_System#Timekeeping). However, the difference is currently only 18 seconds, which is still a UTC offset of 0.

Yes, there is no silver bullet.

But I don't like to have more "magic" in mini_exiftool when interpreting timestamp strings. This confuses the users of the library and I get more issues on the topic of handling timestamps.

My idea is to do two things:

  1. Implementing the posibility to give the time zone as option in MiniExiftool.new. This is a straight forward way and totally clear for any users. Yes, this has a problem with daylight saving when the user doesn't know on which date the timestamps are and/or when which zone is the correct dependant on the date. Further more if there are different dates in one file in different zones (for example a vacation photo has a Creation Date in one time zone and a Modify Date in another when changing the file later at home at another time zone).
  2. Implementing a way for the user to adapt the conversion of values to own demands. For example to your points 1 to 3, but explicit in the user's code. So it's transparent for the user. Something like that I have implemented already in multi_exiftool, my other library for ExifTool. I think of something similar to this.

What do you think?

Please give me time to think a little bit more about the topic. Maybe there are further ideas.

Just some additional thoughts as a user:

I also do not like ‘magic’ in libraries. I’ve chased down many bugs that ended up being caused by some hidden interpretation.

require 'active_support/core_ext/time’ allows any script to support named time zones where the offset is calculated automatically. Time.at(unixtime).in_time_zone('America/Halifax'), for example.

rlue commented

These are all good points. Looking over this discussion with fresh eyes, I think I got over-enthusiastic and took this issue beyond its original scope.

I take back what I said about the current implementation being inadequate 😅. Too much magic is definitely an invitation for confusion and trouble. Trying to determine correct time zones on EXIF metadata is not a reasonable goal; in many cases, the data just isn't there. It could be good to give the user a way to set their own custom time zone (or write a custom #convert method to transform it, as in multi_exiftool), but now that I think about it, that's not even what this issue was originally about.

This issue is about the fact that exiftool can read both EXIF and generic metadata, and these two types of metadata have different conventions for timestamps. In my limited observation and research, generic metadata (for various filetypes like .mov, .mp4, .m4a, etc.) usually stores timestamps in UTC. But here's the big problem—the exiftool CLI doesn't provide this information. This is a limitation of exiftool, not the mini_exiftool Ruby gem.

For me, the most straightforward solution to this problem is just to print a warning to stderr for audio/video files—something like this:

##{timestamp_attr}: Time zone may be incorrect. Audio/video timestamps are typically in UTC, but exiftool does not output this data.

That being said, I'd also be okay with closing this issue without further action. Now that I understand the problem better, I think it might not be worth anyone's time to try to work around the limitations of 1) exiftool and 2) everyone else's inconsistent timestamping patterns.