Correct way to specify last modified date?
Closed this issue · 13 comments
Hello,
I'm using zip-archive
version 0.4.3.
I have this pseudocode:
import "zip-archive" Codec.Archive.Zip qualified as ZArch
import Data.Time.Clock.System (getSystemTime, systemSeconds)
systime <- liftBase getSystemTime
let lastModified = fromIntegral $ systemSeconds systime
let e = ZArch.toEntry fpath lastModified bscContents
ZArch.fromArchive (ZArch.addEntryToArchive e ZArch.emptyArchive)
So basically this should create a zip file with some entry where last modified
is current local time. However, from what I see in my archive viewer, the last modified
time is set to UTC.
To be more precise:
st <- getSystemTime
systemToUTCTime st
returns 2024-04-04 10:05:34 UTC
currently. The local time is 12:05:34
. So the st
value is correct. However, the zip file shows 10:05:34
LOCAL time.
I see that the C library differentiates between zip_file_set_mtime
and zip_file_set_dostime
(https://libzip.org/documentation/zip_file_set_mtime.html):
Following historical practice, the zip_file_set_mtime() function translates the time from the zip archive into the local time zone. If you want to avoid this, use the zip_file_set_dostime() function instead.
Since zip-archive
is a native Haskell library, I'm guessing that there are 2 different fields for this? Maybe we set the wrong one?
Here's what the Haddocks say:
, eLastModified :: !Integer -- ^ Modification time (seconds since unix epoch)
We just have this one field. toEntry
also uses as a parameter "seconds since unix epoch."
readEntry
uses
modEpochTime <- (floor . utcTimeToPOSIXSeconds) <$> getModificationTime path
Can you see a problem here?
PS getModificationTime
returns a UTC time, according to its documentation.
I guess I'd assumed that "seconds since the unix epoch" means "relative to UTC time." Could it be that it is interpreted relative to local time?
I might have just used UTC because this library exports pure functions (no IO) -- hence no way to get the locale's time zone. We could, however, make it a parameter on one of the exported functions.
https://en.wikipedia.org/wiki/Unix_time
Unix time is currently defined as the number of non-leap seconds which have passed since 00:00:00 UTC on Thursday, 1 January 1970, which is referred to as the Unix epoch.
https://www.epochconverter.com/clock this seems to be correct with that definition and provides the exact time as returned by getSystemTime
.
If we sticked with this definition, then a number 1712319469
should be interpreted as seconds since 1970-01-01 00:00:00 UTC
. I guess that sticking seconds directly into eLastModified
as done above would be correct according to the "unix epoch" definition. Also, if eLastModified
wasn't relative to UTC time, we couldn't reliably send the zip file to another time zone as eLastModified
doesn't encode the local timezone.
So at least in theory it all seems to be done correctly. But I don't understand why ark
or other program shows this as -2h from current date...
https://www.ghisler.ch/board/viewtopic.php?t=80315
ZIP files store file times as local time, while the NTFS file system stores them as UTC (universal time). So when switching from/to daylight saving time, either the first or the second will change by one hour.
ZIP stores the time as local time only (no timezone).
If you take a photo of sunrise at 6.00 UTC+6 in the morning, pack it as zip and you unpack it at a city at timezone UTC-6 you will still see 6.00 in the morning.
You see:
- The sunrise photo is taken in the morning of the origin city.
- You have to calculate your own, what time it was at your own city.
So, apparently eLastModified
accepts unix epoch time adjusted for local time zone?
Anyways, the fact that zipping with zip-archive
is a pure function is the reason I used this library in the first place :)
Actually, the line I quoted above comes from readEntry
which is in IO.
So here we could get the locale time zone.
OK, I think I've fixed this. Can you test as well?
The fix only affects readEntry
, which is the only part of this that actually looks at the modification time of a file.
If it looks okay, we can do a new release.
Yes, this might be a solution. However, I still use the pure toEntry
because I don't read files but generate them programmatically. But the getTimeZone * 60
seems to be the way to solve this.
If you use pure toEntry
, then you are specifying the modification time yourself. So you just need to be aware that it expects a time-zone relative unix epoch time. This is not a problem with the library itself, unless I'm missing something.