chronotope/chrono

The call to `localtime_r` may be unsound

quininer opened this issue Β· 42 comments

I found that getenv and setenv in libc are not thread-safe [1], and most impl of localtime_r in libc directly call getenv [2]. This means that localtime_r may have data race with setenv.

In order to ensure soundness of setenv, libstd add a lock to it [1], but this means that using getenv without libstd will be unsound.

This problem is not easy to reproduce on glibc, because glibc's localtime_r caches timezone. but using musl can easily reproduce it.

  1. rust-lang/rust#24741
  2. https://github.com/aosp-mirror/platform_bionic/blob/master/libc/tzcode/localtime.c#L1321 and https://git.musl-libc.org/cgit/musl/tree/src/time/__tz.c#n127

POC: https://gist.github.com/quininer/2063c31b0bc1753989122e782b182bea

Rust libstd has a similar, long open issue, for its own calls to getaddrinfo: rust-lang/rust#27970

Thank you for this!

From reading the associated issues, it seems like touching the environment is the thing that glibc actually considers to be non-thread-safe, is that right? In particular rust-lang/rust#27970 (comment) in particular, it seems like libc wants modifying the environment to be illegal in a multithreaded environment:

functions that modify the environment are all marked with const:env and regarded as unsafe. Being unsafe, the latter are not to be called when multiple threads are running or asynchronous signals are enabled, and so the environment can be considered effectively constant in these contexts, which makes the former safe.

I agree that the solution in the rust stdlib getaddrinfo issue (being able to take the env lock) would help.

I'm trying to think of a good solution here, and it's hard because none of the underlying libraries (libc or stdlib) provide us with the necessary tools. In particular, even if the stdlib provided us with access to the env lock, any other code that interacts with libc directly could inadvertently call setenv and leave us in the same situation. That would qualify as going outside of the safety properties rust is supposed to provide, though.

For anyone landing here from CVE-2020-26235 that states:

In Rust time crate from version 0.2.7 and before version 0.2.23, unix-like operating systems may segfault due to dereferencing a dangling pointer in specific circumstances. [...] The issue was introduced in version 0.2.7 and fixed in version 0.2.23.

... the CVE text is misleading and the issue affects time 0.1 as used by chrono too. See time-rs/time#293 (comment) for details.

Edit: But also, note that the time 0.1 dependency doesn't actually matter because the call of locatime_r happens directly by chrono, not through time. See #602 (comment)

(This is a copy of the Reddit comment I've posted for the record.)

Disclaimer: I have been the initial developer and maintainer of Chrono. I have left the project since then for personal reasons, but I do have a good understanding of Chrono's design and known issues.

It took quite a bit to figure out what was going on, and if my understanding is correct this vulnerability is unfixable under the following constraints:

  1. Chrono has to support the system-local time zone (Local::now() etc).
  2. That system-local time zone has to match what C localtime* etc. will return.

It can look obvious that the solution is to bump the time-rs dependency to 0.3, but the very reason that time-rs 0.1 is required in the first place was to determine the local time zone. Since I've left the local time zone calculation has apparently moved to the chrono::sys internal module, and it has to call localtime_r in order to satisfy 2. Time-rs 0.3 in comparison simply disabled local time zones (i.e. ditch 1) in lieu of the vulnerability, which would be a very huge breaking (and almost unacceptable) change for Chrono.

Back in time I was still in change of Chrono I considered to weaken 2 for the entirely different reason: performance. Someone has already noticed that glibc does cache the time zone and so is less affected by this issue, but glibc still calls stat(2) every time localtime_r gets called in order to track /etc/localtime! Honestly speaking the entire design of POSIX time zone is stupid, /etc/localtime should be tracked with more efficient way (say, inotify) and the userland should be able to subscribe to those changes. And that time zone change event can be easily made to be thread-safe. But we have to support POSIX and its problematic design anyway, so in order to truly fix Chrono it would be necessary to consider whether 2 can be weakened or not first.

For everyone landing here from the chrono security advisory RUSTSEC-2020-0159, note that the issue with chrono is not because of its dependency on time 0.1. It's because chrono itself calls localtime_r in a way that can have the issues described in the advisory. While time 0.1 also has an advisory (RUSTSEC-2020-0071) for calling localtime_r, and while chrono does default to enabling a dependency on time 0.1, chrono never calls the function in time 0.1 that would then call localtime_r. See #602 (comment)

That means that hoping for chrono to update to time 0.3 won't fix the issue. It also means that disabling the oldtime feature of chrono will not fix the issue.

To give some idea about how vulnerable your chrono-using program is, note that the issue happens specifically when two things happen simultaneously:

  1. Something in your program changes the process environment. This could be via std::env::set_var, or directly with libc, or something that isn't even a Rust thread.

  2. Something in your program program calls chrono::Local::now()

If your program does not have both those things happen simultaneously, it will not trigger the issue. (You will of course need to determine this by looking at not just your program's code but also all your deps' code.) Notably, chrono::Local::now() is the only code path in which chrono calls localtime_r. The rest of chrono's API does not call it.

Zooce commented

I'm just thinking out loud here - maybe it will trigger a thought in someone who knows more about this crate -- Since it sounds like a real fix is non-trivial and could take quite some time to address, would grouping the different offsets into features help to mitigate this in the meantime?

I'm only using UTC in my project, so the issue doesn't affect me, but we do use cargo-audit and we get flagged simply for using chrono at all. Not sure cargo-audit takes into account the features actually used in a Cargo.toml, so this may be useless in that case as well, but it's just thought.

It seems like it would have to be a separate crate since Cargo.lock doesn't contain activated-feature information and that's what the analysis is based on. I would use such a crate though, I also don't care the system's local time in all the projects I'm using chrono in.

Zooce commented

Okay, I see. So it might work if chrono was broken up into smaller crates like chrono-format, chrono-utc, chrono-local, etc. Could be interesting, but probably unlikely to be done.

There is also no point to separating getting the localtime out to a separate crate. Either you accept that calling localtime requires environment variables to be static (or for your program to be single-threaded) or you don't ever get the local time or figure out how far your computer is from GMT.

We could add more docs around this, but fundamentally AFAIK this is how getting the local time works and has worked in basically every mainstream programming language for ... decades? I mean, Java uses localtime_r for this purpose. IMO this rustsec advisory is misleading, and for many people it's unactionable.

It is technically broken in multithreaded environments that change env vars after spawning multiple threads (which IIRC is undefined for libc in general anyway) and it would be nice to fix, but (AFAIK) fixing it would rely on fixing or reimplementing libc.

Zooce commented

Yep - I get it for sure. I was only putting it out there as a possible way to mitigate such issues, especially for users who only need a very small subset of the crate's functionality. I do appreciate all of the discussion around this -- it has been very enlightening and helpful.

@quodlibetor I'm not here to argue for splitting it out to a separate crate (I'm not doing that for time), but saying "this is how it's always been done" isn't really an argument against a security advisory.

Yeah, sorry the fact that splitting the separate crate wouldn't really help is mostly unrelated.

IMO the security advisory would be more accurate if it was for the standard library which attempts to allow setting env vars after fork and is the thing that is actually undefined wrt libc. Accessing localtime is not undefined except in that context. Completely eliminating the functionality around local times and offsets is also unacceptible for Chrono, so there's nothing really that can be done here. Unless I'm missing something.

There is also no point to separating getting the localtime out to a separate crate. Either you accept that calling localtime requires environment variables to be static (or for your program to be single-threaded) or you don't ever get the local time or figure out how far your computer is from GMT.

I'm not saying the crate should be split up either. But it's worth noting that it has the advantage that a program that only deals with DateTime<Utc> can be statically (as in via its Cargo.lock) sure that it doesn't have any call path to Local::now(). Tooling that only looks at the lockfile, like cargo-audit, would benefit in that case. Software that only deals solely with DateTime<Utc> is quite common.

We could add more docs around this

Yes. Documenting the chrono function as "might call into libc API that is not safe to call if the environment is being changed simultaneously" is probably the best bet, just like std::env::set_var does.

IMO this rustsec advisory is misleading, and for many people it's unactionable.

The advisory is unactionable in that there is no version of chrono that a user can update to. But it is actionable in that a user can check their code to be sure that the conditions for the issue (environ changing at the same time as Local::now() being called) do not occur.

I agree that it's misleading. It's vague, doesn't mention the conditions for the issue to occur, and links to an issue in the time crate's repo. Even now people are being confused and thinking the problem can be resolved by disabling the oldtime feature or updating the time dep to v0.3 ...

It is technically broken in multithreaded environments that change env vars after spawning multiple threads (which IIRC is undefined for libc in general anyway) and it would be nice to fix, but (AFAIK) fixing it would rely on fixing or reimplementing libc.

IMO the security advisory would be more accurate if it was for the standard library which attempts to allow setting env vars after fork and is the thing that is actually undefined wrt libc.

Multi-threaded C programs calling setenv is never going to change. It's up to the programmer to ensure that those calls do not race with anything that would be affected. There's already Rust code that both relies on changing the environ in a multi-threaded program (unit tests, C library interop, etc) and already makes sure to avoid this problem, so penalizing it by disabling std::env::set_var in the presence of threads won't really help. (The rust-lang/rust issue about getaddrinfo already discussed this idea.)

Completely eliminating the functionality around local times and offsets is also unacceptible for Chrono, so there's nothing really that can be done here. Unless I'm missing something.

I agree. For someone that does want to use DateTime<Local>, they should not be prevented from calling Local::now() if they've taken the precautions to ensure it's okay regardless of how many threads their program has and how many times it's changing its own environ.

Isn't exactly what unsafe is for and the solution to the concern here ?

  • By flagging (and documenting) the related APIs unsafe, the safety responsibility is transferred to the caller.
  • chrono users can ensure their code is safe by "denying unsafe" or not use such APIs.
  • chrono can be flagged as fixed in regard to the CVE.

I personally do want simple UTC offset APIs and am happy that chrono does it in a portable way for me. I am ready to use unsafe code because I have a full understanding of the potential issues in this case.

I agree that it's misleading. It's vague, doesn't mention the conditions for the issue to occur, and links to an issue in the time crate's repo. Even now people are being confused and thinking the problem can be resolved by disabling the oldtime feature or updating the time dep to v0.3 ...

The advisory was filed in haste. If you would like to see it improved, the best way to do that is submitting PRs to the advisory.

https://github.com/rustsec/advisory-db/blob/main/crates/chrono/RUSTSEC-2020-0159.md

Isn't exactly what unsafe is for and the solution to the concern here ?

I do not see a way that localtime_r can be invoked soundly, because that would involve acquiring an OS-specific environment lock in order to be synchronized with get_env/set_env calls coming from other threads:

This lock is not part of std's public API, so I don't see how it's possible for FFI code to synchronize with it correctly.

The safest solution here would be to reimplement localtime_r in Rust, allowing it to go through std's synchronization.

Just to be clear, this only affects *nix-like systems, right? Windows is entirely unaffected by this issue, and so Windows specific software can ignore this CVE?

this only affects *nix-like systems, right? Windows is entirely unaffected by this issue

This depends on the implementation details of the libc you happen to be running with, afaik. For example, the initial bug report stated that it was "difficult to reproduce" on glibc (it's unclear to me if that means "unreproducible" or "rare") but trivial on musl. I don't know if anyone has tested on windows with whatever libc Microsoft ships or the various cross-compatible libc's that are available for it.

This issue is only possible to reproduce in a multi-threaded program that simultaneously (as in, from separate threads) edits the environment and tries to get the current localtime with a libc that is not thread safe.

I do not see a way that localtime_r can be invoked soundly, because that would involve acquiring an OS-specific environment lock in order to be synchronized with get_env/set_env calls coming from other threads:

I agree, and only top level crate can warranty that there are no concurrent get_env/set_env by:

  1. Ensure the top level crate code does not do it.
  2. Ensure none of the dependencies do it, or trust dependencies developers to correctly propagate unsafe if they make use of unsafe chrono APIs.

This is, in my understanding, what unsafe is for and I believe the only solution for this issue is either:

  1. Flag the affected APIs unsafe.
  2. Remove the affected APIs.

Both are breaking change unfortunately.

The solution chosen by time crate, returning an error on unix platform for local time related APIs is not satisfying to me. I understand that library developers are reluctant to introduce unsafe APIs but sometimes you have no choice, and even the std does it.

Am i missing something ? Any blocker to go in that direction (introduce unsafe APIs) ? What alternative do we have ?

CryZe commented

What alternative do we have ?

Porting the localtime_r code to Rust and leaving the env var check out.

I understand that library developers are reluctant to introduce unsafe APIs but sometimes you have no choice, and even the std does it.

Am i missing something ?

When the unsafe keyword is used correctly in Rust programs, it's with the intent that there is some sound usage/abstraction possible so long as certain safety invariants are maintained by the programmer.

The only way to make them sound is to acquire std::sys::unix::os::ENV_LOCK in cfg(unix) environments before making FFI calls into any API that might call getenv/setenv. Unfortunately, ENV_LOCK is not part of the public API, so this isn't possible.

Those APIs will be unsound and cause memory corruption, because it is unsound to concurrently access those APIs from multiple threads.

This really leaves only two solutions:

  1. Request access to std::sys::unix::os::ENV_LOCK from e.g. the libs team, hoping some way of synchronizing access is added to future versions of std
  2. Rewrite localtime_r and other functions that vicariously call getenv/setenv in pure Rust, using the high-level synchronized std::env APIs to query and modify the environment

Request access to std::sys::unix::os::ENV_LOCK from e.g. the libs team, hoping some way of synchronizing access is added to future versions of std

How public ENV_LOCK can help? Crate can call libc::set|get env directly. Also C/C++ library can be linked with Rust crates into final executable, and ENV_LOCK will not be used by C/C++ libraries.

How public ENV_LOCK can help? Crate can call libc::set|get env directly.

The problem is crates are already making FFI calls to these APIs directly, or APIs that invoke them vicariously, without holding ENV_LOCK.

If they were holding ENV_LOCK, the access would be synchronized.

Edit: I opened a rust-internals thread about this: https://internals.rust-lang.org/t/synchronized-ffi-access-to-posix-environment-variable-functions/15475

When the unsafe keyword is used correctly in Rust programs, it's with the intent that there is some sound usage/abstraction possible so long as certain safety invariants are maintained by the programmer.

The only way to make them sound is to acquire std::sys::unix::os::ENV_LOCK in cfg(unix) environments before making FFI calls into any API that might call getenv/setenv. Unfortunately, ENV_LOCK is not part of the public API, so this isn't possible.

I believe there are sound ways to do it outside of your proposal: at the beginning of a program execution when threads have not been spawned yet, or using some thread syncing mechanisms (Channels, Barriers ...). This is of course up to the developer to ensure soundness, this require discipline, but this is possible. That is why I believe having unsafe APIs would help unblock the current situation until we have a RUST version of localtime_r.

You can argue that e.g. set_var should've been marked unsafe in 20/20 hindsight, but the ship already sailed on that in Rust 1.0.0. Making it unsafe now would be a breaking change.

Perhaps it would be best to deprecate set_var in its current form, at the very least.

You can argue that e.g. set_var should've been marked unsafe in 20/20 hindsight, but the ship already sailed on that in Rust 1.0.0. Making it unsafe now would be a breaking change.

Perhaps it would be best to deprecate set_var in its current form, at the very least.

I am just suggesting to mark Chrono API's making direct or indirect use of libc localtime_r unsafe. Local::now() and others.
@quodlibetor : what is your opinion about that ?

It does seem like the only option available to Chrono is to mark everything that goes through localtime_r as unsafe, with documentation about the only way to ensure safety being whole-program analysis to ensure that env vars are not set after threads are spawned.

I think this is a vulnerability in Rust rather than vulnerability in Chrono (see rust-lang/rust#90308).

It's... complicated. That issue is a good summary of some options, but the larger issue is calls to getenv(3) are unsound in the presence of any calls to setenv(3) within a multithreaded program.

std attempted to solve this by placing an RwLock over getenv(3)/setenv(3) when accessed via std:env::var and std::env::set_var respectively, however this lock is not exposed in a manner where a library like chrono could acquire it when making calls to localtime_r.

I found that getenv and setenv in libc are not thread-safe [1], and most impl of localtime_r in libc directly call getenv [2]. This means that localtime_r may have data race with setenv.

I strongly recommend reading the discussion in rust-lang/rust#27970. We seem to have converged on a consensus that reading the environment is safe because writing to the environment is unsafe and pretty much always wrong to do. IMO this was/is a non-issue and the real problem is in anything that calls env::set_var, and that env::set_var isn't marked unsafe. Again, that's explained in the 27970 issue.

I found that getenv and setenv in libc are not thread-safe [1], and most impl of localtime_r in libc directly call getenv [2]. This means that localtime_r may have data race with setenv.

I strongly recommend reading the discussion in rust-lang/rust#27970. We seem to have converged on a consensus that reading the environment is safe because writing to the environment is unsafe and pretty much always wrong to do. IMO this was/is a non-issue and the real problem is in anything that calls env::set_var, and that env::set_var isn't marked unsafe. Again, that's explained in the 27970 issue.

And why it segfaults then?
It shouldn't matter if there is something set or not set in the env in right order or time, there shouldn't be a SEGV!

Semi-related: on certain OSes, chrono calls set_var:

chrono/src/sys/unix.rs

Lines 55 to 78 in 3467172

#[cfg(any(target_os = "nacl", target_os = "solaris", target_os = "illumos"))]
unsafe fn timegm(tm: *mut libc::tm) -> time_t {
use std::env::{remove_var, set_var, var_os};
extern "C" {
fn tzset();
}
let ret;
let current_tz = var_os("TZ");
set_var("TZ", "UTC");
tzset();
ret = libc::mktime(tm);
if let Some(tz) = current_tz {
set_var("TZ", tz);
} else {
remove_var("TZ");
}
tzset();
ret
}

That could probably use its own separate tracking issue. That usage pattern is not thread safe or memory safe.

Semi-related: on certain OSes, chrono calls set_var:

chrono/src/sys/unix.rs

Lines 55 to 78 in 3467172

#[cfg(any(target_os = "nacl", target_os = "solaris", target_os = "illumos"))]
unsafe fn timegm(tm: *mut libc::tm) -> time_t {
use std::env::{remove_var, set_var, var_os};
extern "C" {
fn tzset();
}
let ret;
let current_tz = var_os("TZ");
set_var("TZ", "UTC");
tzset();
ret = libc::mktime(tm);
if let Some(tz) = current_tz {
set_var("TZ", tz);
} else {
remove_var("TZ");
}
tzset();
ret
}

That could probably use its own separate tracking issue. That usage pattern is not thread safe or memory safe.

Yeah, this races with itself even. The same code is in time 0.1 crate too.

I came up with a potential solution - just forked the system library, patched to use Rust for getting env vars and wrote a safe bridge. This was less work than rewriting it in Rust.

I'd appreciate people looking at it and figuring out if it's actually sound and a reasonable solution. I'm willing to hand over the library to the chrono developers or other interested reputable maintainers.

Disclaimer: I don't have much time to work on it further. I just wanted to see if this could actually work.

@Kixunil FWIW there has been quite a significant bit of discussion about various potential solutions on this thread:

https://internals.rust-lang.org/t/synchronized-ffi-access-to-posix-environment-variable-functions/15475

You might want to post your solution there for visibility, although I believe it might have already been discussed/attempted.

LOL, yes, I already did. :D

I just filed #648, about the use of env::set_var, because I think that issue much more clearly needs to be addressed than this one.

For this issue (#499), we have to make a decision: Is it worth trying to make things "more often memory safe" in the case where the environment was being changed by Rust code, even though the general issue isn't fully addressed? Or are we going to say that changing the environment (in another thread) is just plain wrong and thus we're not going to do anything to make it work more often?

I would be fine with doing something like @Kixunil suggested, and I would also be OK with just closing this issue (#499) as "not a bug, working as intended, the bug is elsewhere."

The ideal fix to this particular issue would be a pure Rust reimplementation of localtime_r, and ideally one which can be shared with (or provided by) the time crate.

That would ensure any environment mutation which occurs elsewhere via env::set_var is synchronized.

That said, it seems there is rough consensus that env::set_var should be deprecated in its current form and replaced with an unsafe alternative, which if it were to happen should hopefully encourage people not to use env::set_var capriciously. If that were to happen, it would help obviate the concern here.

The ideal fix to this particular issue would be a pure Rust reimplementation of localtime_r

Yes and I think the next best is a reimplementation that happens to have some bits in C. It behaves the same and the only difference is we don't have a type proof of soundness.

I'm not convinced set_var should be unsafe/deprecated. If all software was written in Rust it'd be non-issue which indicates it's not a Rust design problem but a C problem. Or unsafe code problem if you look at it differently. All the rules around unsafe are meant to identify which piece of code is responsible for a memory problem. Safe code can never be responsible, thus the problem is with the code behind unsafe - one deciding to call into an unsound C library.

Deprecating set_var undermines the whole idea of safe-unsafe contracts. Or maybe the fact we have a discussion about it at all undermines it already.

If all software was written in Rust it'd be non-issue which indicates it's not a Rust design problem but a C problem.

While we're positing hypotheticals, I think that if all software were written in Rust, I don't think we'd have had set_var in the first place. So I don't think that holds much water as to the state of set_var's lack of an unsafe label.

one deciding to call into an unsound C library.

And if the environment becomes Rust-only, Rust loses some of its power as a systems language because it cannot influence any process which is launched from non-Rust code anymore. Bad as the environment (and signals for that matter) are in POSIX land, they're a fact of life that one just has to live with today.

Here’s a pure Rust reimplementation of localtime: https://github.com/x-hgg-x/tz-rs

djc commented

I submitted #677 to address this issue. The next release will be planned in #674.