janestreet/core

Segmentation fault in Time.format

copy opened this issue · 18 comments

copy commented

The following segfaults:

open Core
let _ = Time.format (Time.now ()) "%F %T %Z" ~zone:Time.Zone.utc
% corebuild test.native && ./test.native
Finished, 4 targets (4 cached) in 00:00:00.
[1]    19457 segmentation fault (core dumped)  ./test.native

Here's a backtrace:

(gdb) bt
#0  0x00007fbf2f437196 in __strftime_internal () from /usr/lib/libc.so.6
#1  0x00007fbf2f4389e6 in strftime_l () from /usr/lib/libc.so.6
#2  0x0000558cd7f37fdf in core_kernel_time_ns_format_tm (tm=tm@entry=0x7fff13240bd0, v_fmt=94063409845200) at time_ns_stubs.c:66
#3  0x0000558cd7f36d9d in core_time_ns_strftime (v_tm=<optimized out>, v_fmt=<optimized out>) at unix_time_stubs.c:83
#4  0x0000558cd7c9f774 in camlCore__Core_time__format_2861 () at src/core_unix.ml:1825
#5  0x0000558cd7c3fe80 in camlTest__entry () at test.ml:2
#6  0x0000558cd7c3b109 in caml_program ()
#7  0x0000558cd7f5e260 in caml_start_program ()
#8  0x0000558cd7f43045 in caml_startup_common (argv=0x7fff13240df8, pooling=<optimized out>, pooling@entry=0) at startup.c:156
#9  0x0000558cd7f430ab in caml_startup_exn (argv=<optimized out>) at startup.c:161
#10 caml_startup (argv=<optimized out>) at startup.c:166
#11 0x0000558cd7c39b8c in main (argc=<optimized out>, argv=<optimized out>) at main.c:44

This is on 4.06.0+flambda using Core v0.10.0 and glibc 2.26-10.

I originally found the crash happening in async_unix: https://github.com/janestreet/async_unix/blob/05bdffe3f5f620206a28d4ab25a8963f465fd115/src/raw_scheduler.ml#L367

Hmm. Can you verify whether Time.format still segfaults:

  • without %Z in the format string?
  • with the format string "%Z"?
  • with a constant format string (no % at all)?

I expect that Unix.strftime will also segfault. Can you check? Does, say, Unix.gmtime also segfault?

copy commented

without %Z in the format string?

No.

with the format string "%Z"?

Yes.

with a constant format string (no % at all)?

No.

I expect that Unix.strftime will also segfault. Can you check? Does, say, Unix.gmtime also segfault?

This doesn't segfault, it prints "GMT":

open Core
let _ = print_endline @@ Unix.strftime (Unix.gmtime (Unix.time ())) "%Z"

I'm pretty perplexed. I think your Time.format call should turn into essentially exactly that strftime call. I plan to leave this to people more familiar with our open-source packaging process for now, sorry.

Actually, my current guess is that this line of unix_time_stubs.c is implicated:

tm.tm_zone = NULL; /* GNU extension, may not be visible everywhere */

...although that doesn't explain why Unix.strftime does work, since it's the same code. Shrug.

copy commented

I reran my tests a few more times to make sure it wasn't a glitch in the matrix, but I can reproduce the crashes and non-crashes stably.

I also went ahead and set a breakpoint on core_kernel_time_ns_format_tm and printed out the tm argument:

In the crashing variant:

(gdb) print *tm
$2 = {tm_sec = 56, tm_min = 20, tm_hour = 4, tm_mday = 15, tm_mon = 0, tm_year = 118, tm_wday = 1, tm_yday = 14, tm_isdst = 0, tm_gmtoff = 93825002766562, 
  tm_zone = 0x5a5c2c28 <error: Cannot access memory at address 0x5a5c2c28>}

In the non-crashing variant:

(gdb) print *tm
$1 = {tm_sec = 10, tm_min = 22, tm_hour = 4, tm_mday = 15, tm_mon = 0, tm_year = 118, tm_wday = 1, tm_yday = 14, tm_isdst = 0, tm_gmtoff = 0, tm_zone = 0x7ffff7226fd1 "GMT"}

It looks like tm_zone is not properly initialised, and it just coincidentally works when called directly. Note that on my systen __USE_BSD is not set, but is guarded here:

core/src/unix_time_stubs.c

Lines 108 to 111 in d7dac2e

#ifdef __USE_BSD
tm.tm_gmtoff = 0; /* GNU extension, may not be visible everywhere */
tm.tm_zone = NULL; /* GNU extension, may not be visible everywhere */
#endif

Looks like garbage left on the stack from some previous use. Do you think you could set a breakpoint at the top of core_time_ns_strftime and do a memset(&tm, 0, sizeof(struct tm)) and see if that makes the segfault go away in the segfaulting cases?

copy commented

Looks like garbage left on the stack from some previous use. Do you think you could set a breakpoint at the top of core_time_ns_strftime and do a memset(&tm, 0, sizeof(struct tm)) and see if that makes the segfault go away in the segfaulting cases?

Yes, that fixes the segfault.

Yeah so t seems that __USE_BSD is the wrong thing to check, or at least doesn't cover all versions of glibc. On my ubuntu vm those fields are conditionally defined if __USE_MISC is set.

In addition the functions in that file are really inconsistent in their treatment of those last two fields. Might be a good time to do some early spring cleaning in there...

I suspect just switching to __USE_MISC is also wrong and we should find another way to get this right without looking at weird internal macros.

Maybe one could static-assert that sizeof(struct tm) is the one we expect. Relying on getting a macro name right to avoid segfault seems quite brittle.

@hhugo says that sizeof(struct tm) might not work because there are some hidden fields in it (their names starting with underscores). Also arranging this check might be tricky.

In any case I don't understand why we need any conditional compilation: core_timegm doesn't have any and it's fine?

If we do have to tolerate optional absence of these fields after all, maybe unconditional memset at the beginning would be a safer option.

I vaguely think that struct tm blah = { }; should initialize the entire struct to zero. But it's been a while since I last C'd.

hhugo commented

The following seems to work for me.

index ef83439..46e23d3 100644
--- a/src/unix_time_stubs.c
+++ b/src/unix_time_stubs.c
@@ -66,7 +66,7 @@ CAMLprim value core_timegm (value tm_val) {
 
 CAMLprim value core_time_ns_strftime(value v_tm, value v_fmt)
 {
-  struct tm tm;
+  struct tm tm = {0};
   tm.tm_sec  = Int_val(Field(v_tm, 0));
   tm.tm_min  = Int_val(Field(v_tm, 1));
   tm.tm_hour = Int_val(Field(v_tm, 2));
@@ -76,10 +76,6 @@ CAMLprim value core_time_ns_strftime(value v_tm, value v_fmt)
   tm.tm_wday = Int_val(Field(v_tm, 6));
   tm.tm_yday = Int_val(Field(v_tm, 7));
   tm.tm_isdst = Bool_val(Field(v_tm, 8));
-#ifdef __USE_BSD
-  tm.tm_gmtoff = 0;  /* GNU extension, may not be visible everywhere */
-  tm.tm_zone = NULL; /* GNU extension, may not be visible everywhere */
-#endif
   return core_kernel_time_ns_format_tm(&tm, v_fmt);
 }
hhugo commented

A fix is being reviewed internally.

hhugo commented

The fix will be part of the next release.