JuliaLang/julia

^(big(foo), bar) can cause SIGBART

danluu opened this issue · 8 comments

Example:

julia>^(big(96608869069402268615522366320733234710),16374500563449903721)
gmp: overflow in mpz type

signal (6): Aborted
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 1638697868)
__gmpz_n_pow_ui at /home/dluu/dev/julia/usr/bin/../lib/libgmp.so (unknown line)
^ at gmp.jl:337
bigint_pow at gmp.jl:346
^ at gmp.jl:351
jlcall_^;60129 at  (unknown line)
jl_apply at /home/dluu/dev/julia/src/julia.h:988
jl_trampoline at /home/dluu/dev/julia/src/builtins.c:821
jl_apply at /home/dluu/dev/julia/src/julia.h:988
jl_apply_generic at /home/dluu/dev/julia/src/gf.c:1592
unknown function (ip: 2066563356)
unknown function (ip: 2066564620)
unknown function (ip: 2066567534)
unknown function (ip: 2066563698)
unknown function (ip: 2066658267)
jl_toplevel_eval at /home/dluu/dev/julia/usr/bin/../lib/libjulia-debug.so (unknown line)
jl_f_top_eval at /home/dluu/dev/julia/src/builtins.c:425
eval_user_input at REPL.jl:54
jlcall_eval_user_input;60109 at  (unknown line)
jl_apply at /home/dluu/dev/julia/src/julia.h:988
jl_trampoline at /home/dluu/dev/julia/src/builtins.c:821
jl_apply at /home/dluu/dev/julia/src/julia.h:988
jl_apply_generic at /home/dluu/dev/julia/src/gf.c:1592
anonymous at task.jl:96
jl_apply at /home/dluu/dev/julia/src/julia.h:988
jl_trampoline at /home/dluu/dev/julia/src/builtins.c:821
unknown function (ip: 2066609306)
unknown function (ip: 2066612155)
unknown function (ip: 2066610812)
unknown function (ip: 2066610910)
jl_handle_stack_switch at /home/dluu/dev/julia/usr/bin/../lib/libjulia-debug.so (unknown line)
julia_trampoline at /home/dluu/dev/julia/usr/bin/../lib/libjulia-debug.so (unknown line)
unknown function (ip: 4203809)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 4199369)
Aborted (core dumped)

This is probably not specific to ^, since this happens when gmp tries to rellaoc too much space:

  if (sizeof (mp_size_t) == sizeof (int))
    {
      if (UNLIKELY (new_alloc > ULONG_MAX / GMP_NUMB_BITS))
        {
          fprintf (stderr, "gmp: overflow in mpz type\n");
          abort ();
        }
    }
  else
    {
      if (UNLIKELY (new_alloc > INT_MAX))
        {
          fprintf (stderr, "gmp: overflow in mpz type\n");
          abort ();
        }
    }

It seems a bit impolite to crash julia because of this. What do people think is a good fix?

  1. Before calling gmp, try to figure out of the calculation will result in trying to realloc to much space.
  2. Catch SIGABRT and throw a julia error
  3. Patch gmp to return a null pointer instead of throwing SIGABRT
  4. ???

I don't like 1 because it requires doing something different for each function (I think?), but 2 and 3 have their own downsides.

That's galling. Libraries should not call abort. Even more surprising, __gmpz_pow_ui returns void, so it easily could have returned an error code instead.

Have we reported this upstream? Would be nice to have a ticket number.

Running the above in 0.6.0-dev produces

julia> ^(big(96608869069402268615522366320733234710),16374500563449903721)
ERROR: OverflowError()
Stacktrace:
 [1] bigint_pow(::BigInt, ::Int128) at .\gmp.jl:435
 [2] ^(::BigInt, ::Int128) at .\gmp.jl:442

julia>

Seems fixed. Should we test for this, or close?

No, I'm sure there are still cases where GMP can internally overflow and call abort. We have a very minimal check that happens to catch that case (on 32-bit systems).

grep0 commented

Still there in 1.0:

julia> BigInt(3)^3^3^3
gmp: overflow in mpz type

signal (6): Aborted
in expression starting at no file:0
gsignal at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
abort at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
__gmpz_realloc at /home/mvr/julia-1.0.0/bin/../lib/julia/libgmp.so (unknown line)
__gmpz_n_pow_ui at /home/mvr/julia-1.0.0/bin/../lib/julia/libgmp.so (unknown line)
pow_ui! at ./gmp.jl:146 [inlined]
pow_ui at ./gmp.jl:147

Yes, it’s an upstream issue with GMP, which should not call abort like this. If someone wants to track down all the places GMP calls abort and fix them, it would be a worthy service.

This regressed in 1.6:

❯ jl +1.5
julia> ^(big(96608869069402268615522366320733234710),16374500563449903721)
ERROR: OutOfMemoryError()
Stacktrace:
 [1] pow_ui! at ./gmp.jl:171 [inlined]
 [2] pow_ui(::BigInt, ::UInt64) at ./gmp.jl:172
 [3] ^ at ./gmp.jl:576 [inlined]
 [4] bigint_pow(::BigInt, ::Int128) at ./gmp.jl:597
 [5] ^(::BigInt, ::Int128) at ./gmp.jl:602
 [6] top-level scope at REPL[1]:1
❯ jl +1.6
julia> ^(big(96608869069402268615522366320733234710),16374500563449903721)
gmp: overflow in mpz type

signal (6): Aborted
in expression starting at REPL[1]:1
unknown function (ip: 0x7fbd3c08e83c)
raise at /usr/lib/libc.so.6 (unknown line)
abort at /usr/lib/libc.so.6 (unknown line)
__gmpz_n_pow_ui at /home/tim/Julia/depot/juliaup/julia-1.6.7+0.x64.linux.gnu/bin/../lib/julia/libgmp.so (unknown line)
pow_ui! at ./gmp.jl:171 [inlined]
pow_ui at ./gmp.jl:172
^ at ./gmp.jl:579 [inlined]
bigint_pow at ./gmp.jl:600
^ at ./gmp.jl:605

Even though we're still carrying the patch, and we do configure it:

julia> Base.GMP.ALLOC_OVERFLOW_FUNCTION[]
true

Other cases simply segfault now:

❯ jl +1.5 -e '555555555555555555555555555555555555555555555555555^55555555555555555'
ERROR: OutOfMemoryError()
Stacktrace:
 [1] pow_ui! at ./gmp.jl:171 [inlined]
 [2] pow_ui(::BigInt, ::UInt64) at ./gmp.jl:172
 [3] ^ at ./gmp.jl:576 [inlined]
 [4] bigint_pow(::BigInt, ::Int64) at ./gmp.jl:597
 [5] ^ at ./gmp.jl:602 [inlined]
 [6] macro expansion at ./none:0 [inlined]
 [7] literal_pow(::typeof(^), ::BigInt, ::Val{55555555555555555}) at ./none:0
 [8] top-level scope at none:1

❯ jl +1.6 -e '555555555555555555555555555555555555555555555555555^55555555555555555'
signal (11): Segmentation fault
in expression starting at none:1
__gmp_tmp_reentrant_alloc at /home/tim/Julia/depot/juliaup/julia-1.6.7+0.x64.linux.gnu/bin/../lib/julia/libgmp.so (unknown line)
__gmpz_n_pow_ui at /home/tim/Julia/depot/juliaup/julia-1.6.7+0.x64.linux.gnu/bin/../lib/julia/libgmp.so (unknown line)
pow_ui! at ./gmp.jl:171 [inlined]
pow_ui at ./gmp.jl:172
^ at ./gmp.jl:579 [inlined]
bigint_pow at ./gmp.jl:600
^ at ./gmp.jl:605 [inlined]
macro expansion at ./none:0 [inlined]
literal_pow at ./none:0

Bisected to #45375