[fuzzing] `math.sin` panic: integer part of floating point value out of bounds
squeek502 opened this issue · 5 comments
Test case (format is very weird, sorry EDIT: updated test with a slightly more sane input format):
test "integer part out of bounds f64" {
const float = @bitCast(f64, @as(u64, 0b1111111101000001000000001111110111111111100000000000000000000001));
_ = std.math.sin(float);
}
test "integer part out of bounds f32" {
const float = @bitCast(f32, @as(u32, 0b11100011111111110000000000000000));
_ = std.math.sin(float);
}
Result:
thread 1304409 panic: integer part of floating point value out of bounds
/home/ryan/Programming/zig/zig/lib/std/math/sin.zig:61:13: 0x209656 in sin_ (test)
var j = @floatToInt(I, y);
^
/home/ryan/Programming/zig/zig/lib/std/math/sin.zig:20:20: 0x2092bb in sin (test)
f64 => sin_(T, x),
^
/home/ryan/Programming/zig/zig/lib/std/math/sin.zig:134:12: 0x2076f4 in test "fuzzed" (test)
_ = sin(float);
^
Found via https://github.com/squeek502/zig-std-lib-fuzzing
EDIT: More inputs that trigger this bug:
0b1111111100000000000000001111111111111111111111111111000000000000
0b1110011000010111000000001111111100100011000000000111000000000000
EDIT#2: For reference, here's what glibc's sin
function gives for an input that Zig panics on:
input: 1111111101000001000000001111110111111111100000000000000000000001
result: 0011111111011110100010000101000011100001000001011011111111000100
and musl gives the same as glibc:
result: 0011111111011110100010000101000011100001000001011011111111000100
The issue here is that a codepath for reducing large arguments was missed during the inital port of this code.
Our sin implementation is based on go (which is in turn based on cephes) but the relevant branch can be seen there: https://github.com/golang/go/blob/master/src/math/sin.go#L142
All the test cases provided would instead rely on the trig reduction that we haven't implemented.
This issue effectively boils down to implementing this: https://github.com/golang/go/blob/72c52bfbe2f72fbcacc865e18f132366bdd2effa/src/math/trig_reduce.go#L29 which I suspect should fix all these issues.
This issue will also manifest in our tan and cos, which use similar argument reduction (and are missing this branch as well).
I've somewhat sloppily ported trigReduce
and it seems to be working for f64 but am totally unsure how to adapt it for f32--the mpi4
array especially.
pub const reduce_threshold = 1 << 29;
const pi_divided_by_four = 0.785398163397448309615660845819875721049292349843776;
pub fn trigReduce(comptime T: type, x: T) struct { j: std.meta.Int(.unsigned, @typeInfo(T).Float.bits), z: T } {
if (x < pi_divided_by_four) {
return .{ .j = 0, .z = x };
}
const bit_count = std.meta.bitCount(T);
const TU = std.meta.Int(.unsigned, bit_count);
const TUShift = std.math.Log2Int(TU);
const mantissa_bits = math.floatMantissaBits(T);
const exponent_bits = math.floatExponentBits(T);
const exponent_mask = (1 << exponent_bits) - 1;
const exponent_bias = (1 << (exponent_bits - 1)) - 1;
// Extract out the integer and exponent such that,
// x = ix * 2 ** exp.
var ix = @bitCast(TU, x);
var exponent: i32 = @truncate(u16, (ix >> mantissa_bits) & exponent_mask);
exponent -= exponent_bias + mantissa_bits;
ix &= ~@as(TU, exponent_mask << mantissa_bits);
ix |= 1 << mantissa_bits;
// Use the exponent to extract the 3 appropriate uint64 digits from mPi4,
// B ~ (z0, z1, z2), such that the product leading digit has the exponent -61.
// Note, exp >= -53 since x >= PI4 and exp < 971 for maximum float64.
const digit = @intCast(u32, exponent + 61) / 64;
const bitshift: TUShift = @intCast(TUShift, @intCast(u32, exponent + 61) % 64);
const bitshift_right: TUShift = @intCast(TUShift, @as(u32, bit_count) - bitshift);
const z0 = (mpi4[digit] << bitshift) | (mpi4[digit + 1] >> bitshift_right);
const z1 = (mpi4[digit + 1] << bitshift) | (mpi4[digit + 2] >> bitshift_right);
const z2 = (mpi4[digit + 2] << bitshift) | (mpi4[digit + 3] >> bitshift_right);
// Multiply mantissa by the digits and extract the upper two digits (hi, lo).
const z2mul = std.math.mulWide(TU, z2, ix);
const z1mul = std.math.mulWide(TU, z1, ix);
const z0mul = std.math.mulWide(TU, z0, ix);
const z2hi = @truncate(TU, z2mul >> bit_count);
const z1hi = @truncate(TU, z1mul >> bit_count);
const z1lo = @truncate(TU, z1mul);
const z0lo = @truncate(TU, z0mul);
var carry: TU = 0;
var lo: TU = undefined;
// TODO: I'm not sure this is equivalent to what Go is doing
if (@addWithOverflow(TU, z1lo, z2hi, &lo)) {
carry = 1;
}
var hi = z0lo + z1hi + carry;
// The top 3 bits are j.
var j = @intCast(TU, hi >> (bit_count - 3));
// Extract the fraction and find its magnitude.
hi = hi << 3 | lo >> (bit_count - 3);
const lz = @clz(TU, hi);
const e: TU = exponent_bias - (@as(TU, lz) + 1);
// Clear implicit mantissa bit and shift into place.
hi = (hi << @intCast(TUShift, lz + 1)) | (lo >> @intCast(TUShift, bit_count - (lz + 1)));
hi >>= bit_count - mantissa_bits;
// Include the exponent and convert to a float.
hi |= e << mantissa_bits;
var z = @bitCast(T, hi);
// Map zeros to origin.
if (j & 1 == 1) {
j += 1;
j &= 7;
z -= 1;
}
return .{
.j = j,
// Multiply the fractional part by pi/4.
.z = z * pi_divided_by_four,
};
}
// mPi4 is the binary digits of 4/pi as a uint64 array,
// that is, 4/pi = Sum mPi4[i]*2^(-64*i)
// 19 64-bit digits and the leading one bit give 1217 bits
// of precision to handle the largest possible float64 exponent.
const mpi4 = [_]u64{
0x0000000000000001,
0x45f306dc9c882a53,
0xf84eafa3ea69bb81,
0xb6c52b3278872083,
0xfca2c757bd778ac3,
0x6e48dc74849ba5c0,
0x0c925dd413a32439,
0xfc3bd63962534e7d,
0xd1046bea5d768909,
0xd338e04d68befc82,
0x7323ac7306a673e9,
0x3908bf177bf25076,
0x3ff12fffbc0b301f,
0xde5e2316b414da3e,
0xda6cfd9e4f96136e,
0x9e8c7ecd3cbfd45a,
0xea4f758fd7cbe2f6,
0x7a0e73ef14a525d4,
0xd7f6bf623f1aba10,
0xac06608df8f6d757,
};
Is there a reason as to why the implementation is based on Go and not musl?
@Poetastrophe
The go is based on cephes, which has proper accuracy numbers attached and is performance optimized (by the given IEEE of the writing). I dont know, why google chose to simplify a complex implementation (which is error-prone).
To me their implementation looks like an incomplete translation and not simplification.
Musl has as main goals correctness and maintenance and their implementation is from sun (and much simpler).
Personally I think that the go library should be less trusted, as the accuracy of cephes is "experimentally derived".
Also cephes was not designed to incorporate ‘fused multiply-add’ (FMA), but it can be tricky to estimate error bounds doi.
Can't remember exactly since it was a while ago but the simplicity of the implementation was probably a factor. If I were to implement this again I would probably stick to musl only, so the set of math functions are from a single source.