StanfordAHA/lassen

CW fp_mult not working properly

Opened this issue · 5 comments

Kuree commented

Example:

inst = asm.fp_mul()
data0 = Data(0x8001)
data1 = Data(0x3f80)
res, res_p, _ = pe(inst, data0, data1)
if CAD_ENV:
    rtl_tester(inst, data0, data1, res=res)

Should get 0x8001, but hardware gives 0x8000
image

Kuree commented

It turns out 0x8001 is a denormal float. Turning on iee_compliance should fix this.

@rdaly525 Can you make necessary changes so that fp_mult also has iee_compliance turned on?

Kuree commented

Even after turning on ieee_compliance for some input vectors it still doesn't work, as shown below:

data0 = 0x3000
data1 = 0x8d21
hardware_out: 0x8002
expected_out: 0x8003

It seems to me that the glue logic is wrong in the float_CW library.

@cdonovick , I remember you did a very nice breakdown of exactly how the glue logic for bfloat (emulating 7 bits of mantissa with 10 bits) breaks rounding in some cases. If you recall where that is, could you link it?

@Kuree I suspect that we will not be able to handle all rounding cases with the verilog glue logic and therefore will not get bit-precise testing. Nikhil was the originator of this glue logic.

@rdaly525 I don't remember exactly where I broke it down but as I recall we we currently simulating bfloat16 in software by doing:

operands = float32(*inputs) # cast to float32
fresult = op(*operands, RNE) # perform operation on operands in float32 under RNE
bresult = bfloat16(fresult, RNE) # truncate to 16 bit using RNE

This corresponds more or less to:

x = RealValue()
result =  bfloat16(float32(x, RNE), RNE)

While this is correct in most case this is not actually equivalent to:

x = RealValue()
result =  bfloat16(x, RNE)

As the first round may round up to a point where the second round also rounds up but would not round up on the initial number.

For example consider:

x = 1.49
xt = round_tenths(x) # 1.5
xi = round_int(x) # 1.0
assert round_int(xt) != xi # 2 != 1

As I recall the software implementation was also not handling NaN correctly.

If we wanted to correctly implement simulated bfloat16 in software we could use the following protocol:
1 set system rounding mode to round toward zero (https://en.cppreference.com/w/cpp/numeric/fenv/feround, https://en.cppreference.com/w/cpp/numeric/fenv/FE_round)
2 after every op check if the result is NaN if so properly construct a bfloat nan.
3 if result is not in a "tie" condition round the result normally.
4 else check if the result was inexact https://en.cppreference.com/w/cpp/numeric/fenv/FE_exceptions
5 if the result was in inexact round away from zero (as the result was rounded toward zero to enter the tie condition)
6 else round to nearest even.

The reasoning for the inexact check is as follows:

x = 0.51
xt = round_tenths(x, RTZ)  # 0.5
xi = round_int(x, RNE) # 1
assert round_int(xt, RNE) != xi # 0 != 1

Another options would be just use round toward zero as round_int(x, RTZ) == round_int(round_tenths(x, RTZ), RTZ) for all x. (with the caveat that the round functions must be NaN preserving)