Implementation-defined behaviour in montgomery_reduce()

Question

Implementation-defined behaviour in montgomery_reduce()

rod-chapman opened this issue 3 months ago · 1 comments

Our attempts at formal verification of the montgomery_reduce() function in ref/reduce.c reveal that this function depends on a subtle implementation-defined behaviour - namely a cast from int32_t to int16_t when the valued being converted lies outside the range of int16_t. (See C17 or C23 standards, para 6.3.1.3 for details.)

This is unfortunate from a portability point of view. It also confounds understanding of the code, since the actual semantics typically implemented by gcc and clang is rather non-obvious.

For example, line 20 of reduce.c:
t = (int16_t)a * QINV;
the cast of a to int16_t exhibits implementation-defined behaviour. There is also a second (invisible, implicit) cast there to int16_t that follows the evaluation of the multiplication, just before the assignment to t takes place.

I have developed an alternative implementation that is free from such non-portable behaviour.

Answer 1 · 2024-06-07T03:50:20.000Z

Conceptually, a * QINV is an unsigned computation mod 2^16, so I think the conversion int32_t -> int16_t should be replaced by int32_t -> uint16_t, which is well-defined. However, the result must then be lifted to a signed canonical representative, and I have not come across a C implementation not implementing uint16_t -> int16_t as exactly that (anyone else?). Still, @rod-chapman's right that this is strictly speaking implementation-defined.

I'd be inclined to merely document this limitation (in the code and at the top-level stating the C language requirements for this repository) and change the code to something like this:

int16_t montgomery_reduce(int32_t a)
{
  uint16_t u;
  int16_t t;
  // Compute a*q^{-1} mod 2^16 in unsigned representatives
  u = (uint16_t)a * QINV;
  // Lift to signed canonical representative mod 2^16.
  // PORTABILITY: This relies on uint16_t -> int16_t
  // being implemented as the inverse of int16_t -> uint16_t,
  // which is not mandated by the standard.
  t = (int16_t) u;
  // By construction, the LHS is divisible by 2^16
  t = (a - (int32_t)t*KYBER_Q) >> 16;
  return t;
}

(NB: IIUC, semantically the arithmetic (uint16_t) a * QINV would still be signed (if sizeof(int) > 2) due to integer promotion, but that doesn't matter, again since cast to an unsigned type is always well-defined)