fast_int

Question

fast_int

xelatihy opened this issue 4 years ago · 1 comments

It would be nice to have a fast string to integer/unsigned parser with similar performance. I suspect it is just a matter of wrapping internal functions with a from_chars-like API. But, without knowing the code well, it is as easy to cook together.

Answer 1 · 2021-06-22T21:00:09.000Z

It would be nice to have a fast string to integer/unsigned parser with similar performance.

The reason it is not provided is that it is a far easier problem and I expect that the current system libraries solve this problem well.

Here are examples for 64-bit examples of highly efficient code...

Parsing unsigned 64-bit integers...

  const uint8_t *p = src;
  const uint8_t *const start_digits = p;
  uint64_t i = 0;
  while (parse_digit(*p, i)) { p++; }
  size_t digit_count = size_t(p - start_digits);
  if ((digit_count == 0) || (digit_count > 20)) { return INCORRECT_TYPE; }
  if (digit_count == 20) {
    if (src[0] != uint8_t('1') || i <= uint64_t(INT64_MAX)) { return INCORRECT_TYPE; }
  }
  return i;

Parsing signed 64-bit integers...

  bool negative = (*src == '-');
  const uint8_t *p = src + negative;
  const uint8_t *const start_digits = p;
  uint64_t i = 0;
  while (parse_digit(*p, i)) { p++; }
  size_t digit_count = size_t(p - start_digits);
  size_t longest_digit_count = 19;
  if ((digit_count == 0) || (digit_count > longest_digit_count)) { return INCORRECT_TYPE; }
  if(i > uint64_t(INT64_MAX) + uint64_t(negative)) { return INCORRECT_TYPE; }
  return negative ? (~i+1) : i;

We may use the following generic function...

template<typename I>
bool parse_digit(const uint8_t c, I &i) {
  const uint8_t digit = static_cast<uint8_t>(c - '0');
  if (digit > 9) {
    return false;
  }
  i = 10 * i + digit; // might overflow, we will handle the overflow later
  return true;
}

You may need to accommodate various bit width (8, 16, 32...) and then you want to handle hexadecimals and so forth.

Pull requests invited.