jimbaker/fl-string-pep

Tagged numbers

Opened this issue · 0 comments

Can we use the same approach from #4 to support tagged numbers?

There's an existing convention in Python of using suffixes, as seen in complex numbers, such as 2 - 4j. This is also discussed on python-ideas in at least this thread: https://mail.python.org/archives/list/python-ideas@python.org/message/3Z2YTIGJLSYMKKIGRSFK2DTDIXXVDGEK/

As with tagged strings and tagged expressions, the tag is a user defined callable and it can be any valid Python name; however, it is used as suffix. The tag is called with the string representation of a number literal, and it returns some object. Examples, some drawn from the above thread:

  • 1.1d is equivalent to Decimal("1.1"). It might be desirable to make this tag (or equivalently D) a builtin function.
  • 3f is equivalent to Fraction("3"), which immediately means that 1/3f results in Fraction(1, 3).
  • 2020_06_28date results in a datetime.date(2020, 6, 28). Similar extensions can be done with time, timestamp, and durations like hour, etc. It's up to the tag to determine the specific parsing, eg the date tag could parse any year because the month and day will always be the least significant 4 digits.
  • Units like cm, m, km, etc, could be defined. At the very least, this could support a nice educational usage model. With a suitable __repr__ defined, this could also output nicely with the original tags used as suffixes.
  • It does seem to mix nicely with the use of _ as a digit grouping separator.

C++ defines some standard suffixes for its support of user-defined literals: https://en.cppreference.com/w/cpp/language/user_literal; it would be interesting to see how widely used non-standard user literals are used in C++ codebases.

Unlike using Decimal("1.1") or Decimal("1.23e10") directly, we are able to use the parser to ensure it is a valid numeric literal during compilation, not during run time (although not limiting with respect to valid ranges, eg 00-23 for hour place). Other radixes could be used as well. To disambiguate the tag for hexadecimal numbers, the number would need to be enclosed by parens, eg (0xff)f would result in Fraction(255). (Maybe the tag should also get the radix as an argument?)

Presumably we can reuse the thunk mechanism such that it supports in effect a user-defined extension of the constant pool (depending on the specific underlying implementation). Once computed, the same value is returned (taking in account object lifetimes).