/alphanumerize

Convert numbers to an alphabet format

Primary LanguageJavaScriptMIT LicenseMIT

Alphanumerize

Convert numbers to an alphabetical format. Useful for creating lists numbered with letters rather than digits 0-9.

For example (default behavior):

Number Alphanumerized
0 ''
1 'a'
2 'b'
... ...
26 'z'
27 'aa'
28 'ab'

Installation

npm install alphanumerize

Usage

const alphanumerize = require('alphanumerize');
// by default, the alphabet used is the english alphabet
let a = alphanumerize(1);
let aa = alphabetical(27);
// you can assign any other alphabet using the options argument
a = alphanumerize(1, { alphabet: 'abcdefgh' });
aa = alphanumerize(9, { alphabet: 'abcdefgh' });
// you can preset an alphabet by passing the options argument first
const base8numerize = alphanumerize({ alphabet: 'abcdefgh' });
aa = base8numerize(9);
// and then overwrite the preset alphabet
aa = base8numerize(27, { alphabet: 'abcdefghijklmnopqrstuvwxyz' });

Motivation

One day I wanted to display a list of items indexed by alphabetical characters, for a softer look. Let's use a list of fruit: 1) Apple, 2) Banana, 3) Cucumber, 4) Durian. I want to change the numbers to letters using the following pattern: a) Apple, b) Banana, c) Cucumber, aa) Durian.

At first, I thought a simple change of base algorithm would work. The numbers are in positional notation, while the letters are part of a base 3 number system with numerals [a,b,c]. However, converting the list of numbers to alphabet-based positional number system converts the sequence 1,2,3,4 to a,b,c,ba. This is not what I'm looking for, it makes me feel like a part of the sequence was skipped. That's because representing decimal numbers treats each digit as a coefficient to a polynomial equation. My alphabetic number system treats digits as tallies, more like roman numerals than arabic numerals.

Take a look at the following sequences of letters, one where the alphabetic numerals are used in a positional notation system, and the second as used in a tally system.

positional tally

Do you notice the difference? A tally number system will carry along previous values with each additional numeral, where a positional number system will incorporate the previous values.

So let's describe the tally system with some math, and see if we can create an algorithm for constructing a sequence of alphabetic tallies given a number. It turns out it isn't as straightforward as I thought.

First, a few definitions. The alphanumerize function will take a number n and alphabet A with length m, and return a sequence of digits d_1...d_S of length S, the alphabetic representation of n. Each digit d_i will be a number in 1-to-1 correspondence with a character in the alphabet, where i is the digit's position in the sequence and 1 <= d_i <= m.

For example, to represent the number 9 in the alphabet [a,b,c], note that m = 3 and the map of numbers to characters is {1:a, 2:b, 3:c}. So the sequence of digits will be bc (or 21), counting up from our list of tallied numbers from the above table.

Can we turn this into an algorithm?

Working backwards from our previous example, we find the number 9 can be decomposed into 9 = 2/3 * 3^2 + 3/3 * 3^1 = 2 * 3 + 3 * 1. Here we have the first hint at how to describe the number using letters. 3 is m, or the size of the alphabet. 2 is the letter b from our correspondence, and 3 is also the letter c. Let's replace some values: 9 = 2 * 3^1 + 3 * 3^0 = A[b] * m^1 + A[c] * m^0. Notice the pattern yet? Let's put this in a more generalized form using the definitions above:

Hey look, a sum of a geometric progression! That might be useful :) I'm trying to generate the character sequence for a number n and it would be nice to know how long the sequence describing it is, so let's solve for S.

We know that the number n will have a sequence of length S. Thinking about it, that means n will be less than or equal to the maximum value that can be represented by a sequence of length S, and greater than the maximum value that can be represented by a sequence of length S - 1. The max value for a digit d_i is m, remember? Therefore in math speak, the bounds for n once we substitute the d_i above for m will look like:

Can we use this relation to find the length S of the sequence? Yes! Remember that the sum of a geometric progression is described by a simple equation (thanks Knuth!):

Therefore, if we take j = i - 1 we rewrite the inequality to

and applying the rule about sums of geometric progressions we find that

Nice! We're now at a place with no summation, so we will try to solve for S given n and m.

Our work gives us a new inequality relation for S

The difference between the left and right sides of the inequality is exactly 1. Because S is an integer, that means there is only one possible value for S in the boundary described by the inequality. Because S is greater or equal to the lower bound, that means the first integer encountered when counting up from the lower bound will be S! The function that gives you the closest integer greater or equal to the input is the ceiling function. So our final equation for S is

Yay! Now we have a simple function for finding S. Now let's figure out how to produce the digits d_i of the sequence for the number n.

Finding the digit sequence

TODO