hatgit/hatnotation

Leading Zeroes Lost

hatgit opened this issue · 1 comments

When an n-bit number contains x-bits of leading zeroes and the decoder tries to map the Hatnotated string back to Binary, any leading zeroes are not showing in the final result.

Base-2 Binary Example (128 bits) with two leading zeroes: 00100110101101111110100111001011011001100011100100010001001001111011100011111010101110000101100010110000000111010010011100111110

Base-16 Hex: 0x26b7e9cb66391127b8fab858b01d273e
Base-10 Decimal Integer: 51465596081329778647712687587606734654
Base-64 Hatnotation: #*_$BPZ!H9}Z[+5Y-7IS_

All of the above three decode back binary with leading zeroes lost: 100110101101111110100111001011011001100011100100010001001001111011100011111010101110000101100010110000000111010010011100111110

Currently, the decoder is not showing leading zeroes as that information is not present when the Encoder computes the notation from a starting hex value. Perhaps starting with a binary string would help but this problem would persist and I don't think there is a method to resolve as those leading zeroes values are also lost when converting binary to other popular notation methods.

Potential consideration: Any software allowing access or recovery should only allow a fixed length character limit so that if a user pasted a value short of that limit it would be left-padded with leading zeroes.

Notes I had on this matter from last year: "even when no leading zero in string, the leading zeroes of the actual base64 character are discarded, as the encoding appears to happen from right to left, see this test string where leading zeroes of first character at index value 2 in the library are discarded, and last character is N 0xb13ae7e331ce9dfa59799e95ee8dc117 which decodes to 2.E+VZCS[T_""@$&N+ZS4N

the easiest of this example can be seen using the Hatnotation library of 64 characters as the input:

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()+,-.}:;<=>?@[{]^_`

which decodes to:

000000000001000010000011000100000101000110000111001000001001001010001011001100001101001110001111010000010001010010010011010100010101010110010111011000011001011010011011011100011101011110011111100000100001100010100011100100100101100110100111101000101001101010101011101100101110101111110000110001110010110011110100110101110110110111111000111001111010111011111100111101111110111111
(which in hex is: 0x108310518720928b30d38f41149351559761969b71d79f8218a39259a7a29aabb2dbafc31ef3d35db7e39eb2f3dfbf)
which when encoded back to hatnotation loses the leading zero (or first 6 zeroes of the above binary string) resulting in it missing from the start of the resulting encoded characters: "123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-.}:;<=>?@[{]^_`"

Adding a leading zero into the hex string doesn't solve this either: 0x0108310518720928b30d38f41149351559761969b71d79f8218a39259a7a29aabb2dbafc31ef3d35db7e39eb2f3dfbf

I am closing this issue as we have noted above there is no solution and the loss of zeroes used to pad binary numbers as those are empty values is the same across other notations systems which do not retain that data.