MultiCode

Data encoding for human input

A combination of Reed-Solomon forward-error-correction codes, and a specific binary-to-text encoding that allows common human-input errors to be detected and possibly corrected.

This results in a highly resilient code which is very likely to work.

The prototype of this is at https://jsfiddle.net/i_e_b/x1vru8bc/ where you can play around with it.

Design

Before passing to a FEC (in this case, Reed-Solomon), we look for patterns in the input, and try to correct for them, increasing the chance of the FEC successfully correcting the input.

Start with 32 characters, from the ASCII alpha-numeric set with indistict glyphs OLIU removed, then split into an 'odd' and 'even' set, resulting in 16 characters in each set (for 4 bit grouping)

 0 1 2 3 6 7 8 9 b G J N q X Y Z
4 5 A C D E F H K M P R s T V W

S, Q, and B are presented as lower case to prevent confusion with 5, 0, and 8. As no pair of even or odd characters will be next to each other, we can optimise population of these sets to reduce the chance of accidental obscenity. The likelyhood of accidental word forming is already quite low with this set.

Generated codes should be alternating between the two sets. We know if an input has mistakes if it is not following this alternation. This has a short-coming that we can't tell the difference between pairs of deleted characters at the start or end of the input. We try rotating the input during the Reed-Solomon step, to the limit of deleted characters.

We could try having fixed guard codes at the start and end, but this is not implemented in this project.

Error Examples


	`oeo-eoe-oeo`	-- odd-even pattern is correct, length is correct
Real input:	`7MQ-6DJ-S01`

	`_eo-eoe-oeo`	-- pattern is inverted. First char missing, put in placeholder for Reed-Solomon
Deleted first char:	`_MQ-6DJ-S01`

	`oeo-_oe-oeo`	-- "ee" or "oo" around deletion point
Deleted middle char:	`7MQ-_DJ-S01`

	`oeo-eoe-oe_`	-- all correct, but wrong length
Deleted end char:	`7MQ-6DJ-S0_`

	`oeo-eeo-oeo`	-- "eeoo" one before transposition
Transposed char:	`7MQ-6JD-S01`
	`oeo-oee-oeo`	-- "ooee" one before transposition
	`7MQ-D6J-S01`

	`oeo-eeo-eoo`	-- "ee" at start, "oo" at end
Double transposed:	`7MQ-6JD-0S1`
	`oeo-oeo-eeo`	-- "oo" at start, "ee" at end
	`7MQ-D6S-J01`

	`oeo-eooe-oeo`	-- too long. Error at first repeated o/e
Insertion:	`7MQ-6DDJ-S01`

The weird implementation

The various implementations are not generally idiomatic for their language. They have been written to be portable with minimal effort -- so they basically only rely on being able to create arrays, and the rest comes packaged.

i-e-b/MultiCode

MultiCode

Design

Error Examples

The weird implementation