Trimmed zero at the end after the digit

Question

Trimmed zero at the end after the digit

seoromka opened this issue 4 years ago · 9 comments

seoromka commented 4 years ago

Base32::decode(Base32::encode('10')) // 1
Base32::decode(Base32::encode('test130')) // test13

Answer 1 · 2021-02-24T09:18:54.000Z

Thanks for the report. That's a really odd one. I'll have a look at it.

Answer 2 · 2021-02-24T10:55:33.000Z

I think it has to do what is happening on this line

Answer 3 · 2021-02-26T02:48:32.000Z

Tried the following with the above code, 'test' loses a character also
tess => tess
test => tes
tesu => tesu

Also
8 => ''

Answer 4 · 2021-02-26T04:25:35.000Z

I had a little look at this, inspecting the '8' test case.
correct: HA======
incorrect: H=======

our input: (a single char, but zero padded)
[ 0 1 2 3 4 5 6 7 ] [ 8 9 a b c d e f ] ...

When outputting that second character, it's parsing from bits 5 - 9. Bits 5-7 all belong to character 0 of the input, but because in the second iteration bitLen is 3, it needs to read more data which increments n. Later determining the character to print, $n no longer refers to the character that we read from input - causing it to be >= len and, using '=' instead of the array lookup.

Here I'm subtracting 1 if $bitLen is greater than 8, indicating we're still processing the previous n - it seems to work.

        $encoded .= ($n - (int)($bitLen>8) > $len && 0 == $val) ? '=' : static::ALPHABET[$val >> $shift];

I'll make a PR and add a test case, but we should try look for extra test fixtures from notable projects using base32 if possible

Answer 5 · 2021-02-26T09:44:22.000Z

I'll add a few more tests to make sure the issues as reported are covered. Thank you everyone for reporting and adding info, as well as afk11 for the quick fix 🥇

Answer 6 · 2021-02-26T12:51:39.000Z

Wrote a brute forcer to test every 8bit, 16bit, and 32bit string https://gist.github.com/afk11/26be2ca69f34c7480cfbc6151477ff68 input. It did catch the issue when I undid the fix.. so it'll catch if the new + old implementation don't encode to the same thing, or cases where decoded result != input

So far so good. It'll take a while.

Answer 7 · 2021-02-28T20:41:00.000Z

That script completed and didn't turn up anything! Happy days

Answer 8 · 2021-03-01T21:54:26.000Z

Thank you for the extra effort! Good to know it passes the bruteforce test :)

Answer 9 · 2021-03-03T06:32:13.000Z

Thank you!