bdrister/AquaticPrime

UTF8 problems between PHP and CoreFoundation

Opened this issue · 4 comments

I recently discovered that if you use the PHP implementation with values that are already UTF8 encoded, the signature generated does not match the one that CoreFoundation will generate. Mostly this affects people who use their name which has special characters in it, and that comes across via XML already in UTF8.

I managed to fix the problem in my case with this patch:
https://gist.github.com/706902

But perhaps a better way would be to detect whether the values are in UTF8 already and only convert them if they're not?

A quick internet search suggested a solution like this:
function FixEncoding($x) {
if(mb_detect_encoding($x)=='UTF-8') {
return $x;
} else {
return utf8_encode($x);
}
}

This requires the mbstring extension, however, and it would be nice to have a fallback implementation.

Also, it would be nice if the OP could provide an exact example input with correct and incorrect outputs, to serve as the test case to verify this issue is addressed.

Yes, I considered the mbstring option but didn't want to require it. What I'd suggest as an alternative is either a new parameter or a setting in the Config.php which states whether the data being passed is already in UTF8 format. For example, BMT Micro and Esellerate send XML so those are both already in UTF8, Kagi uses post variables so I'm not sure about that (I'm only testing with BMT).

The issue is very easy to recreate, in my case it was just someone with the character 'é' in their name. I will create a quick test with a test product to demonstrate.

Here's a Gist demonstrating the problem: https://gist.github.com/707790

The source code to generate this is simply (with the appropriate keys provided in Config.php) [edit]and of course you must save the PHP file in UTF8 (no BOM).
$product, "Name" => $name, "Email" => $email, "OrderID" => $orderid); $license = licenseDataForDictionary($dict, $key, $privateKey); echo $license; ?>

testaquaticprime_php_utf8_fail.plist is the data generated by AquaticPrime.php unchanged and does not validate. By removing the utf8_encode() call, testaquaticprime_php_utf8_ok.plist is created instead which works fine.

Also on the same subject, the licensee name in the email text is screwed by the utf8_encode($headers) call inside sendMail(). Removing the utf8_encode() here solves that too, so both places will need altering.

Excellent, thank you. While I don't have time to mess with this immediately, I'll take care of it on my next pass through the source. Thanks again for the report and the test cases.