Midtrans/midtrans-php

Enhancement: avoid `isSanitized` feature from breaking HTTP body JSON API params, in some rare scenario of params includes UTF8 chars

Opened this issue · 0 comments

Affected Lines

Known sample, but recommended to search more comprehensively.

return substr($input, 0, $length);

$plus = substr($field, 0, 1) === '+';

https://github.com/search?q=repo%3AMidtrans%2Fmidtrans-php%20substr&type=code

Findings:

Reproducible evidence:

<?php
$anotherSampleStr = "Q-古典-吾家有喜A-半圆_ Q-古典-吾家有喜A-半圆 定制水晶绒【咨询客服】";

echo "// Sample \$str is a long utf-8 based string of chars: \n";
$str = "制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制制";
var_dump($str);

echo "// Length shorten using regular `substr(\$str,0,50)`, unexpectedly last chars is improperly truncated in the middle bytes and became invalid UTF-8 char: \n";
var_dump(substr($str,0,50));
echo "// Then JSON encoded, **UNEXPECTEDLY** produce empty JSON: \n";
var_dump(json_encode( substr($str,0,50) ));

echo "\n\n ---- Compared to: ---- \n\n";
echo "// Length shorten using specific `mb_substr(\$str,0,50)`: \n";
var_dump(mb_substr($str,0,50));
echo "- Then JSON encoded, **EXPECTED** succeed, BUT encoding changed: \n";
var_dump(json_encode( mb_substr($str,0,50) ));
echo "\n ---- \n";
echo "// Further, if JSON encoded using `JSON_UNESCAPED_UNICODE` param, expected succeed w/ perserved encoding: \n";
var_dump(json_encode( mb_substr($str,0,50),JSON_UNESCAPED_UNICODE ));

// tips: qucik run it using e.g. https://replit.com/@raizerde/Utf8EncodingSubstringIssue
?>

Issue Desc

When:

  • sanitization in which merchant's input params being sanitized, including being substr-ed,
  • then in the very rare scenario of the params includes UTF8 string of chars. (rare case because UTF8 usually used for non-latin char based country, which merchant is rarely from).

Unexpected:

  • the string's last char will be prematurely truncated, which unexpectedly result in an invalid UTF8 char.
  • then due to this problematic char will cause the string & params fail to be json_encode-ed, unexpectedly will result as false / empty JSON.
  • then it will cause empty HTTP request body to be sent to the API, & unexpected API response of e.g. Midtrans API is returning API error. HTTP status code: 500 API response: {"status_code":"500","status_message":"Sorry. Our system is recovering from unexpected issues. Please retry.","id":"c99ccd6b-3174-4b25-907a-bb8e6a1d586d"}

Further: Seems PHP does not natively support UTF-8 char encoding (Chinese chars are UTF-8, which is more bytes than regular ASCII which PHP support).

Expected:

  • substr & json_encode should success produce proper JSON, not empty.

Possible Solution:

[TBD] ... Probably use mb_substr to replace substr usage, but need further check for any possible downsides (e.g. apparently not all PHP server env enable MB extension, as it is an optional non-default extension).

  • Then maybe add warning about PHP limitation in using UTF-8 or other non ASCII string encoding.

Possible Workaround

  • Merchant can change config \Midtrans\Config::$isSanitized = true; to \Midtrans\Config::$isSanitized = false;
    • as the issue happens during Sanitization flow, disabling it will avoid the problematic flow. Although there will be downside.