Fix string truncation to properly handle multi-byte characters
kipkaev55 opened this issue · 0 comments
kipkaev55 commented
Description:
This feature change aims to fix the issue with truncating strings encoded in a multi-byte character set. In the current implementation, truncation was occurring at the byte level, which could lead to incorrect handling of multi-byte characters, such as UTF-8 characters.
import (
// ...
"unicode/utf8"
// ...
)
// ...
func truncate(s string, max int) string {
if utf8.RuneCountInString(s) <= max || max < 0 {
return s
}
runes := []rune(s)
if max > 3 {
return string(runes[:max-3]) + "..."
}
return string(runes[:max])
}
Changes:
- Changes have been made to the logic of string truncation to account for multi-byte characters.
- The usage of the unicode/utf8 package has been added to accurately count characters in the string.
- String truncation now occurs at the character level, ensuring proper behavior when dealing with multi-byte characters.
Why It Matters:
The problem with truncating strings encoded in a multi-byte character set can lead to unpredictable behavior and errors in the application, especially when it deals with multilingual data or text containing special characters.