substring does not support UTF-16
maxlapides opened this issue ยท 12 comments
"๐".substring(0, 1);
> "๏ฟฝ"
This issue came up because a user inputted an emoji into a data field. Attempting to render Text("๐".substring(0, 1)) in Flutter results in:
flutter: โโโก EXCEPTION CAUGHT BY RENDERING LIBRARY โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
flutter: The following ArgumentError was thrown during performLayout():
flutter: Invalid argument(s): string is not well-formed UTF-16
flutter:
flutter: When the exception was thrown, this was the stack:
flutter: #0 ParagraphBuilder.addText (dart:ui/text.dart:1157:7)
flutter: #1 TextSpan.build
package:flutter/โฆ/painting/text_span.dart:172
flutter: #2 TextPainter.layout
package:flutter/โฆ/painting/text_painter.dart:352
flutter: #3 RenderParagraph._layoutText
...
โฏ flutter doctor -v
[โ] Flutter (Channel beta, v1.0.0, on Mac OS X 10.14.2 18C54, locale en-US)
โข Flutter version 1.0.0 at /Users/maxlapides/flutter
โข Framework revision 5391447fae (9 weeks ago), 2018-11-29 19:41:26 -0800
โข Engine revision 7375a0f414
โข Dart version 2.1.0 (build 2.1.0-dev.9.4 f9ebf21297)
[โ] Android toolchain - develop for Android devices (Android SDK 28.0.3)
โข Android SDK at /Users/maxlapides/Library/Android/sdk
โข Android NDK location not configured (optional; useful for native profiling support)
โข Platform android-28, build-tools 28.0.3
โข ANDROID_HOME = /Users/maxlapides/Library/Android/sdk
โข Java binary at: /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home/bin/java
โข Java version OpenJDK Runtime Environment (build 1.8.0_152-release-1248-b01)
โข All Android licenses accepted.
[โ] iOS toolchain - develop for iOS devices (Xcode 10.1)
โข Xcode at /Applications/Xcode.app/Contents/Developer
โข Xcode 10.1, Build version 10B61
โข ios-deploy 1.9.4
โข CocoaPods version 1.6.0.beta.1
[โ] Android Studio (version 3.3)
โข Android Studio at /Applications/Android Studio.app/Contents
โข Flutter plugin version 31.3.3
โข Dart plugin version 182.5124
โข Java version OpenJDK Runtime Environment (build 1.8.0_152-release-1248-b01)
[โ] VS Code (version 1.30.2)
โข VS Code at /Applications/Visual Studio Code.app/Contents
โข Flutter extension version 2.22.1
[โ] Connected device (2 available)
โข SAMSUNG SM G920A โข 03157df3c5558209 โข android-arm64 โข Android 5.1.1 (API 22)
โข iPhone XS โข 9CFC4B23-BE03-4DC9-B1CD-5E1226F5A183 โข ios โข iOS 12.1 (simulator)
โข No issues found!
Here's my workaround for now:
String.fromCharCode(str.runes.first)A substring method that works on code points rather than UTF-16 code units might be inefficient. Eg, taking a substring near the end of a very long string would be slow, because it would have to iterate through most of the string counting code points.
The substring documentation doesn't make it clear that the method operates on code units. Can we update it to explain this, similarly to the good explanation given on the [] operator?
I am having the exact same problem. My case is I want to programmatically delete (backspace) a text input which may have emojis. So far, my workaround is as follows:
var s = "abc๐";
var sRunes = s.runes;
print(String.fromCharCodes(sRunes, 0, sRunes.length-1));And make sure users do not input those emojis which have length 4.
FYI, My another workaround https://stackoverflow.com/a/56135774/348719 which is currently broken.
I have found another workaround , is more respource expensive but so far it works with all emojis.
In my particular case i wanted to do a subtring [EX from index 0 to 16], and count an emoji as an individual character , however it was just getting half of the text due to the emojis in it
My workaround is this one
Create a function called runeSubtring() like this one
String runeSubstring({String input , int start , int end}){
String finalString = ''; //initialize the string
List individualRunes = input.runes.toList(); //convert the string to a list of runes
individualRunes.sublist(start,end).forEach((rune) { //"substring" the list
String character = String.fromCharCode(rune); //convert the list back to the string one by one
finalString = finalString + character;
});
return finalString; //return the substring
}
and just use it like this when you like
String example = r'Example \ud83d\ude13 Example \ud83d\ude13';
String result = runeSubstring(input: example,start: 0,end:10);
String resultEnd = runeSubstring(input: example,start: 11); //from 11 to the end
If the text is really big, you should do input.runes.toList(); outside the function since it will leverage the charge of converting text to runes and to a list everythime the function is called.
See #28404 and dart-lang/language#34, long discussions about making correct String manipulation easier. We will probably close this issue as being one example of the bigger issue.
@LiteCatDev String.fromCharCodes takes an Iterable of rune values, so you can simplify your code to:
String runeSubstring({String input, int start, int end}) {
return String.fromCharCodes(input.runes.toList().sublist(start, end));
}this SO might be related https://stackoverflow.com/questions/54483177/handling-grapheme-clusters-in-dart
dart-lang/language#685 is our current attempt at supporting this via a package; closing the present issue in favor of that
Similar issue I need to solve have posted problem here
please give your inputs thanks
Use package:characters: https://pub.dev/documentation/characters/latest/