Polyglot Notebook: [NetTestingE2E][GB18030] Some GB18030 strings display incorrect in Output, show as Unicode char (e.g. \uD86D\uDCE7...).
Opened this issue · 0 comments
Describe the bug:
Some GB18030 strings (e.g. 𫓧𬬮U1U2) display as \uD86D\uDCE7\uD872\uDF2E\uE05E\uE05F\uE070\uE081U1\uE2C1\uE2C2\uE2C3\uE2D4\uE2E5U2\uE546\uE547\uE548\uE559\uE55A in Output.
Testing Data: Level2 GB18030-2022 Testing Data for medium large amount cases-GB18030
Group3:舰剑饯渐溅建僵齄鿀龬ɑπ㈢Q𫓧𬬮U1U2U3()ao㩹㩺㩻㩼㩽䀃E9;cz囌囍囎囐囓囑囒㏄㏑⿲⿳⿻〇cz珸珹䲟珺珻珼陫
Note:
- Repro VM: 172.16.194.187
- Test on Win 11 24H2 ZH-CN (Chinese (Simplified) Loc OS)
Pre-steps:
1.On Chinese OS, install VSCode and dotnet-interactive-vscode-1.0.6323011.vsix extension component.
2. Install the language package of Chinese (Simplified) from VSCode -> Change the display language of VSCode to Chinese (Simplified)
Steps:
- Ctrl+Shift+P => "Polyglot Notebook: Create new blank notebook"
- Select "Create as .dib" ->Select "C#"
- Set the cell contents as following and execute cell
var value = new { Name = "Developer舰剑饯渐溅建僵齄鿀龬ɑπ㈢Q𫓧𬬮U1U2U3()ao㩹㩺㩻㩼㩽䀃E9;cz囌囍囎囐囓囑囒㏄㏑⿲⿳⿻〇cz珸珹䲟珺珻珼陫", Salary = 42 };
value.Display("text/html", "application/json");
Actual Results:
Some GB18030 strings (e.g. 𫓧𬬮U1U2) display as \uD86D\uDCE7\uD872\uDF2E\uE05E\uE05F\uE070\uE081U1\uE2C1\uE2C2\uE2C3\uE2D4\uE2E5U2\uE546\uE547\uE548\uE559\uE55A in Output.

Expected Results:
All strings should display correctly in VSCode UI.
Please complete the following:
Which version of .NET Interactive are you using? (In a notebook, run the #!about magic command. ):
- OS
- [√ ] Windows 11
- Windows 10
- macOS
- Linux (Please specify distro)
- iOS
- Android
- Browser
- Chrome
- Edge
- Firefox
- Safari
- Frontend
- Jupyter Notebook
- Jupyter Lab
- nteract
- [√ ] Visual Studio Code
- Visual Studio Code Insiders
- Visual Studio
- Other (please specify)
