rannn505/child-shell

Non-Latin Characters Garbled

Opened this issue · 4 comments

It seems characters like ä or ü are incorrectly encoded in the resulting javascript string.

PowerShell Script foo.ps1

echo "bär"

Node-PowerShell

let PS = require('node-powershell');
const ps = new PS({
  executionPolicy: 'Bypass',
  noProfile: true
});
// ps.addCommand('echo bär'); // This works fine
ps.addCommand('./foo.ps1');
ps.invoke()
.then(output => {
  console.log(output);
})

Result:

bär

As you can see the buffer is mangled to bär:

<Buffer 62 c3 83 c2 a4 72>
b: 62 (OK)
ä: c3 83 c2 a4 (garbled)
r: 72 OK

At the utils.js level we see the UTF-8 Replacement characters:

239  191 189
EF   BF  BD => UTF Replacement Character

Apparently PowerShell is receiving ä encoded as 5 bytes 226 148 156 195 and 177!

ps.addCommand('$str="bär";echo $str; echo $([System.Text.Encoding]::UTF8.GetBytes($str))');
...
bär
98  
226 
148 
156 
195
177
114

@cawoodm Hi, do you have a solution or workaround for this problem yet? In the latest version, the problem is still present.

Nope, I gave up on this library 🤷‍♂️ and went for node-powershell.

https://github.com/cawoodm/powowshell/blob/master/ide/package.json

@cawoodm I am using node-powershell v5.0.1. And I am not familiar with using profiles for PowerShell. But now I was able to solve my problem by removing noProfile from PS-Config. But I'm not sure that's a good idea.

Anyway, "Bär" is now displayed as a "Bär" and no longer as "B├ñr". :-)