emoji file name crashes `ncdu`
Closed this issue · 4 comments
Repro:
$ touch 🧡
$ ./find.sh . -maxdepth 1 | ./find2flat.py - | ./unflatten.py - | ncdu -f - -0
thread 3733331 panic: attempt to unwrap error
Unwind information for `:0x1066156` was not available, trace may be incomplete
Aborted (core dumped)
When you diff what ncdu -o
exports of the same directory, the JSON contains the emoji as a raw unicode codepoints while both find2flat
and unflatten
convert it to \ud83e\udde1
. This is indeed what's making ncdu
crash as manually editing the ncdu-export-generated JSON to change it back to a raw unicode codepoint does not crash.
Interestingly, other unicode codepoints such as ä
(\u00e4
) do work.
This might actually be a bug in ncdu, though it wouldn't trigger it on its own of course since it'd put the raw codepoints into the JSON.
I figured out that json.dumps()
makes this happen by default and you need to turn it off using the ensure_ascii = False
parameter. I could not figure out how to apply this to unflatten.py
yet.
Edit: I did figure it out, I just messed up in the implementation and python didn't scream at me... O.o
Hi, thanks for the report and sorry for the late response. I've missed the notification among the flood of notifications from issues I'm subscribed to. I upgraded a mail-filtering rule so now I should react faster.
I closed the PR #6 because I made some other changes to the code. Please check if the code (v0.8.0) works for you.
I've also created a ticket on ncdu's Forgejo.
Yup, works for me :)
Thanks!
BTW, yorhel already committed changes to ncdu. I built it and now ASCII-escaped input not only works but is also displayed the same way the UTF-8 input has been.