Invalid_argument("1E5965 is not an Unicode scalar value")
reynir opened this issue · 6 comments
Since utop 2.10 I am no longer able to use ppx_blob with binary files:
─( 13:00:36 )─< command 0 >────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # "\x1E\x59\x65";;
- : string = "\030Ye"
─( 13:00:37 )─< command 1 >────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # #require "ppx_blob";;
─( 13:00:49 )─< command 2 >────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # let s = [%blob "/home/reynir/bin/bob.com"];;
Fatal error: exception Invalid_argument("1E5965 is not an Unicode scalar value")
The file bob.com
is fetched from here: https://github.com/dinosaure/bob/actions/runs/3250831340
57839aa3033139ec4a66c23b3f6e4ee14f64dfe270dc1554524013ed1d599ba2 /home/reynir/bin/bob.com
I'm not sure how to make the test case smaller.
With utop-full
I could get a more useful backtrace:
Fatal error: exception Invalid_argument("1E5965 is not an Unicode scalar value")
Raised at Stdlib.invalid_arg in file "stdlib.ml", line 30, characters 20-45
Called from Zed_utf8.unsafe_extract_prev in file "src/zed_utf8.ml", line 229, characters 33-189
Called from Zed_string.Zed_string0.prev_ofs in file "src/zed_string.ml", line 139, characters 21-50
Called from Zed_string.Zed_string0.extract_prev in file "src/zed_string.ml", line 210, characters 14-30
Called from Zed_string.Zed_string0.unsafe_explode.aux in file "src/zed_string.ml", line 294, characters 23-43
Called from LTerm_text_impl.Make.of_string in file "src/lTerm_text_impl.ml", line 23, characters 65-96
Called from UTop_main.render_out_phrase in file "src/lib/uTop_main.ml", line 348, characters 17-42
Called from UTop_main.loop in file "src/lib/uTop_main.ml", line 865, characters 30-61
Re-raised at Location.report_exception.loop in file "parsing/location.ml", line 938, characters 14-25
Called from UTop.get_message in file "src/lib/uTop.ml", line 129, characters 2-11
Called from UTop_main.loop in file "src/lib/uTop_main.ml", line 871, characters 21-61
Called from UTop_main.main_aux in file "src/lib/uTop_main.ml", line 1630, characters 8-17
Called from UTop_main.main_internal in file "src/lib/uTop_main.ml", line 1646, characters 4-25
I managed to minimize it to this short snippet:
─( 13:32:56 )─< command 2 >────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # "\247\165\165\165";;
Fatal error: exception Invalid_argument("1E5965 is not an Unicode scalar value")
Thanks for the short repro. I'd say it's an issue with zed: next_error
returns no error, but it only checks that the bytes that encode length are "valid", not that the resulting value fits in Uchar
. I'll try to repro that there.
Fixed in ocaml-community/zed#50
I guess we can't print anything that makes sense but at least utop does not crash. Can you try the PR with the full [%blob]
output? Thanks
utop # "\247\165\165\165";;
- : string = "÷¥¥¥"
Thanks, that fixed it for me
─( 15:30:43 )─< command 2 >────────────────────────────────────────────────────────────────{ counter: 0 }─
utop # let s = [%blob "/home/reynir/bin/bob.com"];;
val s : string =
"MZqFpD='\n\000\000\016\000ø\000\000\000\000\000\000\000\001\000\b@\000\000\000\000\000\000\000\000\000\000\000JT\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\011\000\000²@ë\000ë\020ì\b1ҽ\000\000ë\005éh#\000\000ü\015\031>à¿\000p1ɎÁúÌû\014\031è\000\000^îr\000¸\000\002PP\0071ÿ¹\000\002ó¤\015\031Òÿê \000\000ٹ\000\027¸P\000À1À1ÿóªú@t\019è\021\000\007°\0011É0ö¿p\003èg\000Ouúêì&\000\000SR´\022Í\019s\0251ÀÍ\019rE¸\001\002¹\001\000¶\000»\000\002Ã1ÛÍ\019r2´\bÍ\019r,πç?áÀÐÁÐÁÍ\030\006\0311öƾ\016\021÷¥¥¥¥¥¤\031«««Xª[ÃZò1ÀÍ\019r÷ë¢PQÍÐÉÐÉ\bÁ1۰\001´\002Í"... (* string length 9936896; truncated *)
I will reopen until ocaml-community/zed#50 is merged.