c-cube/ocaml-containers

Integer overflow errors when compiling to JS via jsoo

darrenldl opened this issue ยท 23 comments

We need to compile containers to JS for our doodlinator project via js_of_ocaml.

During compilation, we'd receive the following integer overflow errors (which I believe comes from jsoo compiling containers):

$ make
dune build bookmarklet/doodlinator.bc.js web/webpage.bc.js
 js_of_ocaml .js/containers/containers.cma.js
Warning: integer overflow: integer 0x5555555555555555 (-3074457345618258603) truncated to 0x55555555 (1431655765); the generated code might be incorrect.
Warning: integer overflow: integer 0x3333333333333333 (3689348814741910323) truncated to 0x33333333 (858993459); the generated code might be incorrect.
Warning: integer overflow: integer 0x3333333333333333 (3689348814741910323) truncated to 0x33333333 (858993459); the generated code might be incorrect.
Warning: integer overflow: integer 0xf0f0f0f0f0f0f0f (1085102592571150095) truncated to 0xf0f0f0f (252645135); the generated code might be incorrect.

The last integer matches Sys.max_array_length, which is used in core/CCVector (and data/CCPersistentArray, data/CCPersistentHashtbl, but we're not using containers-data).

Version info

ocaml: 4.08.1
containers: 3.1
jsoo: 3.8.0

System info

Linux 64 bit

This is likely caused by the module CCShimsInt_.ml, produced by src/core/mkshims.ml. It's produced because the compiler is in 64 bits mode.

I think we need to use dune configurator to obtain ocamlc information for the target architecture, not the current architecture.

Can you please try the branch fix-346 ?

Still gives the same error after opam pin add containers https://github.com/c-cube/ocaml-containers.git#fix-346 - did i get the command wrong?

Seems like the right command, did it reinstall everything?

Yep, and I reinstalled everything again as well.

In _build/log, is there any mention of int_size? or "target word size" ?

Snippet of _build/log

#  ; ocaml_config =
#      { version = "4.08.1"
...
#      ; architecture = "amd64"
#      ; model = "default"
#      ; int_size = 63
#      ; word_size = 64
...
#      ; host = "x86_64-pc-linux-gnu"
#      ; target = "x86_64-pc-linux-gnu"
...
#      }

that's very unfortunate if that's what dune gives us for a jsoo target. I'll ask how to get the actual int size.

  • Can OCaml even handle an int size of 53 bits, which is what JS uses?
  • Why does JS truncate to 32 bits? Is it because the JS binary functions are limited to 32 bits?
hhugo commented
  • Can OCaml even handle an int size of 53 bits, which is what JS uses?
  • Why does JS truncate to 32 bits? Is it because the JS binary functions are limited to 32 bits?

There is no such thing as 53 bits ints. There are 64bit floats that can accurately represent ints up to 53bits.

Javascript knows about 32bit integers (via bits operations, xor, or, and, ...) and that's what jsoo uses to represent ocaml integer (preserving the overflow semantic).

@darrenldl does the latest commit fix your problem? it checks int_size at runtime.

hhugo commented

I looked at the latest commit, I think it fixes the semantic but will still trigger the warning message.
A way to fix the warning would be to compute the constants instead of using literals (e.g. int_of_string "0x55555")

It'd fix the warning, but it would produce worse code on x86_64, would it not? The compiler would have no idea what the constants look like, and the whole point of this is to have a reasonably efficient popcount.

Is there an annotation one can put to disable this warning?

This really needs conditional compilation to be fixed 'properly'. @c-cube, how does this compile on 32-bits?

@bluddy there already is conditional compilation. Alas, it seems that one can compile on a 64 bits architecture, produce bytecode, and then run this bytecode on a 32 bits architecture (or pass it to jsoo, which is also 32 bits). This is what thwarted the already existing conditional generation of shims based on size_int.

Then maybe the conditional compilation check should instead also check if we're producing bytecode? Performance isn't such a primary concern then.

Sadly this is also not provided in the config. The shims are generated once, and used for both native and bytecode targets.

@darrenldl does the latest commit fix your problem? it checks int_size at runtime.

I still receive the exact same warning

Wrong close, but I was planning to "wontfix" anyway, I don't see how we can compile for 64 bits and expect things to work in 32 bits.

copy commented

Note that the compiler contains similar code (int_of_string "0x1_0000_0000"): https://github.com/ocaml/ocaml/blob/5661bbfe1f5e5384b09125759e1125d519ae69f3/stdlib/int32.ml#L66 It compiles to a move from a global memory location, so it is only slightly less inefficient. Alternatively, large constants may be written using shifts, producing the same code as the current variant: https://gcc.godbolt.org/z/x5bEGe

It think it would be nice to fix this, since the warning shows up in user code, and it also allows remove the shims for it it (since it's selected using Sys.int_size already at runtime).

copy commented

For reference:

let popcount_64_ (b:int) : int =
  let b = b - ((b lsr 1) land 0x5555555555555555) in
  let b = (b land 0x3333333333333333) + ((b lsr 2) land 0x3333333333333333) in
  let b = (b + (b lsr 4)) land 0x0f0f0f0f0f0f0f0f in
  let b = b + (b lsr 8) in
  let b = b + (b lsr 16) in
  let b = b + (b lsr 32) in
  b land 0x7f

let popcount_64_shifts (b:int) : int =
  let fives = 0x55555555 lor 0x55555555 lsl 32 in
  let threes = 0x33333333 lor 0x33333333 lsl 32 in
  let fs = 0x0f0f0f0f lor 0x0f0f0f0f lsl 32 in
  let b = b - ((b lsr 1) land fives) in
  let b = (b land threes) + ((b lsr 2) land threes) in
  let b = (b + (b lsr 4)) land fs in
  let b = b + (b lsr 8) in
  let b = b + (b lsr 16) in
  let b = b + (b lsr 32) in
  b land 0x7f

let () =
  for _ = 0 to 100_000_000 do
    let b = Int64.to_int (Random.int64 Int64.max_int) in
    assert (popcount_64_ b = popcount_64_shifts b)
  done

@copy real nice. Would you mind doing a PR with that code?