other.test_gen_struct_info is flaky
Opened this issue · 7 comments
Occassionally fails with
gen_struct_info: Calling generated program... /tmp/tmp6t3jvaci.js
code.c: no_exit=1 assertions=1 flush=0 keepalive=0 filesystem=0
Traceback (most recent call last):
File "/home/clb/buildbot/h12dsi-linux-mint22/emscripten_linux_x64/build/emscripten/main/tools/gen_struct_info.py", line 411, in <module>
sys.exit(main(sys.argv[1:]))
~~~~^^^^^^^^^^^^^^
File "/home/clb/buildbot/h12dsi-linux-mint22/emscripten_linux_x64/build/emscripten/main/tools/gen_struct_info.py", line 394, in main
info_fragment = inspect_code(header_files, use_cflags)
File "/home/clb/buildbot/h12dsi-linux-mint22/emscripten_linux_x64/build/emscripten/main/tools/gen_struct_info.py", line 290, in inspect_code
info = inspect_headers(headers, cflags)
File "/home/clb/buildbot/h12dsi-linux-mint22/emscripten_linux_x64/build/emscripten/main/tools/gen_struct_info.py", line 273, in inspect_headers
return json.loads(info)
~~~~~~~~~~^^^^^^
File "/home/clb/.pyenv/versions/3.13.3/lib/python3.13/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/home/clb/.pyenv/versions/3.13.3/lib/python3.13/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/home/clb/.pyenv/versions/3.13.3/lib/python3.13/json/decoder.py", line 361, in raw_decode
obj, end = self.scan_once(s, idx)
~~~~~~~~~~~~~~^^^^^^^^
json.decoder.JSONDecodeError: Illegal trailing comma before end of object: line 692 column 13 (char 10583)
None
None
[8%] test_gen_struct_info (test_other.other.test_gen_struct_info) ... FAIL
I don't think I've ever seen this one before! Any idea how this could possibly flake? Seems like pretty straight forward non-threaded code.
My guess is by interaction from other tests in the parallel run. I haven't been able to reproduce this on the bot where it fails, at least by running the test itself multiple times on repeat.
http://clbri.com:8010/#/builders/10 has multiple runs where it fails.
Some of the failures have different quite nondescript failure logs: http://clbri.com:8010/api/v2/logs/54169/raw_inline
I wonder why we haven't see this on our CI at all. Are you doing anyting other than running the test_other.py with normal/default level of parallelism?
Nothing special. You can see what's being run in the log.
E.g. it is running
source ./emsdk_env.sh; cd emscripten/main; python3 test/runner.py --failing-and-slow-first --failfast other \
skip:other.test_dlmalloc \
skip:other.test_bullet_cmake \
skip:other.test_dylink_zlib_reversed \
skip:other.test_dylink_zlib \
skip:other.test_zlib_configure \
skip:other.test_legacy_exported_runtime_numbers \
skip:other.test_sse2 skip:other.test_modularize_closure_pre \
skip:other.test_sse2_nontrapping \
skip:other.test_sse4_1 \
skip:other.test_iostream_and_determinism \
skip:other.test_openjpeg \
skip:other.test_zlib_cmake \
skip:other.test_avx_nontrapping \
skip:other.test_avx skip:other.test_freetype \
skip:other.test_bullet_autoconf \
skip:other.test_avx2 \
skip:other.test_avx2_nontrapping \
skip:other.test_printf_wasmfs \
skip:other.test_printf \
skip:other.test_poppler \
skip:other.test_cmake_compile_features \
skip:other.test_cmake_compile_features_noforce \
skip:other.test_safe_stack \
skip:other.test_wasm_sourcemap_relative_paths \
skip:other.test_codesize_cxx_mangle \
skip:other.test_codesize_hello_dylink_all \
skip:other.test_minimal_runtime_code_size_hello_webgl2_wasm \
skip:other.test_minimal_runtime_code_size_hello_webgl2_wasm2js \
skip:other.test_minimal_runtime_code_size_hello_webgl_wasm \
skip:other.test_minimal_runtime_code_size_hello_webgl_wasm2js
with the environment
environment:
EMCC_SKIP_SANITY_CHECK=1
EMTEST_BENCHMARKERS=clang,size,node,node-64
EMTEST_BROWSER=/Applications/Firefox.app/Contents/MacOS/firefox
EMTEST_RETRY_FLAKY=5
EMTEST_SKIP_CCACHE=1
EMTEST_SKIP_EH=1
EMTEST_SKIP_JSPI=1
EMTEST_SKIP_NINJA=1
EMTEST_SKIP_NODE_CANARY=1
EMTEST_SKIP_NODE_DEV_PACKAGES=1
EMTEST_SKIP_RUST=1
EMTEST_SKIP_SCONS=1
EMTEST_SKIP_V8=1
EMTEST_SKIP_WASM64=1
EMTEST_SKIP_WASM_ENGINE=1
(the skips are to skip over all the slow tests for iteration)
Python is 3.13.3 and Node.js is 22.16.0, as produced by emsdk install step: http://clbri.com:8010/api/v2/logs/54178/raw_inline
[b'./emsdk', b'install', b'sdk-main-64bit', b'node-nightly-64bit', b'ninja-git-release-64bit']