browsermt/bergamot-translator

Bergamot raises SIGSEGV for trivial CLI run

dbezhetskov opened this issue · 4 comments

Hello,

I've built bergamot translator as was described here #299 and then I've tried to run it:

./build/app/bergamot --model-config-paths <Absolute/path/>/bergamot-translator-tests/models/deen/ende.student.tiny.for.regression.tests/config.intgemm8bitalpha.yml.bergamot.yml --cpu-threads 4 <<< "Hello World"

and then I've got:

tcmalloc: large alloc 2147483648 bytes == 0x55ca0371e000 @ 
tcmalloc: large alloc 2147483648 bytes == 0x55ca8fe4e000 @ 
tcmalloc: large alloc 2147483648 bytes == 0x55cb15c8c000 @ 
tcmalloc: large alloc 2147483648 bytes == 0x55cb9a70c000 @ 
[1]    366044 abort (core dumped)  ./build/app/bergamot ...

I've tried to remove --cpu-threads 4 param and to use relative paths and absolute ones but result is the same - it crashes.
The stack trace from gdb is:

#2  0x0000555555603c99 in marian::cpu::ProdBatchedOld(IntrusivePtr<marian::TensorBase>, std::shared_ptr<marian::Allocator>, IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, bool, bool, float, float) [clone .cold] ()
#3  0x0000555555bc1051 in marian::cpu::ProdBatched(IntrusivePtr<marian::TensorBase>, std::shared_ptr<marian::Allocator>, IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, bool, bool, float, float) ()
#4  0x0000555555935e68 in marian::ProdBatched(IntrusivePtr<marian::TensorBase>, std::shared_ptr<marian::Allocator>, IntrusivePtr<marian::TensorBase>, IntrusivePtr<marian::TensorBase>, bool, bool, float, float) [clone .isra.0] ()
#5  0x000055555599a66e in marian::DotBatchedNodeOp::forwardOps()::{lambda()#1}::operator()() const ()
#6  0x00005555557f1f37 in marian::Node::forward() ()
#7  0x00005555557bb7d6 in marian::ExpressionGraph::forward(std::__cxx11::list<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > >, std::allocator<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase> > > > >&, bool) ()
#8  0x0000555555830aaf in marian::BeamSearch::search(std::shared_ptr<marian::ExpressionGraph>, std::shared_ptr<marian::data::CorpusBatch>) ()
#9  0x000055555566eb0e in marian::bergamot::TranslationModel::translateBatch(unsigned long, marian::bergamot::Batch&) ()
#10 0x00005555556a2f20 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<marian::bergamot::AsyncService::AsyncService(marian::bergamot::AsyncService::Config const&)::{lambda()#1}> > >::_M_run() ()
#11 0x00007ffff7e8cde4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#12 0x00007ffff7c32609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x00007ffff7b59293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I additionally checked that input issues aren't involved.
I've replaced reading input from std::string input = readFromStdin(); to std::string input = "Hello, world"; but the crash is the same.

Any ideas about why it is not working for me? Any help is appreciated

Could you provide us with information about the platform (Operating system, Architecture) you're running this on? I request the output of the following as well.

./build/app/bergamot --build-info

Tagging @XapaJIaMnu, @graemenail who are perhaps more suited to answer, based on the debug stack trace.

Sure, it is x86_64 ubuntu 20.04

AVX2_FOUND=true
AVX512_FOUND=false
AVX_FOUND=true
BLAS_Accelerate_LIBRARY=BLAS_Accelerate_LIBRARY-NOTFOUND
BLAS_acml_LIBRARY=BLAS_acml_LIBRARY-NOTFOUND
BLAS_acml_mp_LIBRARY=BLAS_acml_mp_LIBRARY-NOTFOUND
BLAS_blas_LIBRARY=BLAS_blas_LIBRARY-NOTFOUND
BLAS_blis_LIBRARY=BLAS_blis_LIBRARY-NOTFOUND
BLAS_complib.sgimath_LIBRARY=BLAS_complib.sgimath_LIBRARY-NOTFOUND
BLAS_cxml_LIBRARY=BLAS_cxml_LIBRARY-NOTFOUND
BLAS_dxml_LIBRARY=BLAS_dxml_LIBRARY-NOTFOUND
BLAS_essl_LIBRARY=BLAS_essl_LIBRARY-NOTFOUND
BLAS_f77blas_LIBRARY=BLAS_f77blas_LIBRARY-NOTFOUND
BLAS_goto2_LIBRARY=BLAS_goto2_LIBRARY-NOTFOUND
BLAS_mkl_LIBRARY=BLAS_mkl_LIBRARY-NOTFOUND
BLAS_mkl_em64t_LIBRARY=BLAS_mkl_em64t_LIBRARY-NOTFOUND
BLAS_mkl_ia32_LIBRARY=BLAS_mkl_ia32_LIBRARY-NOTFOUND
BLAS_mkl_intel_LIBRARY=BLAS_mkl_intel_LIBRARY-NOTFOUND
BLAS_mkl_intel_lp64_LIBRARY=BLAS_mkl_intel_lp64_LIBRARY-NOTFOUND
BLAS_openblas_LIBRARY=BLAS_openblas_LIBRARY-NOTFOUND
BLAS_scsl_LIBRARY=BLAS_scsl_LIBRARY-NOTFOUND
BLAS_sgemm_LIBRARY=BLAS_sgemm_LIBRARY-NOTFOUND
BLAS_sunperf_LIBRARY=BLAS_sunperf_LIBRARY-NOTFOUND
BLAS_vecLib_LIBRARY=BLAS_vecLib_LIBRARY-NOTFOUND
BUILD_ARCH=native
CMAKE_ADDR2LINE=/usr/bin/addr2line
CMAKE_AR=/usr/bin/ar
CMAKE_BUILD_TYPE=Release
CMAKE_COLOR_MAKEFILE=ON
CMAKE_CXX_COMPILER=/usr/bin/c++
CMAKE_CXX_COMPILER_AR=/usr/bin/gcc-ar-10
CMAKE_CXX_COMPILER_RANLIB=/usr/bin/gcc-ranlib-10
CMAKE_CXX_FLAGS=-std=c++11 -pthread -Wl,--no-as-needed -fPIC -Wno-unused-result   -march=native  -msse2 -msse3 -msse4.1 -msse4.2 -mavx -mavx2 -m64 -DUSE_SENTENCEPIECE -D_USE_INTERNAL_STRING_VIEW
CMAKE_CXX_FLAGS_DEBUG=-O0 -g -rdynamic
CMAKE_CXX_FLAGS_MINSIZEREL=-Os -DNDEBUG
CMAKE_CXX_FLAGS_RELEASE=-O3 -m64 -funroll-loops
CMAKE_CXX_FLAGS_RELWITHDEBINFO=-O3 -m64 -funroll-loops -g -rdynamic
CMAKE_C_COMPILER=/usr/bin/cc
CMAKE_C_COMPILER_AR=/usr/bin/gcc-ar-10
CMAKE_C_COMPILER_RANLIB=/usr/bin/gcc-ranlib-10
CMAKE_C_FLAGS=-pthread -Wl,--no-as-needed -fPIC -Wno-unused-result   -march=native  -msse2 -msse3 -msse4.1 -msse4.2 -mavx -mavx2
CMAKE_C_FLAGS_DEBUG=-O0 -g -rdynamic
CMAKE_C_FLAGS_MINSIZEREL=-Os -DNDEBUG
CMAKE_C_FLAGS_RELEASE=-O3 -m64 -funroll-loops
CMAKE_C_FLAGS_RELWITHDEBINFO=-O3 -m64 -funroll-loops -g -rdynamic
CMAKE_DLLTOOL=CMAKE_DLLTOOL-NOTFOUND
CMAKE_EXPORT_COMPILE_COMMANDS=OFF
CMAKE_INSTALL_PREFIX=/usr/local
CMAKE_LINKER=/usr/bin/ld
CMAKE_MAKE_PROGRAM=/usr/bin/make
CMAKE_NM=/usr/bin/nm
CMAKE_OBJCOPY=/usr/bin/objcopy
CMAKE_OBJDUMP=/usr/bin/objdump
CMAKE_RANLIB=/usr/bin/ranlib
CMAKE_READELF=/usr/bin/readelf
CMAKE_SKIP_INSTALL_RPATH=NO
CMAKE_SKIP_RPATH=NO
CMAKE_STRIP=/usr/bin/strip
CMAKE_VERBOSE_MAKEFILE=FALSE
COMPILE_CPU=ON
COMPILE_CUDA=OFF
COMPILE_EXAMPLES=OFF
COMPILE_SERVER=OFF
COMPILE_TESTS=OFF
COMPILE_WASM=OFF
GENERATE_MARIAN_INSTALL_TARGETS=OFF
GIT_EXECUTABLE=/usr/bin/git
GIT_SUBMODULE=ON
INTEL_ROOT=/opt/intel
M32_BINARIES=OFF
MKL_INCLUDE_DIR=MKL_INCLUDE_DIR-NOTFOUND
MKL_ROOT=MKL_ROOT-NOTFOUND
SSE2_FOUND=true
SSE3_FOUND=true
SSE4_1_FOUND=true
SSE4_2_FOUND=true
SSPLIT_COMPILE_LIBRARY_ONLY=ON
SSSE3_FOUND=true
Tcmalloc_INCLUDE_DIR=/usr/include
Tcmalloc_LIBRARY=/usr/lib/x86_64-linux-gnu/libtcmalloc_minimal.a
USE_APPLE_ACCELERATE=OFF
USE_CCACHE=OFF
USE_CUDNN=OFF
USE_DOXYGEN=ON
USE_FBGEMM=OFF
USE_MKL=ON
USE_MPI=OFF
USE_NCCL=ON
USE_SENTENCEPIECE=ON
USE_STATIC_LIBS=ON
USE_WASM_COMPATIBLE_SOURCE=OFF

It appears to me that MKL was not found in the build? Can you check if you installed dependencies following https://browser.mt/docs/main/marian-integration.html#dependencies?

Ahh, it works! MKL was missing, thanks @jerinphilip !