cornell-zhang/heterocl

`shmget` failed causing SegFault

Closed this issue · 1 comments

This bug may appear on some specific machines when using Vivado HLS as backend. All the programs with target.config(compile="vivado_hls") may not be successfully compiled.

The trace from gdb is shown below.

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
__memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:455
455     ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: No such file or directory.
(gdb) backtrace
#0  __memmove_avx_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:455
#1  0x00007fff78394a2b in TVM::runtime::GenSharedMem(TVM::runtime::TVMArgs&, std::vector<int, std::allocator<int> >&, std::vector<unsigned long, std::allocator<unsigned long> >&) () from /home/chz/heterocl/tvm/lib/libhcl.so
#2  0x00007fff78375b85 in TVM::runtime::SimModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<TVM::runtime::ModuleNode> const&)::{lambda(TVM::runtime::TVMArgs, TVM::runtime::TVMRetValue*)#1}::operator()(TVM::runtime::TVMArgs, TVM::runtime::TVMRetValue*) const () from /home/chz/heterocl/tvm/lib/libhcl.so
#3  0x00007fff78376764 in std::_Function_handler<void (TVM::runtime::TVMArgs, TVM::runtime::TVMRetValue*), TVM::runtime::SimModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<TVM::runtime::ModuleNode> const&)::{lambda(TVM::runtime::TVMArgs, TVM::runtime::TVMRetValue*)#1}>::_M_invoke(std::_Any_data const&, TVM::runtime::TVMArgs&&, TVM::runtime::TVMRetValue*&&) ()
   from /home/chz/heterocl/tvm/lib/libhcl.so
#4  0x00007fff785ea202 in TVMFuncCall () from /home/chz/heterocl/tvm/lib/libhcl.so

This bug is caused by copying data to unallocated memory. As shown below, shmget fails to allocate shared memory somehow and returns error code -1, which is not caught by HeteroCL and causes further crash.

// TODO: maybe get the current path??
key_t key = ftok("/", i+1);
int shmid = shmget(key, arg_sizes[i], 0666|IPC_CREAT);
shmids.push_back(shmid);
// copy mem from TVM args to the shared memory
void* mem = shmat(shmid, nullptr, 0);
memcpy(mem, arr->data, arg_sizes[i]);

However, after I change the path of ftok to other folders, the program can run without SegFault. I do not clearly know what is the reason here.

I use the current path to replace "/" in ftok, and this issue is fixed in #253 .