dds-bridge/dds

Segmentation fault when libdds.so is unload

Closed this issue · 4 comments

I compiled libdds.so on Linux with the default parameter and I'm trying to use it with ctypes in Python. I noticed, when I unload the library at the end of a Python script, I get a segmentation fault.

Here is a toy example of Python script:

from ctypes import *
dll = cdll.LoadLibrary("libdds.so")

When I run it:

$ python libdds.py
Segmentation fault (core dumped)

According to the debugger, the issue come from FreeMemory function:

$ gdb
(gdb) file python
Reading symbols from python...done.
(gdb) run libdds.py
Starting program: /home/pierre/anaconda3/bin/python libdds.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5e59454 in Memory::ReturnThread(unsigned int) ()
   from /home/pierre/Documents/python/bridge/python-dds-master_copy/libdds.so
(gdb) bt
#0  0x00007ffff5e59454 in Memory::ReturnThread(unsigned int) ()
   from /home/pierre/Documents/python/bridge/python-dds-master_copy/libdds.so
#1  0x00007ffff5e5cb31 in FreeMemory () from /home/pierre/Documents/python/bridge/python-dds-master_copy/libdds.so
#2  0x00007ffff7de7de7 in _dl_fini () at dl-fini.c:235
#3  0x00007ffff6a0cff8 in __run_exit_handlers (status=0, listp=0x7ffff6d975f8 <__exit_funcs>, 
    run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#4  0x00007ffff6a0d045 in __GI_exit (status=<optimized out>) at exit.c:104
#5  0x00007ffff69f3837 in __libc_start_main (main=0x400ab0 <main>, argc=2, argv=0x7fffffffde68, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffde58) at ../csu/libc-start.c:325
#6  0x00000000004009e9 in _start ()

An idea about how to correct that issue?

The same script works for me on Ubuntu 18.04.

Are you sure you are running with master from my fork of the repository. My fork contains all my fixes and new features. I am not a contributor here - this project has been in silence for a few months now.

The stack dump suggests FreeMemory is called from the dll destructor, which I have removed.

The backtrace mentioned above was generated with the code of the official repository, not yours. But, as I saw you proposed some fixes, I tried to compile libdds.so based on the code on your master branch (cf. my answer to your pull request). I still got an error but from a different part of the code:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, Memory::GetPtr (this=0x7ffff67c0520 <memory>, thrId=0) at Memory.cpp:113
113	    cout << "Memory::GetPtr: " << thrId << " vs. " << nThreads << " vs. " << memory.size() << endl;
(gdb) bt
#0  Memory::GetPtr (this=0x7ffff67c0520 <memory>, thrId=0) at Memory.cpp:113
#1  0x00007ffff654e2ce in CloseDebugFiles () at Init.cpp:398
#2  0x00007ffff6557985 in Memory::~Memory (this=0x7ffff67c0520 <memory>, __in_chrg=<optimized out>) at Memory.cpp:23
#3  0x00007ffff7829ff8 in __run_exit_handlers (status=0, listp=0x7ffff7bb45f8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#4  0x00007ffff782a045 in __GI_exit (status=<optimized out>) at exit.c:104
#5  0x00007ffff7810837 in __libc_start_main (main=0x555555636f40 <main>, argc=2, argv=0x7fffffffddf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdde8)
    at ../csu/libc-start.c:325
#6  0x0000555555717e0e in _start () at ../sysdeps/x86_64/elf/start.S:103
(gdb) n
Memory::GetPtr: 0 vs. 0 vs. 0
114	    exit(1);

This time, the error was located in the destructor of the Memory class.

I made some changes to bypass the problem. You can find what I did in my fork of your work. In my opinion, it's not clean but it works: Python can release the lib without any error. I guess there is a problem because the memory is clean twice (but, to be honest, I don't know the code of this project, so it's just a supposition).

Thanks! I'll add rimmington's fixes to the next release.

I experienced the same problem although with a different setup:
From within a Java based application I called the libdds.so external library via JNA. Similar behaviour: on closing the application a SIGSEGV error was produced.

Following the above suggestions from rimmington I changed two files: Init.cpp and Memory.cpp and recompiled the libdds.so library. Rechecking the above setup did not produce any error.
Please find attached the two changed files (Init.cpp and Memory.cpp from version 2.9 of the library)

changedFiles.zip

Greetings
Klaus