sewenew/redis-plus-plus

[BUG] crash is triggered when I call ping

romanholidaypancakes opened this issue · 9 comments

Describe the bug

 	ManagedHost.exe!mtx_do_lock(_Mtx_internal_imp_t * mtx=0x0000028a034f5ad0, const xtime * target=0x0000000000000000) Line 100	C++	Symbols loaded.
 	ManagedHost.exe!std::_Mutex_base::lock() Line 48	C++	Symbols loaded.
 	ManagedHost.exe!std::lock_guard<std::mutex>::lock_guard<std::mutex>(std::mutex & _Mtx={...}) Line 449	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::ConnectionPool::release(class sw::redis::Connection)	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::SafeConnection::~SafeConnection() Line 134	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::Redis::command<void >(void )	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::Redis::ping(void)	C++	Non-user code. Symbols loaded.


To Reproduce
I tried many times and couldn't reproduce it

Expected behavior

Environment:

  • OS: win10
  • Compiler: vs2022 lastest
  • hiredis version: 1.0.1-dev
  • redis-plus-plus version: 1.3.1

Additional context

Sorry, but I cannot reproduce the problem either. If you cannot give code that can reproduce the problem, there isn't much I can do.

Also you can try to update redis-plus-plus to the latest release to see if the problem still exists, since the one you used is too old.

Regards

I did some investigations based on the stack, it seems that the multi-threaded crash caused by mtx->_get_cs()->lock();,

static int mtx_do_lock(_Mtx_t mtx, const xtime* target) { // lock mutex
    if ((mtx->type & ~_Mtx_recursive) == _Mtx_plain) { // set the lock
        if (mtx->thread_id != static_cast<long>(GetCurrentThreadId())) { // not current thread, do lock
            mtx->_get_cs()->lock();
            mtx->thread_id = static_cast<long>(GetCurrentThreadId());
        }
        ++mtx->count;

        return _Thrd_success;
    } else { // handle timed or recursive mutex
        int res = WAIT_TIMEOUT;
        if (target == nullptr) { // no target --> plain wait (i.e. infinite timeout)
            if (mtx->thread_id != static_cast<long>(GetCurrentThreadId())) {
                mtx->_get_cs()->lock();  // <<<<<< crash
            }

I found that in the destructor SafeConnection::~SafeConnection call to std::lock_guard<std::mutex>, the variable std::mutex must be valid at this point? Does it get released early, I don't know the underlying design of the library, maybe I can give you a hint!

the variable std::mutex must be valid at this point?

Yes, it's valid. The mutex is part of the the connection pool, and the connection pool is a part of Redis object. When you send ping command with Redis object, the Redis object must exist, so the connection pool and the mutex are valid.

Regards

I triggered the crash again and it seems to be triggering the crash within the exception

ManagedHost.exe!mtx_do_lock(_Mtx_internal_imp_t * mtx, const xtime * target) Line 100 (d:\a\_work\1\s\src\vctools\crt\github\stl\src\mutex.cpp:100)
ManagedHost.exe!std::_Mutex_base::lock() Line 48 (c:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.36.32532\include\mutex:48)
ManagedHost.exe!std::lock_guard<std::mutex>::lock_guard<std::mutex>(std::mutex & _Mtx) Line 449 (c:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.36.32532\include\mutex:449)
ManagedHost.exe!sw::redis::ConnectionPool::release(class sw::redis::Connection) (Unknown Source:0)
ManagedHost.exe!sw::redis::SafeConnection::~SafeConnection() Line 134 (d:\redis-plus-plus\src\sw\redis++\connection_pool.h:134)
ManagedHost.exe!_guard_xfg_dispatch_icall_nop�() (Unknown Source:0)
ManagedHost.exe!_CallSettingFrame() Line 50 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\amd64\handlers.asm:50)
ManagedHost.exe!__FrameHandler4::FrameUnwindToState(unsigned __int64 * pRN, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int targetState) Line 1138 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:1138)
ManagedHost.exe!__FrameHandler4::FrameUnwindToEmptyState(unsigned __int64 * pRN, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo) Line 234 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\risctrnsctrl.cpp:234)
ManagedHost.exe!__InternalCxxFrameHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 351 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:351)
ManagedHost.exe!__InternalCxxFrameHandlerWrapper<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 234 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:234)
ManagedHost.exe!__CxxFrameHandler4(EHExceptionRecord * pExcept, unsigned __int64 RN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC) Line 306 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\risctrnsctrl.cpp:306)
ManagedHost.exe!__GSHandlerCheck_EH4(_EXCEPTION_RECORD * ExceptionRecord, void * EstablisherFrame, _CONTEXT * ContextRecord, _DISPATCHER_CONTEXT * DispatcherContext) Line 86 (d:\a\_work\1\s\src\vctools\crt\vcstartup\src\gs\amd64\gshandlereh4.cpp:86)
ntdll.dll!00007ffcbc0f242f() (Unknown Source:0)
ntdll.dll!00007ffcbc080939() (Unknown Source:0)
ManagedHost.exe!__FrameHandler4::UnwindNestedFrames(unsigned __int64 * pFrame, EHExceptionRecord * pExcept, _CONTEXT * pContext, unsigned __int64 * pEstablisher, void * Handler, FH4::FuncInfo4 * __formal, int TargetUnwindState, int CatchState, FH4::HandlerType4 * pCatch, _xDISPATCHER_CONTEXT * pDC, unsigned char recursive) Line 685 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\risctrnsctrl.cpp:685)
ManagedHost.exe!CatchIt<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, FH4::HandlerType4 * pCatch, const _s_CatchableType * pConv, FH4::TryBlockMapEntry4 * pEntry, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char IsRethrow, unsigned char recursive) Line 1370 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:1370)
ManagedHost.exe!FindHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, unsigned char recursive, int CatchDepth, unsigned __int64 * pMarkerRN) Line 625 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:625)
ManagedHost.exe!__InternalCxxFrameHandler<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 399 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:399)
ManagedHost.exe!__InternalCxxFrameHandlerWrapper<__FrameHandler4>(EHExceptionRecord * pExcept, unsigned __int64 * pRN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC, FH4::FuncInfo4 * pFuncInfo, int CatchDepth, unsigned __int64 * pMarkerRN, unsigned char recursive) Line 234 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\frame.cpp:234)
ManagedHost.exe!__CxxFrameHandler4(EHExceptionRecord * pExcept, unsigned __int64 RN, _CONTEXT * pContext, _xDISPATCHER_CONTEXT * pDC) Line 306 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\risctrnsctrl.cpp:306)
ntdll.dll!00007ffcbc0f23af() (Unknown Source:0)
ntdll.dll!00007ffcbc0a14b4() (Unknown Source:0)
ntdll.dll!00007ffcbc0f0ebe() (Unknown Source:0)
KernelBase.dll!00007ffcb9d6cf19() (Unknown Source:0)
ManagedHost.exe!_CxxThrowException(void * pExceptionObject, const _s__ThrowInfo * pThrowInfo) Line 75 (d:\a\_work\1\s\src\vctools\crt\vcruntime\src\eh\throw.cpp:75)
ManagedHost.exe!sw::redis::throw_error(struct redisContext const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &) (Unknown Source:0)
ManagedHost.exe!sw::redis::Connection::recv(bool) (Unknown Source:0)
ManagedHost.exe!sw::redis::Redis::_command<void >(class sw::redis::Connection &,void ) (Unknown Source:0)
ManagedHost.exe!sw::redis::Redis::command<void >(void ) (Unknown Source:0)
ManagedHost.exe!sw::redis::Redis::ping(void) (Unknown Source:0)

Since timeout exceptions are occasionally triggered, I try to catch exceptions on the ping and try to recreate the client connection, here is a simple example:

        try {
            if (redis->ping() != "PONG") {
                redis = createRedis();
            }
        } catch (...) {
            redis = createRedis();
        }

Based on the stack, it's clear that after the ping triggers an exception it goes into an exception catch and then triggers a mutex crash when trying to call the destructor sw::redis::SafeConnection::~SafeConnection.

Even if it throws timeout error, it won't be a problem.

However, from the stack info you given, it seems that you installed two version of STL:

  • d:\a_work\1\s\src\vctools\crt\github\stl\src\mutex.cpp
  • c:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.36.32532\include\mutex

That might be the reason why your code crash in mutex related stuff, i.e. version 1 and version 2 are messed up. You can uninstall one, and keep only one version of STL.

Regards

github\stl\src\mutex.cpp does not exist, it is just a symbol index.

new crash stack:

 	ManagedHost.exe!std::_String_val<std::_Simple_types<char>>::_Large_string_engaged() Line 2282	C++	Symbols loaded.
 	ManagedHost.exe!std::string::_Tidy_deallocate() Line 4868	C++	Symbols loaded.
 	ManagedHost.exe!std::string::~basic_string<char,std::char_traits<char>,std::allocator<char>>() Line 3164	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::ConnectionOptions::~ConnectionOptions() Line 54	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::Connection::~Connection() Line 121	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::Connection::`scalar deleting destructor'(unsigned int)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::destroy_at<sw::redis::Connection>(sw::redis::Connection * const _Location=0x0000000000000000) Line 313	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::_Default_allocator_traits<std::allocator<sw::redis::Connection>>::destroy<sw::redis::Connection>(std::allocator<sw::redis::Connection> & __formal={...}, sw::redis::Connection * const _Ptr=0x0000000000000000) Line 688	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::deque<sw::redis::Connection,std::allocator<sw::redis::Connection>>::pop_back() Line 1206	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::deque<sw::redis::Connection,std::allocator<sw::redis::Connection>>::_Tidy() Line 1588	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::deque<sw::redis::Connection,std::allocator<sw::redis::Connection>>::~deque<sw::redis::Connection,std::allocator<sw::redis::Connection>>() Line 904	C++	Non-user code. Symbols loaded.
>	ManagedHost.exe!sw::redis::ConnectionPool::~ConnectionPool() Line 64	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::ConnectionPool::`scalar deleting destructor'(unsigned int)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::_Destroy_in_place<sw::redis::ConnectionPool>(sw::redis::ConnectionPool & _Obj={...}) Line 300	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::_Ref_count_obj2<sw::redis::ConnectionPool>::_Destroy() Line 2111	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::_Ref_count_base::_Decref() Line 1172	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::_Ptr_base<std::random_device>::_Decref() Line 1398	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::shared_ptr<VoidConnector>::~shared_ptr<VoidConnector>() Line 1681	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::Redis::~Redis()	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::Redis::`scalar deleting destructor'(unsigned int)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::default_delete<sw::redis::Redis>::operator()(sw::redis::Redis * _Ptr=0x0000019a7560d860) Line 3167	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::unique_ptr<sw::redis::Redis,std::default_delete<sw::redis::Redis>>::reset(sw::redis::Redis * _Ptr=0x0000019a7560d740) Line 3315	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::unique_ptr<sw::redis::Redis,std::default_delete<sw::redis::Redis>>::operator=<std::default_delete<sw::redis::Redis>,0>(std::unique_ptr<sw::redis::Redis,std::default_delete<sw::redis::Redis>> && _Right={...}) Line 3266	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!`RedisHelper::isConnecting'::`1'::catch$3() Line 43	C++	Symbols loaded.
 	ManagedHost.exe!_CallSettingFrame_LookupContinuationIndex() Line 98	Unknown	Non-user code. Symbols loaded.
 	ManagedHost.exe!__FrameHandler4::CxxCallCatchBlock(_EXCEPTION_RECORD * pExcept=0x00000082a19fd4a0) Line 1435	C++	Non-user code. Symbols loaded.
 	ntdll.dll!RcConsolidateFrames�()	Unknown	Non-user code. Symbols loaded without source information.
 	ManagedHost.exe!RedisHelper::isConnecting() Line 39	C++	Symbols loaded.
 	ManagedHost.exe!RedisHelper::tryConnecting() Line 50	C++	Symbols loaded.


Connection is nullptr
image

Is the redis client thread-safe? I share the same client with multiple threads. I read that it is thread-safe by default? Am I getting this wrong? Below is my short code to create redis client

return  std::make_unique<sw::redis::Redis>(connectionOptions);

And I found deadlock when a large number of threads call ping, they are all stacks like this

 	ntdll.dll!00007ffcbc0f0a14()	Unknown	Non-user code. Symbol loading disabled by Include/Exclude setting.
 	ntdll.dll!00007ffcbc0b4141()	Unknown	Non-user code. Symbol loading disabled by Include/Exclude setting.
 	KernelBase.dll!00007ffcb9db3499()	Unknown	Non-user code. Symbol loading disabled by Include/Exclude setting.
 	[Inline Frame] ManagedHost.exe!Concurrency::details::stl_condition_variable_win7::wait_for(Concurrency::details::stl_critical_section_interface *) Line 87	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!Concurrency::details::stl_condition_variable_win7::wait(Concurrency::details::stl_critical_section_interface * lock) Line 81	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!_Cnd_wait(_Cnd_internal_imp_t * cond=0x000002626dee2730, _Mtx_internal_imp_t * mtx=0x000002626dee26e0) Line 59	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::condition_variable::wait(class std::unique_lock<class std::mutex> &)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!std::lock<class std::mutex,class std::mutex>(class std::mutex &,class std::mutex &)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::ConnectionPool::_wait_for_connection(class std::unique_lock<class std::mutex> &)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::ConnectionPool::fetch(void)	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::SafeConnection::SafeConnection(sw::redis::ConnectionPool & pool={...}) Line 122	C++	Symbols loaded.
 	ManagedHost.exe!sw::redis::Redis::command<void >(void )	C++	Non-user code. Symbols loaded.
 	ManagedHost.exe!sw::redis::Redis::ping(void)	C++	Non-user code. Symbols loaded.

these threads use the same redis client

The lib is thread safe, unless otherwise stated. You can call methods on Redis in multiple threads. However, Subscriber, Pipline or Transaction are not thread safe.

I still doubt that the two version of STL caused the problem. Can you try your code on a clean env, i.e. env with only 1 version of STL installed (remove the one installed in d:\a_work\1\s\src\vctools\crt\github\stl).

Regards

I use single thread crashes will no longer occur