transcribestreaming SIGSEGV of library in CRTHttpClient::MakeRequest -> ostream::write
sem32 opened this issue · 4 comments
Describe the bug
We are using C++ SDK to transcribe stream in realtime, and we have an issue with crashing the SDK library in some cases, but it is 100% reproduced in case of the wrong env variable AWS_SECRET_ACCESS_KEY
Why we are using CRT HTTP CLIENT?
We are using it because we have a performance issue when we use lib CURL.
- With the version of CURL 7.87 the quality of the transcribe was good, but CPU usage was too high (every 3-5 sec spike of CPU usage to 100%). For one transcribing process is more or less OK, but for 30 is not).
- With the version of CURL 7.88 we faced an issue with the quality of the transcribe (it looks like the CURL library does some optimization), but we had no performance issue.
- We have no issue with the quality and performance with the CRT http client.
GDB output: gdb_dump.txt
I tried to use libsanitizer to catch the issue, and here is the result: libsanitizer_res.txt
Expected Behavior
There are no crashes in the library
Current Behavior
the library is crashing
Reproduction Steps
the issue is reproduced in some rare cases with no changes, but 100% reproduced in case we put some wrong symbol to the value of AWS_SECRET_ACCESS_KEY environment variable
Possible Solution
No response
Additional Information/Context
No response
AWS CPP SDK version used
1.11.184 (latest master)
Compiler and Version used
gcc (Debian 10.2.1-6) 10.2.1 20210110
Operating System and version
Debian 11
I'm working on trying to reproduce the same error you are getting and I had a few questions:
- Are the logs/sanitizer/dump from when you reproduce the error without any changes? (i.e. with the normal AWS_SECRET_ACCESS_KEY)
- Are you getting CRC Mismatch in both error cases?
- Can you confirm you are using the unmodified sample found here: https://github.com/awsdocs/aws-doc-sdk-examples/tree/main/cpp/example_code/transcribe
- have you tried and reproduced this on any other OS's?
I just want to make sure we are both trying to solve the same problem. This similar looking issue was caused by a
permission access error in my AWS credential
and I want to make sure we're not debugging an error added artificially by changing the AWS_SECRET_ACCESS_KEY
Are the logs/sanitizer/dump from when you reproduce the error without any changes? (i.e. with the normal AWS_SECRET_ACCESS_KEY
I've changed only the default requestTimeoutMs, because it is too small in SDK.
diff --git a/src/aws-cpp-sdk-core/source/client/ClientConfiguration.cpp b/src/aws-cpp-sdk-core/source/client/ClientConfiguration.cpp
index 30e4fbabc0..ba73b788b1 100644
--- a/src/aws-cpp-sdk-core/source/client/ClientConfiguration.cpp
+++ b/src/aws-cpp-sdk-core/source/client/ClientConfiguration.cpp
@@ -122,7 +122,7 @@ void setLegacyClientConfigurationParameters(ClientConfiguration& clientConfig)
clientConfig.useFIPS = false;
clientConfig.maxConnections = 25;
clientConfig.httpRequestTimeoutMs = 0;
- clientConfig.requestTimeoutMs = 3000;
+ clientConfig.requestTimeoutMs = 30000;
clientConfig.connectTimeoutMs = 1000;
clientConfig.enableTcpKeepAlive = true;
clientConfig.tcpKeepAliveIntervalMs = 30000;
Are you getting CRC Mismatch in both error cases?
yes, I have the same error CRC Mismatch
even if I have the correct AWS_SECRET_ACCESS_KEY. When it's one transcribing session it's okay, but when I start 10-20 transcribing sessions in some time (20-30 sec) I have the same error (CRC Mismatch ) and the crash.
So, changing AWS_SECRET_ACCESS_KEY is the simplest way to reproduce the issue, but it's not a production case. In production, I have the same error (and crash) with a small load.
here are the logs/dumps:
crash2.zip
Also with the load I've faced other crashes with a load ~30 transcribing sessions
crash3.txt
and one more:
crash4.txt
crash5.zip
Can you confirm you are using the unmodified sample found here: https://github.com/awsdocs/aws-doc-sdk-examples/tree/main/cpp/example_code/transcribe
yes, correct. I tried to reproduce the issue with the wrong AWS_SECRET_ACCESS_KEY and it looks like the crash the same.
have you tried and reproduced this on any other OS's?
no, we are using Debian 11
I'm developing multithread application for realtime transcribing VoIP's calls, so when I load my module, I call Aws::InitAPI(options)
and for each SIP call that I need to transcribe I start a separate thread where I call
m_client = Aws::MakeUnique<TranscribeStreamingServiceClient>("TAG", config);
StartStreamTranscriptionRequest m_request;
set all callbacks and call
m_client->StartStreamTranscriptionAsync(_request, OnStreamReady, OnResponseCallback, nullptr);
When I have 5 calls to transcribe, it looks good with no issue, but when it's 10-20 I start to face an issues with crashes of SDK's library.
I compile SDK by:
cmake ../aws-sdk-cpp -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=/usr/local/ -DCMAKE_INSTALL_PREFIX=/usr/local/ -DBUILD_ONLY="transcribestreaming" -DUSE_CRT_HTTP_CLIENT=1
@SergeyRyabinin
The fix is working. Thank you!
⚠️ COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.