decreasing grab framerate to 30 fps -> unstable multi GMSL cameras start
bmegli opened this issue · 3 comments
Preliminary Checks
- This issue is not a duplicate. Before opening a new issue, please search existing issues.
- This issue is not a question, feature request, or anything other than a bug report directly related to this project.
Description
Related to GMSL cameras (ZED-X, ZED-XM)
Decreasing grab_frame_rate
to 30 fps makes starting multiple GMSL cameras at the same time unstable.
Steps to Reproduce
See Anything Else
section for now
Expected Result
Changing grab_frame_rate
not affecting cameras startup stability.
Actual Result
Decreasing grab_frame_rate
to 30 fps makes starting multiple GMSL cameras at the same time unstable.
Starting only 1 camera works as expected.
Waiting for first camera to finish init before second camera helps a bit but is still hit or miss
When both cameras start then they work reliably, it is only the start that is affected.
Warnings
Camera 1 (eventually succeeds)
[ZED-Argus][Timeout] CAM 0 is frozen
[ZED-Argus][Timeout] CAM 0 is frozen
(Argus) Error FileOperationFailed: Failed socket read: Connection reset by peer (in src/rpc/socket/common/SocketUtils.cpp, function readSocket(), line 79)
(Argus) Error FileOperationFailed: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error FileOperationFailed: Receive worker failure, notifying 1 waiting threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 350)
(Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 366)
(Argus) Error FileOperationFailed: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error FileOperationFailed: Client thread received an error from socket (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 145)
(Argus) Error FileOperationFailed: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
Camera 2 (eventually fails)
(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
ZED Camera model
ZED-X, ZED-XM
Note - impossible to select ZED-X/ZED-XM, in the form, edited afterwards
Environment
Both are running with Neural depth
Jetson AGX Orin
- JP 5.1
- L4T 35.2.1
- ZED_SDK_Tegra_L4T35.2_v4.0.2
- GMSL driver
apt-cache policy stereolabs-nvidia-l4t-kernel-35.2-dtbs
stereolabs-nvidia-l4t-kernel-35.2-dtbs:
Installed: 5.10.104-tegra-35.2.1-20230124153320
Candidate: 5.10.104-tegra-35.2.1-20230124153320
Resulting from
sudo apt install /usr/local/zed/drivers/L4T_35.2/stereolabs-zedx-L4T35.2-v0.4.7_max96712.deb
Anything else?
Workaround
Keep grab_frame_rate
at 60 fps
Other notes
- I am using nodelet workflow
- I am instantly pulling data from cameras within launchfiles (traffic on GMSL link from the start)
- my guess would be there is some timer and timeout tuned for 60 fps grab frame rate
ZED_Depth_Viewer
It is possible to trigger similar condition with 2x ZED_Depth_Viewer + point at different GMSL cameras + neural depth + playing with framerate (which restarts the cameras)
So the real problem is below ROS layer.
First
ZED_Depth_Viewer
[ZED-Argus][Timeout] CAM 1 is frozen
[ZED-Argus][Timeout] CAM 1 is frozen
(Argus) Error FileOperationFailed: Failed socket read: Connection reset by peer (in src/rpc/socket/common/SocketUtils.cpp, function readSocket(), line 79)
(Argus) Error FileOperationFailed: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error FileOperationFailed: Receive worker failure, notifying 1 waiting threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 350)
(Argus) Error InvalidState: Argus client is exiting with 1 outstanding client threads (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 366)
(Argus) Error FileOperationFailed: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error FileOperationFailed: Client thread received an error from socket (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 145)
(Argus) Error FileOperationFailed: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
[ZED-Argus][Timeout] CAM 1 is frozen
[ZED-Argus][Timeout] CAM 1 is frozen
Second
ZED_Depth_Viewer
(Argus) Error EndOfFile: Unexpected error in reading socket (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadCore(), line 277)
(Argus) Error EndOfFile: Receiving thread terminated with error (in src/rpc/socket/client/ClientSocketManager.cpp, function recvThreadWrapper(), line 379)
(Argus) Error InvalidState: Receive thread is not running cannot send. (in src/rpc/socket/client/ClientSocketManager.cpp, function send(), line 96)
(Argus) Error InvalidState: (propagating from src/rpc/socket/client/SocketClientDispatch.cpp, function dispatch(), line 91)
Stack trace (most recent call last):
#26 Object "ZED_Depth_Viewer", at 0x41ed6f, in
#25 Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffffa021ae0f, in __libc_start_main
#24 Object "ZED_Depth_Viewer", at 0x41e2fb, in
#23 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa0893a5b, in QCoreApplication::exec()
#22 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa088b3b7, in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>)
#21 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08e81cb, in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)
#20 Object "/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0", at 0xffff9ed36c53, in g_main_context_iteration
#19 Object "/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0", at 0xffff9ed36bb3, in
#18 Object "/usr/lib/aarch64-linux-gnu/libglib-2.0.so.0", at 0xffff9ed36943, in g_main_context_dispatch
#17 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08e7e37, in
#16 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08e7507, in QTimerInfoList::activateTimers()
#15 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa088cc0b, in QCoreApplication::notifyInternal2(QObject*, QEvent*)
#14 Object "/usr/lib/aarch64-linux-gnu/libQt5Widgets.so.5", at 0xffffa1245ad7, in QApplication::notify(QObject*, QEvent*)
#13 Object "/usr/lib/aarch64-linux-gnu/libQt5Widgets.so.5", at 0xffffa123c4ab, in QApplicationPrivate::notify_helper(QObject*, QEvent*)
#12 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08ba5b7, in QObject::event(QEvent*)
#11 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08c7557, in QTimer::timeout(QTimer::QPrivateSignal)
#10 Object "/usr/lib/aarch64-linux-gnu/libQt5Core.so.5", at 0xffffa08b9bff, in QMetaObject::activate(QObject*, int, int, void**)
#9 Object "ZED_Depth_Viewer", at 0x41f35b, in
#8 Object "ZED_Depth_Viewer", at 0x43fe83, in
#7 Object "ZED_Depth_Viewer", at 0x43fafb, in
#6 Object "ZED_Depth_Viewer", at 0x437a03, in
#5 Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa355f857, in sl::Camera::open(sl::InitParameters)
#4 Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa35cc56b, in
#3 Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa2a47a53, in sl::GMSLInput::close(bool)
#2 Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa2a3d573, in ArgusCamera::close()
#1 Object "/usr/local/zed/lib/libsl_zed.so", at 0xffffa2a454bf, in
#0 Object "/usr/lib/aarch64-linux-gnu/tegra/libnvargus_socketclient.so", at 0xffff9fad0810, in
Segmentation fault (Address not mapped to object [(nil)])
Segmentation fault (core dumped)
After installing ZED SDK 4.0.3 I can no longer reproduce this problem from ROS side.
I am not sure it is SDK or GMSL grabber driver or something else that fixed the problem.
If I don't see it again soon I will close the issue.
When one of the cameras is not reachable it is possible that the argus service is frozen for some reason.
You can recover the cameras by restarting the service:
sudo service nvargus-daemon restart
Thanks.
I can no longer reproduce the problem also with ZED_Depth_Viewer
So somehow installing ZED SDK 4.0.3 or GMSL driver fixed it