Buffer Overflows in Native Serial Implementation with Long Device Names
Closed this issue ยท 7 comments
I am currently troubeshooting an issue that occurs in nrjavaserial-5.2.1
when connecting to a certain serial port on an openHAB system. For reference, there is also a thread in the openHAB community about this, but so far no one had an idea about how it could be fixed. So I hoped maybe you guys could help me identify and fix the issue.
Platform information: Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-122-generic x86_64)
Background: I use two different serial devices connected via USB. My system sometimes changes the assignments of the USB ports to tty
devices, i.e. sometimes a certain device will be mounted at /dev/ttyUSB0
and sometimes at /dev/ttyUSB1
. The assignments can change upon every reboot and the behavior seems to be non-deterministic. Luckily there are some symlinks with stable names located at /dev/serial/by-id
:
I am trying to use those symlinks for my openHAB devices. The issue is that one of those symlinks can be used without any problems, while the other crashes the complete JVM when connected to.
This one works without any issues:
/dev/serial/by-id/usb-SHK_NANO_CUL_433-if00-port0
while this one crashes the JVM:
/dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0
I debugged the issue remotely and found out that the crash happens when calling the native open()
method in class gnu.io.RXTXPort
. The call is at line 173:
fd = open( name );
the native method called is:
private native synchronized int open( String name ) throws PortInUseException;
First I suspected a threading issue, but even when deactivating the NanoCUL device the issue still occurs, so concurrency does not seem to be the problem.
Do you guys have any other idea why this could happen? The only reason I can think of right now is the length of the device names. The NanoCUL device name has 50 characters, while the other one (it is a Zigbee USB stick) has 66 characters. Are you aware of any limitations regarding the device name length? Or do you have any other ideas why the crash might occur?
Here are the two stack traces of the calls:
From the openHAB Serial Binding with device name /dev/serial/by-id/usb-SHK_NANO_CUL_433-if00-port0
(this call works):
Thread [OH-safeCall-1] (Suspended (breakpoint at line 173 in RXTXPort))
RXTXPort.<init>(String) line: 173
RXTXCommDriver.getCommPort(String, int) line: 983
CommPortIdentifier.open(String, int) line: 467
SerialPortIdentifierImpl.open(String, int) line: 53
SerialBridgeHandler.initialize() line: 138
GeneratedMethodAccessor91.invoke(Object, Object[]) line: not available
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43
Method.invoke(Object, Object...) line: 566
InvocationHandlerSync<T>(AbstractInvocationHandler<T>).invokeDirect(Invocation) line: 154
Invocation.call() line: 52
FutureTask<V>.run() line: 264
QueueingThreadPoolExecutor(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1128
ThreadPoolExecutor$Worker.run() line: 628
Thread.run() line: 829
From the Zigbee binding with device name /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0
(this one crashes the JVM):
Daemon Thread [OH-thingHandler-5] (Suspended (breakpoint at line 173 in RXTXPort))
owns: EmberHandler (id=705)
RXTXPort.<init>(String) line: 173
RXTXCommDriver.getCommPort(String, int) line: 983
CommPortIdentifier.open(String, int) line: 467
SerialPortIdentifierImpl.open(String, int) line: 53
ZigBeeSerialPort.open(int, ZigBeePort$FlowControl) line: 157
ZigBeeSerialPort.open() line: 123
ZigBeeDongleEzsp.initialiseEzspProtocol() line: 1289
ZigBeeDongleEzsp.initialize() line: 432
ZigBeeNetworkManager.initialize() line: 418
EmberHandler(ZigBeeCoordinatorHandler).initialiseZigBee() line: 431
EmberHandler(ZigBeeCoordinatorHandler).lambda$2() line: 557
354614459.run() line: not available
Executors$RunnableAdapter<T>.call() line: 515
ScheduledThreadPoolExecutor$ScheduledFutureTask<V>(FutureTask<V>).run() line: 264
ScheduledThreadPoolExecutor$ScheduledFutureTask<V>.run() line: 304
WrappedScheduledExecutorService(ThreadPoolExecutor).runWorker(ThreadPoolExecutor$Worker) line: 1128
ThreadPoolExecutor$Worker.run() line: 628
Thread.run() line: 829
I could not find any JVM crash logs or core dumps. Please let me know if I can provide any further information.
I would be happy to contribute a fix if the issue turns out to be in the Java code (as opposed to the native code).
There are indeed several unbounded sprintf
writes to a logging message buffer of 80 bytes containing the device name: https://github.com/NeuronRobotics/nrjavaserial/blob/master/src/main/c/src/SerialImp.c#L669 and following.
This could be the issue indeed. While looking through the C code I found the following issues:
RXTXPort.Initialize
: buffermessage[80]
declared in line 265 seems never to be written anywhere but is passed toreport()
in line 311RXTXPort.open
: buffermessage[80]
declared in line 669 will overflow for long file names like in my case, effective messages belowRXTXPort.open
: buffer is filled and never used in line 710 (this most likely causes my crash)RXTXCommDriver.testRead
(test code, not sure if relevant): buffermessage[80]
in line 4378 will overflow for long file names (example below)lib_lock_dev_lock
: buffermessage[80]
declared in line 5391 will overflow for long file names (example below)uucp_lock
: empty message buffer is reported in line 5551uucp_lock
: not sure whetherfilename
was intended instead ofname
in line 5566. Anyways if my device name would be inserted the buffer would overflow (example below)uucp_lock
: not sure if the lock file name also includes the original device name, but if so there might be more overflows in line 5579, 5588 and 5596.uucp_unlock
: potential overflows if the lock file name contains the original device file name
Effective error messages that cause buffer overflows when using my device name as example:
open: locking has failed for /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0\n (95 characters)
open: locking worked for /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0\n (91 characters)
open: exclusive access denied for /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0\n (100 characters)
testRead: /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0 is good!\n (85 characters)
RXTX fhs_lock() Error: creating lock file for: /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0: %s\n (117+ characters)
uucp_lock: device was /dev/serial/by-id/usb-Silicon_Labs_BV_2010_10_01382705-if00-port0\n (88 characters)
I could try to provide a pull request, but I have to declare that while I'm an experienced Java programmer, I don't know much about memory management in C. So I don't know whether just raising the buffer sizes would solve the problem, plus I don't know how to test this properly. Would anyone else who is more familiar with the code and knows how this can be tested be willing to fix this?
Thanks for the research. Most points should be very easy to fix by replacing sprintf
by snprintf
and adding sizeof(message)
as the additional second argument. The risk of breaking anything by these changes should be low.
Printing an uninitialized buffer should simply be removed I guess.
Created PR #230, it would be great if you could review and test it as I don't have the required toolchains installed.
Thanks for fixing these bugs @david-pace!
I'd be happy to integrate a new nrjavaserial version in openHAB when it's released. ๐
It would also allow us to use an official release again, since we've been using our own "5.2.1.OH1" build to be able to use the fix for the annoying #111 bug ๐ ๐จ
So all changes up to 7aa21d1 are well tested since they are part of OH 3.3.0 which was released ~6 months ago.
Would it possible to create a new stable release of nrjavaserial
@madhephaestus @crea-doo @kaikreuzer @MrDOS, and can we do anything to help?