MarvellEmbeddedProcessors/A3700-utils-marvell

Sometimes ttyUSB is blocking on open?

Closed this issue · 19 comments

I don't know why, might be a kernel thing, but strace reports open("/dev/ttyUSB0".... This means the call doesn't return, so unfortunately something in the program needs to change prior to calling open.

I know the addition of O_NONBLOCK changes the ABI slightly, so you should skim the rest of the code looking for places were u might want to block/flush. I think there is a read() somewhere in a loop after sending wtp\CR, for example.

diff --git a/wtptp/src/Wtpdownloader_Linux/src/UARTPortLinux.cpp b/wtptp/src/Wtpdownloader_Linux/src/UARTPortLinux.cpp
index fa1cd9e..bae0c33 100644
--- a/wtptp/src/Wtpdownloader_Linux/src/UARTPortLinux.cpp
+++ b/wtptp/src/Wtpdownloader_Linux/src/UARTPortLinux.cpp
@@ -127,7 +127,7 @@ void CUARTPortLinux::OpenPort() throw (CWtpException)
        UARTDeviceName ="/dev/ttyUSB" ;
        UARTDeviceName.append(portNumber.str());
        cout << endl << UARTDeviceName << endl;
-       uartLinuxFileDesc = open(UARTDeviceName.c_str(),O_RDWR|O_NOCTTY);
+       uartLinuxFileDesc = open(UARTDeviceName.c_str(),O_RDWR|O_NOCTTY|O_NONBLOCK);
        if(uartLinuxFileDesc < 0)
        {
                cout << endl << " ERROR: Open UART driver failed:" << endl;
pali commented

@cheako: I'm seeing same issue that lot of times wtpdownloader gets stuck in open syscall. I guess that it is because of internal USB <--> UART state. Are you able to fix also following read/write calls so code would behave correctly?

@pali I'm working on something.

@pali I've discovered that after first connecting the USB device(usually I unplug the 12v power) you get exactly one try until this bug presents itself. Also the patch above should work just fine, a few lines down the application turns blocking back on, somewhat accidentally.

Still with this in hand I'm unable to recover my device. I hope U have better luck.

pali commented

Not sure if this is related, but in Marvell U-Boot documentation https://github.com/MarvellEmbeddedProcessors/u-boot-marvell/blob/u-boot-2018.03-armada-18.12/doc/mvebu/uart_boot.txt is written that prior using WtpDownload_linux it is needed to call stty -F /dev/ttyUSB0 clocal.

pali commented

@cheako: Now WtpDownload_linux stucked again. And when I called above clocal then next WtpDownload_linux did not stucked again. Can you check if clocal helps for you?

I see how clocal could help and would be the correct setting. Even if this is a better fix than setting non-blocking... Why isn't this application doing that?

https://man7.org/linux/man-pages/man3/termios.3.html

Seems simple enough. If setting clocal requires root... non-blocking mode doesn't.

I no-longer need to use this application, but I will still champion for proper documentation and code changes. Like explaining that there is a 2 second window where the autoboot can be interrupted after minutes of waiting on data upload. However, this is not a case of documentation that's nearly impossible to find... It's that minor changes to the code would remove the need for the user to do anything WRT this issue.

pali commented

Seems that modem control lines should be really ignored for UART as there are no control lines. So it makes sense that clocal is needed.

Why isn't this application doing that?

Why WtpDownload_linux does not set clocal? I do not know. This is question for Marvell.

Maybe @kostapr would know? At least mentioned U-Boot documentation where is clocal documented is git-authorized by him.

@pali @cheako
actually this step is documented in u-boot procedure doc/mvebu/uart_boot.txt, #4

The following shows the detail steps of UART boot. The target board is
connected to /dev/ttyUSB0 in this example, replace the port number
according to your set-up.

1. The normal ATF build procedure for A3700 target creates a folder
   with all the images required for UART boot under ATF build output
   folder (A.K.A., build/a3700/release or build/a3700/debug). Make
   sure u-boot was built before ATF, see detail in build.txt.

2. Make sure the target board is in UART mode and ">" prompt is
   presented on the connected terminal. Pressing "Enter" in this mode
   will show "E" and then ">" again.
   There is a way to stop the target board booting from other sources,
   SPI flash for example, and switch to UART boot mode, by using the
   WtpDownload application::

	(power down the board or hold the RESET button)
	sudo stty -F /dev/ttyUSB0 clocal
	(power up the board or release the RESET button)
	./WtpDownload_linux -P UART -C 0 -R 115200 -Y

3. Stop the terminal emulator program on the UART port attached to the
   target board (minicom) for releasing the serial port and allowing
   an application to use it for image download.

4. Disable the TTY modem control signals::

	sudo stty -F /dev/ttyUSB0 clocal

5. Start the image download application in the ATF build folder using
   the following parameters.

   WTP downloader usage example - non-secure boot::

	./WtpDownload_linux -P UART -C 0 -R 115200 \
		-B ./uart-images/TIM_ATF.bin \
		-I ./uart-images/boot-image_h.bin \
		-I ./uart-images/wtmi_h.bin -E

   WTP downloader usage example - secure boot::

	./WtpDownload_linux -P UART -C 0 -R 115200 \
		-B ./uart-images/TIM_ATF_TRUSTED.bin \
		-B ./uart-images/TIMN_ATF_TRUSTED.bin \
		-I ./uart-images/boot-image_h.bin \
		-I ./uart-images/wtmi_h.bin -E

   Note that "-E" parameter can be dropped from the above commands.
   If you do this, check that the BootROM UART download mode is
   selected in paragraph 2, type in terminal "wtp" without quotes and
   press Enter. Then follow the procedure described in paragraph 3.
   This command will switch the BootROM to WTP download mode. The same is
   archieved by "-E" parameter in latest downloader SW.

6. After the process is finished, start the terminal emulator
   program (minicom) on that the UART port attached to the target
   board and ensure the U-Boot prompt is reached.

Note that steps 2-5 can be simplified with a single command as below::

	sudo stty -F /dev/ttyUSB0 clocal
	./WtpDownload_linux -P UART -C 0 -R 115200 \
		-B ./uart-images/TIM_ATF.bin \
		-I ./uart-images/boot-image_h.bin \
		-I ./uart-images/wtmi_h.bin -E -O

pali commented

I think I see what is happening here. See tty_ioctl(4) manpage about clocal: https://man7.org/linux/man-pages/man4/tty_ioctl.4.html

If the CLOCAL flag for a line is off, the hardware carrier detect (DCD) signal is significant, and an open(2) of the corresponding terminal will block until DCD is asserted, unless the O_NONBLOCK flag is given. If CLOCAL is set, the line behaves as if DCD is always asserted. The software carrier flag is usually turned on for local devices, and is off for lines with modems.

So if clocal is not set and you open /dev/ttyUSB device without O_NONBLOCK then it open syscall hangs. And if you open /dev/ttyUSB with O_NONBLOCK then Wtpdownloader would not work as it expects blocking IO operation.

Therefore setting clocal prior using Wtpdownloader is needed.

Other option is to patch Wtpdownloader to first open with O_NONBLOCK, sets clocal and either re-open or unset O_NONBLOCK.

pali commented

mox-imager (alternative tool for building and sending A3720 firmware over UART) has now a fix for this issue:
https://gitlab.nic.cz/turris/mox-imager/-/commit/2ab64ffb9aefeb91c04b6e896f87a345a0a920e2

pali commented

@cheako: Maybe you could be interested in alternative flashing tool for A3720 boards:
https://lists.denx.de/pipermail/u-boot/2021-June/452099.html

And if you open /dev/ttyUSB with O_NONBLOCK then Wtpdownloader would not work as it expects blocking IO operation.

I've recompiled, adding O_NONBLOCK, and had no ill-effects. Later, an audit of the code suggests Wtpdownloader was coded(even if accidentally) with non-blocking IO in mind. If anything, O_NONBLOCK can be used during the initial open and then dropped with fcntl/F_SETFL. Thought, I've not tested using fcntl(I.E. it could block for the same reason as open), and I see no need.

pali commented

Later, an audit of the code suggests Wtpdownloader was coded(even if accidentally) with non-blocking IO in mind.

That is questionable... I see that e.g. CUARTPortLinux::ReadRaw probably really expects non-blocking IO mode.

That function specifically doesn't care one way or the other. If anything, it's proving my point when it specially handles the case of 0 bytes read appropriately.

pali commented

I did some tests with O_NONBLOCK and seems it is working... So I think, @cheako go ahead and create a pull request to use O_NONBLOCK. Also it probably make sense to add CLOCAL into c_cflag at:

pali commented

@cheako: Look at this code properly:

uartLinuxFileDesc = open(UARTDeviceName.c_str(),O_RDWR|O_NOCTTY);
if(uartLinuxFileDesc < 0)
{
cout << endl << " ERROR: Open UART driver failed:" << endl;
throw CWtpException(CWtpException::OPENUARTPORT,0,UARTDeviceName);
}
fcntl(uartLinuxFileDesc,F_SETFL,0);

After opening /dev/ttyUSB* device it calls fcntl(uartLinuxFileDesc,F_SETFL,0); which clears all file status flags, including O_NONBLOCK. So if you add your patch which opens tty device with O_NONBLOCK then this status flag is cleared.

So with that simple your patch all our tests were always done with blocking mode.

pali commented

@cheako So I think that your patch in the first post should at least fix an issue when open syscall hangs. And because fcntl is called immediately after open blocking mode is still used.

So go ahead and open a pull request with your patch. It does not harm and can help with hangs.

pali commented

@cheako: Could you create a pull request with your fix?

pali commented

@cheako: PING

I have opened a pull request #26 which should fix this issue.