baryluk/fnirsi-usb-power-data-logger

Sporadic lock of USB interface

baryluk opened this issue · 8 comments

Killing the logger, might leave the meter in a state, where it does not allow to restart the logger.

Sometimes, second or third run, also makes UI (screen and controls) on the meter freeze.

Below is in FNB58, firmware 0.40

$ ./fnirsi_logger.py 
timestamp sample_in_packet voltage_V current_A dp_V dn_V temp_C_ema energy_Ws capacity_As
1668863696.036 0 0.05757 0.00009 1.661 1.655 24.700 0.000000 0.000001
1668863696.046 1 0.05757 0.00009 1.661 1.655 24.700 0.000000 0.000002
1668863696.056 2 0.05757 0.00009 1.661 1.655 24.700 0.000000 0.000003
1668863696.066 3 0.05757 0.00009 1.661 1.655 24.700 0.000000 0.000004
...
1668863697.396 0 0.05761 0.00009 1.660 1.655 24.700 0.000006 0.000111
1668863697.406 1 0.05761 0.00009 1.660 1.655 24.700 0.000006 0.000112
1668863697.416 2 0.05761 0.00006 1.661 1.655 24.700 0.000006 0.000113
1668863697.426 3 0.05761 0.00006 1.661 1.655 24.700 0.000007 0.000113
^C
Traceback (most recent call last):
  File "fnirsi_logger.py", line 230, in <module>
    main()
  File "fnirsi_logger.py", line 220, in main
    data = ep_in.read(size_or_buffer=64, timeout=1000)
  File "/usr/lib/python3/dist-packages/usb/core.py", line 423, in read
    return self.device.read(self, size_or_buffer, timeout)
  File "/usr/lib/python3/dist-packages/usb/core.py", line 1029, in read
    ret = fn(
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 864, in intr_read
    return self.__read(self.lib.libusb_interrupt_transfer,
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 946, in __read
    retval = fn(dev_handle.handle,
KeyboardInterrupt

$ ./fnirsi_logger.py 
timestamp sample_in_packet voltage_V current_A dp_V dn_V temp_C_ema energy_Ws capacity_As
Traceback (most recent call last):
  File "fnirsi_logger.py", line 230, in <module>
    main()
  File "fnirsi_logger.py", line 220, in main
    data = ep_in.read(size_or_buffer=64, timeout=1000)
  File "/usr/lib/python3/dist-packages/usb/core.py", line 423, in read
    return self.device.read(self, size_or_buffer, timeout)
  File "/usr/lib/python3/dist-packages/usb/core.py", line 1029, in read
    ret = fn(
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 864, in intr_read
    return self.__read(self.lib.libusb_interrupt_transfer,
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 954, in __read
    _check(retval)
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 602, in _check
    raise USBTimeoutError(_strerror(ret), ret, _libusb_errno[ret])
usb.core.USBTimeoutError: [Errno 110] Operation timed out
$ ./fnirsi_logger.py 
Traceback (most recent call last):
  File "fnirsi_logger.py", line 230, in <module>
    main()
  File "fnirsi_logger.py", line 63, in main
    dev.reset()
  File "/usr/lib/python3/dist-packages/usb/core.py", line 961, in reset
    self._ctx.backend.reset_device(self._ctx.handle)
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 911, in reset_device
    _check(self.lib.libusb_reset_device(dev_handle.handle))
  File "/usr/lib/python3/dist-packages/usb/backend/libusb1.py", line 604, in _check
    raise USBError(_strerror(ret), ret, _libusb_errno[ret])
usb.core.USBError: [Errno 2] Entity not found
$ 

which multimeter it has?

@baryluk I am witnessing the same behaviour with a FNIRSI FNB58.
Could you explain how the data that is written to the device during setup and between readings works?
E.g., ep_out.write(b"\xaa\x81" + b"\x00" * 61 + b"\x8e")

Perhaps the device needs a specific sequence to exit cleanly?

@luismeruje Yes, I believe that one needs to send some sequence (at possibly a specific time), before stopping connection. My guess is that device otherwise will fill up some UART FIFOs and possibly enter infinite loop trying to send more data, but FIFO is full, and will not be smaller.

When I looked at USB packet captures few months ago, I was not able to figure it out quickly (I spent maybe 20 minutes on this), so just ignored. But it would be nice to fix it.

I have created a pull request which does some impromevements to code structure and also to the stability of the tool.
I've added exception handling when a Keyboard Interrupt occurs, so that readings are exhausted before exiting. While this does not solve the problem entirely it seems to help a lot.

This is now mostly fixed. Big thanks to Luis.

It still sporadically locks, despite draining the descriptors and waiting for the timeout.

It appears that it now locks on initialization sometimes.

That is draining buffers out on exit look good, but something else is happening.

So keeping this issue open.

So it appears, that if I remove these two lines in the code:

dev.reset()

dev.set_configuration()

together with thew new draining code on Ctrl-C, it is quite reliable. Was not able to hang it yet.

All this reset and set_configuration was a guess work when I initially wrote this code, probably following some tutorials, so maybe it is not needed.

Also reset() on FNB58 is quite annoying, because it causes some issues with the storage device that the power meter also exposes. So maybe there is actually no need to do it.

Once I am happy with this, I think what I will make it not do reset() by default, and only do it if --reset command line argument is passed.

Having the dev.reset() and dev.set_configuration() lines removed greatly improves the reliability, but it still not fully consistent. After a few runs (with ^C in between them), on the 5th or 6th run, it freezes my FNB58 and I get the timeout error again. Nothing resolves this except a reboot of the FNB58.