lexus2k/tinyproto

Link management re-sync timeout

TPopiel opened this issue · 13 comments

Hello,

I'm using the library working in FD mode at both sides: device and PC application connected over serial interface. Everything works fine when the application on PC is up and running and HDLC link between node is alive, but if I close the link (e.g.: by closing the application) I need to wait 5 seconds till the device is able to establish new HDLC link with the PC application. I wonder if it is possible with the tinyproto following:

  1. Get current HDLC link status.
  2. Close HDLC link (the node in such case should send U-Frame Disconnect DISC command).
  3. Reset HDLC link (the node in such case should send U-Frame Reset RSET command).

Best regards,
Tomek

Hi Tomas,

Regarding 5 seconds that's the feature with keep alive timeout. By default keep alive timeout is hard coded to 5 seconds protocol->ka_timeout = 5000;. But that can be changed via tiny_fd_set_ka_timeout() to any desired value. Remember that the retry timeout must be less than keep alive timeout in order to avoid link breakage during normal communication process.

It is possible to implement the API you ask:

  1. Get current HDLC link status

Not a problem, FD implementation always knows status internally, and I think it is easy to add such API.

  1. Close HDLC link (the node in such case should send U-Frame Disconnect DISC command).

Such command can be implemented.

  1. Reset HDLC link (the node in such case should send U-Frame Reset RSET command).

Yes, RSET also can be implemented.

The features 2 and 3 require additional testing and writing more unit tests. Also, they do not protect the protocol in the cases when a hardware becomes suddenly broken (i.e. disconnected cable). That's why keep alive timeout was implemented.

Best regards
Alexey

Hi Alexey,

I believe that combination of public getter that returns current state of HDLC link and implicit sending Disconnect DISC command to other node before tiny_fd_close should solve all my issues. Do you consider changing the tinyproto by above 2 items?

I looked into the code but I'm not sure where to trigger sending the DISC command,
It can be put into following function:

void tiny_fd_close(tiny_fd_handle_t handle)
{
// here to call DISC command if HDLC state is alive
hdlc_ll_close(handle->_hdlc);
tiny_events_destroy(&handle->frames.events);
tiny_mutex_destroy(&handle->frames.mutex);
}

but how to ensure that it is sent properly and reaches the other node?

Regards, Tomek

Well, I have to mention that tiny_fd_close() is not good place to implement disconnect command. And the things get a little bit more complicated than just to put DISC command to the queue for FD protocol. Actually you need to wait until the command is sent.
It will be not so easy to implement those things in tiny_fd_close since you need to take into account multithread and single thread cases, and somehow to pass callbacks to read/write functions for the communication channel.

So, I would recommend to implement disconnect command as separate API function. DISC command has U-format, so it has no any specific confirmation response, except UA command. But UA frame is standard asnwer to many other commands like SNRM, SARM, SABM, SNRME, SARME, SABME, RSET, SIM. So, you cannot be completely sure that this is response to DISC.

I wonder if decreasing keep alive timeout makes things a little better in your case?

Just added 2 new APIs: tiny_fd_get_status() and tiny_fd_disconnect().
Let me know what you think

Hi Alexey,

Sorry for late response, but I was loaded with other stuff. I have played a bit with new changes.

I have 3 observations:

  1. MSVC build ends with error due to changes introduced with alignment optimizations.
  2. In my code I'm using tinyproto::FdD type. It is high level FD API and there is missing getter for the handle - it is necessary in order to use new introduced APIs. Could you add following into IFd class:
    tiny_fd_handle_t getTinyHandle() const
    {
    return m_handle;
    }
  3. When I updated version of tinyproto I'm receiving TINY_ERR_TIMEOUT, before it communication was working fine - I'm looking into details what is root cause.

Regards, Tomek

Hi Tomek,

MSVC build ends with error due to changes introduced with alignment optimizations.

Fixed by latest commits

Would you add following into IFd class.

Why not. New method is added getHandle()

When I updated version of tinyproto I'm receiving TINY_ERR_TIMEOUT, before it communication was working fine - I'm looking into details what is root cause.

I have several questions to you:
a) can you find the exact commit, which breaks the communication in your case?
b) can you describe your setup and provide simple code example that doesn't work?

Since DISC support commit, no new changes were introduced to the protocol itself except alignment. And alignment doesn't change the logic of the protocol.

Best regards

Once again, I went through all commits since DISC command support, and there are only changes in initialization procedures due to alignment.

I checked various commits and the problem was introduced with: 0095565 there you added support for DISC command and checking link management status. Tomorrow I will try to enable log file and gather tinyproto logs. I'm able to reproduce the problem with my unit tests - so need to connect to real device.

Regards, Tomek

That commit adds 1 state, and now there are 4: DISCONNECTING, CONNECTED, DISCONNECTED, CONNECTING
Earlier there were only 3 states: CONNECTED, DISCONNECTED, CONNECTING

Old logic:

  1. Upon receiving UA packet the protocol switches to CONNECTED state always.
  2. S-frames and I-frames are being processed only in CONNECTED state
  3. If there are no frames for specified ka timeout, the protocol switches to DISCONNECTED state.
  4. If number of retries is exhausted, the protocol switches to DISCONNECTED state.

New logic

  1. Upon receiving UA packet the protocol switches to CONNECTED only from CONNECTING state
  2. Upon receiving UA packet the protocol switches to DISCONNECTED only from DISCONNECTING state
  3. Upon receiving DISC packet the protocol switches to DISCONNECTED immediately
  4. S-frames and I-frames are being processed in CONNECTED and DISCONNECTING states
  5. If there are no frames for specified ka timeout, the protocol switches to DISCONNECTED state.
  6. If number of retries is exhausted, the protocol switches to DISCONNECTED state.

At least, I cannot find anything criminal. Each commit to the protocol goes through unit tests, which check 73% of code. And each commit automatically runs virtual UART communication tests on ARM/x86 platform using socat.

The logs will be very helpful.

Hi Alexey,

Please see the log:

49624213 ms: [0x55a166e79f70] PUT frame
49624213 ms: [0x55a166e79f70] ABM connection is not established
49624216 ms: [0x55a166e79f70] Sending U-Frame type=2C
49624216 ms: [0x55a166e79f70] Receiving U-Frame type=60
49624216 ms: [0x55a166e79f70] Receiving U-Frame type=60
49624216 ms: [0x55a166e79f70] ABM connection is not established
49624217 ms: [0x55a166e79f70] Sending U-Frame type=2C
49624416 ms: [0x55a166e79f70] ABM connection is not established
49624416 ms: [0x55a166e79f70] Sending U-Frame type=2C
49624616 ms: [0x55a166e79f70] ABM connection is not established
49624617 ms: [0x55a166e79f70] Sending U-Frame type=2C
49624715 ms: [0x55a166e79f70] PUT frame timeout

It seems, I know what's wrong in fd protocol. The bug in lines 790 793: no switching to connecting state.
Good catch.
Let me know if the fix below works

Hi, I found 1 more issue, all fixes are on the master branch

Hi Alexey,

Works perfectly, thanks for introducing this feature into the tinyproto!

Regards, Tomek