earlephilhower/bearssl-esp8266

Use of yield

Adam5Wu opened this issue · 7 comments

Thanks a lot for your effort in keep improving this port!

The recent commit introduced yield to prevent wdt reset, however it would prevent use of this library in context other than g_cont, such as in lwip callback (AsyncTCP).

I think there are two alternatives that are more compatible with diverse execution contexts, could you please consider:

  1. system_soft_wdt_feed
    This one feeds the dog so it won't bark.
  2. optimistic_yield
    This one will only yield in the g_cont, and won't cause panic in other context.

If I use optimistic_yield instead of yield, will that cause WDT trouble for AsyncTCP? The EC key exchange, even at 160MHz, can take over (WDT_timeout)ms in BearSSL library code.

The latest push has optimistic_yield() in it. Seems fine in main Arduino code, let me know if you need some other logic to support AsyncTCP.

Thanks a lot! While using optimistic_yield prevents immediate panic in callback context, I found, as you described, sometimes too heavy computation fires watchdog reset.

The amount of computation seems to depend on server. tls.mbed.org seems to cause the most heavy computation. I haven't observed reset problem with other sites (but I haven't tried many, either).

Adding system_soft_wdt_feed(); before the "Run formulas" loop seems to aleviate the reset problem. So it seems both optimistic_yield and system_soft_wdt_feed are needed. Technically, it is an either-or -- if yield is possible, do that, otherwise, feed the dog.

So maybe the following block can be used instead?

if (cont_can_yield(&g_cont)) {
  yield();
} else {
  system_soft_wdt_feed();
}

However, even after that, connection cannot be established for me - tcpdump reveals that the server seems to give a very short tcp retry and timeout when doing handshake, around 2 seconds for each resend and 3 resend before closing connection.

Due to the nature of AsyncTCP, all data is handled in the LWIP callback context, which means that when computation hits, the TCP layer cannot ack fast enough...

I will try implement and workaround, basically queue all data during handshake into another timer callback context, and see if it helps.

BTW, I think you can move the yield block outside of the "Run formulas" loop.
My tracing shows that, it is not the loop that runs too long. Each execution of run_code() finish fairly fast, but it was run ~100 times and total is too long.
So you can safely reduce the check frequency and reduce some overhead. 😄

LOL it turned out my problem is CPU power, after all.

I have successfully implemented offloaded handshake in AsyncTCP, and I confirmed packets during handshake are acknowledged as fast as possible, but still the connection was cut (from server side) before handshake can be completed.

Then I tried to run at 160MHz, and the connection went right through. 😄

So apparently there are some servers just impossible to connect with default 80Mhz clock.

There are some hardware accelerated solutions, such as ATECC508A / ATECC608A.
Bearssl base-code seems well written and very modular, I guess it won't be hard to utilize those.
I put it on my backlog, will look into it when I got time, someday... :P

You found the same thing I did while testing. At 80MHz tls.mbed.org times out from the server at around 5 seconds.

The workaround, if you really want to run at 80MHz, is to drop the EC key exchange algorithms from the enabled suites in the br_ssl_engine_set_suites call, or move the RSA ones above all the EC ones so they get called preferentially.

I think this is good now. I'll open something about the memory allocation to support AsyncTCP to track that.