A bad connection causes an inifinite loop (at least on an ESP32 but I think also on an Arduino)
Jeroen88 opened this issue · 2 comments
The problem of #56 is even bigger: ::ClientRead() and ::ClientWrite() are called in bearssl ssl_io.c. If the socket client connection (Client *c) is lost or reset by the peer, both c->read(buf, len) and c->write(buf, len) will return a 0 (at least on the ESP32 but I think this is standard behavior also on an Arduino) thus causing an infinite loop in static int run_until(br_sslio_context *ctx, unsigned target) in ssl_io.c.
In my situation this problem occurs on a bad WiFi connection, but I think it may also occur on a bad tcp/ip connection or if the peer closes the connection.
One solution would be to add a timeout in the run_until() function, but I think it is better not to touch the original library. So I made adaptations to ::ClientRead() and ::ClientWrite() that involve static variables, which I think is a bad programming habit, but this is the only way to record a state between calls to these functions.
I did not introduce a new timeout value, but use the value of the socket client c->getTimeout().
int BearSSLClient::clientRead(void *ctx, unsigned char *buf, size_t len)
{
static bool notAvailableFlag = false;
static uint32_t lastNotAvailableMillis;
Client* c = (Client*)ctx;
if (!c->connected() && !c->available()) {
return -1; // connection lost or closed by peer (in ssl_io.c low_read, which points to this function, fails on a -1. which it should if a connection is lost)
}
int result = c->read(buf, len);
if (result == 0) {
if(notAvailableFlag) {
if(millis() - lastNotAvailableMillis > c->getTimeout()) {
notAvailableFlag = false; // for the next round
return -1; // timout read (in ssl_io.c low_read, which points to this function, fails on a -1. which it should if a connection is lost or closed by peer)
}
} else {
notAvailableFlag = true; // First time no data available
lastNotAvailableMillis = millis();
}
delay(10); // Needed?
} else {
notAvailableFlag = false; // This flag was set but new data is available, so start again
}
#ifdef DEBUGSERIAL
DEBUGSERIAL.print("BearSSLClient::clientRead - ");
DEBUGSERIAL.print(result);
DEBUGSERIAL.print(" - ");
for (size_t i = 0; i < result; i++) {
byte b = buf[i];
if (b < 16) {
DEBUGSERIAL.print("0");
}
DEBUGSERIAL.print(b, HEX);
}
DEBUGSERIAL.println();
#endif
return result;
}
int BearSSLClient::clientWrite(void *ctx, const unsigned char *buf, size_t len)
{
static bool notAvailableForWriteFlag = false;
static uint32_t lastNotAvailableForWriteMillis;
Client* c = (Client*)ctx;
if (!c->connected()) {
return -1; // connection lost or closed by peer (in ssl_io.c low_write, which points to this function, fails on a -1. which it should if a connection is lost)
}
int result = c->write(buf, len);
if (result == 0) {
if(notAvailableForWriteFlag) {
if(millis() - lastNotAvailableForWriteMillis > c->getTimeout()) {
notAvailableForWriteFlag = false; // for the next round
return -1; // timout write (in ssl_io.c low_write, which points to this function, fails on a -1. which it should if a connection is lost or closed by peer)
}
} else {
notAvailableForWriteFlag = true; // First time impossible to write data to peer
lastNotAvailableForWriteMillis = millis();
}
delay(10); // Needed?
} else {
notAvailableForWriteFlag = false; // This flag was set but new data was written, so start again
}
#ifdef DEBUGSERIAL
DEBUGSERIAL.print("BearSSLClient::clientWrite - ");
DEBUGSERIAL.print(len);
DEBUGSERIAL.print(" - ");
for (size_t i = 0; i < len; i++) {
byte b = buf[i];
if (b < 16) {
DEBUGSERIAL.print("0");
}
DEBUGSERIAL.print(b, HEX);
}
DEBUGSERIAL.println();
#endif
return result;
}
Here also should be added: if(errorCode() != BR_ERR_OK) return 0; This will check for (among others) the connection being dropped during SSL negotiation.
This issue is causing poor performance of a program I'm running on a MKR 1500 to communicate via MQTT: about half the time when connecting to MQTT the program hangs indefinitely. I tried implementing @Jeroen88's clientRead() and clientWrite() but then it invariably won't connect at all. Any ideas why this is?