Bs0Dd/Coverett

fimexu randomly fails on large files

Opened this issue · 4 comments

fimexu -i randomly fails on large files with "No such file pr directory" exception
2022-09-22_09 28 55

Bs0Dd commented

Does this problem occur when using standard utilities? It looks like the program receives a JSON response from the controller like {"type":"error","data":"No such file or directory"}. I'm not sure if the problem is in coverett/fimexu, I'll try to reproduce it myself.

I rarely use stock import.lua, because it corrupts data as often, if not more, than fimexu, so i mostly use micropython implementation, and never seen this happen anywhere besides fimexu.

pufit commented

I faced the same problem and made a small investigation into it.
Basically, this commit TeaGuild@6726285 fixed the problem for me. How does it work? I have no idea whatsoever...

The messages No such file or directory or Not Implemented appeared on this line https://github.com/Bs0Dd/Coverett/blob/master/fimexu.c#L131. I logged wrt and data.retNumber and the values were
wrt 512, data.retNumber 0.00. However, one line before writing data in a file, the value of data.retNumber was 512.00. That doesn't make any sense to me, but I've changed the type of data.retNumber from double to size_t, and suddenly all errors disappeared. I tested it for a few days on various files with import-export and hash sum checks, and it didn't fail even once.

So my current theory is that there is a bug in sedna or even in musl compiler, which somehow leads to UB when you try to write a buffer with a double typed variable as a size. I also don't know why data.retNumber even was a double, so I don't rush to create a PR. But @Bs0Dd can probably help me at least with this mystery.

Bs0Dd commented

The messages No such file or directory or Not Implemented appeared on this line https://github.com/Bs0Dd/Coverett/blob/master/fimexu.c#L131.

Ah, that's it. By mistake, I put perror there, and later I thought that the message was generated on line 124. The "double" type is used because cJSON represents the "Number" value type exactly as a floating point number. Maybe I did something wrong, but I sometimes had problems reading a number from "valueint" in a cJSON object.

Made the variable cast into the "size_t" type. It seems that there are no more problems with importing, and it seems that the program has stopped breaking the file upon receipt (at least after a couple of days with hash sum verification). However, the problem with exporting files has not yet gone anywhere, they are still corrupted by zero bytes (I checked with a ~440kb JPG file). I changed the JSON "packet" size, added a delay before sending, doesn't help. Sometimes, it happens that by some miracle the file is exported normally and the hash sums match, but most often it turns out to be corrupted. Maybe the problem is in the cJSON library itself, I don't have any more ideas on that yet...