joltwallet/esp_littlefs

Fails to write to append file greater than 28KB

marklangster opened this issue · 11 comments

Whenever I try to append a file that goes above 28 KB, the write fails and triggers is a core panic. I am not making any changes to the default configuration and the partition is on the order of megabytes. Really unsure why this is happening. I can keep creating new files but each one fails at around the same point.

Here's some code to recreate. Each new data write is about 160 bytes.

#include <esp_littlefs.h>
#include <esp_log.h>
#include <sys/stat.h>

#define BASE_PATH           "/store"
#define STRING_SIZE         30
#define PARTITION           "data_store"
#define TAG                 "DATA"



char path[30];

void initDataStorage(){
    esp_vfs_littlefs_conf_t conf = {
            .base_path = BASE_PATH,
            .partition_label = PARTITION,
            .format_if_mount_failed = true,
            .dont_mount = false
    };
    esp_err_t err = esp_vfs_littlefs_register(&conf);
    if (err != ESP_OK)
    {
        if (err == ESP_FAIL)
        {
            ESP_LOGE(TAG, "Failed to mount or format filesystem");
        }
        else if (err == ESP_ERR_NOT_FOUND)
        {
            ESP_LOGE(TAG, "Failed to find LittleFS partition");
        }
        else
        {
            ESP_LOGE(TAG, "Failed to initialize LittleFS (%s)", esp_err_to_name(err));
        }
        return;
    }
    size_t total = 0, used = 0;
    err = esp_littlefs_info(conf.partition_label, &total, &used);
    if (err != ESP_OK)
    {
        ESP_LOGE(TAG, "Failed to get LittleFS partition information (%s)", esp_err_to_name(err));
    }
    else
    {
        ESP_LOGI(TAG, "Partition size: total: %d, used: %d", total, used);
    }
}

void createNewFile(){
    uint8_t count;
    for (count = 0; count< 255; count++){
        struct stat st;
        snprintf(path, STRING_SIZE, "%s/s_%d.json", BASE_PATH, count);
        if (stat(path, &st) != 0){
           break;
        }
    }

    FILE *file = fopen(path, "w");
    ESP_LOGI(TAG, "File open");


    if (file == NULL)
    {
        ESP_LOGE(TAG, "Failed to open file for writing");
        return;
    }
    fclose(file);

}

void store(char* data){
    FILE *file = fopen(path, "a");
    if (file == NULL)
    {
        ESP_LOGE(TAG, "Failed to open file for appending");
        return;
    }
    struct stat st;
    if (stat(path, &st) == 0){
        ESP_LOGI(TAG, "File stats\n size: %ld\nblocks: %ld\nhard links: %d", st.st_size, st.st_blocks, st.st_nlink);
    } 
   
    fprintf(file, ",%s", data); //Breaking line
    


    fclose(file);
    ESP_LOGI(TAG, "File at %s appended", path);

}

Failure stacktrace

Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x400014fd  PS      : 0x00060d30  A0      : 0x801b0678  A1      : 0x3ffd0f50  
A2      : 0x00000000  A3      : 0xfffffffc  A4      : 0x000000ff  A5      : 0x0000ff00  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x00000000  A9      : 0x3ffd0f10  
A10     : 0x00000000  A11     : 0x3f4043e8  A12     : 0x3ffd1194  A13     : 0x3ffb6d24  
A14     : 0x00000000  A15     : 0x00000001  SAR     : 0x00000004  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  


Backtrace: 0x400014fa:0x3ffd0f50 0x401b0675:0x3ffd0f60 0x401b1ce5:0x3ffd1270 0x401c748d:0x3ffd12a0 0x40098849:0x3ffd12d0 0x400d94a9:0x3ffd1320 0x400d95f3:0x3ffd13b0 0x400d9a15:0x3ffd1440 0x40096241:0x3ffd1460
0x401b0675: _vfprintf_r at /Users/brnomac003/.gitlab-runner/builds/qR2TxTby/0/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:1528

0x401b1ce5: vprintf at /Users/brnomac003/.gitlab-runner/builds/qR2TxTby/0/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vprintf.c:34 (discriminator 5)

I ran your code with an app_main of:

void app_main(){
        initDataStorage();
        createNewFile();
        while(true)
                store("asdklfjlasdkjfklasdjfklasdjfkljasdlkfjasdlkfjasdlkjfasdkljflsdakjfalskdjf");
}

I also changed some logging around; the output looks like:

...
I (85845) DATA: File stats size: 66748 blocks: 0 hard links: 0
I (85895) DATA: File stats size: 66822 blocks: 0 hard links: 0
I (85965) DATA: File stats size: 66896 blocks: 0 hard links: 0
I (86035) DATA: File stats size: 66970 blocks: 0 hard links: 0
...

Have yet to see a crash, so it might be an interaction with some other portion of your code base? If you can produce a complete minimal project (possibly based off of esp_littlefs/example, with the littlefs_create_partition_image call in CMakeLists.txt removed) that can reproduce the issue, I can certainly look deeper.

The example does work fine and I can even make it work with my code replacing the example code but it's still failing in the main project.

I finally got an error:

E (207588) esp_littlefs: Unable to allocate FD

Happened right before the stack trace.

Thats helpful! So that error comes from here. Essentially, a calloc failed, suggesting that your system is out of memory.

That's where I was seeing as well.

Two questions:
Is there a way to understand how the cache and file descriptor memory grows? Is it linear based on file size or number of writes?

Is there a way to clear the some of that memory allocation? I am not concerned about file search times.

Basically, every time you allocate a new file descriptor, it will:

  1. Check if there's a pre-allocated FD slot available. If so, use it.

  2. If not, increase the FD cache by a factor; currently it's set to 2 (i.e. doubles).

The FD cache never shrinks; however if you re-enable this code it'll attempt to shrink the FD cache. That code hasn't been touched in a long time, so I'm not sure if it works. If you want to play around with it, we could make it enable-able in kconfig.

so upon closer inspection (I didn't actually write this portion of code, and it's been a few years since I last looked at it), that's just for the fd list structure (which should be relatively small). The file structure itself is dynamically allocated, which is what is failing for you. That increases linearly (and decreases linearly as closed) with number of files open. Number of writes shouldn't matter.

Based on your description, I am even more confused...

As in the code I sent to you, I am only creating one file per power cycle and appending to it. I have tried a number of approaches including adding back in the FD cache shrink and it hasn't changed the behavior. That being said, changes to the code have led to a change of the file size/number of writes that leads to the break. Seems to fail when running fopen(path, "a").

I get the following console log outputs:

E (179358) esp_littlefs: Unable to allocate FD
E (179358) DATA: Failed to open file for appending

Meaning that my FILE * pointer is null after running fopen(path, "a"). Weirdly, the next time that I try to append, I do not get a null pointer and then try to write at which point I end up with the stack track I sent originally:

Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x400014fd  PS      : 0x00060f30  A0      : 0x801b0320  A1      : 0x3ffd99b0  
A2      : 0x00000000  A3      : 0xfffffffc  A4      : 0x000000ff  A5      : 0x0000ff00  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x00000000  A9      : 0x3ffd9970  
A10     : 0x00000000  A11     : 0x3f407138  A12     : 0x3ffd9bf4  A13     : 0x3ffb6d24  
A14     : 0x00000000  A15     : 0x00000001  SAR     : 0x00000004  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  


Backtrace: 0x400014fa:0x3ffd99b0 0x401b031d:0x3ffd99c0 0x401b198d:0x3ffd9cd0 0x401c7135:0x3ffd9d00 0x40098849:0x3ffd9d30 0x400d9237:0x3ffd9d80 0x400d94b3:0x3ffd9e10 0x400d98d5:0x3ffd9ea0 0x40096241:0x3ffd9ec0
0x401b031d: _vfprintf_r at /Users/brnomac003/.gitlab-runner/builds/qR2TxTby/0/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:1528

0x401b198d: vprintf at /Users/brnomac003/.gitlab-runner/builds/qR2TxTby/0/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vprintf.c:34 (discriminator 5)

Im not sure; I am unable to locally reproduce. If you can share me a complete project I can investigate further. Currently, I can only assume you are running out of memory due to other processes running on your system.

Unfortunately, that is not possible because it is proprietary and because it is on a custom board meaning a lot more will break.

If this was a true situation where my application was running out of memory, I would have expected that my breaking point would have been much more random as other parts of the application could also trigger it. I did allocate more memory to the task managing the storage and also added high water mark logging to make sure I am not running out of memory and I am not even close (I have a couple KB extra). I also adjusted the CONFIG_LITTLEFS_CACHE_SIZE definition to 1KB (1024) and it still didn't change the break point.

I added logging to the esp_littlefs_allocate_fd function and the file definition is being updated every time the file is opened.

Without becoming a bit of an expert in the littlefs inner workings, I feel I don't have a lot to work with here. I appreciate your help but will probably use a different file system until there's something more to execute on.

If you have any thoughts on where I can debug/add some logging that may be helpful, let me know. Appreciate your support @BrianPugh!

This isn't an upstream littlefs issue (I don't particularly know well of littlefs's inner workings, either); we know that a calloc is failing to allocate space. uxTaskGetStackHighWaterMark inspects stack space, which is not the immediate issue. calloc allocates from the heap. I would check how much memory is free in your heap, and maybe investigate how fragmented it is. Also, add logging to check how much memory is attempting to be allocated (I bet it's somewhere around 50 bytes).

I would suggest to get a minimum working, complete reproducible example that you can share.

closing this issue due to inactivity, I'm pretty sure the underlying issue is that the heap was exhausted.