Dobiasd/frugally-deep

Unexpectedly high memory usage, am I missing something?

mebeim opened this issue ยท 6 comments

Take the following code:

// g++ -O3 -DNDEBUG -I... example.cpp
#include <fdeep/fdeep.hpp>

int main(int argc, char **argv) {
        const auto model = fdeep::load_model(argv[1], false);
        return 0;
}

The JSON model I am loading through argv[1] is around 600MB and was generated with convert_model.py (FYI: this size is after removing unneeded indentation, as the script was producing a JSON double the size due to json.dump(..., indent=2), but that's a different story). It represents a sequential text processing model with (in order): one embedding layer (embedding matrix of shape (113236, 300)), one bidirectional layer with 150 GRU units, one dense layer with 240 neurons, one dense layer with 50 neurons and a last one with only one neuron.

I have snapshotted memory usage at different execution points through a simple helper header (similar to this one), and the results are as follows:

  • In main, before fdeep::load_model(): max mem 1'032 bytes, cur mem 1'032 bytes
  • In fdeep::read_model() at include/fdeep/model.hpp:296, before loading the JSON: max 9'800 bytes, cur 9'744 bytes.
  • In fdeep::read_model() at include/fdeep/model.hpp:298, after loading the JSON: max 1'130'077'936 bytes, cur 1'130'073'792 bytes (+1.13GB)
  • In fdeep::read_model() at include/fdeep/model.hpp:313, before full_model is created: max 1'130'077'936 bytes, cur 1'130'073'792 bytes (same as previous)
  • In fdeep::read_model() at include/fdeep/model.hpp:323, after full_model is created: max 1'594'362'176 bytes, cur 1'267'924'496 bytes (+137MB)
  • Back in main, after fdeep::load_model(): max 1'594'362'176 bytes, cur 1'378'517'36 bytes (+111MB)

Do the above values look reasonable? I'm asking because I don't know much about fancy C++ memory semantics (specially when dealing with lambdas that borrow objects and similar stuff).

The JSON alone, right after being loaded, adds 1.13GB. After loading the model from it, memory usage increases. I would expect it to decrease at this point, since the loaded JSON object can be discarded and only the model can be kept in memory. Furthermore I would expect that data weighting around 568MB in text form (JSON) would have a smaller size when loaded into memory in binary form, and not double the original (text form) size. For comparison, if I save only the weights of the model through Keras, I get a file which is around 138MB.

I also saw this comment in the source code of fdeep::read_model():

json_data = {}; // free RAM

but in reality that statement does nothing. I've checked running with verify = true and snapshotting memory usage right before and after the statement, in reality memory usage increases after that line of code.

Am I missing something?

Very interesting! Thanks a lot for this investigation! ๐Ÿš€

Indeed, the measured memory usage (with -O3 -DNDEBUG) seems quite high. I vaguely remember having done some memory optimizations a long time ago. Maybe the // free RAM line did something meaningful back then, but no longer does.

I'll check if I can reproduce the observation, get back to you, and see if I can improve the situation.

So, we have different points here:

  • wasteful whitespace in JSON file
  • maybe an inefficient way to save weights in JSON file
  • json_data unnecessarily hanging around in memory forever

I've tested with a VGG19 model, and with this, the json file (776055763 bytes) is not so much larger than the h5 file (574753104 bytes).

  • Of course, saving without whitespace might be a good idea nonetheless, because the amount likely depends a lot on the model architecture. (tr -d -C ' ' <vgg19.json | wc -c -> 6011374)
  • Another thing, that can eat space in the json, which does not exist in the h5 are the test cases. Using --no-tests when using convert_model.py would help here.

But what bothers me the most is the json_data issue. With the VGG19, I've also measured and see the same problem you reported:

Memory usage (before fdeep::load_model): 6 MB
Memory usage (before read_model): 6 MB
Memory usage (before loading json_data): 6 MB
Memory usage (after loading json_data): 804 MB
Memory usage (before create_model_layer): 804 MB
Memory usage (after create_model_layer): 1380 MB
Memory usage (before internal::load_test_cases): 1380 MB
Memory usage (after internal::load_test_cases): 1380 MB
Memory usage (before 'json_data = {};'): 1380 MB
Memory usage (after 'json_data = {};'): 1362 MB
Memory usage (after read_model): 1362 MB
Memory usage (after fdeep::load_model): 1362 MB

I'll look deeper into what is going on there. ๐Ÿง

Same problem with the following minimal example:

#include <nlohmann/json.hpp>

#include <fstream>

// source: https://stackoverflow.com/a/64166/1866775
#include "stdlib.h"
#include "stdio.h"
#include "string.h"
int parseLine(char* line) {
    int i = static_cast<int>(strlen(line));
    const char* p = line;
    while (*p <'0' || *p > '9') p++;
    line[i-3] = '\0';
    i = atoi(p);
    return i;
}
int getMemoryUsageInkB() {
    FILE* file = fopen("/proc/self/status", "r");
    int result = -1;
    char line[128];
    while (fgets(line, 128, file) != NULL){
        if (strncmp(line, "VmSize:", 7) == 0){
            result = parseLine(line);
            break;
        }
    }
    fclose(file);
    return result;
}

void printMem(const std::string& name) {
    std::cout << "Memory usage (" << name << "): " << getMemoryUsageInkB() / 1024 << " MB" << std::endl;
}

int main()
{
    printMem("start");
    {
        std::ifstream in_stream("vgg19.json");
        printMem("after opening ifstream");
        {
            nlohmann::json json_data;
            in_stream >> json_data;
            printMem("after loading json");
            json_data = {};
            printMem("after assigning {} to json object");
        }
        printMem("after json destructor");
    }
    printMem("after ifstream destructor");
}
Memory usage (start): 5 MB
Memory usage (after opening ifstream): 5 MB
Memory usage (after loading json): 803 MB
Memory usage (after assigning {} to json object): 785 MB
Memory usage (after json destructor): 785 MB
Memory usage (after ifstream destructor): 785 MB

I've also tried with different versions of nlohmann/json (2.x), but always got similar results.


Edit: I've just found a section about Memory Release nlohmann/json/README.md (https://github.com/nlohmann/json#memory-release). Let's see if this sheds some light on things. ๐Ÿ™‚

Ok, two things. I was measuring the "Virtual Memory currently used by current process" (https://stackoverflow.com/a/64166/1866775). I've now switched to measuring the "Physical Memory currently used by current process":

// g++ -O3 -DNDEBUG main.cpp
#include <nlohmann/json.hpp>

#include <fstream>
#include <iostream>

#include <malloc.h>

// source: https://stackoverflow.com/a/64166/1866775
#include "stdlib.h"
#include "stdio.h"
#include "string.h"
int parseLine(char* line) {
    int i = static_cast<int>(strlen(line));
    const char* p = line;
    while (*p <'0' || *p > '9') p++;
    line[i-3] = '\0';
    i = atoi(p);
    return i;
}
int physicalMemorycurrentlyusedbycurrentprocessInMB() {
    FILE* file = fopen("/proc/self/status", "r");
    int result = -1;
    char line[128];
    while (fgets(line, 128, file) != NULL){
        if (strncmp(line, "VmRSS:", 6) == 0){
            result = parseLine(line);
            break;
        }
    }
    fclose(file);
    return result;
}

void printMem(const std::string& name) {
    std::cout << "Memory usage (" << name << "): " << physicalMemorycurrentlyusedbycurrentprocessInMB() / 1024 << " MB" << std::endl;
}

int main()
{
    printMem("start");
    {
        std::ifstream in_stream("vgg19.json");
        printMem("after opening ifstream");
        {
            nlohmann::json json_data;
            in_stream >> json_data;
            printMem("after loading json");
            json_data = {};
            printMem("after assigning {} to json object");
        }
        printMem("after json destructor");
    }
    printMem("after ifstream destructor");
    malloc_trim(0);
    printMem("after malloc_trim(0)");
}

However, the memory is still not free-ed after the json destructor. Only malloc_trim(0); actually gives it back to the OS:

Memory usage (start): 1 MB
Memory usage (after opening ifstream): 1 MB
Memory usage (after loading json): 791 MB
Memory usage (after assigning {} to json object): 783 MB
Memory usage (after json destructor): 783 MB
Memory usage (after ifstream destructor): 783 MB
Memory usage (after malloc_trim(0)): 3 MB

But since it's Linux-specific, I don't want to put it in the frugally-deep code.

I guess I'll add a remark to the FAQ.md, pointing to this and recommending to use malloc_trim(0); after fdeep::load_model. What do you think?

@Dobiasd great detective work! So as it turns out it's an "issue" with the underlying memory allocator (glibc malloc). I did not think about that at all. Mentioning malloc_trim() in the FAQ seems like the right idea for people using Linux + glibc (maybe also mention that this is the default libc on most major distros like Ubuntu, Debian, Arch, etc).

Of course, saving without whitespace might be a good idea nonetheless, because the amount likely depends a lot on the model architecture.

Yeah. After all, it's not really meant to be human-readable data, specially on medium/large models. I'd suggest a simple:

with open(out_path, 'w') as f:
    json.dump(json_output, f, separators=(',', ':'))

The separators=(',', ':') part also makes sure no superfluous whitespace is added after commas and colons. Using dump over dumps also avoids the creation of a very large string in memory before passing it to the write_text_file() wrapper. I can submit a PR if you wish, just let me know.

Thank you very much for the detailed responses ๐Ÿ‘

Thanks. Your JSON suggestion sounds very good. I'd be happy about a PR. ๐Ÿ™‚