ohmplatform/FreedomGPT

3.0.4 Run slow on Hyper-V dynamic ram virtual machine

Closed this issue · 2 comments

If set Hyper-V virtual machine to fixed RAM, then after model file is load, there are 9.5 GB using RAM in task manager. In the "committed memory" section, there are only 2.8 GB. Overall RAM usage is indeed going up to 94%

When it comes to dynamic RAM, after system boot, task manager show RAM usage end at 3.8 GB and won't increase, while "committed memory" section is 2.4 GB.

In both case, task manager process tab couldn't see which process are hogging RAM.

It seems model file didn't fully load into RAM and run on disk partly when Hyper-V set to dynamic RAM, I mean, the "server.exe" memory allocation process didn't trigger Hyper-V to add more RAM to this virtual machine. If send something to edge model, I can see disk usage is very high, about 144 MB /sec read.

Maybe "server.exe" is just mounted a file mapping with shared memory, so the file is allowed to be swapped out from physical RAM? As a comparison, when I start a Minecraft Java Edition server, soon Hyper-V will increase dynamic RAM from 4096 MB to 8192 MB, also "committed memory" are increasing at same time. I can also see "java.exe" takes 6000 MB RAM in process tab.

Is there an option or some hack to change memory allocation method?

If I run "server.exe -h", I get some interesting message from console:

--mlock                   force system to keep model in RAM rather than swapping or compressing
--no-mmap                 do not memory-map model (slower load but may reduce pageouts if not using mlock)

I don't know how to add the two options to FreedomGPT.exe.

Located AppData\Local\FreedomGPT\app-3.0.4\resources\app\renderer\components\Chat\LocalServerHandler.tsx

This file contains server launch arguments:

    const inferenceProcessConfig = [
      "-m",
      currentModelPath(selectedModel.id),
      "-c",
      "2048",
      "--port",
      "8887",
    ];

However, edit it's content will not work, server still launch without additional argument.
After many try I find a file located at AppData\Local\FreedomGPT\app-3.0.4\resources\app\main\index.js:

exports.inferenceProcess = (0, child_process_1.spawn)(CHAT_SERVER_LOCATION, config);

Hack this line, insert new lines to make it looks like:

            let hackConfig = config.slice();
            hackConfig.unshift('--mlock');
            hackConfig.unshift('--no-mmap');
            exports.inferenceProcess = (0, child_process_1.spawn)(CHAT_SERVER_LOCATION, hackConfig);

By putting arguments in front of duplicated array and pass duplicated array in, "server.exe" finally launch with desire options.
Hyper-V now will allocate RAM for "server.exe" normally.

Use array.push() will not work. And also try to modify some generated javascript file also not work.
They are

  • app-3.0.4\resources\app\renderer\.next\server\chunks\718.js
  • app-3.0.4\resources\app\renderer\.next\server\pages\index.js
  • app-3.0.4\resources\app\renderer\.next\static\chunks\pages\index-9109e887bf7b615c.js