mirage/qubes-mirage-firewall

Random qvm shutdown

kennethrrosen opened this issue ยท 11 comments

I have set my mirage-firewall VM as the NetVM for each of my networked VMs. After sometime, the VM randomly shutsdown without warning. This happens rather consistently. I have the VM to start on boot. Once I restart the mirage-firewall VM, it generally doesn't crash again.

Hi, thanks for your feedback, I've had something like this only once over the last few weeks and I couldn't reproduce it.

I currently have the same configuration: mirage-fw is the default netvm and is run at boot time. I had the crash when I lost my WiFi connection and I guess mirage-fw couldn't free up enough memory with packets still coming in, so the heap grows until mirage-fw fails to get fresh memory from solo5.

In order to validate this can you check in your log if you have a message like (grep -C1 "Aborted" /var/log/xen/console/guest-mirage-test.log):

Fatal error: out of memory
Aborted
Solo5: solo5_abort() called

The interesting part should be just above, my logs only shows the ARP message about a minute before.

I can see two easy tweaks to try to fix that:

  • increase the memory given to mirage-fw (I don't like this because the low memory requirement is one of the goals of using mirage-fw)
  • change the percentage of free memory when forcing the GC call (in . After the GC call if we still have a less than 60% of free memory we may raise a Memory_critical to drop the NAT table in
    let handle_low_memory t =
    )

If you're ok with the build system, can you try the test-memory branch on my repository?

Apologies for the late reply. With a more detailed walkthrough I could try the test-memory branch, but for now I've slightly increased the memory given to mirage-fw. There have been no further crashes.

Thanks for testing one of the tweaks! Glad it works for you.
Could you tell me how much memory you are now providing to mirage-fw?
For compiling, the easiest way is to run sudo ./build_with_docker.sh, the sha will not match as it has not been updated. Then you will need to copy the kernel (dist/qubes-firewall.xen) to the vmlinuz kernel file in dom0 as with the reproducible build.

I'll close this issue, thanks for your information. If you have an answer or @palainp questions, don't hesitate to comment here :)

I catch this today. I experienced random aborts of firewall VM for few days. Looks like this is related to moderately heavy traffic (updating templates, opening multiple websites in tabs, etc.).

Fatal error: out of memory
Aborted
Solo5: solo5_abort() called

Are you planing to release fixed version or the easiest way to workaround is increasing memory assigned to firewall VM?

@burghardt we are investigating where the memory is kept, the intermediate workaround is indeed to increase the memory assigned to the firewall VM. Out of curiousity: how much memory do you have assigned to it?

I were using 64 MB for 0.7.1, and increased to 128 MB after upgraded to 0.8.0 and seen first issues. IIRC I started with 32 few releases ago and it has never run OOM. 128 MB should be fine for running Fedora 35 in firewall VM which makes Mirage use a bit pointless.

Thanks for your numbers, @burghardt. What you can do to help us is try out @palainp branch:
git clone -b test-memory https://github.com/palainp/qubes-mirage-firewall ; cd qubes-mirage-firewall ; sudo ./build-with-docker

And use the resulting dist/qubes-firewall.xen with 64MB (or even 32MB). This branch collects garbage slightly earlier.

Looks like @palainp branch works fine for both 32 MB and 64 MB setup (frankly only about one hour under load).

SHA2 of build:   5b9b16e8d5611e59d90d750ed5cee520743e937d3c9ddd6983c9ca110d11053d  ./dist/qubes-firewall.xen

I set memory limit to 32 MB. Same firewall performance as far I can test (iperf3, speedtest, rtorrent).

qvm-prefs mirage-firewall memory 32
qvm-prefs mirage-firewall maxmem 32

Meminfo for 32 MB:

2022-09-13 12:51:44 -00:00: INF [memory_pressure] Writing meminfo: free 19MiB / 27MiB (68.61 %)

And for 64 MB:

2022-09-13 13:06:11 -00:00: INF [memory_pressure] Writing meminfo: free 48MiB / 59MiB (81.37 %)

I will stick with 32 MB for now and report any problems.

I likewise increased to 128mb and worked fine

Thanks for your feedback. We just released 0.8.1 (https://github.com/mirage/qubes-mirage-firewall/releases/tag/v0.8.1) which works fine with 64MB memory (and should as well work nicely with 32MB). If you can give that a try, that'd be great.