Excessive memory usage building GTK Rust, a problem for entry level devices for newcomers (e.g. Raspberry Pi 3+)
Opened this issue · 13 comments
I tried to build this Rust project, an on-screen keyboard directly on-device on a PinePhone:
$ git clone https://source.puri.sm/Librem5/squeekboard
$ cd squeekboard
$ mkdir _build
$ meson _build/
$ cd _build
$ ninja
(install of some dependencies will likely be required before it runs through, it tells you which as you go)
I expected to see this happen: it works
Instead, this happened: it runs out of memory once it builds GTK for Rust:
This device has 2GB in total in memory as you can see. Now before you point out that may not be the typical developer device, the Raspberry PI also commonly has around that range in memory and is a common entry level device for tinkerers and new want-to-be programmers that try out their first steps. Please also note that some people from less fortunate backgrounds may not be able to afford a pricier device. Also, mobile phones used on the run for quick tasks, which might involve rebuilding an app locally as I tried here, are increasingly becoming more common. For all programming languages other than Rust that I've ever tried, this is not an issue: compilation may take long, maybe hours even for larger projects, but at least for regular desktop level applications outside of giants like browsers, or entire compiler toolchains themselves, it usually works, even if only with lots of patience. For Rust however, this doesn't seem to be currently the case.
I reported the issue to GTK for Rust which is the component that makes rustc run out of memory when compiled, where it was pointed out this is likely something that needs to be addressed in the compiler: gtk-rs/gtk#1074
Since this seems like a potentially major accessibility barrier for newcomers and people stuck with less powerful devices to get into the Rust world, I suggest that this situation should be improved.
Please note I am not objecting to this memory usage in environments where it is available, if that speeds up compilation (for caching etc). However, it shouldn't be a required amount to be able to build any basic graphical GTK application at all.
Meta
rustc --version --verbose
:
$ rustc --version --verbose
rustc 1.44.0
binary: rustc
commit-hash: unknown
commit-date: unknown
host: aarch64-alpine-linux-musl
release: 1.44.0
LLVM version: 10.0
(this is the compiler directly on the 2GB device, a PinePhone.)
Since you have 4 cores but only 2 GB of RAM, try building with only 1 codegen unit.
Putting that into .cargo/config
for the dev
and release
profiles, as well as into the Cargo.toml.in
that squeekboard uses, got me a little further, but not much:
Compiling gtk v0.7.0
Compiling serde v1.0.117
error: could not compile `gio`.
Caused by:
process didn't exit successfully: `rustc --crate-name gio /home/ellie/.cargo/registry/src/github.com-1ecc6299db9ec823/gio-0.7.0/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C codegen-units=1 -C debuginfo=2 --cfg 'feature="v2_44"' -C metadata=0564254420026841 -C extra-filename=-0564254420026841 --out-dir /home/ellie/Develop/squeekboard/_build/debug/deps -L dependency=/home/ellie/Develop/squeekboard/_build/debug/deps --extern bitflags=/home/ellie/Develop/squeekboard/_build/debug/deps/libbitflags-14f9ed560b3aed83.rmeta --extern fragile=/home/ellie/Develop/squeekboard/_build/debug/deps/libfragile-5b6284edc93a7050.rmeta --extern gio_sys=/home/ellie/Develop/squeekboard/_build/debug/deps/libgio_sys-14cdd38865daa21e.rmeta --extern glib=/home/ellie/Develop/squeekboard/_build/debug/deps/libglib-878ccd51d1057dc4.rmeta --extern glib_sys=/home/ellie/Develop/squeekboard/_build/debug/deps/libglib_sys-70f0d998180d9afb.rmeta --extern gobject_sys=/home/ellie/Develop/squeekboard/_build/debug/deps/libgobject_sys-90c9207e4cbdf53c.rmeta --extern lazy_static=/home/ellie/Develop/squeekboard/_build/debug/deps/liblazy_static-da3d7afec1ac6b6f.rmeta --extern libc=/home/ellie/Develop/squeekboard/_build/debug/deps/liblibc-ef7e34d7ca0b8bbb.rmeta --cap-lints allow` (signal: 9, SIGKILL: kill)
warning: build failed, waiting for other jobs to finish...
(device became unresponsive afterwards)
Also, would there be any way cargo could do this automatically on low memory devices? I'm not 100% sure I applied it correctly, although given codegen-units=1
shows up in the error above I assume I did. But it doesn't seem very optimal that the default first experience is still a system freeze, unless I start tweaking just to get it to work at all.
Nevertheless, if it has done something of use then it seems like it wasn't enough to save the day, sadly.
Try passing -j1
to Cargo if you aren't already. Or run ninja -j1
instead of just ninja
, that might also help.
Okay, so I did want to leave it run for a while since it didn't instantly fully go out of memory / drop SSH as last time.
However, the result of ninja limiting to one job + codegen-units=1
was merely that the device hovered at 1.81/1.88GB with ssh still just about working, but not much else really making a lot of advances. The CPU time of rustc for example went up in a glacially slow amount, and I actually went to sleep after looking at it for hours compiling the gtk
package, that duration in itself I guess still might be normal on such a slow CPU - and now after waking up, the device fully froze with out of memory again with SSH dead, screen only responding after a many minutes delay as before (so it hasn't crashed, just usual late kernel oom killer).
So that leaves me with: 1. still really wondering if cargo really should default to something more conservative with codegen units by default on any device with <3GB free memory, 2. rustc in overall still seems to go way overboard with memory use compiling GTK for Rust in a way that really is a showstopper on a 2GB mem device which seems quite unfortunate. (Might be just about workable on a 3-4GB device, I don't know. I might get a chance to test that in late November, or December.)
If it's not enabled already, have you tried enabling zram (memory compression through the swap system)? Or if your storage device is fast enough, swap?
@the8472 I heard that it works on the 3GB phone variant which I should own soon, but likely only if pretty much all other software is terminated. This seems unreasonable for a compiler, and I feel like we're just talking around the issue that rustc just appears to use too much memory compared to pretty much all other compilers when compiling a comparable program (a basic GTK+ app).
Swap isn't the greatest idea because on the phone you can either put it on eMMC (which kills the phone when it dies) or the SD card (which is more prone to dying early under too heavy writing compared to "proper" desktop storage). zram might work, but only goes so far. Also, it doesn't change the fact that rustc needs ~1.8GB-2GB memory to build a GTK+ app, how is that not at least concerning?
I was merely trying to offer options that are available now.
Memory footprint optimizations rarely are easy and may also involve tradeoffs such as writing more data to disk instead (which would be similar to swapping anyway), disabling some features that keep large amounts of data in memory or doing smaller units of work which may increase compilation times. Even optimizations that require no feature tradeoffs still require development time.
Anyway, another thing you can try is lowering the level of debuginfo, currently you're building with debuginfo = 2
which can take up considerable amounts of memory (#45854).
may also involve tradeoffs such as writing more data to disk instead
Just to point it out, this may still be preferable to general system swap however, since then it is only used during rustc compilation rather than potentially all the time.
disabling some features that keep large amounts of data in memory or doing smaller units of work which may increase compilation times.
This definitely sounds like something I would be looking for. The problem also is the many tweaks that keep being suggested are hard to apply for someone not using Rust themselves (I am not, I just want to test the dev version of that program) in a program with a complicated build tool like ninja (which I also don't use regularly myself), and it would help a lot if rustc detected little memory available and then did some aggressive tweaks in favor of lower memory use and higher compilation time on its own. I'm not sure how doable that is, just pointing it out as IMHO the ideal outcome from my view.
rather than potentially all the time.
Note that swap aggressiveness can be configured so that it is only used when there is no other choice
and it would help a lot if rustc detected little memory available and then did some aggressive tweaks in favor of lower memory use and higher compilation time on its own.
That may be acceptable for performance settings such as number of compiler threads but choices such as disabling debug info might end up overriding developer intent, e.g. when the developer chose to make a debug build.
The problem also is the many tweaks that keep being suggested are hard to apply for someone not using Rust themselves (I am not, I just want to test the dev version of that program)
Then note that you're trying to do a slow debug build. You probably want to create a release build instead.
Well I was trying to test out an experimental build to give the developer feedback on the ticket, where having debug symbols might actually be useful if it misbehaves.
Note that swap aggressiveness can be configured so that it is only used when there is no other choice
This tends to work out poorly in practice however, especially when combined with tools like earlyoom (which is kind of mandatory nowadays given how bad the kernel oom killer is). The points where it's reasonably killed early and where it should start using swap can be pretty close then, and usually I actually prefer the kill since as written above general swap use is not such a good idea on eMMC. So I think swap can't be done "half heartedly", either you're using it constantly on such a low memory device or not much at all even when needed. The system after all cannot really guess when it's rustc that I would actually like to keep running, unlike most other processes that go wild that would better be stopped.
What I was hoping with this ticket is that there would be some recognition that this problem might be a little out of hand (which fwiw I'm not saying isn't there), and then some thought about some mechanisms to curb it automatically instead of leaving the user kind of stranded with difficult tweaks. Some mechanism to manually swap things to disk in rustc definitely sounds like the way to go IMHO, with that being connected to some way of actual free memory detection on the actual system at hand. Or of course algorithmic tweaks that improve this generally, although I understand all the obvious ones might already be done here.
Edit: just to support this further, look at the discussion here: gtk-rs/gtk#1074 at some point rustc users just seem to be at a loss at how to adjust and break up their project to work around this, which is why I think eventually at a certain scale it would help if rustc considered it its own responsibility to not assume it can all stay in memory without disk swapping, giving how naturally memory heavy Rust compilation appears to be. There just seems to be an eventual problem of scale with that approach which in the end excludes lower spec device users (even if they would be patient enough to otherwise endure an excessive wait time).
Well I was trying to test out an experimental build to give the developer feedback on the ticket, where having debug symbols might actually be useful if it misbehaves.
You can still try lowering debug info from 2 to 1.
I think there's definitely recognition that rustc is a memory hog, based on discussions in the rust-lang Zulip. Some contributors, including myself, have to reduce the number of jobs when building rustc or else risk death by thrashing. I've just started looking into ways to reduce memory consumption, and I'm already seeing that there are potential reductions that I'm hopeful are realizable.
As for actively adapting to memory usage, I wonder if it would be appropriate for cargo to try to manage this in part by adjusting the number of rustc instances dynamically based on some heuristic.
I would like to point out that this could more easily be worked around if cargo provided an option to force some or all crate dependencies to be linked dynamically rather than statically, ABI stability be damned. :^)
Stability between releases should be good enough for such use-cases and this should prevent most of these OOM scenarios. It just takes less memory to dynamically link to .so libraries than it does to statically link everything into one big blob.