CRaC/docs

Checkpoint after restore not working

klauswuestefeld opened this issue · 4 comments

Hi, I tried several different ways to checkpoint a restored VM. Nothing worked.

Is it supposed to? If not, that should be clearly documented.

Repeated checkpoints/restores of the same process are extremely useful for in-memory object databases. Regular serialization is orders of magnitude slower than a memory dump.

rvansa commented

I agree that it could be very useful, but you're right, it does not work ATM. The problem is actually in
our fork of CRIU that has an optimization that maps the checkpointed image into memory rather than read it (as upstream CRIU does) - this allows for an order-of-magnitude faster restores. However this means that if you try to checkpoint to the same directory it will overwrite the original checkpoint.
There are two ways to work it around: you can use a different image directory for each checkpoint, though this naturally kind of blows up the storage needs. Another one is to disable the optimization, which is currently possible through env var CRAC_CRIU_OPTS=-no-mmap-page-image.
I've had a try to fix it in openjdk/crac#57 but architecturally this is not the right solution - the mess comes from CRIU, so it should be fixed in CRIU. It could leave a parasite thread that would do the protect-copy-remap cycle, or orchestrate this from an outside process (ptracing the original one)...
Regrettably currently this has not been marked a priority. I am here to answer any questions I can should you want to pursue that yourselves.

Thanks @rvansa !

There are two ways to work it around: you can use a different image directory for each checkpoint, though this naturally kind of blows up the storage needs.

  1. That's fine! I can just delete old checkpoints. How can I do that?

I tried adding the arg -XX:CRaCCheckpointTo=NEW-PATH also when restoring but that didn't work.

Another one is to disable the optimization, which is currently possible through env var CRAC_CRIU_OPTS=-no-mmap-page-image

  1. If I use the env var CRAC_CRIU_OPTS=-no-mmap-page-image to disable the optimization will XX:CRaCCheckpointTo=NEW-PATH work when doing a restore for a subsequent checkpoint?

From your CRIU fork readme:

"CRIU project is (almost) the never-ending story, because we have to always keep up with the Linux kernel supporting checkpoint and restore for all the features it provides."

Is a fork really inevitable? It makes me really worried for the project.

  1. Won't CRIU accept contribs for an order-of-magnitude faster startup mode? They themselves contrib to the Kernel even, so they know what it is to need a contrib upstream. CRIU already has a few modes like lazy restore, incremental and remote.

  2. I got subsequent checkpoints/restores to work with the latest stable CRIU release (3.17.1-3) on a simple Java project on a regular JVM. Should I expect trouble for larger projects even if I control all I/O such as files and sockets or can I just go with CRIU?

  3. Is the optimization the only reason to fork CRIU?

  4. Would you like to move this discussion somewhere else?

Cheers

rvansa commented

I tried adding the arg -XX:CRaCCheckpointTo=NEW-PATH also when restoring but that didn't work.

Hmm, what was the issue? You can enable more verbose log on restore using -XX:CREngine=criuengine,--verbosity=4? Last time I've checked this worked...

If I use the env var CRAC_CRIU_OPTS=-no-mmap-page-image to disable the optimization will XX:CRaCCheckpointTo=NEW-PATH work when doing a restore for a subsequent checkpoint?

Not sure what's the question.

Won't CRIU accept contribs for an order-of-magnitude faster startup mode? They themselves contrib to the Kernel even, so they know what it is to need a contrib upstream. CRIU already has a few modes like lazy restore, incremental and remote.

CRIU tries to restore the process in exactly the same state as it was before, and having some pages mmapped from disk rather than being anonymous memory is a significant difference that might cause trouble here and there (such as this inability to do repeated checkpoint). We have other fixes in place, e.g. related to standart FDs - again something that doesn't make the process an exact copy, but is very practical for our usecase. So, having it in upstream is not really likely as it breaks the goal of the project.
On the other hand, CRaC requires less powerful CRIU - we require the app to close most of the external resources before checkpoint, and also JVM does not use all the dark magic Linux offers. At the same time it is not too hard to merge upstream changes back from time to time.
Eventually, we might have an alternative implementation to CRIU, there's been POCs already, and for Java processes we even might not need all the elevated capabilities.

I got subsequent checkpoints/restores to work with the latest stable CRIU release (3.17.1-3) on a simple Java project on a regular JVM. Should I expect trouble for larger projects even if I control all I/O such as files and sockets or can I just go with CRIU?

If you got the restore working, most likely there's no 'catch' that you'd experience later during runtime. OpenJ9 has something similar to CRaC and they use unpatched CRIU AFAIK. But even besides optimization, we've identified use-cases where upstream CRIU was less practical.

Would you like to move this discussion somewhere else?

This is fine, if you want to ask more interactively you're welcome to #crac channel on https://foojay.slack.com