spring-projects/spring-boot

Improve exploded structure experience for efficient deployments

snicoll opened this issue · 27 comments

The reference guide has a section on unpacking the executable jar to get extra boost. It also has an additional hint for bypassing the bootlader with the impact on losing classpath ordering.

Our work on AppCDS (see spring-projects/spring-framework#31497) has shown that a predicable classpath has an impact on how effective the cache is going to be. Investigating a bit more, it looks doable to provide the above as a first class concept.

Here is a proposal that is hacking layertools with an additional extract2(sic) command: snicoll@a67215a

Ignoring the fact that this does use layertools for convenience, you can execute the following on any repackaged archive:

java -Djarmode=layertools -jar target/my-app-1.0.0-SNAPSHOT.jar extract2 --destination target/app

This creates a directory with the following structure:

target/app/
├── application
│   └── my-app-1.0.0-SNAPSHOT.jar
├── dependencies
│   ├── ...
│   ├── spring-context-6.1.0-RC2.jar
│   ├── spring-context-support-6.1.0-RC2.jar
│   ├── ...
└── run-app.jar

You can then run the app as follows (assuming you're in the same directory as the previous command):

java -jar target/app/run-app.jar

The run-app.jar has the following characteristics:

  • It starts the application class directly (no bootlader used)
  • The manifest defines a classpath with the same order as classpath.idx, with application/my-app-1.0.0-SNAPSHOT.jar being in front

my-app-1.0.0-SNAPSHOT.jar has some manifest entries of the original jar so that package.getImplementationVersion() continues to work.

The work on this prototype has led to a number of questions:

  • Layertools provide a Command infrastructure and several utilities related to extracting/copying that are not specific to layers. If we want to go the same route with a different jar mode, a significant number of classes should be copied/reimplemented. Perhaps a single jar with a public API and sub-packages for layertools and this new mode could be an option?
  • This work could potentially apply to layers themselves. Rather than having dependencies/BOOT-INF/lib and a layer for the bootloader, we could use the same exact structure. Or perhaps the structure above could become a layer configuration (where all libs go to dependencies).
  • It's hard for run-app.jar to have sensible File attributes. The command tries to respect the file attributes of files it extracts from the repackaged archive, but run-jar.jar is created on the spot.

We have confirmed that with the prototype, AppCDS is effective (close to 95% classes loaded from the cache).

About performance - In my project a data exchange between agents conducted via zip files. In one case I had 1k+ zips inside other zip and preparing data for agent by unziping all archives had taken minutes.
After switching to com.google.jimfs the time of preparing reduced to hundreds milliseconds.
So spending additional memory you can achieve almost zero time for unpacking without making any changes in structure.of archive.
You can see a general concept here
https://github.com/sergmain/metaheuristic/blob/master/apps/commons/src/test/java/ai/metaheuristic/commons/utils/ZipUtilsTest.java

pom dependency

<dependency>
    <groupId>com.google.jimfs</groupId>
    <artifactId>jimfs</artifactId>
    <version>1.2</version>
</dependency>

repo is here - https://github.com/google/jimfs

@bclozel and I had a bit of brainstorming on this one and he raised a use case that could become problematic. The solution above ignores the Spring Boot launcher on purpose to make the startup as straightforward as possible. However, it currently does not allow to augment the classpath, which is something that can be problematic. I don't know if this is something we need to take into consideration, but the POC could be adapted accordingly.