paketo-buildpacks/maven

[feature request] support multiple artifacts of different types.

Closed this issue ยท 8 comments

What happened?

The maven buildpack is expecting a single build artifact of type either jar or war, which can be explicitly configured using BP_MAVEN_BUILT_ARTIFACT.
This artifact will be extracted and passed on to downstream buildpacks, like the native image buildpack.

The argument BP_MAVEN_BUILD_ARGUMENTS="-Dmaven.test.skip=true -Dquarkus.package.type=native-sources package", to create Quarkus native image sources for a separate native image build, will generate multiple artifacts that are required to build a native image eventually with the native image buildpack:

target/native-sources/native-image.args
target/native-sources/lib/
target/native-sources/getting-started-1.0.0-SNAPSHOT-runner.jar

A potential solution for this usecase, and maybe others, is to allow passing a list of artifacts to BP_MAVEN_BUILT_ARTIFACT, which items are either extracted (jar or war) or copied (e.g. lib folder)

Build Configuration

  • What platform (pack, kpack, tekton buildpacks plugin, etc.) are you
    using? Please include a version.

  • What buildpacks are you using? Please include versions.
    TBS

Checklist

  • I have included log output.
  • The log output includes an error message.
  • I have included steps for reproduction.

Hi guys.

I am taking a look at this issue. here are my thoughts about it. The env BP_MAVEN_BUILT_ARTIFACT is a regular expression with the pattern to be evaluated to the potential artifacts we want to take care of it, by default the value is target/*.[ejw]ar which is suggesting to deal with *.ear, *.war or *.jar artifacts, all of them are some kind of compressed files.

I did the following test with the Quarkus getting-started guide:

pack build quarkus-getting-started -e BP_MAVEN_BUILD_ARGUMENTS="-Dquarkus.package.type=native-sources package" -e BP_MAVEN_BUILT_ARTIFACT="target/native-sources/*.*" -p .

Screen Shot 2021-11-23 at 10 26 57 AM

What I can understand is the following BP_MAVEN_BUILT_ARTIFACT can handle whatever I want, in this case, a whole folder native-sources but the ArtifactResolver is not prepare to handle a different kind of file that is not a compressed file (like a ear, war or jar).

Am I going in the right direction?

I've been thinking about a possible solution to this issue, here is one idea. Based on my previous comment BP_MAVEN_BUILT_ARTIFACT is implemented right now as a regular expression also after evaluating this regular expression the current code is expecting just 1 artifact as result to be sent to the next phase (native-image for example).

What about adding a new environment variable BP_MAVEN_BUILT_ARTIFACT_FORMAT, initially the values of this variable could be reg (Default, which means regular expression) or folder. When this value is set to folder it expects the other environment value. BP_MAVEN_BUILT_ARTIFACT to be set and also to point to some folder, for example BP_MAVEN_BUILT_ARTIFACT="target/native-sources". In this case, the Buildpack will do the following actions:

  • It would compressed the whole folder and create 1 artifact
  • It would send that artifact to the next Buildpack

With this solution we can:

  • Keep the current behavior as it is without changing the rule of expecting just 1 artifact
  • In the future extend the behavior enabling more formats to deal with

Hi guys,

I would like to share some news about this issue. I tried to take advantage of the maven assembly plugin the idea is to have distribution xml in the maven paketo buildpack and inject that one during building time (not sure how to inject it yet). The distribution xml is going to create a .zip from the output folder defined by the user.

This is an example.

pack build quarkus-getting-started-native \
          -e BP_NATIVE_IMAGE=true \
          -e BP_MAVEN_BUILD_ARGUMENTS="-Dquarkus.package.type=native-sources package" \ 
          -e BP_MAVEN_BUILT_ARTIFACT_FORMAT=zip \ 
          -e BP_MAVEN_BUILT_ARTIFACT="target/getting-started-1.0.0-SNAPSHOT-bin.zip" \ 
          -e BP_NATIVE_IMAGE_BUILD_ARGUMENTS="$(cat target/native-sources/native-image.args)" 
     -p .  (more buildpacks...)

Important to notice

  • I am adding a new environment variable BP_MAVEN_BUILT_ARTIFACT_FORMAT which I think can be use in the future to add more format based on the maven assembly plugin
  • I am using the native-image.args created by Quarkus to help on the native image creation

The output is actually the file getting-started-1.0.0-SNAPSHOT-bin.zip with all the files needed to create the native-image, this file is send to the native-image buildpack

Yes, so after the artifact resolver locates the artifact, it will copy it to a layer. This is what allows us to persist the artifact across builds and cache it. There is an assumption that the item saved is a zip file (EAR/WAR/JAR typically, but not enforced).

What then happens is that we delete all of the application source code (basically rm -rf /workspace) as none of that this needed in the final image & we unzip the artifact that was saved into /workspace. The net result is that you end up with the same /workspace as if you built locally and performed a pack build -p my-app.jar.

There are two ways that we can make this work:

  1. If the artifact resolver finds multiple items, it'll zip them all up and store them in the layer in a single zip file. The file would then get extracted into /workspace as I described above, but you'd have all of the resolved artifacts.

  2. There is some overhead with #1, in that you'd be zipping up artifacts that are possibly already zipped (like if you have a file plus a WAR/JAR). These files may also be fairly large and so the overhead of zipping could be noticed if you have a couple of hundred-megabyte WAR/JAR. We could also look at removing the assumption about what is stored in the layer, i.e. not requiring a ZIP. We could then simply copy all artifacts to the layer. This would complicate the restoration logic though. If there are multiple objects, we'd simply copy all of those back to /workspace, however, if there is only one item and it's a ZIP then we'd need to extract that to /workspace to retain the current behavior.

I'd like to go with option #2, but I need to see how much additional effort this requires.

After adding this support I think that having a single artifact will still remain the most common use case, so I worry about the case where someone accidentally has multiple artifacts. This has been something that happens with Gradle projects. If you build locally, then build with buildpacks, you can end up with multiple archives in the build folder. If we change the behavior to support multiple artifacts, we'll need to be careful how we handle this use case.

I could potentially see adding some protections here, like an env variable BP_MAVEN_EXPECTED_ARTIFACT_COUNT which needs to match the number of artifacts that are found. It would default to 1, which would retain the current behavior. How do folks feel about that? Any other suggestions on how we could potentially implement this?

Hi @dmikusa-pivotal ,

Thanks for the reply, I just created a draft PR. the PR is addressing the idea #1 but in order to do that I am using the maven assembly plugin. I haven't add unit tests yet, but I would like to get any feedback regarding the approach.

To enable the feature I added the following environment variable BP_MAVEN_SUPPORT_MULTIPLE_ARTIFACTS when this environment variable is enable the buildpack will try to inject the distribution xml file to create a zip file from an output folder, defined by BP_MAVEN_BUILT_ARTIFACT_NATIVE_SOURCE and configure to default to native-sources (for Quarkus) but it can also be configure by the end user

@jjbustamante It's awesome that you were able to get this working with the Maven assembly plugin, and thanks for wrapping that up in a PR but we can't accept that PR.

I should have explained this better in my previous post, sorry. The functionality I was walking through is not in the Maven buildpack. It's actually part of libbs, which is a library that we share across all of our build-system style buildpacks (Maven, Gradle, SBT, Clojure Tools, etc...). If we make a change to support multiple artifacts, it has to be done in libbs so that it works across all of the build systems we support.

Using a specific Maven plugin to achieve this is only going to work for Maven.

A PR for this is going to have to update libbs:

  • This is the artifact resolver. This is going to need a new method that resolves and returns multiple artifacts (probably return []string). We need a new function because we can't change the existing API in libbs without a major version bump. We can add new methods though, new functionality can be done in a minor version bump.

  • This is the code in libbs that calls the artifact resolver & copies the artifact to the layer. We'd need to update this to use the new artifact resolver. It's also going to need to handle both cases, saving a single artifact or saving multiple artifacts. See my previous note for why we need to do this.

  • This is the code in libbs that is deleting the source & restoring the contents of the application ZIP to /workspace. This is going to need to be modified to again handle restoring both single or multiple artifacts.

  • On the plus side, I don't think you'd need to actually modify any of the individual buildpacks, like maven or gradle. I believe all the logic is encapsulated in libbs. We will need to bump the go.mod file to use a new version of libbs, but our CI will handle that.

  • Testing libbs locally is kind of a pain. It's not a buildpack itself, so what you need to do is take your target buildpack, Maven in this case, and edit the go.mod file. At the bottom, add replace github.com/paketo-buildpacks/libbs v1.10.0 => /path/to/your/local/libbs/folder. Then run go mod tidy. Now your Maven buildpack should use the local copy of your libbs. You can then package up the maven buildpack and use it in tests.

It's a lot of work, definitely jumping into the deep end of the pool on this PR. If you want to give it a shot, give me a ๐Ÿ‘. Otherwise, I'll probably have time to work on this issue next week.

Hi @dmikusa-pivotal

It's ok if the PR can't be merged, I understand the reasons for me was a great way to learn the internal details of the Buildpack implementations. I can give it a shot to the changes in the libbs project ๐Ÿ‘ in fact, I was at the beginning I was doing that and I identified the ArtifactResolver but I thought the idea was to add the feature just to the Maven Buildpack.

Right now, I am working in some changes in the native-image Buildpack, those changes assume all the artifacts requiere to build the image are in the /workspace, let me create a draft PR of those changes so you can give me feedback on that, then I will take a look on experimenting on libbs

Fantastic, I'll keep an eye out for some PRs. If you have questions, don't hesitate to reach out. Happy to discuss more. Our Slack is a good place for discussions like this. If you're on Slack, we're at https://paketobuildpacks.slack.com. All are welcome to join.