git-commit-id/git-commit-id-maven-plugin

slow execution

delanym opened this issue · 8 comments

The execution is too slow. It took 8 real seconds to mvn initialize my project. Sometimes its more. It varies by as much as a second.

<plugin>
  <groupId>pl.project13.maven</groupId>
  <artifactId>git-commit-id-plugin</artifactId>
  <version>4.0.0</version>
  <executions>
    <execution>
      <goals>
        <goal>revision</goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <dateFormat>yyyyMMddHHmm</dateFormat>
    <dotGitDirectory>${project.basedir}/../.git</dotGitDirectory>
    <skipPoms>false</skipPoms>
    <generateGitPropertiesFile>true</generateGitPropertiesFile>
    <generateGitPropertiesFilename>${project.basedir}/../git.properties</generateGitPropertiesFilename>
    <failOnNoGitDirectory>true</failOnNoGitDirectory>
    <abbrevLength>7</abbrevLength>
    <skip>${skip.git.commit}</skip>
    <runOnlyOnce>true</runOnlyOnce>
    <gitDescribe>
      <skip>${skip.git.commit}</skip>
      <always>true</always>
      <abbrev>7</abbrev>
      <dirty>-dirty</dirty>
      <match>*</match>
      <forceLongFormat>false</forceLongFormat>
    </gitDescribe-->
  </configuration>
</plugin>

Most of the wait time is spent after the line [DEBUG] remaining output:

[DEBUG] readpipe [git, config, --system, --edit],/usr/bin
[DEBUG] readpipe may return '/etc/gitconfig'
[DEBUG] remaining output:

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.411 s
[INFO] Finished at: 2019-12-20T12:44:22+02:00
[INFO] ------------------------------------------------------------------------

Anything more than a second is too long. I have 400 projects, and when I build them with their parent I can set runOnlyOnce=true.
But sometimes I just need to rebuild 10 of them and I'll use a bash function for that. But then runOnlyOnce is useless. 10*8 seconds is a lot of wait time!

Its slow on my system, but for 5 of my colleagues building in WSL1 its intolerable.

git version 2.24.0
Apache Maven 3.6.0
openjdk version "1.8.0_232"
Ubuntu 19.04 \n \l
zsh 5.5.1 (x86_64-ubuntu-linux-gnu)

Hello,
thank you for creating an issue here.
The execution time of the plugin can vary on various factors (e.g. amount of tags, amount of commits, amount of branches, ....) and what IMHO makes the most difference is what what you are asking the plugin to generate. Saying what the actual 'delay' is therefore not really that easy.

In your case I would see if the following could make a difference:

  • enable the native git implementation (as of now you are using the java-git-bridge which is known to be slower) -- to enable this set <useNativeGit>true</useNativeGit>
  • set the plugin to offline-mode via <offline>true</offline>
  • disable the generation of properties that you are not using. In most cases users only need a specific subset of properties (e.g. the commit hash and branch), but the plugin is generating a lot more. Each property not needed will take time and doesn't add a benefit when not consumed. You have full control what properties are generated with specifying them via excludeProperties or includeOnlyProperties. Certainly when not needed the plugin will not attempt to waste time in generating them.

In your case, perhaps start with (or whatever you need)

<includeOnlyProperties>
  <includeOnlyProperty>^git.build.(time|version)$</includeOnlyProperty>
  <includeOnlyProperty>^git.commit.id.(abbrev|full)$</includeOnlyProperty>
</includeOnlyProperties>

Unfortunately I'm not to sure what causes the ~1 second fluctuation you are seeing. This can again be due to the jgit-implementation or the fact that the plugin performs a git fetch before generating the properties. This can be disabled via <offline>true</offline>. However I should also emphasis that this is the overall execution and based on your comment it is unclear how much of the time is actually spend within the plugin (e.g. use https://github.com/takari/maven-profiler to really blame the fluctuation on the plugin here)

Now to your last point:
Are those 10 projects all child-projects of the reactor build? If so, why not using the --projects-maven argument?
From the help:

-pl,--projects <arg>
    Comma-delimited list of specified reactor projects to build instead of all projects. A project can be specified by [groupId]:artifactId or by its relative path. 

If 10 separate projects apply the config mentioned above to all of them.

PS:
With version 4.0.0 I did a benchmark where you roughly see an execution of 8 seconds for 12k commits when using the jgit-implementation. Excluding ^git.local.branch.*$ which is almost the same as <offline>true</offline> cuts the execution down to 3 seconds. To note here this is the entire runtime of maven and not only of the plugin. So you can easily chop of 1s for maven to be initialized, configured, .....

Just for reference could you give me the rough number of commits that are in your repo (git rev-list --all --count)?

Thanks I'll try it and get back to you in the new year.

First mvn initialize this morning took: 9.759 s
Again after enabling git native (extra time because of fetch/user input): 13.635 s
Repeated: 3.759 s
Again after setting offline: basically no difference
Again after limiting to 2 properties: 1.300 s
Great improvement!

Thanks for the feedback and glad you could get it sorted!

I had the same issue, including only the needed information reduced the time the plugin runs from 8s in my case to much less than a second.

I think disabling the collection of most of the data by default would be a great improvement.

I think disabling the collection of most of the data by default would be a great improvement.

That would certainly break backwards compatibility of the plugin. At least I don't know a way to find out what properties might be consumed somewhere and thus this plugin can't suddenly stop generating them

Indeed. Anyway, thanks for your work. It is a great tool.

I'm facing this issue - I can see something interesting in the logs:

12:59:38,899 [INFO] Collected git.build.version with value 0.1.0-SNAPSHOT
12:59:43,911 [INFO] Collected git.build.host with value creeper.local

it does not seem normal to wait between collecting those two properties for so long, does it? And actually when I exclude git.build.host, it runs fast.