ishepard/pydriller

In version 2.5, there is an error when traversing between two tags, which is not present in Pydriller version 2.4

armandossrecife opened this issue · 3 comments

1. Clone repository (ok)

git clone https://github.com/apache/calcite.git

2. Version of OS (ok)

uname -a

Linux 1192474995c9 5.15.109+ #1 SMP Fri Jun 9 10:57:30 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

lsb_release -a

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy

3. Install pydriller 2.5 (ok)

pip3 install pydriller

4. Version of python3

python3 --version

Python 3.10.6

5. Confirm version of PyDriller (ok)

pip3 list | grep PyDriller

PyDriller 2.5

6. Test pydriller Simple Scenario (ok)

import pydriller
for commit in pydriller.Repository("calcite").traverse_commits():
    print(commit.hash)

Result:
Show all commit correctly

7. List tags of cloned repository (ok)

cd calcite && git tag

Result
['avatica-1.10.0-rc0', 'avatica-1.11.0-rc0', 'avatica-1.12.0-rc0', 'avatica-1.13.0-rc0', 'calcite-0.9.1-incubating', 'calcite-0.9.2-incubating', 'calcite-1.0.0-incubating', 'calcite-1.1.0-incubating', 'calcite-1.10.0', 'calcite-1.11.0', 'calcite-1.12.0', 'calcite-1.13.0', 'calcite-1.14.0', 'calcite-1.15.0', 'calcite-1.16.0', 'calcite-1.17.0', 'calcite-1.18.0', 'calcite-1.19.0', 'calcite-1.2.0-incubating', 'calcite-1.20.0', 'calcite-1.21.0', 'calcite-1.22.0', 'calcite-1.23.0', 'calcite-1.23.0-rc0', 'calcite-1.23.0-rc1', 'calcite-1.24.0', 'calcite-1.24.0-rc0', 'calcite-1.25.0', 'calcite-1.25.0-rc0', 'calcite-1.26.0', 'calcite-1.26.0-rc0', 'calcite-1.27.0', 'calcite-1.27.0-rc0', 'calcite-1.28.0', 'calcite-1.28.0-rc0', 'calcite-1.29.0', 'calcite-1.29.0-rc0', 'calcite-1.3.0-incubating', 'calcite-1.30.0', 'calcite-1.30.0-rc0', 'calcite-1.30.0-rc1', 'calcite-1.30.0-rc2', 'calcite-1.30.0-rc3', 'calcite-1.31.0', 'calcite-1.31.0-rc0', 'calcite-1.31.0-rc1', 'calcite-1.31.0-rc2', 'calcite-1.31.0-rc3', 'calcite-1.32.0', 'calcite-1.32.0-rc0', 'calcite-1.33.0', 'calcite-1.33.0-rc0', 'calcite-1.34.0', 'calcite-1.34.0-rc0', 'calcite-1.35.0', 'calcite-1.35.0-rc0', 'calcite-1.35.0-rc1', 'calcite-1.35.0-rc2', 'calcite-1.35.0-rc3', 'calcite-1.4.0-incubating', 'calcite-1.5.0', 'calcite-1.6.0', 'calcite-1.6.0-rc0', 'calcite-1.7.0', 'calcite-1.8.0', 'calcite-1.9.0', 'calcite-avatica-1.7.0', 'calcite-avatica-1.7.1', 'calcite-avatica-1.8.0', 'calcite-avatica-1.9.0', 'calcite-avatica-1.9.0-rc0', 'calcite-avatica-1.9.0-rc1', 'optiq-0.4.10', 'optiq-0.4.11', 'optiq-0.4.7', 'optiq-0.4.8', 'optiq-0.4.9', 'optiq-0.9.0-incubating', 'optiq-parent-0.4.12', 'optiq-parent-0.4.12.1', 'optiq-parent-0.4.12.2', 'optiq-parent-0.4.12.3', 'optiq-parent-0.4.12.4', 'optiq-parent-0.4.13', 'optiq-parent-0.4.14', 'optiq-parent-0.4.15', 'optiq-parent-0.4.16', 'optiq-parent-0.4.17', 'optiq-parent-0.4.18', 'optiq-parent-0.4.18.1', 'optiq-parent-0.5', 'optiq-parent-0.6', 'optiq-parent-0.7', 'optiq-parent-0.8', 'rel/avatica-1.10.0', 'rel/avatica-1.11.0', 'rel/avatica-1.12.0', 'rel/avatica-1.13.0', 'rel/calcite-0.9.1-incubating', 'rel/calcite-0.9.2-incubating', 'rel/calcite-1.0.0-incubating', 'rel/calcite-1.1.0-incubating', 'rel/calcite-1.2.0-incubating', 'rel/calcite-1.3.0-incubating', 'rel/calcite-1.4.0-incubating', 'rel/calcite-1.5.0', 'rel/calcite-1.6.0', 'rel/calcite-avatica-1.7.0', 'rel/calcite-avatica-1.7.1', 'rel/calcite-avatica-1.8.0', 'rel/calcite-avatica-1.9.0', 'v0.4.6', 'v1.23.0-rc1']

8. There is an error when I try to traverse between two tags:

tag_1='calcite-1.5.0'
tag_2='calcite-1.6.0'

for commit in pydriller.Repository("calcite",from_tag=tag_1, to_tag=tag_2).traverse_commits():
    print(commit.hash)
Result:

---------------------------------------------------------------------------
GitCommandError                           Traceback (most recent call last)
<ipython-input-9-997ae32b48a6> in <cell line: 4>()
      2 tag_2='calcite-1.6.0'
      3 
----> 4 for commit in pydriller.Repository("calcite",from_tag=tag_1, to_tag=tag_2).traverse_commits():
      5     print(commit.hash)

6 frames
/usr/local/lib/python3.10/dist-packages/git/cmd.py in wait(self, stderr)
    602                 errstr = read_all_from_possibly_closed_stream(p_stderr)
    603                 log.debug("AutoInterrupt wait stderr: %r" % (errstr,))
--> 604                 raise GitCommandError(remove_password_if_present(self.args), status, errstr)
    605             return status
    606 

GitCommandError: Cmd('git') failed due to: exit code(129)
  cmdline: git rev-list --reverse --ancestry-path=ba6e43c6983ca92d8ce32a693776dbe73f19e0dc ^ba6e43c6983ca92d8ce32a693776dbe73f19e0dc^ c4d346b0a413a1a62e028dd3be40071523203a58 --
  stderr: 'usage: git rev-list [OPTION] <commit-id>... [ -- paths... ]
  limiting output:
    --max-count=<n>
    --max-age=<epoch>
    --min-age=<epoch>
    --sparse
    --no-merges
    --min-parents=<n>
    --no-min-parents
    --max-parents=<n>
    --no-max-parents
    --remove-empty
    --all
    --branches
    --tags
    --remotes
    --stdin
    --quiet
  ordering output:
    --topo-order
    --date-order
    --reverse
  formatting output:
    --parents
    --children
    --objects | --objects-edge
    --unpacked
    --header | --pretty
    --[no-]object-names
    --abbrev=<n> | --no-abbrev
    --abbrev-commit
    --left-right
    --count
  special purpose:
    --bisect
    --bisect-vars
    --bisect-all
'

All these steps can be automatically reproduced with more details using the automated script we developed in https://github.com/armandossrecife/teste/blob/main/bug_pydriller_v2_5.ipynb

Hi! Most likely you are using the wrong version of Git. You must have Git > 2.38.0 (current version is 2.41.0).

I don't see issues when I run it.

Hi Davide, thank you for your reply. I will update git to version 2.41.0 and I will retest this test case.

Hi Davide, You are right. It is necessary to update Git version to latest version (2.41.0). So, Pydriller 2.5 is working fine.
You can reproduce this test case here

Congratulations for this excellent work.