gitpython-developers/gitdb

make gitdb tests not require being ran from within gitdb git repository

yarikoptic opened this issue · 15 comments

Although I also got confused with #7 suggesting that gitdb is "dead" and absorbed into GitPython but it seems to be live and kicking out new releases ;) So I will assume that.

It would be nice if gitdb tests could be ran on a source distribution of gitdb, i.e. without relying on having .git. Now even if I initialize some dummy repository

mkdir TESTGITDB
cd TESTGITDB
git init; for c in 1 2 3; do echo $c >| 1; git add 1; git commit -m "commit $c"; done

manually and point to it as directed by messages I get:

> GITDB_TEST_GIT_REPO_BASE=$PWD/TESTGITDB/.git nosetests -s -v gitdb/test
...
======================================================================
ERROR: test_reading (gitdb.test.db.test_git.TestGitDB)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/db/test_git.py", line 22, in test_reading
    assert 1 < len(gdb.databases()) < 4
  File "/home/yoh/deb/gits/python-gitdb/gitdb/db/base.py", line 223, in databases
    return tuple(self._dbs)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/util.py", line 237, in __getattr__
    self._set_cache_(attr)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/db/git.py", line 58, in _set_cache_
    raise InvalidDBRoot(self.root_path())
InvalidDBRoot: /home/yoh/deb/gits/python-gitdb/gitdb/test/fixtures/../../../.git/objects

======================================================================
ERROR: test_pack_random_access (gitdb.test.performance.test_pack.TestPackedDBPerformance)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/lib.py", line 45, in wrapper
    return func(self, *args, **kwargs)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/performance/test_pack.py", line 54, in test_pack_random_access
    (ns, len(pdb.entities()), elapsed, ns / elapsed), file=sys.stderr)
ZeroDivisionError: float division by zero

======================================================================
ERROR: test_pack_writing (gitdb.test.performance.test_pack_streaming.TestPackStreamingPerformance)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/lib.py", line 45, in wrapper
    return func(self, *args, **kwargs)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/performance/test_pack_streaming.py", line 58, in test_pack_writing
    PackEntity.write_pack((pdb.stream(sha) for sha in pdb.sha_iter()), ostream.write, object_count=ni)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/pack.py", line 979, in write_pack
    "Expected to write %i objects into pack, but received only %i from iterators" % (object_count, actual_count))
ValueError: Expected to write 1000 objects into pack, but received only 0 from iterators

======================================================================
ERROR: test_base (gitdb.test.test_example.TestExamples)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/test_example.py", line 42, in test_base
    ldb.store(istream)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/db/loose.py", line 184, in store
    fd, tmp_path = tempfile.mkstemp(prefix='obj', dir=self._root_path)
  File "/usr/lib/python2.7/tempfile.py", line 308, in mkstemp
    return _mkstemp_inner(dir, prefix, suffix, flags)
  File "/usr/lib/python2.7/tempfile.py", line 239, in _mkstemp_inner
    fd = _os.open(file, flags, 0600)
OSError: [Errno 2] No such file or directory: '/home/yoh/deb/gits/python-gitdb/gitdb/test/fixtures/../../../.git/objects/objZdoPUA'

======================================================================
FAIL: test_writing (gitdb.test.db.test_ref.TestReferenceDB)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/lib.py", line 60, in wrapper
    return func(self, path)
  File "/home/yoh/deb/gits/python-gitdb/gitdb/test/db/test_ref.py", line 46, in test_writing
    assert len(rdb.databases()) == 1
AssertionError

----------------------------------------------------------------------
Ran 24 tests in 13.322s

FAILED (SKIP=1, errors=4, failures=1)

I think it would have been best if there was a fixture for the tests battery which would initiate such a repository using git present on the system (thus testing against whatever git available) and do not even rely on me doing above dance manually. If some tests are really dependent on gitdb/.git -- I guess they could Skip if no upstairs .git available.

Thanks in advance!

Maybe I am misunderstanding this, but in any case, I am sure we can figure this out !
First of all, gitdb is alive and kicking indeed. There was the plan to merge it into the 'experiment-2012' branch, but this one is now obsolete and deprecated.

What I find odd is that the gitdb tests don't work for you out of the box, as you don't have to specify a custom repository anymore at all. If the environment variable is unset, it will just use it's own 'gitdb' repository which works fine. Travis for example does it that way.

If you want to use that variable, than it expects to be pointing to some moderately large repository that has at least one pack. I used the git source repository for this at some point.

For me, the system seems to be working the way it is, and I'd rather remove this environment variable than try to include some massive fixture (if that is what you suggested).
Please let me know what you think.

"it will just use it's own 'gitdb' repository which works fine" -- that is what I was trying to avoid as well. E.g. for debian package, we have only sources, no gitdb's .git is shipped along. But I still would like to run tests to verify that gitdb operates correctly. I hope it clears out my intentions? ;)

Alright, then I'd recommend cloning the gitdb repo as part of your build scripts, and point the environment variable there. This seems like the most bullet-proof way of doing it, as this is how I run my tests too.
Does that sound applicable to you ?

wouldn't work for me either since network connectivity is not guaranteed! ;-) (that is a part of Debian policy carved in stone)

oh ... well. The point is that the repository it uses needs to be sufficiently large, and must have packs. Maybe what you could do is to just keep a clone of gitdb for testing as part of your build-script. Doesn't have to be the latest one, as far as I am concerned. Something like that must be possible.
Does that make sense, it is feasible ?

well... I would be lying if I say that it is not feasible... there is even a 3.0 (git) format for source packages in Debian so the sources could be provided with .git and I could then drag entire history of gitdb and debian packaging within. may be at some point I will actually do that but not now -- package is maintained under SVN of the python-modules team in Debian -- I would need to migrate it away etc. But meanwhile (I am not even original maintainer for gitdb/gitpython, just picked up to provide fresh releases for others and my own good) I think, if you still allow testing against some other repository, I would instantiate it from scratch and just skip (via a patch or just argument to nose) the tests requiring access to packs or anything else gitdb/large git-repo specific. ATM tests are not excercised at all at package build time -- so some tests already will be better than no tests at all.

do you see any tests among above error'ing/failing which could be fixed up counting on lean test git repo existing?

Yeah, I see your point. The first one for instance actively tries to access it's own repository. Tests should at least be fixed to use the one provided by the environment variable at all times.
Then it would just be up to you to setup a repository which is big enough.
Will work on that right away, thanks for guiding me to this realisation.
To be honest, I didn't even know these tests anymore and just assumed they all use the environment variable.

Thank you very much in advance!

Alright, it should work now. Please let me know if you find anything else though.

ok -- lets see!! ;)

% cd /tmp
% git clone http://github.com/gitpython-developers/gitdb
Cloning into 'gitdb'...
remote: Counting objects: 1851, done.
remote: Compressing objects: 100% (19/19), done.
remote: Total 1851 (delta 6), reused 0 (delta 0)
Receiving objects: 100% (1851/1851), 1.18 MiB | 609.00 KiB/s, done.
Resolving deltas: 100% (1116/1116), done.
Checking connectivity... done.
% cd gitdb
% rm -rf .git
% mkdir TESTGITDB
% cd TESTGITDB
% git init; for c in 1 2 3 4 5 6 7 8; do echo $c >| $c; git add $c; git commit -m "commit $c"; done 
Initialized empty Git repository in /tmp/gitdb/TESTGITDB/.git/
[master (root-commit) d79fdce] commit 1
 1 file changed, 1 insertion(+)
 create mode 100644 1
[master 575a0c3] commit 2
 1 file changed, 1 insertion(+)
 create mode 100644 2
[master 028fa4e] commit 3
 1 file changed, 1 insertion(+)
 create mode 100644 3
[master b5dc401] commit 4
 1 file changed, 1 insertion(+)
 create mode 100644 4
[master 2ebdcdc] commit 5
 1 file changed, 1 insertion(+)
 create mode 100644 5
[master 816327d] commit 6
 1 file changed, 1 insertion(+)
 create mode 100644 6
[master d888b0b] commit 7
 1 file changed, 1 insertion(+)
 create mode 100644 7
[master 2db25a8] commit 8
 1 file changed, 1 insertion(+)
 create mode 100644 8
% cd ../
% python setup.py build_ext --inplace
/usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'zip_safe'
  warnings.warn(msg)
/usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'install_requires'
  warnings.warn(msg)
running build_ext
building 'gitdb._perf' extension
creating build
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/gitdb
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -Igitdb -I/usr/include/python2.7 -c gitdb/_fun.c -o build/temp.linux-x86_64-2.7/gitdb/_fun.o
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -Igitdb -I/usr/include/python2.7 -c gitdb/_delta_apply.c -o build/temp.linux-x86_64-2.7/gitdb/_delta_apply.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wl,-z,relro -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security build/temp.linux-x86_64-2.7/gitdb/_fun.o build/temp.linux-x86_64-2.7/gitdb/_delta_apply.o -o /tmp/gitdb/gitdb/_perf.so
% GITDB_TEST_GIT_REPO_BASE=$PWD/TESTGITDB/.git nosetests -s -v gitdb/test
test_reading (gitdb.test.db.test_git.TestGitDB) ... ERROR
test_writing (gitdb.test.db.test_git.TestGitDB) ... ok
test_basics (gitdb.test.db.test_loose.TestLooseDB) ... ok
test_writing (gitdb.test.db.test_mem.TestMemoryDB) ... ok
test_writing (gitdb.test.db.test_pack.TestPackDB) ... ok
test_writing (gitdb.test.db.test_ref.TestReferenceDB) ... Test TestReferenceDB.test_writing failed, output is at '/home/yoh/.tmp/test_writingW_FcOo'
FAIL
test_correctness (gitdb.test.performance.test_pack.TestPackedDBPerformance) ... Endurance run: verify streaming of objects (crc and sha)
PDB: verified 0 objects (crc=0) in 0.000041 s ( 0.000000 objects/s )
PDB: verified 0 objects (crc=1) in 0.000001 s ( 0.000000 objects/s )
ok
based on the pack(s) of our packed object DB, we will just copy and verify all objects in the back ... ok
test_pack_random_access (gitdb.test.performance.test_pack.TestPackedDBPerformance) ... PDB: looked up 0 shas by index in 0.000049 s ( 0.000000 shas/s )
PDB: looked up 0 sha in 0 packs in 0.000001 s ( 0.000000 shas/s )
ERROR
test_pack_writing (gitdb.test.performance.test_pack_streaming.TestPackStreamingPerformance) ... PDB Streaming: Got 1000 streams by sha in in 0.000038 s ( 26379270.440252 streams/s )
ERROR
test_stream_reading (gitdb.test.performance.test_pack_streaming.TestPackStreamingPerformance) ... <open file '<stderr>', mode 'w' at 0x7f0d9a7401e0> PDB Streaming: Got 5000 streams by sha and read all bytes totallying 0 KiB ( 0.000000 KiB / s ) in 0.000038 s ( 131896352.201258 streams/s ) <open file '<stderr>', mode 'w' at 0x7f0d9a7401e0>
ok
test_large_data_streaming (gitdb.test.performance.test_stream.TestObjDBPerformance) ... Creating  data ...
Done (in 0.781967 s)
Added 50000 KiB (filesize = 17294 KiB) of  data to loose odb in 0.853747 s ( 58565.350648 Write KiB / s)
Read 50000 KiB of  data at once from loose odb in 0.215706 s ( 231796.864724 Read KiB / s)
Read 50000 KiB of  data in 512 KiB chunks from loose odb in 0.235925 s ( 211931.794640 Read KiB / s)
Creating random  data ...
Done (in 7.747607 s)
Added 50000 KiB (filesize = 43183 KiB) of random  data to loose odb in 1.755956 s ( 28474.518646 Write KiB / s)
Read 50000 KiB of random  data at once from loose odb in 0.317991 s ( 157237.145464 Read KiB / s)
Read 50000 KiB of random  data in 512 KiB chunks from loose odb in 0.312287 s ( 160109.205345 Read KiB / s)
ok
test_streams (gitdb.test.test_base.TestBaseTypes) ... ok
test_base (gitdb.test.test_example.TestExamples) ... ok
test_pack (gitdb.test.test_pack.TestPack) ... ok
test_pack_64 (gitdb.test.test_pack.TestPack) ... SKIP
test_pack_entity (gitdb.test.test_pack.TestPack) ... ok
test_pack_index (gitdb.test.test_pack.TestPack) ... ok
test_compressed_writer (gitdb.test.test_stream.TestStream) ... ok
test_decompress_reader (gitdb.test.test_stream.TestStream) ... ok
test_decompress_reader_special_case (gitdb.test.test_stream.TestStream) ... ok
test_sha_writer (gitdb.test.test_stream.TestStream) ... ok
test_basics (gitdb.test.test_util.TestUtils) ... ok
test_lockedfd (gitdb.test.test_util.TestUtils) ... ok

======================================================================
ERROR: test_reading (gitdb.test.db.test_git.TestGitDB)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/gitdb/gitdb/test/db/test_git.py", line 26, in test_reading
    assert isinstance(gdb.info(gitdb_sha), OInfo)
  File "/tmp/gitdb/gitdb/db/base.py", line 205, in info
    return self._db_query(sha).info(sha)
  File "/tmp/gitdb/gitdb/db/base.py", line 192, in _db_query
    raise BadObject(sha)
BadObject: BadObject: 5690fd0d3304f378754b23b098bd7cb5f4aa1976

======================================================================
ERROR: test_pack_random_access (gitdb.test.performance.test_pack.TestPackedDBPerformance)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/gitdb/gitdb/test/lib.py", line 72, in wrapper
    return func(self, *args, **kwargs)
  File "/tmp/gitdb/gitdb/test/performance/test_pack.py", line 65, in test_pack_random_access
    (max_items, pdb_fun.__name__.upper(), elapsed, max_items / elapsed), file=sys.stderr)
ZeroDivisionError: float division by zero

======================================================================
ERROR: test_pack_writing (gitdb.test.performance.test_pack_streaming.TestPackStreamingPerformance)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/gitdb/gitdb/test/lib.py", line 72, in wrapper
    return func(self, *args, **kwargs)
  File "/tmp/gitdb/gitdb/test/performance/test_pack_streaming.py", line 58, in test_pack_writing
    PackEntity.write_pack((pdb.stream(sha) for sha in pdb.sha_iter()), ostream.write, object_count=ni)
  File "/tmp/gitdb/gitdb/pack.py", line 979, in write_pack
    "Expected to write %i objects into pack, but received only %i from iterators" % (object_count, actual_count))
ValueError: Expected to write 1000 objects into pack, but received only 0 from iterators

======================================================================
FAIL: test_writing (gitdb.test.db.test_ref.TestReferenceDB)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/gitdb/gitdb/test/lib.py", line 87, in wrapper
    return func(self, path)
  File "/tmp/gitdb/gitdb/test/db/test_ref.py", line 49, in test_writing
    assert rdb.has_object(gitdb_sha)
AssertionError

----------------------------------------------------------------------
Ran 24 tests in 13.247s

FAILED (SKIP=1, errors=3, failures=1)

1 less error! would that be it? ;)

Ok, I will use your particular script and recheck.
Have to leave in a few minutes, but will try to push before that happens.

Script was adjusted to

git init; for c in `seq 400`; do echo $c >| $c; git add $c; git commit -m "commit $c"; done; git gc;

Please note that the number 400 is arbitrary large, smaller numbers might do as well and I leave it to you to experiment (300 won't work though). Also I added git gc at the end to get some packs.

Tests work for me with that repository, using PY2.7 and PY3.4.
Good luck !

AWESOME -- confirming that it works for me on my laptop!!! Is a new release coming? ;)

Good to hear !
It will be once git-python reaches 0.3.5. Until then, gitdb migth receive minor updates as well.