src-d/hercules

Panic: file x already exists

alippai opened this issue · 12 comments

I'm trying to run hercules against my repo:

$ ./hercules --burndown .
2018/09/19 15:47:54 Burndown failed on commit #2201 (3024) 8fd831a7cb4f2d6012bb984be66dc7fd9e3a8f99
panic: file app/core/modules/****.js already exists

goroutine 1 [running]:
main.glob..func3(0xd36140, 0xc00025cce0, 0x1, 0x2)
        c:/gopath/src/gopkg.in/src-d/hercules.v4/cmd/hercules/root.go:225 +0xb26
github.com/spf13/cobra.(*Command).execute(0xd36140, 0xc00008e0d0, 0x2, 0x3, 0xd36140, 0xc00008e0d0)
        c:/gopath/src/github.com/spf13/cobra/command.go:766 +0x2d3
github.com/spf13/cobra.(*Command).ExecuteC(0xd36140, 0x104ab40, 0x26, 0xc000a52089)
        c:/gopath/src/github.com/spf13/cobra/command.go:852 +0x304
github.com/spf13/cobra.(*Command).Execute(0xd36140, 0x406f37, 0xc00007c058)
        c:/gopath/src/github.com/spf13/cobra/command.go:800 +0x32
main.main()
        c:/gopath/src/gopkg.in/src-d/hercules.v4/cmd/hercules/root.go:429 +0x38

Using (v4 on Windows): https://github.com/src-d/hercules/releases/download/v4/hercules.win64.zip

Try with --first-parent, does the problem persist?

It works this way!

All right you've got a workaround - it does not analyze all the commits but should be enough for a plain --burndown. I wonder how I can reproduce this error. I guess the code is under NDA so I cannot take it - so I am closing the issue for now. If anybody else hits the same bug on an open source project, please report here and I will reopen.

You are awesome and indeed it's under NDA. I'm not familiar with go, so I assume it would be hard to debug this repository.

As requested:

If anybody else hits the same bug on an open source project, please report here and I will reopen.

I am running the following:

hercules --pb --burndown https://github.com/greenplum-db/gpdb > gpdb.pb
 39748 / 44196 [==================================================>-----] 19m35s2018/09/28 13:43:33 Burndown failed on commit #39535 (39748) 74b1d29dd186c4ea51ba1eff06aebd1faeb5dfcd
panic: file src/bin/initdb/po/ro.po already exists

goroutine 1 [running]:
main.glob..func3(0x651dec0, 0xc420a61110, 0x1, 0x3)
	/Users/pivotal/go/src/gopkg.in/src-d/hercules.v4/cmd/hercules/root.go:225 +0xae3
github.com/spf13/cobra.(*Command).execute(0x651dec0, 0xc420030190, 0x3, 0x3, 0x651dec0, 0xc420030190)
	/Users/pivotal/go/src/github.com/spf13/cobra/command.go:766 +0x2c1
github.com/spf13/cobra.(*Command).ExecuteC(0x651dec0, 0x4b27e91, 0x26, 0xc420e6c814)
	/Users/pivotal/go/src/github.com/spf13/cobra/command.go:852 +0x30a
github.com/spf13/cobra.(*Command).Execute(0x651dec0, 0xc42003e0b8, 0x0)
	/Users/pivotal/go/src/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
	/Users/pivotal/go/src/gopkg.in/src-d/hercules.v4/cmd/hercules/root.go:429 +0x31

I built from source at commit 9b8478d - go version go1.10.1 darwin/amd64

If I run

hercules  --first-parent --pb --burndown https://github.com/greenplum-db/gpdb > gpdb.pb
Enumerating objects: 26, done.
Counting objects: 100% (26/26), done.
 30639 / 31496 [======================================================>-] 00m22s2018/09/28 15:02:44 Burndown failed on commit #30638 (30639) 25a90396cd1c52d252a37986e12b63e8e037aa83
panic: file src/bin/initdb/po/ro.po already exists

goroutine 1 [running]:
main.glob..func3(0x651dec0, 0xc4201d45c0, 0x1, 0x4)
	/Users/pivotal/go/src/gopkg.in/src-d/hercules.v4/cmd/hercules/root.go:225 +0xae3
github.com/spf13/cobra.(*Command).execute(0x651dec0, 0xc4200d4100, 0x4, 0x4, 0x651dec0, 0xc4200d4100)
	/Users/pivotal/go/src/github.com/spf13/cobra/command.go:766 +0x2c1
github.com/spf13/cobra.(*Command).ExecuteC(0x651dec0, 0x4b27e91, 0x26, 0xc420e7076c)
	/Users/pivotal/go/src/github.com/spf13/cobra/command.go:852 +0x30a
github.com/spf13/cobra.(*Command).Execute(0x651dec0, 0xc4200ae058, 0x0)
	/Users/pivotal/go/src/github.com/spf13/cobra/command.go:800 +0x2b
main.main()
	/Users/pivotal/go/src/gopkg.in/src-d/hercules.v4/cmd/hercules/root.go:429 +0x31

Please let me know what I can do to help with this issue.

Interesting that this also fails on the Postgres source code:

hercules --pb --burndown https://github.com/postgres/postgres.git > postgres.pb
 31988 / 45684 [=====================================>----------------] 1h55m01s2018/09/28 16:11:16 Burndown failed on commit #31987 (31988) 74b1d29dd186c4ea51ba1eff06aebd1faeb5dfcd
panic: file src/bin/initdb/po/ro.po already exists
...
hercules --pb --burndown --first-parent https://github.com/postgres/postgres.git > postgres.pb
Counting objects: 100% (5/5), done.
 31988 / 45689 [=======================================>----------------] 08m08s2018/09/28 16:40:17 Burndown failed on commit #31987 (31988) 74b1d29dd186c4ea51ba1eff06aebd1faeb5dfcd
panic: file src/bin/initdb/po/ro.po already exists

Great, now that I am able to reproduce I will fix soon. Thanks for reporting.

External contributions are welcome here!

The cause of this is skipping certain file deletions as they become binary and go out of sight. They become binary because the current algorithm in Hercules checks for invalid utf-8 which is stricter than simple scanning for \0.

@doty-pivotal fwiw (postgres)

pg

own

This is awesome - I will take it for a spin on GPDB as soon as I can.

project_23_x_7985_ granularity_30 _sampling_30

Here is GPDB. Looking forward to digging in to more.