Kentzo/git-archive-all

Python's relpath mischievous behavior affecting the archive extractability

1AdAstra1 opened this issue · 7 comments

Good afternoon!

Just have noticed a case of strange behavior of this script, resulting in my tarballs not extracting:

Olgas-Mac-mini:somesoftware_c490e0f4_5235ff0b oreznikova$ tar -xzf test.tar.gz
./../../../../../../../../var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo/config/databases/geobaza/csv2mysql.sql: Path contains '..'
tar: Error exit delayed from previous errors.

I run the script this way (from another library's path, to archive the current directory):

/usr/bin/env /Users/oreznikova/WorkProjects/capistrano-git-copy/vendor/git-archive-all/git-archive-all --prefix='' ../test.tar.gz

I traced the call stack and found out that the project uses Python's 'relpath' function. So I put the temporary print statement into the script:

# Shell command returns absolute paths to submodules.
submodule_path = path.relpath(submodule_path, self.main_repo_abspath)
print(submodule_path)

And I saw it output a really weird path:

  /private/var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo/../../../../../../../../var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_490e0f4_5235ff0b/repo/config/databases/geobaza/csv2mysql.sql => ../../../../../../../../var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo/config/databases/geobaza/csv2mysql.sql

How can that happen? Is there anything a user can do to avoid such a behavior? Would be very thankful for any answer.

@1AdAstra1 Could you also print submodule_path and self.main_repo_abspath?

Yes, of course:

# Shell command returns absolute paths to submodules.
submodule_path = path.relpath(submodule_path, self.main_repo_abspath)
print("relative path: " + submodule_path)
print("absolute path: " + self.main_repo_abspath)

produces

relative path: ../../../../../../../../var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo/config/databases
absolute path: /private/var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo

Could you print submodule_path before it's changed?

Ah, sorry, didn't get that. Sure, here it is:

# Shell command returns absolute paths to submodules.
print("submodule original path: " + submodule_path)
submodule_path = path.relpath(submodule_path, self.main_repo_abspath)
print("relative path: " + submodule_path)
print("absolute path: " + self.main_repo_abspath)

produces

submodule original path: /var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo/config/databases
relative path: ../../../../../../../../var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo/config/databases
absolute path: /private/var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/somesoftware_c490e0f4_5235ff0b/repo

I guess it's related to symlinks. /var is a symlink to /private/var/.

Yep, seemed to produce the correct path now:

submodule original path: /private/var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/geoproxy_staging_c490e0f4_5235ff0b/repo/config/databases
relative path: config/databases
absolute path: /private/var/folders/yw/f7p3n2ds245d7p9cly1npc1h0000gn/T/geoproxy_staging_c490e0f4_5235ff0b/repo