/git_xy

Handy way to manage (sub)module in multiple repositories. Designed for lazy engineers. This project is written for educational purpose only. Use them at your own risk.

Primary LanguageShell

Description

WARNING: This project is written for educational purpose only. Use them at your own risk.

git_xy helps to synchronize sub directories between git repositories semi-automatically, and may generate pull requests on github if changes are detected on the destination repository. It works like rsync, but for git repositories.

git_xy reads a list of source/destination specifications from a configuration file, and for each of them, git_xy fetches changes from the source repository and synchronizes them to destination path (thanks to rsync). It finally generates commit and creates new PR (pull request) if necessary. See more details in How it works.

TOC

Usage

Installation

git_xy is a Bash4 script. It requires some additional tools on system:

The main program git_xy can be installed anywhere on your search path

$ sudo wget -O /usr/local/bin/git_xy \
    https://github.com/icy/git_xy/raw/ng/git_xy.sh

$ sudo chmod 755 /usr/local/bin/git_xy

Configuration

Configuration consists of source/destination specification in the following format:

src_repo src_branch src_path   dst_repo dst_branch dest_path [pr_base] [rsync_options]

which reads in the order

  • src_repo, src_branch, src_path: The source repository, branch and path
  • dst_repo, dst_branch, dst_path: The destination repository, branch and path.
  • pr_base is optional and is used to specify where you want the PR arrives. By default, it's the upstream repository.
  • rsync_options (optional): Any options for rsync command. The first option should start with -.

See examples in git_xy.config-sample.txt.

git@github.com:icy/pacapt ng lib/ git@github.com:icy/pacapt master lib/

Invocation

Now execute the script

GIT_XY_CONFIG="git_xy.config-sample.txt" ./git_xy.sh

the script will fetch changes in lib directory from branch ng in the pacapt repository, and update the same repository on another branch master. If changes are detected, a new branch will be created and/or some pull request will be generated.

If you provide GIT_XY_CONFIG="-", the script will read from STDIN.

Sample Prs on Github

These PRs are generated by using sample configuration file.

Generated by the latest version of the script:

Generated by some older versions of the script:

Environment variables

  • GIT_XY_ENV_FILE: Where you define some configuration for your script. In this file you can define/do whatever you want before the script starts. Please note that you may want to use export to make your variables available to subsequent processes.
  • GIT_XY_CONFIG: Path to the configuration file. When it is empty and/or not specified, the value git_xy.config is used. You can also use - to specify stdin as the input source of configurations.
  • GIT_XY_PUSH_OPTIONS: Options used by git push command when new commit is generated by the script. Default to empty string.
  • GIT_XY_SET_OPTIONS: Options used by set command. See man set for details. For example, if you want to turn on debug mode, use GIT_XY_SET_OPTIONS=-x. Please use this variable with care; and please note that the option -u and +e will be always enforced.
  • GIT_XY_HOOKS: Specify a list of post-commit hooks. By default, it is gh. See Hooks for details
  • GIT_XY_REVERSE: When the value is yes, the synchronization direction is reversed (dst becomes src, src become dst). This is useful when you want to fetch something from the downstream. The pr_base option is not changed when the option is yes.
  • D_GIT_SYNC: Where the script fetches remote repositories. It's a hard-coded string $HOME/.local/share/git_xy/ here $HOME is your home folder.

Hooks

A hook is a predefined method which would be executed when new commit is generated. A hook is expected to return zero code.

  • gh: Generate a Github pull request when new commit is created.

How it works

Nothing magic, it's a wrapper of git clone, rsync and git commit:) Let's say we have configuration file

src_repo src_branch src_path dst_repo dst_branch dst_path

the script will do as below

  • Create a clone of the src_repo in ~/.local/share/git_xy/src_repo (The actual folder name is a bit different to avoid some special characters in the user input.)
  • Check out the existing branch src_branch
  • Create a clone of the dest_repo in ~/.local/share/git_xy/dst_repo
  • Check out the existing branch dst_branch
  • Create new branch from dst_branch (if neccessary). This branch is specially used for PR creation.
  • Use rsync to synchronize the contents of the src_path and dst_path. On the local machine where the script runs, it's a variant of the command rsync -ra SRC DST here SRC is ~/.local/share/git_xy/src_repo/src_path and DST is ~/.local/share/git_xy/dst_repo/dst_path
  • Generate new commit
  • If specified, execute the hook to generate new pull request on Github

Well, it's so easy right? It's an automation support of your handy commands.

TODO

  • Add tests and automation support for the project
  • Provide a link to the original source
  • More POSIX ;)
  • Sometimes we only need to create a PR without generate any commit
  • Re-use existing branch to generate new PR

Done

  • Read configuration file from STDIN
  • hook/gh: Return zero if a PR already exists
  • Support file synchronization...
  • Add option to reverse the synchronization (dst becomes src and vice versa)
  • Better error reporting
  • Handle --delete option
  • Create a hook script to create pull requests
  • Gather multiple sub-folder in a single PR
  • Re-use existing git_xy branch
  • Better hook to handle where PRs will be created
  • Add some information from the last commit of the source repository
  • Make sure the top/root directory is not used (we allow that)
  • Allow a repository to update itself

Why

There are many tools trying to solve the code-sharing problem:

Native support

  • git submodule: Create pointers to some commit hashes on the upstream repositories, and check out the upstream repository as sub-directory of the current repository upon request. It's you and your team who watch changes on upstream and fetch them manually. More submodules to watch, more issues to handle.

  • git subtree: Quite similar to git submodule, but it doesn't provide fragile pointer. Instead, it fetches all upstream commits and creates some merge points in the current repository. This way it's more stable than git submodule, when you always have a copy of the upstream code in your repository. You get what you don't really mean: duplication of commits, bigger size, confused/noisy commit messages, and you have to learn how to merge (really?) Good reading: https://www.atlassian.com/git/tutorials/git-subtree.

    Git-subtree original experimental project is found here https://github.com/apenwarr/git-subtree.

Looked like git submodule requires you to understand C programming language, while git subtree is kind of Python which hides pointers from your laptop:D

Meta-repository

Very constraint tools

Back-and-forth tools

Well, there are too many tools... What I really need is a simple way to pull changes from some repository to another repository, generates some pull request for reviewing, and the downstream maintainer will decide what they would do next.

Morever, this process should be done automatically when the upstream repository is updated. Human intervention is not the right way when there are just 100 or 500 repositories because of the raise of the micro-repository design (if any) :D

Resources

Author. License

The script is written by Ky-Anh Huynh. The work is released under a MIT license.