jacquesg/p5-Git-Raw

Git::Raw::Push doesn't accept all valid refspecs

sedninja opened this issue · 17 comments

I have a project I am working that watches a working copy of a git repo and automates the process of pushing to multiple commits whenever a change is made. When the project was started, we were running perl v5.10 on RHEL 6. However, it came to my attention that 5.10 doesn't support the unicode_strings feature implemented in 5.12 or greater, and I am now in the process of converting to a newer version of perl (5.16 through SCL).

This has required me to start refactoring older Git stuff to use Git::Raw because the original module doesn't seem to be available through SCL or CPAN.. I have to say, so far, this is a much more elegant solution. That said I've hit a snag with regard to git push that may be an issue with my code but seems more likely to be the behavior of the Git::Raw::Push module.

Git::Raw::Push requires that you pass a valid refspec string. The examples shown illustrate a single head push which is great and seems to work when targeting just a single head:

my $refspec = "+refs/heads/$head:refs/remotes/$remote/heads/$head";
$push->add_refspec($refspec);

However using the syntax outlined in the git documentation to indicate all heads returns an error indicating that the spec is not valid. The strange thing is that if you use the Git::Raw::Refspec::string() method on a remote that doesn't have a custom refspec, the return value is exactly what I am passing which is being rejected with the error Not a valid reference 'refs/heads/*'

my $rem = $self->remote($remote);
my $spec = $rem->refspecs->string;
$push->add_refspec($spec);

The returned respec is +refs/heads/*:refs/remotes/origin/heads/* which SHOULD be a valid refspec according to the documentation I am reading. My low level familiarity with the internals of git is limited, so it's possible this is a problem between the computer and chair. Is there anything I'm missing? Please let me know if you need any further information from me.

I think when you're performing a push, the left side of the refspec should be a reference, and thus wouldn't allow you to push all branches. This is possibly a bug in libgit2, not quite sure, feel free to open an issue in that project to confirm (I'm going in for PRK surgery this morning, wont be able to help for a few days).

In the meantime, what should work for you would be to:

  1. Get the refspec(s) from your remote
  2. Select the branches you want to push.
  3. Add the exact refspec i.e. +refs/heads/master:refs/remotes/origin/heads/master (If you haven't changed the default namespaces, only using +refs/header/master: is also ok). You could also possibly use Git::Raw::Refspec->dst_transform on the refspecs from your remote to get the right-hand side (transformed) value.

Actually, what may even be better is to get the upstream branch associated with the branch you're trying to push, i.e.

my $branch = Git::Raw::Branch->lookup ($repo, 'master', 1);
my $upstream = $branch->upstream();
my $refspec = '+'.join(':', $branch->name(), $upstream->name());
$push->add_refspec($refspec);

Granted, you need to have an upstream associated for this to work (like git branch --set-upstream-to), which can also be done via Git::Raw::Branch->upstream().

Thanks @jacquesg.... It looks like I can set an upstream ref immediately when using Git::Raw::Branch to create a new branch so this will work out for me just fine.

@sedninja Did you come right with this?

Thanks for the followup. I am still trying to work some things out to be honest. I am no longer receiving the error that I'm passing an invalid refspec, but the remote is not receiving the expected content.

The project I am working on pulls a git project from remote A and sets the remote name to updates. A new remote is created for remote B which is named origin. Several other remotes get added later for different purposes, and the package I am building will be responsible for keeping the other remotes in sync with origin, The origin repo may be empty when this particular section of code I am working on gets executed so I may be missing something that needs to happen before I push the ref.

The origin is on a self hosted Gitlab server, and when I push the ref, unpack_ok is successful, but the event that is displayed is:

Test User pushed new branch refs/remotes/origin/master at Test User / test

The link for the "new branch" returns a 404 or 500 response (can't recall at the moment). I would normally expect to see something similar to:

Test User pushed new branch master at Test User / test

I may have just fat fingered something to close this issue. I do not feel the issues I am experiencing are a bug or anything, but any help you might be able to provide will still be greatly appreciated.

The remote side should just see the the branch as refs/heads/master as you're remote branch (non server-side) is tracking the local branch server-side.

Can you post the actual refspec you are using when calling add_refspec()?

@jacquesg the current refspec being added is determined by using the name of the remote and target branch

sub push {
  my ( $self, $remote, $branch ) = @_;
  my $push = Git::Raw::Push->new($self->remote($remote));
  my $spec = "+refs/heads/$branch:refs/remotes/$remote/$branch";
  $push->add_spec($spec);
  # ..... add callbacks and exec push .......
}

The function above is part of a module and executed as $project->push("origin","master"), so the ref that is being added is +refs/heads/master:refs/remotes/origin/master. The $push->unpack_ok value is truthy indicating a successful push, but the master branch I am trying to push doesn't seem to be packed and pushed upstream. The current origin remote doesn't have a master branch yet since it's a newly created bare repo. That said, the working copy I am working with was created using Git::Raw::Repository::clone and the origin remote url is updated to the REAL origin after a second updates remote is created. To set the url I use the Git::Raw::Branch::url method.

Can you use the status callback to see if you get a message back for the ref, something like this (the semantics are documented in the perldoc, $msg is undef if the update was successful, otherwise it may/will contain a message):

$push->callbacks({
    'status' => sub {
        my ($ref, $msg) = @_;
        if (defined($msg)) {
           print "Status: $ref: $msg\n";
        }
    }
});

Also, you need to call $push->update_tips() after the unpack was successful. You can register a callback on the $remote object to fire for each updated ref (should only be 1 ref as your refspec is limited to 1 branch), something like:

$remote->callbacks({
    'update_tips' => sub {
        my ($ref, $a, $b) = @_;
        print "Updated: $ref ($a -> $b)\n";
    }
});

@jacquesg Thank you so much for helping me get through this. I really appreciate your insight.

So... I did actually have a callback for status already, but my logic was flipped so I was getting an inaccurate report that the push was successful. My new callback is as follows:

$push->callbacks({
    'status' => sub {
        my ( $ref, $msg ) = @_;
        if ( defined $msg ) {
            $self->log("info","updated ref: $ref");
        } else {
            $self->log("error","update failed for $ref");
        }
    },
    'pack_progress' => sub {
        my ( $stage, $current, $total ) = @_;
        $self->log("debug","packed $current objects out of $total");
    }
});

The log output shows the failure now like so:

2014-09-04 21:20:41.000000 +0000 error desman project: update failed for refs/remotes/origin/master

Unfortunately there isn't much to go on here, and I can't see anything from the Gitlab UI because any attempts to view the project once I've executed a push return a 500 error. I will see if I can get any additional information from the Gitlab logs too, and I'll let you know what I find there.

The Gitlab server is not providing any kind of useful message that would indicate a failure in the git push process, although I get a small stack trace when trying to access the project when the 500 error gets thrown which seems to be due to an "id" field not be supplied.

I should also have pointed out, that the pack_progress callback is producing some interesting log output as well.

2014-09-04 21:55:00.000000 +0000 debug desman project: packed 1 objects out of 0
2014-09-04 21:55:00.000000 +0000 debug desman project: packed 0 objects out of 2039

There is never a point where the values for $current and $total are the same which is odd. I am building in a docker container so it may be that the pack fails due to a system dependency that I am unaware of. The current container is built using Ubuntu 14:04 as a base image with the following packages installed via apt-get

RUN apt-get update -qq && \
        apt-get install -qq -y make gcc openssl libgit2-dev \
        ca-certificates libssh2-1 libssh2-1-dev ssh-client \
        mysql-client libmysqlclient-dev libxml-sax-expat-perl \
        libxml2-dev curl git

I also have a CPAN installation that is scripted out where Git::Raw and other perl modules get installed that aren't available through aptitude. I do not see any issues that would indicate a bad build, but I know that some features may not be implemented if the system libs aren't available at build time.

Woot!

I changed the refspec to +refs/heads/$branch:refs/heads/$branch and the push was successful. The status callback still has an undef msg variable, and the pack_progress output is the same as previously mentioned, but the remote has the contents of the repo, and the Gitlab event viewer shows the expected Test User pushed new branch master at Test User / test event.

What would be of interest here would be to print out the refspec that your remote has:

my @remotes = $repo->remotes();
foreach my $remote (@remotes) {
    foreach my $refspec ($remote->refspecs()) {
        print "Remote: ", $remote->name(), ": ", $refspec->string(), "\n";
    }
}

Let's have a look at what this reports. The most robust/correct solution is to actually use the remote's refspec, and then do a transformation.

In terms of optional libraries, Git::Raw uses openssl and libssh2. Normally on Linux systems, it should find both if they're available. openssl is normally used for HTTPS/SHA-1 hashes, and libssh2 is used if you require SSH support (which most people probably do).

This is how I actually do pushes:

    my $refName = $ref->name();
    my $refSpec = join ('', $force ? '+' : '', "$refName:$refName");

    my $result = $this->_Push ($refSpec);
    if (defined ($result))
    {
        # Ensure there is a remote-tracking branch associated
        $this->AssociateRemoteTracking();
    }
sub _Push
{
    my ($this, $refSpec) = @_;

        my $remote = Git::Raw::Remote->load ($git, 'origin');
        $remote->callbacks ({
            credentials => sub {
                my ($url, $user) = @_;
            },
            update_tips => sub {
                my ($ref, $a, $b) = @_;
            },
            sideband_progress => sub {
                my ($msg) = @_;
            },
        });
        my $pushCallbacks = {
            transfer_progress => sub {
                my ($current, $total, $bytes) = @_;
            },
            pack_progress => sub {
                my ($stage, $current, $total) = @_;
            },
            status => sub {
                my ($ref, $msg) = @_;
            }
        };
        my $push = Git::Raw::Push->new ($remote);
        $push->add_refspec ($refSpec);
        $push->callbacks ($pushCallbacks);
        $remote->connect ("push");
        $push->finish();
        $remote->update_tips();
        $remote->disconnect();
}

This part you may not need / need to adapt to your use case. I only use on remote, origin (and assume its the first one), which is no true for your usage scenario.

sub AssociateRemoteTracking
{
    my ($this) = @_;

    my $ref = $this->{ref};

    if (!$ref->is_remote())
    {
        my $upstream = $ref->upstream();
        if (!defined ($upstream))
        {
            my $git = $ref->owner();
            my $remote = ($git->remotes())[0];

            $upstream = Git::Raw::Reference->lookup (
                join ('/', $remote->name(), $ref->shorthand()), $git);
            if (defined ($upstream))
            {
                $ref->upstream ($upstream);
            }
        }
    }
}

@jacquesg As you suggested, here is a report of all of the refs associated with the 2 current remotes. This is before $push->finish and $push->update_tips are run (immediately after creating the devel branch).

Remote: origin: +refs/heads/*:refs/remotes/origin/*
Remote: origin: +refs/heads/refs/heads/devel:refs/remotes/origin/refs/heads/devel
Remote: origin: +refs/heads/refs/heads/master:refs/remotes/origin/refs/heads/master
Remote: origin: +refs/heads/refs/remotes/origin/master:refs/remotes/origin/refs/remotes/origin/master
Remote: updates: +refs/heads/*:refs/remotes/updates/*

If I stop after the initial clone and configuration of the updates remote, I have just the refs/heads/*:refs/remotes/$remote/* refspec for each remote.


Thanks for helping get this going, It's a little strange that the status callback $msg input variable is undefined, but I'll keep playing around with it. I think this is a much better solution than using the Git module I am coming from. I believe with a bit of work I'll be able to get much more useful information for my users from this module which is important. Keep up the great work!

I agree that is may be a little bit odd that $msg is undefined on success, however, this is due to the way libgit2 does it see: https://libgit2.github.com/libgit2/#HEAD/group/push/git_push_status_foreach

Feel free to open issues if there is functionality missing which you need, I believe I've managed to bind most of them.