/git-introduction

Crash course in Git

Creative Commons Attribution Share Alike 4.0 InternationalCC-BY-SA-4.0

Crash course in Git

Welcome to the Git ½ day crash course for the University of Bergen.

This entire course is available under the Creative Commons Attribution Share Alike 4.0 International license (cc-by-4.0), created by and for Pål Grønås Drange (Pal.Drange@uib.no).

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, even commercially.

This license is acceptable for Free Cultural Works.

The licensor cannot revoke these freedoms as long as you follow the license terms.

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Notices:

You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.

No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

About this course

There is nothing this course will teach you that you couldn't get from reading the amazing Pro Git book. In the cases where we will not be able to explain a topic any better than the book, which happens to be most topics, we will put a link to the chapter and verse in the book at the beginning of the section.

Most sections have a reference section for further reading, e.g.

References

  1. git-scm
  2. man gittutorial
  3. Learn Git branching

It is recommended that you take some time to read through the material.

Note that this course will be completely console/text based, and the goal is that if you understand the foundation of git and version control, you will also be able to use Git through graphical user interfaces, such as gitk that might already be available on your system.

A half day course

The topics for a four hour course is:

  1. First commits and browsing
  2. Linear use and undoing
  3. Branches
  4. Remotes

The least interesting topic for a fundamentals course is the topic of Remotes, so when squeezed into three hours, we cap that section.

The topics for a three hour course are:

  1. First commits and browsing
  2. Linear use and undoing
  3. Branches and remotes

Table of contents

  1. Warming up
  2. The empty repository
  3. Your first commit
  4. Browsing our history
  5. Linear use of git
  6. Undoing changes
  7. Branches in git
  8. Remotes
  9. First time setup and configuration
  10. Git commands

Warming up

Open an empty pure text file.

$ nano a.txt

(Note that if you are on Windows, you may use notepad a.txt instead.)

Let us create some content in our file a.txt:

Mr. Utterson the lawyer was
a man of a rugged countenance
that was never lighted by a smile;
cold, scanty and embarrassed in discourse;
backward in sentiment;
lean, long, dusty, dreary
and yet somehow lovable.

Now, let us create a file b.txt which contains some errors on line five:

Mr. Utterson the lawyer was
a man of a rugged countenance
that was never lighted by a smile;
cold, scanty and embarrassed in discourse;
bckwrd n sntmnt;
lean, long, dusty, dreary
and yet somehow lovable.

If we run diff a.txt b.txt in the terminal, we will get the following output (this is important)

5c5
< backward in sentiment;
---
> bckwrd n sntmnt;

There are four lines in this output, and each line is important.

  1. 5c5 consists of two line numbers 5 and 5, separated by a c (for change)
  2. The < character means content in the first file (a.txt)
  3. The --- separator means that we are done with the first file of this hunk
  4. The > character means content in the second file (b.txt)

If we go back to file b.txt and delete the fifth line (the one reading bckwrd n sntmnt;), then diff a.txt b.txt will say

5d4
< backward in sentiment;

You might have guessed it: the d in 5d4 means that a line was deleted. Were we to add a line to the second file, the output would likely be 5a6.

Notice how a diff (a patch) between two files can tell us how to go from one file (a.txt) to another file (b.txt), and back again. And indeed, there is a tool, called patch, that takes a diff and applies it to a file, producing the second file:

$ diff a.txt b.txt > fix.patch
$ patch a.txt fix.patch
patching file a

It is also possible to go back

$ patch b.txt fix.patch
patching file b
Reversed (or previously applied) patch detected!  Assume -R? [n]

(press y for reverse patching).

Once you understand this back-and-forth you will be able to understand everything version control is about!

Hunks

As we saw above, the diff output consisted of one line with something like 5c5, one (or more) line(s) starting with <, a separator ---, followed by one (or more) line(s) starting with >. This is one section of a diff, called a hunk. If you have several changes in different places in the file, you will likely have several hunks. You should be able to decode their meaning now, i.e. what is the difference between a.txt and b.txt.

2c2
< a man of a rugged countenance
---
> a man of rugged countenance
6c6
< lean, long, dusty, dreary
---
> lean, long dusty, dreary

Lines are first-class citizens

Note that diff, patch, and therefore also git are primarily line oriented, meaning that changes are on a line-by-line basis. This means that there are some drawbacks with regards to programming languages (and other content like markup and natural text) that naturally span several lines.

Exercises

  1. Create (touch) a new text file.
  2. Write some lines.
  3. Copy that file to a different file and make a couple of changes.
  4. Run diff on the two files and save the output to a new file.
  5. Experiment with several changes and inspect the output of diff.
  6. Apply patch on the file and the output.

References

The empty repository

(NB: You probably want to create a new folder to experiment in with mkdir path and cd path prior to git init.)

We now understand the fundamentals behind revisions of files, so we are ready to start using git. Go into a new empty folder and simply write

git init

The output will be

Initialized empty Git repository in /Your/username/path/.git/

Congratulations, you have started using git!

If you get an error message about not having installed git, it is probably your first time using that tool. If that is the case, have a look for the section First time setup and configuration on how to configurate your first git setup.

To see what we have in our folder, we run

git status

which outputs

On branch master

No commits yet

nothing to commit (create/copy files and use "git add" to track)

Git here tells us that we have no history, and thus no files. So in the next section, we will take our file a.txt and add it to git.

Observe now that though, that everything in Git is stored in the .git folder, and you can always investigate its content by running tree (or find .git if your system does not have tree). On Windows, you can run cmd //c tree .git to use Windows' tree command.

$ tree .git
.git
├── branches
├── config
├── description
├── HEAD
├── hooks
│   ├── applypatch-msg.sample
│   ├── […]
│   └── update.sample
├── info
│   └── exclude
├── objects
│   ├── info
│   └── pack
└── refs
    ├── heads
    └── tags

Exercises

  1. Make a new folder
  2. Run git init
  3. Run git status
  4. Read How to Write a Git Commit Message.

References

  1. git init
  2. git status
  3. How to Write a Git Commit Message

Your first commit

The sooner we talk about the three stages of tracked files, the better:

  1. Tracked, unchanged
  2. Tracked, changes not staged to be commited
  3. Tracked, changes staged to be commited
  4. (... and of course the untracked files)

When we have added a file to the git tree, it is tracked. If we change that file, the file will immediately be marked as changed, but the changes are not staged. Before we can commit the changes to the git tree, we need to stage the changes, or add the changes. We do this by running git add a.txt. To immediately commit the changes, we run git commit -m "Description/message of change".

git add a.txt
git commit -m "Initial commit"

Which outputs

[master (root-commit) 1fe91d0] Initial commit
 1 file changed, 6 insertions(+)
 create mode 100644 a.txt

Exercises

  1. Create a text file with some content.
  2. Run git add on the file
  3. Run git status
  4. Run git commit -m "Initial commit"

References

  1. git add
  2. git commit

Browsing our history

A crucial part of using Git is to read the history of the project; it is after all a revision control system.

Exercises

  1. Run git log
  2. Run git log -p (use q to quit if necessary)
  3. Run git log --oneline

References

  1. git log

Linear use of git

Now we are ready to use Git.

Open a.txt and add a couple of lines.

As many times as you need:

edit a.txt
git add a.txt
git diff
git status
git commit -m "Changed a.txt"
git status
git log --oneline
git log -p

Exercises

  1. Change a line, diff, status, add and commit, log
  2. Delete a line, diff, status, add and commit, log
  3. Add a line, diff, status, add and commit, log
  4. Write a longer file with at least three paragraphs, add and commit ...
  5. ... then modify the file in the top and bottom paragraphs and run git add -p
  6. Change top and bottom paragraph, add and commit only the top changes

References

Undoing changes

Being humans, one thing that will never change is that we make mistakes. Like a lot. Sometimes we delete a file we didn't intend to delete, and sometimes we change things we didn't want changed. Git is super-helpful in these cases. Even if we commit the changes we didn't intend, we can still navigate back to the previous version we were happy with.

First of all, you should know that

git reset --hard

removes all your changes and everything you added since last commit. Hence, if you commit often, you can always get back to a nearby state by running git reset --hard. But realize that this removes all your unsaved uncommited changes, which might not be what you want.

Reverting all the changes

Let's say you delete a.txt and maybe overwrite b.txt with some garbage that you didn't intend to. If there is no other change you need to keep since the previous commit, you run

git reset --hard

However, that resets everything. Sometimes you want to keep some of the changes (say, the changes to b.txt), but you might still want to recover a.txt. The simplest way out is to run

git add b.txt
git commit -m "Saving changes to b"
git reset --hard

Now, since we saved the changes to b.txt, the reset --hard restores a.txt, but leaves b.txt as it is in the commit.

Undoing a staging (an add)

Suppose that you change b.txt and stage the changes with git add b.txt, but immediately regrets staging them. By running

git reset

we unstage everything. We can also run git reset b.txt to reset only b.txt. In some sense, git reset <path> is the opposite action of git add <path>.

Undoing a commit

Sometimes we even commit a staged change by accident, or after review, we realize that we have commited an error. We can roll back to a previous commit, however, this should be done only locally, or in other special circumstances.

If we have this history:

* be70d88 (HEAD -> master) Add c
* 434bbc2 Add b
* d52b990 Initial long commit
* 4468f5d Initial commit

then git reset --hard 434bbc2 results in this tree:

* 434bbc2 (HEAD -> master) Add b
* d52b990 Initial long commit
* 4468f5d Initial commit

Exercises

  1. Delete a.txt with rm a.txt and run git status. Revert the change with git reset --hard.
  2. Stage changes in several files, unstage only one of the file changes.
  3. Unstage everything, and stage with git add -p
  4. Commit and roll back to a previous commit.

References

Branches in git

A commit is, as we have seen, a diff between different versions of files, and commits form the base of git. A commit lives in a branch.

When we ran git init, we started with a branch called master. While we add and commit, we commit to master.

We can branch out from master to create new commits that are "separate" from the commits we have on master by using git branch:

git branch my_first_branch

The branch command adds a new branch called my_first_branch, and if we run git branch now, we will see that we have two branches, master and my_first_branch. The next step is to go to the new branch:

git switch my_first_branch

This takes us to the new branch that branched our from HEAD on master, and if we run git log, we will see that we are on my_first_branch.

When we now change a file, and add and commit it, we can see from the log, that my_first_branch has moved beyond master, by running git log:

* f308559 (HEAD -> my_first_branch) Add line on new branch
* 1fe91d0 (master) Initial commit

(We note that git checkout has been "replaced" with git switch and git restore, and that git switch -c new_branch has become the recommended alternative to git checkout -b new_branch.)

Merging a branch into master

When we work with a branch, we usually intend to merge the branch with the master branch (but not always).

Run git branch to ensure that we are on master. Run git switch -c new_branch and make changes to a.txt. Stage and commit the changes, and check out master.

Observe that we have two different versions of a.txt.

When we want to merge new_branch into master, we simply write

git merge new_branch

This was a very simple merge, which resulted only in a fast-forward.

Let us make a more complicated merge. Check out a new branch, called second_branch. Change a.txt (and stage and commit) in the top of the file! Go back to master and change a.txt at the bottom of the file (stage & commit). Observe that we have two different (changed) versions of a.txt in the two branches.

Go to master and again run git merge second_branch.

At this point, git will create a new commit for you, called a merge commit. You are asked to provide a commit message, but the default

Merge branch 'second_branch'

is a good message, so we keep that.

Git will now do an auto-merge.

At this point, it can be illuminating to run

git log --oneline --graph
*   5d7576b (HEAD -> master) Merge branch 'second_branch'
|\
| * 039842c (second_branch) Change in second to a
* | ff20c97 Change in master to a
|/
* 02c02bc Add changes to file on master
* cf6489e (new_branch) Add change in new_branch
* f8213a3 Add file c
* 434bbc2 Add b
* d52b990 Initial long commit
* 4468f5d Initial commit

Merge conflicts

It is at this point recommended to breath, enter lotus position, revisit the previous exercises and embrace. We will provoke a CONFLICT in git.

As you saw in the previous section, git could easily merge the changes, because they touched completely different parts of the file. Git manages to keep both changes, and no data/change is lost.

But what if we have two files that change the same line

Let us check out yet another branch, git switch -c future-confl. Edit any line in a.txt and commit. Go back to master and edit the same line in a.txt but in a different way. Now let's merge!

$ git merge future-confl
Auto-merging a
CONFLICT (content): Merge conflict in a
Automatic merge failed; fix conflicts and then commit the result.

Run git status to see what is going on:

On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Unmerged paths:
  (use "git add <file>..." to mark resolution)

	both modified:   a.txt

no changes added to commit (use "git add" and/or "git commit -a")

We are now in a conflict state, which we can easily get out of by running git add a.txt and git commit. But first, let's inspect a.txt.

git diff
  The text above the change ...
++<<<<<<< HEAD
 +this line was changed in master.
++=======
+ whereas this line was changed in future_confl.
++>>>>>>> future-confl
  this is some more text ...

What we can see here, is that git informs us that we have to choose which change we want, or more precisely, how we want a.txt to look on master.

Let's go into the a.txt file and fix the conflict. Run git add a.txt and git commit. Again we are prompted with a commit message, and again the default is ok,

Merge branch 'future-confl'

Again we can run git log --oneline --graph to see our work:

*   20bb70d (HEAD -> master) Merge branch 'future-confl'
|\
| * a80f35e (future-confl) Change a in future-confl
* | 80ca334 Change a in master
|/
*   5d7576b Merge branch 'second_branch'
|\
| * 039842c (second_branch) Change in second to a
* | ff20c97 Change in master to a
|/
* 02c02bc Add changes to file on master
* cf6489e (new_branch) Add change in new_branch
* f8213a3 Add file c
* 434bbc2 Add b
* d52b990 Initial long commit
* 4468f5d Initial commit

We have successfully provoked a conflict as well as resolved it.

Take away message: A conflict is Git telling us that she doesn't know which change you want to keep, because you changed the same part in both branches you want to merge. Git then asks you to manually pick how the result should look, and when you are happy, you simply stage & commit.

Exercises

  1. Create a new branch

  2. Jump between branches

  3. Change the file in one branch, add, commit, observe the file in the different branches

  4. Merge the branch into master

  5. Create a new branch, and change a.txt in the top. Go back to master and change it in the bottom parts. Merge.

  6. Create a new branch, change the same parts in both master and the new branch. Merge and resolve conflicts.

  7. Create a different branch, make a commit, go back to master and cherry-pick the commit in.

  8. Read the manual entry for git rebase.

  9. Python exercise: Create a file cc.py with the following content

    def count_chars(fname, char):
        count = 0
        with open(fname, "r") as f:
            for line in f:
                count += line.count(char)
        return count
    
    
    if __name__ == "__main__":
        from sys import argv
    
        print(count_chars(argv[1], argv[2]))

    Create a branch, one where you change fname to filename (both places), and on master, you change count_chars to count_character (both places).

    Merge the changes into one working Python script containing both changes.

    Ensure that the script work by running

    $ python cc.py cc.py i
    10
    $ python cc.py cc.py a
    14
    

References

  1. git branch
  2. git merge
  3. git rebase
  4. git cherry-pick

Remotes

We are now actually at a level where we can use Git for everything, but we have not used it as a collaborative tool. It is actually possible to use it productively by sending the changes (commits and even full branches) via email, however, it is possible to use a common server, or in Git called remote, to share a Git tree. (See note.)

Suppose that you (Alice) are working with Bob on the same Git tree, and the Git tree are stored on a Git server with URL https://example.com/git/project.git.

You can add that URL as a remote for your git tree by using

git remote add origin https://example.com/git/project.git

The name origin here is arbitrary but standard. There is also a Git protocol which is based on ssh, in which the URL above would look like git://example.com:git/project.git.

We could add such a remote (choosing a different name now) with

git remote add upstream git://example.com:git/project.git

Cloning an existing repo

If the repository already exists on the server, you more likely would want to clone the repository:

git clone git://example.com:git/project.git

Note on ssh vs https

When using the https protocol, we are prompted with a login prompt whenever we want to interact with the server, of the kind

username:
password:

However, in today's era, we very often recommend the use of two-factor authentication (2FA), and as you maybe can imagine, this prompt rarely works well with 2FA. Hence, if you have enabled 2FA on your Git server, you should not be surprised if you run into problems using the https protocol.

We therefore recommend you use the ssh protocol. The ssh protocol is based on a private/public key (asymmetric encryption), where you generate two keys, one that you keep secret (your private key), and one you give to the Git server (and everyone else in the world, your public key).

There is a tutorial on GitHub.com for Generating a new SSH key and adding it to the ssh-agent

As long as you keep the private key secret, this is a very good and secure way of working with Git. You can add several keys, one for each computer you use.

It is also possible to use HTTPS without 2FA, but with a personal access tokens instead of the password. In this case, your username is your GitHub username, and the password to use is the token. Beware that you need to take care to not leak the token, since that is a stand-in for password+2FA.

Fetching, pulling, and pushing

To push your tree to the server, you simply write git push.

To fetch the tree on the server, you run git fetch.

If you want to fetch and "merge" the changes on the server, you run git pull.

Exercises

  1. Clone any Git repository and run git log

Notes

  1. A remote is not necessarily stored on a server. Indeed, you can actually clone a repository on your local (or remote) harddrive. Simply make a new folder and write git clone /path/to/repo.

References

First time setup and configuration

If you have already installed git, you may want to skip that section.

If this is the first time you are using git, you may run into problems due to a lack of configuration. The first time you are using git, you should set your name and your email-adress, which will later get used to tag commits you are contributing in different projects.

There is a tool called git config which lets you set different parameters of your local configuration. If you wish to set your name and your email-adress (you should), you can use the git config in the following way:

git config --global user.name "Ola Nordmann"
git config --global user.email oldnordmann@example.com

Creating a .gitignore file

When programming, there are a lot of files which are temporarily created like compiled versions of files, files created by IDEs or big data files which you may not want to save into your version control system.

To solve that problem, there is a way to create a global ignore file which contains a list of filename (including wildcards) which should be ignored by git.

To create a global .gitignore file, use the following command:

nano ~/.gitignore

You may want to use a different editor if you like. Inside that file, you can add file- or foldernames you would like to ignore forever. Every line in a file is one entry to ignore. If you use a start, you create a wildcard, which can be any number and combination of symbols.

An example .gitignore file for Java can look like this:

*.class
*.jar

That file tells git to ignore every file with the file ending .class or .jar. There are good websites in the Internet creating .gitignore-files automatically, depending on the programming languages and IDEs you are using. One example is this website.

After creating the file, you need to tell git ignore where the file is saved by sending the following command:

git config --global core.excludesfile ~/.gitignore

Due to the weirdness of Microsoft products, this command is different on Windows (source):

git config --global core.excludesfile %USERPROFILE%\.gitignore

You can also create local .gitignore files by creating a file with that name in one of your git-repositories.

Aliases

There is a lot of git commands and many of them can feel quite long if you type them in very often. Therefore, git allows you to alias commands, which means that it creates a shortcut for it.

To be able to do that, you need to open your .gitconfig file by opening it with the text editor of your trust.

nano ~/.gitconfig

In there, one needs to create an [alias] section followed by any number of aliases. This can be added to the end of the file. The format of them is like a variable assignment in a programming language. The left side is the new command and the right side the old command (source).

An example can look like that:

[alias]
praise = blame

l = log --pretty=format:"%C(yellow)%h\\ %ad%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --date=short
ll = log --pretty=format:"%C(yellow)%h%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --numstat
ld = log --pretty=format:"%C(yellow)%h\\ %C(green)%ad%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --date=short --graph
ls = log --pretty=format:"%C(green)%h\\ %C(yellow)[%ad]%Cred%d\\ %Creset%s%Cblue\\ [%cn]" --decorate --date=relative

a = add
ap = add -p
cm = commit -m
d = diff
s = status

Git commands



Copyright 2020---2022 Pål Grønås Drange, (cc-by-4.0)