How to Use Git: a Tutorial

From Opentaps Wiki
Jump to navigationJump to search

This document is a work in progress.

General Concepts

Git is a free open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency...once you learn how to use it.

Git can be (more than) a little confusing at first when coming from a Subversion background. Let's first try to compare subversion and git:

  • it is distributed: every checkout is actually a clone of the master repository, this means that there is no higher authority or single point of failure. Effectively every developer has it own repository with its own branches and own commits.
  • because of that, it can also be describe as disconnected: one does not need to access the master repository in order to commit, branch, merge or review the history.
  • this makes all operations that would normally use the Subversion server __usable__ (checking commit log, blame, branching and merging).
  • it is storage efficient: the Opentaps repository in Git including 1.0.0 / 1.4 / 1.0 and all the branches and tags takes around 900Mo, typically a SVN checkout of 1.4 alone would be more than 600M, multiply this be each branch and by each tag ...

Probably one of the most important point is branching. Because Subversion branches are difficult to merge and expensive to create (one need to do a whole checkout to get a branch) they are mostly used as dead branches.

In comparison, in Git branches are very easy to create and merge back. In the git way of thinking, they are "free," and because they are local they are actually used all the time. The reason why a complete Git repository of opentaps is only 900Mo instead of a few gigabyte is because they all share a lot of common content.

Git branches are typically used for:

  • local only configuration
  • unit of work (tickets)
  • client and sub projects

Local Configuration Branch

One of the first thing that we all do is the configuration of the database in `entityengine.xml`, and in subversion we have to lock the file else it ends up accidentally modified.

In Git one could instead use a configuration branch.

This branch can be considered a floating branch, because we want to apply it on top of whatever branch we are currently working on.

*-----[master]----[work]----[configuration]

Unit of work Branch

Let's say you want to work on a feature. With git, You can create a branch for your feature, work on it, and then merge it back to the main branch. A typical workflow is, from

*-----[master]

Create a working branch:

*-----[master][1234-some-ticket]

Make some commits, so your 1234-some-ticket branch advances further:

*-----[master]---[1234-some-ticket]

Then push the branch on `origin` for review, and have the reviewer merge it:

*-----o------[master]
      |                       
       \---[1234-some-ticket]

master> git merge 1234-some-ticket

*-----o-----------------------(1234 merged)[master]
      |                       |
       \---[1234-some-ticket]/

If we do not want the reviewer to do any conflict resolution and want to do it ourselves, we can merge the master into our 1234-some-ticket branch first:

*-----o------[master]
      |                       
       \---[1234-some-ticket]

1234-some-ticket> git merge master

*-----o------[master]
      |             |
       \------------(master merged)[1234-some-ticket]
<

Note that all of this do not mean that the branch has be local until ready, it can be pushed little by little.

We can then push it back to the master. This only produce one merge commit in the `master` branch per ticket / feature. This makes it easier for third parties to follow.

Client and Sub Projects

Typically clients implementing opentaps or developers creating a project based on it use their own subversion repository created from a checkout of opentpas. Luckily the architecture of opentaps allow them to only store a custom hot-deploy component, but sometimes this is still a bit limited and as we know they end up touching code outside of there own components. Finally they sometimes request features / bug fixes that we implement in both our and their repository.

Having the ability to clone our repository and branch it easily means that the changes they make are contained in their branch. It also makes it easier for them to receive updates from the main opentaps repository. Finally, by setting up their repository as a new remote (eg: `client-origin`), you can have many clients in the same local repository (saving some disk space in the process). All that is needed is an SSH access, which is much easier to setup than a SVN server.

*--o--o-----------------------------[origin/master]
   |  |                                 |     |
   |  |                                 |     |  
   |  \----------[client-origin1/client]-[merged]
   |                                    |
   |                                    |
    \---------[client-origin2/client]-[merged]

Using Git

Now let's actually try to use git.

Getting the Code

First, let's take a look at the basic process of getting the code from git and making some local changes. Getting the code is a lot like subversion, at least on the surface. You would use the git clone command to pull the code from the remote repository to your local computer:

$ git clone git://gitorious.org/opentaps/opentaps.git git-opentaps
Initialized empty Git repository in /Users/sichen/Documents/workspace/git-opentaps/.git/
remote: Counting objects: 78173, done.
remote: Compressing objects: 100% (14699/14699), done.
remote: Total 78173 (delta 57124), reused 77443 (delta 56698)
Receiving objects: 100% (78173/78173), 219.74 MiB | 303 KiB/s, done.
Resolving deltas: 100% (57124/57124), done.
Checking out files: 100% (15107/15107), done.

If you have an existing repository, you can also use it as a reference when you clone, like this:

$ git clone git://gitorious.org/opentaps/opentaps.git new-opentaps-git --reference /path/to/previous-opentaps-git/

Now, let's imagine that you modified a file, such as framework/entity/config/entityengine.xml, and changed it to use mysql instead of the embedded Derby database. You can use git status to get a list of the files that have been modified:

$ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#	modified:   framework/entity/config/entityengine.xml
#
no changes added to commit (use "git add" and/or "git commit -a")

git diff will show you the actual modifications, or differences:

$ git diff
diff --git a/framework/entity/config/entityengine.xml b/framework/entity/config/entityengine.xml
index 652c6fc..d6d341e 100644
--- a/framework/entity/config/entityengine.xml
+++ b/framework/entity/config/entityengine.xml
@@ -51,7 +51,7 @@ access. For a detailed description see the core/docs/entityconfig.html file.
     <connection-factory class="org.ofbiz.entity.connection.DBCPConnectionFactory"/>
 
     <delegator name="default" entity-model-reader="main" entity-group-reader="main" entity-eca-reader="main" distributed-cache-clear-enabled="false">
-        <group-map group-name="org.ofbiz" datasource-name="localderby"/>
+        <group-map group-name="org.ofbiz" datasource-name="localmysql"/>
         <group-map group-name="org.ofbiz.olap" datasource-name="localderbyolap"/>
         <group-map group-name="org.opentaps.analytics" datasource-name="analytics"/>
         <group-map group-name="org.opentaps.testing" datasource-name="testing"/>

etc. etc.

Updating

To bring your clone up to date with the remote main opentaps repository,

$ git fetch

This will get updates from all the branches on the remote repository and pull them to your local clone. If you have created local branches from your remote branches, then you would need to update your local branches from the remote branches that were updated (see below):

$ git rebase origin/1447 1447

If however your local branches have changes, git will complain:

cannot rebase: you have unstaged changes
M	framework/entity/config/entityengine.xml

If that's the case, you will have to "stash" your changes for the moment:

$ git stash
$ git rebase origin/1447 1447

Cherry Picking your Update

If you don't want to update everything, you can cherry pick a particular commit like this:

$ git fetch
$ git cherry-pick <the-commit-hash>

Committing Your Changes

To commit your changes, you must first add the change files to the list of files to commit with git add:

$ git add opentaps/opentaps-common/src/common/org/opentaps/gwt/common/server/lookup/PartyLookupService.java

This will move the file into the "Changes to be committed" list of files when you do your git status.

Note that git add is recursive, so if you

$ git add opentaps/search/src/

It will add all files which are in subdirectories of opentaps/search/src/

Once you have added your files, git commit will allow you to save your changes, or "commit" them.

$ git commit -a
[master 4516382] Change configuration to use mysql

Here's the big difference: After you have committed your change, it is committed to your local clone, not to the master git repository of opentaps. This means that it is still available to you, and to anybody who makes a clone of your git repository, but not to anybody who is cloning the main opentaps repository.

After you commit, you can get a history of your changes with git log:

$ git log framework/entity/config/entityengine.xml 
commit 4516382fb3d17da5a2bdccd1942b01636ea08e4d
Author: si chen <sichen@si-chens-imac.gateway.2wire.net>
Date:   Tue Apr 13 11:32:56 2010 -0700

    Change configuration to use mysql

commit b5bf829383d261f7eb2e4c733568dbcce65868f7
Author: sparksun <sparksun@d3523486-f5fe-0310-a41a-a15a4e76f3c7>
Date:   Thu Jan 7 13:51:57 2010 +0000

    #1315 add useOldAliasMetadataBehavior parameter in mysql jdbc url for avoid hibernate cannot find column view entity
    
    git-svn-id: svn://svn.opentaps.org/opentaps_all/versions/1.4/trunk@14503 d3523486-f5fe-0310-a41a-a15a4e76f3c7

commit 6678504157a4234d051292a336bc623d83107ff2
Author: jwickers <jwickers@d3523486-f5fe-0310-a41a-a15a4e76f3c7>
Date:   Fri Nov 27 04:12:00 2009 +0000

    Reset default entity config to use Derby
    
    git-svn-id: svn://svn.opentaps.org/opentaps_all/versions/1.4/trunk@14071 d3523486-f5fe-0310-a41a-a15a4e76f3c7

Note that unlike subversion, you can see a log of all the changes of the file even before your branch was created.

Un-Committing Your Changes

Uh oh, I didn't mean to commit that.

You can un-commit your changes with

$ git reset HEAD <file>

or un-commit all your changes

$ git reset HEAD

You can do this before pushing them up (see below.)

Reverting a Changed File

What if you want to revert your changes to a file and go back to the way it was on your local repository? In git, you would use checkout, like this:

git checkout <file>

Thanks to http://norbauer.com/notebooks/code/notes/git-revert-reset-a-single-file for this tip!

Your Stash

To see the changes you have stashed away,

$ git stash list

To see which files have been stashed away, use

$ git stash show

This shows a list of files in the stash. You can see the changes actually stashed with

$ git stash show -u

To get it back out of the stash and apply the stashed away changes to your current working branch,

$ git stash pop

(Note its' better to git stash pop than git stash apply, or your temporary changes will stay around and accumulate until it's confusing. If that happens, you need to get rid of them (see below) and start over again.)

Sometimes every time you git stash pop, you'll get a conflict. This may have happened because your stash actually contains unresolved conflicts. The only way to fix it is to get rid of that stash.

$ git stash drop

will get rid of the last stash. You can drop a few until the problem goes away.

To get rid of a particular stash, use

$ git stash drop <stash>

To drop a particular stash.

If you don't need any of your stash, you can clear it all with

$ git stash clear

WARNING: If you stash unmerged conflicts, every time you pop from your stash, git will complain about unmerged changes.

Pushing Your Changes Up

To send your changes back up to the main opentaps git repository, you would need to do a

$ git push git@gitorious.org:opentaps/opentaps.git

To push a branch other than the one you are currently on (see below), use

$ git push git@gitorious.org:opentaps/opentaps.git name-of-branch

To push a branch which is not linked to a remote branch,

$ git push origin local-branch:remote-branch

If you get this error message:

! [rejected]        master -> master (non-fast-forward)

It means that your version is out of date, and you need to do a git fetch to bring your version of the day first before pushing up your changes.

Permission denied (publickey).

You need to upload your public key, usually ~/.ssh/id_rsa.pub, to your profile on gitorious.

Branching

First let's take a look at what branches are available:

$ git branch -a
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master

The current branch you're on is the one marked with a "*". The branches which start with "/remotes/" are from remote locations, such as gitorious, where you cloned opentaps. You can't really work with those.

To create your own branch,

$ git branch dataimport
$ git branch -a
  dataimport
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master

Now, to start working with your new branch, you need to switch to it. The git checkout command is actually for switching between branches, like svn switch:

$ git checkout dataimport
Switched to branch 'dataimport'
$ git branch -a
* dataimport
  master
  remotes/origin/HEAD -> origin/master
  remotes/origin/master

Here's something else that will seem strange at first about branches in git: If you make changes but do not commit them, and then switch to a different branch, git will give you a diff against the other branch. So, for example:

$ git checkout dataimport
$ vi my-file
# make some changes
$ git diff
# will show the changes you just made against your current branch, or dataimport
$ git checkout master
$ git diff
# will show the same changes, but now against the master branch you just switched to

You would need to commit the changes to one branch or another. Then, when you switch between the branches with git checkout or see a log of your committed changes with git log, you will see the difference if you are on one branch versus another.

Deleting a Branch

If you need to get rid of a branch,

$ git branch useless-branch
$ git branch -d useless-branch

Copying Remote Branches

If you have multiple remote branches, so that

$ git branch -a
* pro-master
  remotes/origin/1447
  remotes/origin/1474
  remotes/origin/1487
  remotes/origin/HEAD -> origin/pro-master
  remotes/origin/master
  remotes/origin/pro-master

You should make a local branch which copies a remote branch first before working with it:

$ git checkout -b 1447 origin/1447

This will copy the origin/1447 branch to 1447: $ git branch -a

* 1447
  pro-master
  remotes/origin/1447
  remotes/origin/1474
  remotes/origin/1487
  remotes/origin/HEAD -> origin/pro-master
  remotes/origin/master
  remotes/origin/pro-master

Otherwise, if you try to checkout to a remote branch directly, git will get confused:

$ git checkout origin/1447
$ git branch -a
* (no branch)
  pro-master
  remotes/origin/1447
  ...

Resetting to a Remote Branch

Sometimes the remote branch may actually "fall behind" your local branch, so that when you use git fetch and git rebase, git could still not correctly mirror it. This might happen if you went back on the remote branch, for example to remove some bad commits, and then moved forward again. For example, this e-mail from git tells you that:

This update added new revisions after undoing existing revisions.  That is
to say, the old revision is not a strict subset of the new revision.  This
situation occurs when you --force push a change and generate a repository
containing something like this:

 * -- * -- B -- O -- O -- O (e678fc2bb071e8d57d428a7c8cb25bc2cfbe28e5)
           \
            N -- N -- N (007ba5038c62421d458a02fb66889d271292a48f)

At other times, you may just want to get rid of all your local commits and go back to wherever the remote branch is right now, but git rebase would just merge the changes from the remote branch to your local branch, and not get rid of your commits.

In either case, you can solve the problem with a git reset, for example:

$ git reset --hard origin/1447

Before doing this though, you could save your local branch in another branch first:

$ git branch clone-my-1447-branch
$ git reset --hard origin/1447

and then later, if you ever need it back, you can always use git rebase

$ git rebase 1447 clone-my-1447-branch

Reversing Pushed Changes

Sometimes you or somebody else might have pushed changes accidentally to the remove repository. To get rid of them, first get a log or history of the push commits:

$ git log

Then, use git reset to push back to a particular come it, identified by its SHA1 sequence from the log. For example:

$ git reset --hard 6bb3dc30bc0c8fc36421474cf9376d658ee643aa

Sometimes just the first few letters and numbers of the sequence, such as 6bb3dc would do.

After you've done the reset, you need to push it back to the server. However, if you just pushed your branch, you will get an error message:

$ git push git@gitorious.org:opentaps/opentaps.git master
To git@gitorious.org:opentaps/opentaps.git
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to 'git@gitorious.org:opentaps/opentaps.git'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes before pushing again.  See the 'Note about
fast-forwards' section of 'git push --help' for details.

To really push it, you would need to add a + before your branch name:

$ git push git@gitorious.org:opentaps/opentaps.git +master
Total 0 (delta 0), reused 0 (delta 0)
=> Syncing Gitorious... [OK]
To git@gitorious.org:opentaps/opentaps.git
 + 6398f5f...6bb3dc3 master -> master (forced update)

Fixing Commits Which Have Not Been Pushed

If you have committed changes but have not pushed him yet, you have several ways of fixing them:

  • git reset --hard origin/xxx : you will lose all the local commits and changes in your current branch
  • git commit --ammend : you can use the ammend flag to "fix" the previous commit
  • git reset XXX : without --hard it only resets the branch marker, but keeps the changes, so you can recommit the changes in a different way or fix them / etc ..
  • git rebase -i XXX : opens an editor with the list of commits from XXX to your current branch state and you can remove one of those commits / reorder them / merge some together / etc ..

Merging

You can merge the changes from branch into another by switching to the target branch first, then doing a git merge. If you want to get the changes from a remote branch, you should git fetch and git rebase first. For example,

$ git fetch
$ git rebase origin/master master
$ git checkout dataimport
$ git merge master

This sequence gets all the updates from the remote git server, then updates the local copy of the master branch from the remote one. Next, I switched over to the dataimport branch. Finally, I merged the changes from the master branch to the dataimport branch.

Once you've done this, you may want to send the merged version of your local branch back to the remote repository. In that case, you must push again:

$ git push dataimport

Note that you do not need to commit first, because the merge happens on the local repository, so there are no uncommitted changes against the local repository.

You can always undo a rebase

$ git rebase --abort

Merge Request

Here's an example of something that would be a lot of work with subversion but almost trivial with git:

git pull git://gitorious.org/opentaps/opentaps.git refs/merge-requests/6
git push git://gitorious.org/opentaps/opentaps.git

I've just pulled a merge request from our gitorious repository, which are automatically merged into their branches in my local git repository, and then I pushed them back to gitorious.

Clean Merge Requests

Normally when working on a branch one would sometime have to merge the changes from the trunk into it in order to keep working on up-to-date code. However when making a merge request this can result in a lot of those merges making the request harder to review.

When possible, instead of merging the trunk into the working branch one could rebase the working branch on top of the trunk. For example if the trunk (from gitorious) is master and the working branch is my-feature

  git rebase master my-feature

Experimenting with Git

When you are first working with git, try this from any branch:

$ git branch tmp

This will copy it to a branch called tmp. You can then do whatever you want, such as

$ git reset XXX --hard

If later you decide that's not what you wanted, you can try to get it back with

$ git reset --hard tmp

Also, you can use

$ git reflog

to get a history.

Extracting Diff from History

To get historical changes from git, you can use git diff like this:

$ git diff 7cc3f0ce846952196dde68c62595f9 0069e2e46ad591bf4b7cf169d493b9

This will produce a diff file which you can then use the patch command to apply to other files.

Note that git diff parameters are in opposite sequence of svn diff:

$ svn diff <earlier>:<later>

is the same as

$ git diff <later> <earlier>

Working with Additional Remote Repositories

The other neat thing about git is that you can add additional remote repositories. For example, you may have cloned your own repository, but then you want to merge in changes from opentaps, or merge some of your changes back against opentaps. You can do this by adding the gitorious opentaps:

$ git remote add gitorious git://gitorious.org/opentaps/opentaps.git 
$ git fetch -all
....
From git://gitorious.org/opentaps/opentaps
.... 
* [new branch]      master     -> gitorious/master

Now if you check your branches, you will see that there is a new remotes/gitorious/master branch:

$ git branch -a
....
 remotes/gitorious/master

To merge in from gitorious opentaps:

$ git merge gitorious/master

(That's it!)

To merge back to opentaps, you need to create a patch file

$ git diff -p gitorious/master origin/master

or just the specified directories:

$ git diff -p gitorious/master origin/master -- opentaps
$ git diff -p gitorious/master origin/master -- opentaps/crmsfa opentaps/financials

and use the patch file for merging.

Other Notes

File Formats

With more frequent branching and merging, it's important that you keep file formats consistent and avoid Unix/Windows file differences. Make sure your editor is set to use Unix line terminations (use LF instead or CRLF). Before pushing to GIT, make sure your commit does not change the line terminations! Use $ git diff, and you will see the diff change all the file instead of just the relevant part.

If you are not sure about your editor you can also try to find the DOS2UNIX.EXE utility to set the proper formating.

Untracked Files

To get a list of files not tracked by git,

$ git status

You can remove files which are not tracked by git with

$ git clean -x -f -d

It will remove all the untracked files and directories from your current directory.

What if Git Hangs?

Sometimes git could get stuck while unpacking objects: $ git fetch remote: Counting objects: 137, done. remote: Compressing objects: 100% (79/79), done. Unpacking objects: 13% (12/88)

When I ctrl+c stop it and then check, there's a large list of dangling commits: $ git fsck Checking object directories: 100% (256/256), done. Checking objects: 100% (210332/210332), done. dangling commit d70058dada2c537a03d68d79f5c71a2325ae1c8f dangling commit 5617c81334112457e6bd8d058b252172cc131562 dangling commit 942b10e3066e96b8b2311ae3882763a1e06c8179 dangling commit af3b78d82838f237048a7f1b22032d0e02cd4bbf dangling commit 7454907e59d3e340b41ba5bb4d7449d033f1ec53 ......

This fixed it: $ git gc --aggressive

References