How to Use Git: a Tutorial
This document is a work in progress.
Contents
General Concepts
Git is a free open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency...once you learn how to use it.
Git can be (more than) a little confusing at first when coming from a Subversion background. Let's first try to compare subversion and git:
- it is distributed: every checkout is actually a clone of the master repository, this means that there is no higher authority or single point of failure. Effectively every developer has it own repository with its own branches and own commits.
- because of that, it can also be describe as disconnected: one does not need to access the master repository in order to commit, branch, merge or review the history.
- this makes all operations that would normally use the Subversion server __usable__ (checking commit log, blame, branching and merging).
- it is storage efficient: the Opentaps repository in Git including 1.0.0 / 1.4 / 1.0 and all the branches and tags takes around 900Mo, typically a SVN checkout of 1.4 alone would be more than 600M, multiply this be each branch and by each tag ...
Probably one of the most important point is branching. Because Subversion branches are difficult to merge and expensive to create (one need to do a whole checkout to get a branch) they are mostly used as dead branches.
In comparison, in Git branches are very easy to create and merge back. In the git way of thinking, they are "free," and because they are local they are actually used all the time. The reason why a complete Git repository of opentaps is only 900Mo instead of a few gigabyte is because they all share a lot of common content.
Git branches are typically used for:
- local only configuration
- unit of work (tickets)
- client and sub projects
Local Configuration Branch
One of the first thing that we all do is the configuration of the database in `entityengine.xml`, and in subversion we have to lock the file else it ends up accidentally modified.
In Git one could instead use a configuration branch.
This branch can be considered a floating branch, because we want to apply it on top of whatever branch we are currently working on.
*-----[master]----[work]----[configuration]
Unit of work Branch
Let's say you want to work on a feature. With git, You can create a branch for your feature, work on it, and then merge it back to the main branch. A typical workflow is, from
*-----[master]
Create a working branch:
*-----[master][1234-some-ticket]
Make some commits, so your 1234-some-ticket branch advances further:
*-----[master]---[1234-some-ticket]
Then push the branch on `origin` for review, and have the reviewer merge it:
*-----o------[master] | \---[1234-some-ticket] master> git merge 1234-some-ticket *-----o-----------------------(1234 merged)[master] | | \---[1234-some-ticket]/
If we do not want the reviewer to do any conflict resolution and want to do it ourselves, we can merge the master into our 1234-some-ticket branch first:
*-----o------[master] | \---[1234-some-ticket] 1234-some-ticket> git merge master *-----o------[master] | | \------------(master merged)[1234-some-ticket] <
Note that all of this do not mean that the branch has be local until ready, it can be pushed little by little.
We can then push it back to the master. This only produce one merge commit in the `master` branch per ticket / feature. This makes it easier for third parties to follow.
Client and Sub Projects
Typically clients implementing opentaps or developers creating a project based on it use their own subversion repository created from a checkout of opentpas. Luckily the architecture of opentaps allow them to only store a custom hot-deploy component, but sometimes this is still a bit limited and as we know they end up touching code outside of there own components. Finally they sometimes request features / bug fixes that we implement in both our and their repository.
Having the ability to clone our repository and branch it easily means that the changes they make are contained in their branch. It also makes it easier for them to receive updates from the main opentaps repository. Finally, by setting up their repository as a new remote (eg: `client-origin`), you can have many clients in the same local repository (saving some disk space in the process). All that is needed is an SSH access, which is much easier to setup than a SVN server.
*--o--o-----------------------------[origin/master] | | | | | | | | | \----------[client-origin1/client]-[merged] | | | | \---------[client-origin2/client]-[merged]
Using Git
Now let's actually try to use git.
Getting the Code
First, let's take a look at the basic process of getting the code from git and making some local changes. Getting the code is a lot like subversion, at least on the surface. You would use the git clone command to pull the code from the remote repository to your local computer:
$ git clone git://gitorious.org/opentaps/opentaps.git git-opentaps Initialized empty Git repository in /Users/sichen/Documents/workspace/git-opentaps/.git/ remote: Counting objects: 78173, done. remote: Compressing objects: 100% (14699/14699), done. remote: Total 78173 (delta 57124), reused 77443 (delta 56698) Receiving objects: 100% (78173/78173), 219.74 MiB | 303 KiB/s, done. Resolving deltas: 100% (57124/57124), done. Checking out files: 100% (15107/15107), done.
Now, let's imagine that you modified a file, such as framework/entity/config/entityengine.xml, and changed it to use mysql instead of the embedded Derby database. You can use git status to get a list of the files that have been modified:
$ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: framework/entity/config/entityengine.xml # no changes added to commit (use "git add" and/or "git commit -a")
git diff will show you the actual modifications, or differences:
$ git diff diff --git a/framework/entity/config/entityengine.xml b/framework/entity/config/entityengine.xml index 652c6fc..d6d341e 100644 --- a/framework/entity/config/entityengine.xml +++ b/framework/entity/config/entityengine.xml @@ -51,7 +51,7 @@ access. For a detailed description see the core/docs/entityconfig.html file. <connection-factory class="org.ofbiz.entity.connection.DBCPConnectionFactory"/> <delegator name="default" entity-model-reader="main" entity-group-reader="main" entity-eca-reader="main" distributed-cache-clear-enabled="false"> - <group-map group-name="org.ofbiz" datasource-name="localderby"/> + <group-map group-name="org.ofbiz" datasource-name="localmysql"/> <group-map group-name="org.ofbiz.olap" datasource-name="localderbyolap"/> <group-map group-name="org.opentaps.analytics" datasource-name="analytics"/> <group-map group-name="org.opentaps.testing" datasource-name="testing"/>
etc. etc.
Updating
To bring your clone up to date with the remote main opentaps repository,
$ git fetch
This will get updates from all the branches on the remote repository and pull them to your local clone. If you have created local branches from your remote branches, then you would need to update your local branches from the remote branches that were updated (see below):
$ git rebase origin/1447 1447
If however your local branches have changes, git will complain:
cannot rebase: you have unstaged changes M framework/entity/config/entityengine.xml
If that's the case, you will have to "stash" your changes for the moment:
$ git stash $ git rebase origin/1447 1447
Committing Your Changes
git commit will allow you to save your changes, or "commit" them.
$ git commit -a [master 4516382] Change configuration to use mysql
Here's the big difference: After you have committed your change, it is committed to your local clone, not to the master git repository of opentaps. This means that it is still available to you, and to anybody who makes a clone of your git repository, but not to anybody who is cloning the main opentaps repository.
After you commit, you can get a history of your changes with git log:
$ git log framework/entity/config/entityengine.xml commit 4516382fb3d17da5a2bdccd1942b01636ea08e4d Author: si chen <sichen@si-chens-imac.gateway.2wire.net> Date: Tue Apr 13 11:32:56 2010 -0700 Change configuration to use mysql commit b5bf829383d261f7eb2e4c733568dbcce65868f7 Author: sparksun <sparksun@d3523486-f5fe-0310-a41a-a15a4e76f3c7> Date: Thu Jan 7 13:51:57 2010 +0000 #1315 add useOldAliasMetadataBehavior parameter in mysql jdbc url for avoid hibernate cannot find column view entity git-svn-id: svn://svn.opentaps.org/opentaps_all/versions/1.4/trunk@14503 d3523486-f5fe-0310-a41a-a15a4e76f3c7 commit 6678504157a4234d051292a336bc623d83107ff2 Author: jwickers <jwickers@d3523486-f5fe-0310-a41a-a15a4e76f3c7> Date: Fri Nov 27 04:12:00 2009 +0000 Reset default entity config to use Derby git-svn-id: svn://svn.opentaps.org/opentaps_all/versions/1.4/trunk@14071 d3523486-f5fe-0310-a41a-a15a4e76f3c7
Note that unlike subversion, you can see a log of all the changes of the file even before your branch was created.
Pushing Your Changes Up
To send your changes back up to the main opentaps git repository, you would need to do a
$ git push git@gitorious.org:opentaps/opentaps.git
To commit a branch other than the one you are currently on (see below), use
$ git push git@gitorious.org:opentaps/opentaps.git name-of-branch
If you get this error message:
! [rejected] master -> master (non-fast-forward)
It means that your version is out of date, and you need to do a git fetch to bring your version of the day first before pushing up your changes.
Permission denied (publickey).
You need to upload your public key, usually ~/.ssh/id_rsa.pub, to your profile on gitorious.
Branching
First let's take a look at what branches are available:
$ git branch -a * master remotes/origin/HEAD -> origin/master remotes/origin/master
The current branch you're on is the one marked with a "*". The branches which start with "/remotes/" are from remote locations, such as gitorious, where you cloned opentaps. You can't really work with those.
To create your own branch,
$ git branch dataimport $ git branch -a dataimport * master remotes/origin/HEAD -> origin/master remotes/origin/master
Now, to start working with your new branch, you need to switch to it. The git checkout command is actually for switching between branches, like svn switch:
$ git checkout dataimport Switched to branch 'dataimport' $ git branch -a * dataimport master remotes/origin/HEAD -> origin/master remotes/origin/master
Here's something else that will seem strange at first about branches in git: If you make changes but do not commit them, and then switch to a different branch, git will give you a diff against the other branch. So, for example:
$ git checkout dataimport $ vi my-file # make some changes $ git diff # will show the changes you just made against your current branch, or dataimport $ git checkout master $ git diff # will show the same changes, but now against the master branch you just switched to
You would need to commit the changes to one branch or another. Then, when you switch between the branches with git checkout or see a log of your committed changes with git log, you will see the difference if you are on one branch versus another.
Deleting a Branch
If you need to get rid of a branch,
$ git branch useless-branch $ git branch -d useless-branch
Copying Remote Branches
If you have multiple remote branches, so that
$ git branch -a * pro-master remotes/origin/1447 remotes/origin/1474 remotes/origin/1487 remotes/origin/HEAD -> origin/pro-master remotes/origin/master remotes/origin/pro-master
You should make a local branch which copies a remote branch first before working with it:
$ git checkout -b 1447 origin/1447
This will copy the origin/1447 branch to 1447: $ git branch -a
* 1447 pro-master remotes/origin/1447 remotes/origin/1474 remotes/origin/1487 remotes/origin/HEAD -> origin/pro-master remotes/origin/master remotes/origin/pro-master
Otherwise, if you try to checkout to a remote branch directly, git will get confused:
$ git checkout origin/1447 $ git branch -a * (no branch) pro-master remotes/origin/1447 ...
Resetting to a Remote Branch
Sometimes the remote branch may actually "fall behind" your local branch, so that when you use git fetch and git rebase, git could still not correctly mirror it. This might happen if you went back on the remote branch, for example to remove some bad commits, and then moved forward again. For example, this e-mail from git tells you that:
This update added new revisions after undoing existing revisions. That is to say, the old revision is not a strict subset of the new revision. This situation occurs when you --force push a change and generate a repository containing something like this: * -- * -- B -- O -- O -- O (e678fc2bb071e8d57d428a7c8cb25bc2cfbe28e5) \ N -- N -- N (007ba5038c62421d458a02fb66889d271292a48f)
At other times, you may just want to get rid of all your local commits and go back to wherever the remote branch is right now, but git rebase would just merge the changes from the remote branch to your local branch, and not get rid of your commits.
In either case, you can solve the problem with a git reset, for example:
$ git reset --hard origin/1447
Before doing this though, you could save your local branch in another branch first:
$ git branch clone-my-1447-branch $ git reset --hard origin/1447
and then later, if you ever need it back, you can always use git rebase
$ git rebase 1447 clone-my-1447-branch
Reversing Pushed Changes
Sometimes you or somebody else might have pushed changes accidentally to the remove repository. To get rid of them, first get a log or history of the push commits:
$ git log
Then, use git reset to push back to a particular come it, identified by its SHA1 sequence from the log. For example:
$ git reset --hard 6bb3dc30bc0c8fc36421474cf9376d658ee643aa
Sometimes just the first few letters and numbers of the sequence, such as 6bb3dc would do.
After you've done the reset, you need to push it back to the server. However, if you just pushed your branch, you will get an error message:
$ git push git@gitorious.org:opentaps/opentaps.git master To git@gitorious.org:opentaps/opentaps.git ! [rejected] master -> master (non-fast-forward) error: failed to push some refs to 'git@gitorious.org:opentaps/opentaps.git' To prevent you from losing history, non-fast-forward updates were rejected Merge the remote changes before pushing again. See the 'Note about fast-forwards' section of 'git push --help' for details.
To really push it, you would need to add a + before your branch name:
$ git push git@gitorious.org:opentaps/opentaps.git +master Total 0 (delta 0), reused 0 (delta 0) => Syncing Gitorious... [OK] To git@gitorious.org:opentaps/opentaps.git + 6398f5f...6bb3dc3 master -> master (forced update)
Merging
You can merge the changes from branch into another by switching to the target branch first, then doing a git merge. If you want to get the changes from a remote branch, you should git fetch and git rebase first. For example,
$ git fetch $ git rebase origin/master master $ git checkout dataimport $ git merge master
This sequence gets all the updates from the remote git server, then updates the local copy of the master branch from the remote one. Next, I switched over to the dataimport branch. Finally, I merged the changes from the master branch to the dataimport branch.
Once you've done this, you may want to send the merged version of your local branch back to the remote repository. In that case, you must push again:
$ git push dataimport
Note that you do not need to commit first, because the merge happens on the local repository, so there are no uncommitted changes against the local repository.