How to Use Git: a Tutorial
This document is a work in progress.
Contents
General Concepts
Git is a free open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency...once you learn how to use it.
Git can be (more than) a little confusing at first when coming from a Subversion background. In this tutorial, we'll write the Subversion terms in italic.
The most important things to remember about git are:
- it is distributed: every checkout is actually a clone of the master repository, this means that there is no higher authority or single point of failure. Effectively every developer has it own repository with its own branches and own commits.
- because of that, it can also be describe as disconnected: one does not need to access the master repository in order to commit, branch, merge or review the history.
- this makes all operations that would normally use the Subversion server __usable__ (checking commit log, blame, branching and merging).
- it is storage efficient: the Opentaps repository in Git including 1.0.0 / 1.4 / 1.0 and all the branches and tags takes around 900Mo, typically a SVN checkout of 1.4 alone would be more than 600M, multiply this be each branch and by each tag ...
Probably one of the most important point is branching. Because Subversion branches are difficult to merge and expensive to create (one need to do a whole checkout to get a branch) they are mostly used as dead branches.
In comparison, in Git branches are very easy to create and merge back. In the git way of thinking, they are "free," and because they are local they are actually used all the time. The reason why a complete Git repository of opentaps is only 900Mo instead of a few gigabyte is because they all share a lot of common content.
Git branches are typically used for:
- local only configuration
- unit of work (tickets)
- client and sub projects
Local Configuration Branch
One of the first thing that we all do is the configuration of the database in `entityengine.xml`, and in subversion we have to lock the file else it ends up accidentally modified.
In Git one could instead use a configuration branch.
This branch can be considered a floating branch, because we want to apply it on top of whatever branch we are currently working on.
*-----[master]----[work]----[configuration]
Unit of work Branch
Let's say you want to work on a feature. With git, You can create a branch for your feature, work on it, and then merge it back to the main branch. A typical workflow is, from
*-----[master]
Create a working branch:
*-----[master][1234-some-ticket]
Make some commits, so your 1234-some-ticket branch advances further:
*-----[master]---[1234-some-ticket]
Then push the branch on `origin` for review, and have the reviewer merge it:
*-----o------[master]
      |                       
       \---[1234-some-ticket]
master> git merge 1234-some-ticket
*-----o-----------------------(1234 merged)[master]
      |                       |
       \---[1234-some-ticket]/
If we do not want the reviewer to do any conflict resolution and want to do it ourselves, we can merge the master into our 1234-some-ticket branch first:
*-----o------[master]
      |                       
       \---[1234-some-ticket]
1234-some-ticket> git merge master
*-----o------[master]
      |             |
       \------------(master merged)[1234-some-ticket]
<
Note that all of this do not mean that the branch has be local until ready, it can be pushed little by little.
We can then push it back to the master. This only produce one merge commit in the `master` branch per ticket / feature. This makes it easier for third parties to follow.
Client and Sub Projects
Typically clients implementing opentaps or developers creating a project based on it use their own subversion repository created from a checkout of opentpas. Luckily the architecture of opentaps allow them to only store a custom hot-deploy component, but sometimes this is still a bit limited and as we know they end up touching code outside of there own components. Finally they sometimes request features / bug fixes that we implement in both our and their repository.
Having the ability to clone our repository and branch it easily means that the changes they make are contained in their branch. It also makes it easier for them to receive updates from the main opentaps repository. Finally, by setting up their repository as a new remote (eg: `client-origin`), you can have many clients in the same local repository (saving some disk space in the process). All that is needed is an SSH access, which is much easier to setup than a SVN server.
*--o--o-----------------------------[origin/master]
   |  |                                 |     |
   |  |                                 |     |  
   |  \----------[client-origin1/client]-[merged]
   |                                    |
   |                                    |
    \---------[client-origin2/client]-[merged]
Using Git
Now let's actually try to use git.
Getting the Code
First, let's take a look at the basic process of getting the code from git and making some local changes. Getting the code is a lot like subversion, at least on the surface. You would use the git clone command to pull the code from the remote repository to your local computer:
$ git clone git://gitorious.org/opentaps/opentaps.git git-opentaps Initialized empty Git repository in /Users/sichen/Documents/workspace/git-opentaps/.git/ remote: Counting objects: 78173, done. remote: Compressing objects: 100% (14699/14699), done. remote: Total 78173 (delta 57124), reused 77443 (delta 56698) Receiving objects: 100% (78173/78173), 219.74 MiB | 303 KiB/s, done. Resolving deltas: 100% (57124/57124), done. Checking out files: 100% (15107/15107), done.
Now, let's imagine that you modified a file, such as framework/entity/config/entityengine.xml, and changed it to use mysql instead of the embedded Derby database. You can use git status to get a list of the files that have been modified:
$ git status # On branch master # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: framework/entity/config/entityengine.xml # no changes added to commit (use "git add" and/or "git commit -a")
git diff will show you the actual modifications, or differences:
$ git diff
diff --git a/framework/entity/config/entityengine.xml b/framework/entity/config/entityengine.xml
index 652c6fc..d6d341e 100644
--- a/framework/entity/config/entityengine.xml
+++ b/framework/entity/config/entityengine.xml
@@ -51,7 +51,7 @@ access. For a detailed description see the core/docs/entityconfig.html file.
     <connection-factory class="org.ofbiz.entity.connection.DBCPConnectionFactory"/>
 
     <delegator name="default" entity-model-reader="main" entity-group-reader="main" entity-eca-reader="main" distributed-cache-clear-enabled="false">
-        <group-map group-name="org.ofbiz" datasource-name="localderby"/>
+        <group-map group-name="org.ofbiz" datasource-name="localmysql"/>
         <group-map group-name="org.ofbiz.olap" datasource-name="localderbyolap"/>
         <group-map group-name="org.opentaps.analytics" datasource-name="analytics"/>
         <group-map group-name="org.opentaps.testing" datasource-name="testing"/>
etc. etc.
Committing Your Changes
git commit will allow you to save your changes, or "commit" them.
$ git commit -a [master 4516382] Change configuration to use mysql
Here's the big difference: After you have committed your change, it is committed to your local clone, not to the master git repository of opentaps. This means that it is still available to you, and to anybody who makes a clone of your git repository, but not to anybody who is cloning the main opentaps repository.
To send your changes back to the main opentaps git repository, you would need to do a
$ git push git@gitorious.org:opentaps/opentaps.git
To commit a branch other than the one you are currently on (see below), use
$ git push git@gitorious.org:opentaps/opentaps.git name-of-branch
Branching
First let's take a look at what branches are available:
$ git branch -a * master remotes/origin/HEAD -> origin/master remotes/origin/master
The current branch you're on is the one marked with a "*". The branches which start with "/remotes/" are from remote locations, such as gitorious, where you cloned opentaps. You can't really work with those.
To create your own branch,
$ git branch dataimport $ git branch -a dataimport * master remotes/origin/HEAD -> origin/master remotes/origin/master
Now, to start working with your new branch, you need to switch to it. The git checkout command is actually for switching between branches, like svn switch:
$ git checkout dataimport Switched to branch 'dataimport' $ git branch -a * dataimport master remotes/origin/HEAD -> origin/master remotes/origin/master
Here's something else that will seem strange at first about branches in git: If you make changes but do not commit them, and then switch to a different branch, git will give you a diff against the other branch. So, for example:
$ git checkout dataimport $ vi my-file # make some changes $ git diff # will show the changes you just made against your current branch, or dataimport $ git checkout master $ git diff # will show the same changes, but now against the master branch you just switched to
You would need to commit the changes to one branch or another. Then, when you switch between the branches with git checkout or see a log of your committed changes with git log, you will see the difference if you are on one branch versus another.
Deleting a Branch
If you need to get rid of a branch,
$ git branch useless-branch $ git branch -d useless-branch
Updating
$ git fetch
