Report on git
Tuesday, June 19th, 2007After seeing this video of Linus Torvalds waxing arrogant about git, the source code management system he wrote for Linux kernel development, I decided I should try it out.
The benefits of git over the CVS/Subversion model amount to this: (1) git is distributed, and (2) git is fast. (2) is partly a result of (1), partly a result of Linus being a wizard programmer.
For git, “distributed” means that every working directory is a full-fledged repository in its own right. That is, there is no single repository that all developers check out of and commit into. Commits are done locally, into your own copy of the repository. Sharing is done only explicitly, so nobody can screw up your personal copy of the code with their code, and vice versa. Distribution also gets you a bunch of speed, because commits are simply written into your local filesystem. Beyond that, Linus used a lot of fancy impressive algorithms to make things generally fast.
I used Fink to install git on my OS X machines (it’s in the unstable tree), and found that it works pretty much as advertised. The tutorial at kernel.org is quite nice; below is a quick summary of the most important things I gleaned from it.
Installation
On Mac OS X, you need to install Fink if you don’t have it. Then add unstable/main and unstable/crypto to /sw/etc/fink.conf, and then do
fink selfupdate; fink index; fink scanpackages
to activate the unstable trees.
Then do
fink install git
and type your admin password when it’s asked for.
On Linux, it’s probably just a matter of installing it from whatever package manager your distro has.
User Configuration
For the sake of the logs, you should first configure git to know your name and email address:
git config --global user.name "Ed Baskerville" git config --global user.email notmyemailaddress@edbaskerville.com
Creating a Repository
To make a new repository from scratch and start working on it, you create the working directory and cd into it, create the repository inside, and then check out the empty main branch:
mkdir myproj cd myproj git-init git-checkout
Notice I didn’t specify any arguments to init or checkout. Init assumes you want to create a repository called .git inside the current directory, and checkout assumes you want to check out code from .git into the current directory to begin working. So the standard directory structure is this:
myproj/ [the working directory]
.git/ [the repository directory]
...
[the git repository's internal state]
...
...
[working source files]
...
Cloning a Repository
If there’s an existing repository that you want a copy of, you don’t check out a working copy as with CVS/Subversion. Rather, you clone it, making a copy of the complete repository for yourself. This works from a local file, via ssh, via http, or through git-daemon, which needs to be installed on the “server” machine. Examples:
git clone /local/path git clone ssh://server.name/remote/path git clone http://www.kernel.org/pub/scm/git/git.git git clone git://git.sv.gnu.org/lilypond.git
Making Changes
Making and committing changes is pretty much just like CVS or Subversion: you edit or add files in your working directory, and then commit them, only your commits go into the local repository .git repository directory.
So you could create a file called myprogram.c:
echo I bet this won't compile > myprogram.c git add myprogram.c git commit
One catch is that you need to do a git add before a commit for all files that have changed, not just files that are new. You can get around this with the -a switch:
git commit -a
will first add any files that have changed to the commit (but not new files that have been created—they must be added explicitly) and then commit.
Pulling and Pushing
Source code management isn’t much use unless you can share your work. If somebody else has changes you want to merge into your copy of the repository, use git pull:
git-pull /path/to/repository
If you’re operating on a CVS/Subversion-like model with a central master repository, push can be used to put your changes into that repository:
git-push ssh://server.name/path/to/repository
Branches
I won’t spend time on branching, but one of the key reasons to use git is that branches are cheap and easy, so you can easily split off experimental branches to do things that might fail spectacularly. The tutorial has good information about all this.