Gitting started

Pat McGee

Copyright 2008, 2009 James Patrick McGee. Email: JPM at XorAndOr dot com. This work is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Abstract

Here are instructions for using git in various situations I've encountered. Mostly this is as a single developer working on projects on both Macs and Windows machines.

I've tested these instructions on both Macs and Windows machines.

Work in progress

I'm still writing this because I haven't solved all the problems I need to solve. Anytime you see something marked with "+++", this signals a question that I haven't answered yet. If you know the answer, please email me at the address given above. I'll update the document. Please let me know if you want me to credit you in the revision.

Note: when explaining git, I assume that you're able to use Terminal and vim. If you aren't yet, you should be. (_Real_ programmers use a magnetized needle and a steady hand. http://xkcd.com/378) Go practice.

Why should I use git?

I like keeping track of most of the changes I make while writing programs and sometimes documents. Sometimes I delete something and later think, "Wait, I wanted that!" Sometimes I screw up and decide that it's easier to go back a version or two and start from there instead of trying to fix the muddle I've made.

Those are good reasons for even projects when I'm working by myself on projects where I'm the only developer and the only customer. When I deal with customers, I want to keep track of every version I give them. That way, when they complain about something they don't like, I can see precisely what I gave them to help track down the problem. When I deal with other developers, I want to be able to merge changes they made with the changes I make, and to send them only the changes I've made.

Lastly, I like using cool tools. I've dealt with version tracking programs since SCCS and RCS. I hated them and never used them when I didn't have to. I used PVCS, CVS, and Subversion, hoping to find something better. Not really. Now it looks like finally git is useful and sufficiently unfrustrating that I'll actually like using it. We'll see.

Get and install the tools

To install git on a Mac, +++ Write this when I get my MacBook back from Apple and I can do an install on a clean system.+++

To install git on a Windows system, go to 'git-scm.com/downloads. Click on the link to the msysGit version. From that link, select the "Featured" version. When it has downloaded, execute the installer. In the Selection Additional Tasks pane, select "Add Git Bash Here" and "Add Git GUI Here" boxes. Select the "Use Git Bash only" option. Select the "Use OpenSSH" option.

Working by myself on a single Mac / Unix computer on a single project

Create a working directory. If you're on a Mac, make it a directory that Time Machine backs up.

Create a more permanent directory. Put this on a disk / directory that gets copied to your offsite backups.

In general, don't put the more permanent directory in a directory that Time Machine backs up. TM operates asynchronously to git. In a race condition, the integrity of any given TM backup of a git repository is not guaranteed. This is only an issue if something fails during both a commit and a TM backup. It will probably be easier to recover from the TM backup of the working directory.

Create the public repository:

cd <public repository>/MyProject

git init

<create or copy in the first version of at least one file. It can be a placeholder, not a useful file.>

git add <filename>

git commit

Create a working directory and populate it with the initial version of the project file(s):

cd <directory above where you want the working directory>

git clone <public repository>/MyProject

Edit some files in the working directory.

Put the changes into the working repository:

git add <filename>

git commit

Another way to put the changes into the working directory:

git commit -a

Copy the changes to the public repository:

git push

One thing I don't like about this is that when you do the push, git only updates the git repository, that is the .git subdirectory. It does not update the files in the level above that. This means that if you do go to that directory (possibly after restoring from a backup), you'll see a whole bunch of obsolete files and might forget that the .git subdirectory does contain the current versions. To update those files:

cd <public repository>/MyProject

git checkout -f

+++ It looks like using a receive-hook to do the checkout may solve this issue.

Working by myself on a single Windows computer on a single project

You don't start git on Windows the same way as you do on Mac/Linux/Unix. Think of git as a method name inside an object. In this analogy, a directory is the object. So, use Windows Explorer to navigate to a directory. Then right-click on the directory. "Git Bash Here" is on that menu. So, it's a verb to apply to the noun (the directory).

Create a directory for the public repository. Put it in a directory that gets backed up frequently and offsite. Inside that directory, create a subdirectory, "MyProject"

Select the "MyProject" directory, and right-click. Select "Git Bash here".

Inside the directory, create a file: test1.txt, and put something uninteresting in it: "abc" or somesuch.

In the bash window, type:

git init

git add test1.txt

git commit

(The editor git uses by default is vi. People who have little history using Unix may find this strange. Remember to type 'i' to switch from edit mode to insert mode before typing something, hit the 'ESC' key to get back from insert mode to edit mode, and 'ZZ' while in edit mode to save the changes and exit. If you completely screw things up, hit the 'ESC' key to make sure you're in edit mode, and type ':q!'. This will abort the edit, which will abort the commit. Then you can start over.)

Now create a directory for the private repository.

Open a Windows Explorer window and navigate to the directory two levels above (two levels, count them: two) where you want the project to be. Select the directory one level above where you want the project to be. Right-click and select "Git Bash Here". Inside the bash window, type:

git clone <public repository>/MyProject

Close that bash window. In the Windows Explorer window, select the MyProject directory. Right-click and select 'Git Bash Here'.

Edit the test1.txt file inside the project directory. In the bash window for the project, type the following:

git commit -a

git push

Back in the bash window for the public repository, type:

git checkout -f

+++ Figure out how to eliminate the last step with a receive-hook. +++

Working by myself on two networked Mac / Unix computers on a single project

Remote mount the directory in which the public repository lives. It's also possible to use git directly to a daemon, but that's probably not worth the extra time to learn.

Here's the complete setup:

Create the public repository:

cd <public repository>/MyProject

git init

<create or copy in the first version of at least one file>

git add <filename>

git commit

Create the working directory on computer Able:

cd <directory above where you want the working directory>

git clone <public repository>/MyProject

Do the same thing on computer Baker:

cd <directory above where you want the working directory>

git clone <public repository>/MyProject

Edit the documents on computer Able.

Check in the documents on computer Able:

git commit -a

git push

On computer Baker, retrieve the changes:

git pull

As long as you don't edit one of the files without doing a push / pull first, this will work just fine. Remember to do a pull before starting work on either computer, and a push before you change computers or take a break.

Working by myself on two non-networked Mac / Unix computers on a single project

Basic approach is to first set up a duplicate directory containing something, and then somehow transfer (email, ftp, whatever) the patches back and forth, applying them to the receiving system.

The simple way to do this would be to create a single file containing just one line of text. Do this independently on both computers. It doesn't matter what the file name is or what the contents are, as long as they are the same on both. It's probably better to give the file a name that won't be used in the real project.

Do the following on both computers:

cat > test1.txt

abc

^D

git init

git add test1.txt

git commit

git tag last-send

Then, on the sending computer, import all the project files that exist so far and commit them.

cp <whatever> .

git add <files>

git commit

git format-patch -M -C last-send..

git tag last-send

This will create one or more files whose names start with four-digit numbers and also contain the patch comment as part of the file name. Somehow get all those files to the receiving computer. Then do the following:

git am <names of the patch files>

git tag last-send

When you make new changes, do the same thing.

The previous steps work fine as long as you only make changes to one system at a time. You can send any number of commits from one system to the other before switching computers. When you switch computers, you must remember to apply all the patches from the other system and update the tag before making any new changes.

When you type 'git tag last-send', you're essentially adding a bookmark (called a tag) at that place. As long as you keep the bookmarks in the two repositories synchronized, everything will work fine. The format-patch command will only create patches for those changes made since the bookmark.

You can send the files to the other computer in any way that works for you: sneakernet, email, remote mounts, whatever.... There's undoubtably a way to set up email so that sending the patches from one computer to the other can be completely automated. But, that's beyond the scope of what I care about right now. If you're using email, just attach them at one end and save them at the other end.

Working by myself on two non-networked Windows computers on a single project

On each computer, do the following:

Create the "MyProject" folder. In the MyProject window, create a starter file: test1.txt, with the same uninteresting contents: "abc" or somesuch. Select that directory in Windows Explorer and right-click on it. Execute 'Git Bash Here'. In the bash window, type:

git init

git add test1.txt

git commit

got tag last-send

After you've done this on both systems, add some interesting content on one of them. Commit the changes. On this system, type:

git format-patch -M -C last-send..

git tag -f last-send

Somehow transport the numbered files to the other computer and put them somewhere. (Probably not in the MyProject directory to avoid possibly confusing any IDE you might be using.) On that computer, type:

git am <directory>/0*

git tag -f last-send

If you forget to tag the repository with 'last-send' after receiving the patches, then when you update the other way, you'll send back the patches that you received. If you do this, you'll confuse git and need to take remedial action. (The error message describes the actions needed pretty well.)

Working with someone else on two non-networked computers on a single project

Do the same as in the previous section. The trick is to maintain discipline about updating the 'last-send' tag at all the proper times. Even without that, the fact that each commit is packaged separately into a file whose name starts with both a number and the commit comment should help a lot.

As long as both people don't change the same file, everything works just fine. When both people change the same file and one tries to apply the changes that the other sends, git will complain. See the next section on dealing with overlapping changes for how to deal with this.

Dealing with overlapping changes

When 'git am' tries to process a patch on a file that people have modified in both places, it stops with an error message.

There are several interesting cases:

The receiving user has edited but not committed a file that 'git am' would have changed. (The error message will read something like this: "error: test1.txt: does not match index.") In general, you should avoid situations where this error could occur; you should always commit your changes before running 'git am'.

The receiving user has committed a change to a file that 'git am' would have changed and git can't automatically resolve the differences. (The error message will read something like this: "error: patch failed: test1.txt:1, error: test1.txt: patch does not apply.")

In the first case, the easy thing to do is to take your local changed copy of the file and move it elsewhere. Then abort the add and do the add again.

mv test1.txt test1.txt.save

git am --abort

git am <directory>/0*

Now you've got a local copy of the changed file but with a different name, and you've got the changes that the other person sent you in both the repository and the working directory. Compare the two files and figure out how to edit it to keep both sets of changes.

There are several ways to do this. You can use 'git merge', which is a command-line program. For a GUI program on the Mac, use FileMerge (which is one of the tools in /Developer/Applications/Utilities). For Windows, use WinMerge (available from www.winmerge.org).

In the second case, you've got two committed versions with edits that conflict, one committed here and the other committed there. You need to edit the file to have only a single version. Then you need to commit that version here, and to send your changes to the other person so he can commit them there.

If you're really facile with text editors, you can fix this by editing the files that contain conflicts. After you get the error from running 'git am', run the following:

git am --abort

git am -3 <directory>/0*

What this does is to put both sets of changes, suitably marked with extra delimiter lines, into the work copy of the file. Use your favorite text editor to edit the files; make the appropriate changes to make both sets of changes work. Remember to remove all the delimiter lines; they'll just confuse the compiler. Compile and test. When you finish, do the following:

git add <name of file(s) you edited)

git am -3 --resolved

This works, but it's a real pain if the changes are anything except dead simple. For more complex changes, you really want to use a GUI file merge program.

+++ What GUI program should I use, and how should I set things up to call it with the proper arguments? +++

+++ Stuff I've written but I'm not sure is right or I'm not sure it belongs here +++

Using multiple repositories

Git, like any version tracking program, keeps previous versions of files inside things called repositories. There is always at least one. I find it very limiting to use only one. Here are some of my reasons why:

It's often desirable to back up only the source code and not all the other compiler output. Keeping one repository inside the IDE working directory and another outside it just for the source code is a nice way to do this. (Call these the working and the source repositories.)

If you're writing code on two different computers and the local directories are not visible from each other, it's convenient to exchange code through a single repository visible from both. (Call these the private and the public repositories.)

If you're doing a lot of experiments, it's sometimes nicer to keep one repository of all experiments and another only of successful experiments. (Call these the exploratory and the distribution repositories.)

If you're delivering code to users, it usually works out better to have a repository that has only delivered code, not all the code that you discarded on the way to that working code.

Automatic backups while working

Some computers have automatic backup programs set up to make backups while you're working. For example, Time Machine on the Mac makes hourly backups.

If you store your git repository in a directory that gets backed up that way, make sure you make other backups that happen while you're not working. If you need to restore, only use those backups for the contents of the git repository.

The reason is that the two processes run asynchronously, so there's no way to guarantee that a specific backup of a git repository is internally consistent. If your hard drive fails and you restore from such a backup, maybe all your history will be present and maybe it won't.

You'd do better to restore from a backup that you know couldn't be inconsistent, and to commit the current versions of the working files from the restored working directory.

Background on git internals

You will have four different places where your files will be. First is the working directory, where you make your edits. Next is a cache in which git stores copies of your files as you move them between the working directory and the repository. A file stored in the cache is an independent entity; it isn't connected with any particular version or tag or directory tree. Then there's your private repository, in which a directory is linked to the entire rest of the project and the history and the tags. Finally, there's the public repository.

You cannot move a file directly from the working directory into either repository. You must stage it through the cache first. Some git commands hide this staging action from you; that doesn't mean it doesn't happen. You'll get confused if you forget it. Also, you cannot move a file from the cache into the public repository. You must stage it through your private repository.

So, here's where files can be moved between:

working <=> cache <=> private repository <=> public repository.

You can't skip a step, even though some git commands do two steps in the same command invocation.

(Don't get confused by the fact that git uses the same physical implementation, called the index, for both the cache and a repository. The cache and a repository are two different conceptual structures, and are definitely not interchangeable. If you haven't read enough documentation to have seen a mention of the index, ignore this note.)

Figure out where to put the git public repository

In this tutorial, I'm going to configure two separate git repositories. The first is a public repository. (For the private repository, see later instructions.) Other people working on the project can access this. Or, if you have occasion to work on the project from two different computers, you can access it from both. And this makes a great place to keep your latest complete working version of the code, so you can demo it to customers or users at any time.

Put this repository in some directory that gets copied to the off-site backups when you make them.

If you don't really need this because you're working alone on a computer that has only a single hard drive and you don't have any customers, well, do it anyway. It's good practice for when you will need it. (And, I'm too lazy to write two sets of instructions.)

Pick or make a directory on some network drive that everyone has the appropriate read and write permissions on.

(Note: Git does support web-based access, but I'm not going to explain that for this simple tutorial.)

I'm going to use "/Volumes/Public/Projects" as the directory for the public repository in this tutorial. Substitute your own directory name every time I use that one.

In a terminal window, type the following:

cd /Volumes/Public

mkdir Projects

Use git to initialize the public repository

Now that Xcode has created the project directory, you need to create a git repository in it.

Execute the following commands in a Terminal window to create the public repository:

cd /Volumes/Public/Projects/

cd MyProject

git init

Next you need to store the project definition into the git repository.

git add English.lproj/ Info.plist MyProject* main.m

git commit

At this point, git will open the vim editor on a temporary file. The cursor will be on the first line, which is blank. The bottom part of the file shows the status of the various files that git sees, putting them in three different sections. First, in a section labeled "Changes to be committed", are the files that the commit will add to the repository. Second, in "Changed but not updated", are files that you've told git that you're interested in tracking and have changed, but you didn't put into the cache. Third, in "Untracked files", are files that you have not told git you are interested in tracking. If there are no files to display in a particular section, git will not display that section.

You should see something like this:


# Please enter the commit message for your changes. Lines starting

# with '#' will be ignored, and an empty message aborts the commit.

#

# Committer: Pat McGee <jpmcgee@MusicBox.local>

#

# On branch master

#

# Initial commit

#

# Changes to be committed:

# (use "git rm --cached <file>..." to unstage)

#

# new file: English.lproj/InfoPlist.strings

# new file: English.lproj/MainMenu.xib

# new file: Info.plist

# new file: MyProject.xcodeproj/TemplateIcon.icns

# new file: MyProject.xcodeproj/jpmcgee.mode1v3

# new file: MyProject.xcodeproj/jpmcgee.pbxuser

# new file: MyProject.xcodeproj/project.pbxproj

# new file: MyProject_Prefix.pch

# new file: main.m

#

According to this status, git is about to commit all the files listed as being new files. More importantly, there is not a section headed "Changed but not updated" or one headed "Untracked files". If there had been, this would mean that you mistyped some file name in the 'git add' command above.

In the case that git tells you something you don't expect, abort the 'git commit' command and fix things before trying it again. To abort the command, don't add anything to the file and exit vim. Type 'ZZ' or ':q" without inserting anything or type ":q!" if you already inserted something.

After you've verified that git will commit all the files you want it to and only those files, insert some comment into the top line of the file and close the file. Make the comment reasonably informative, probably something like "Initial Xcode project definition".)

(Remember, vim is a mode-based editor. When it starts, you're in edit mode. In order to add something, you must change to insert mode, usually by typing 'i'. When you've changed to insert mode, the status line at the bottom of the terminal window will change to read "--- INSERT ---". If it says anything else, you're in edit mode. When you've finished adding, hit the ESC key to get back into edit mode. (Did the "INSERT" on the status line go away?) Then type 'ZZ' to close the file and exit vim.)

After you exit the editor, git will display a bunch of status messages. As long as they start with "Created commit...", you've done things correctly.

Use git to create your private repository and working directory

Xcode will store its files in your working directory. git will also store your private repository inside that directory.

I'm going to use a subdirectory named "work" in my home directory. Of course, Xcode demands a subdirectory named "MyProject" inside "work".

Using the git 'clone' command will create that subdirectory, create a private repository, associate it with the public repository, and copy all the files from the public repository into both your private repository and the working directory.

cd ~/work

git clone /Volumes/Public/Projects/MyProject

Commit your changes to the repositories

Git helps you keep track of changes in your files. You've changed some files. It's time to tell git to put those changes into both repositories. Use 'git add' to add the changes to the cache, then 'git commit' to copy them from the cache to your private repository, then 'git push' to copy them to the public repository.

You'll need to make a decision rule about when to commit changes to your private repository and another rule about when to push those changes to the public repository. Pushing to the public repository is a social act, and you'll need to negotiate with the other people using your code.

My preferred rules of thumb are to commit to the private repository every time I wrap up a work session or finish a feature, and to push to the public repository every time I finish testing a new feature and verify that it works. For me, this corresponds to a single story (in the XP sense.) That way, at any time anyone can compile the code in the public repository and demo the current code.

I also commit to the private repository every time I create a new branch or merge an old branch.

The right answer depends on your particular context, and you'll find out where it is only by experimenting and getting it wrong a bunch of times. IMHO, I believe that the more people you work with, the fewer changes you should make before doing a commit, and that the right answer is always to commit about twice as often as you think you need to.

That said, never ever ever commit something that causes the code in the public repository to not compile, and only very rarely commit something that doesn't completely work. If you want to experiment with something and want to keep track of your interim changes, use branches. This is a topic beyond the scope of what I can write here.

Now for the detailed steps. First put the changes into the cache:

cd ~/work/MyProject

git add AppDelegate.* MyProject.xcodeproj

(Make sure you do not add 'build/'.)

Next copy the changes from the cache into your private repository:

git commit

You should see five files listed under the "Changes to be committed" section, none in the "Changed but not updated" section, and only "build/" in the "Untracked files" section.

Add a descriptive commit message. Remember the 'i' to tell vim to change to insert mode, the ESC to change back to edit mode, and the ZZ to close the file and exit.

Copy the changed file to the public repository:

git push

Commit the changes (using a shorter procedure this time)

After saving the .xib file, use 'git commit -a' to commit the changes to both the cache and your local repository.

This command combines the actions of 'git add' and 'git commit' by first looking for modifications in each of the files that git is tracking. If there are any, it sets up to do both an add and a commit. This is just a convenience command; it doesn't do anything different than 'git add' followed by 'git commit'. It's just less to type, and git does the work of figuring out which files changed, instead of requiring you to remember them.

In a terminal window, do the following:

cd ~/work/MyProject

git commit -a

Verify that the "Changes to be committed" section includes the MainMenu.xib file. (It will probably also include some files inside the MyProject.xcodeproj/ directory. I don't yet know what those files are for, so I'm ignoring them.)

Add a commit message, and exit the editor. (You do still remember about insert and edit modes, don't you?)