Relative Sanity

a journal

Thinking git through

When it comes to understanding git, I've found myself struggling to provide beginners with a good overview of how I intuitively picture what's happening when I use various git commands.

Usually, I end up scribbling a pile of dots on a bit of paper and saying something like the following:

it's really simple: each dot is a commit, and each dot is linked to at least one "parent" dot. A branch is just a label attached to one of the dots. "Committing a change to a branch" really just means "adding a dot and moving the label to that new dot". git reset really just means "move the label to a specific dot", and git merge just means "add a new dot which links back to two other dots, then move the label to that new dot".

I sense the eyes begin to glaze over around the point where I claim this is "simple".

The internet to the rescue!

It turns out Sam Livingston-Gray had a similar explanation, but was smart enough to realise that this looks exactly like graph theory.

Seriously, go and read his amazing Think like (a) git. I mean it, it'll take like twenty minutes.

I'll wait.

Seriously

Go. I'll wait.

Read it?

Isn't it awesome? I love it! It perfectly captures my intuition about git when I realised that the base units are commits, not branches or tags or files or anything like that.

git really is quite simple: nodes and edges. Easy!

Except…

I really do love almost everything about that resource, but there's one part that I think could be streamlined a little. Part way through, Sam has this great moment where he fesses up:

… before I tried something I was a little uncertain about, I would back up the entire directory.

We've all been there, right? That moment when we're not sure, so we take a backup so that we can always get back if it goes horribly wrong. No harm in that, for sure, but if there's one lesson I have learned about git, it's that it's likely already a step ahead of me.

Buckle up. This is going to get awesome.

Merge in turn

The issue Sam is highlighting here is the fear that "when I merge, there's no simple way for me to see what the result of that merge will be, in absolute terms. How can I preview the result so I can be comfortable?"

The solution he outlines, which is perfectly reasonable, is to create a new test branch from the merge target (probably master), and then to merge the changes down to the new branch. If it all looks good, we can merge the changes down to master. If not, we just blow away the test branch and figure out what went wrong.

Thing is, this is similar to the pattern of backing up before we go forward, but unlike that pattern, it doesn't really buy us much.

Reference needed

The great thing about git is the fact that it very rarely throws anything away. Let's consider this simple workflow:

git checkout -b awesome_feature
echo "[ ]: implement feature" > feature.todo
git add feature.todo
git commit -m "adds feature checklist"

So this has passed peer review, and I'm ready to merge down to master. What's my usual flow? Let's assume master is up-to-date with all our remotes, and dive right in:

git checkout master
git merge awesome_feature

Wait, what?! That's terrifying!

What did I just do? Well, I merged my branch down to master, and nothing went wrong.

Or, at least, I think nothing went wrong. How can I know for sure?

The simplest way I can think of, if you're dealing with a remote (on GitHub, say), is to diff with that remote. Assuming your remote is called origin:

git diff origin/master

This will show you all the changes that your merge has introduced compared with the remote version of master, and obviates the need to create a "copy" of master to check against "actual" master. git checkout master already created a copy, and the original remains as origin/master.

That's all well and good, but what if the merge is a complete clustercuss? My checked-out master is now a bombsite. How do I fix this mess?

Well, this is where reset comes in real handy. See, reset says "take the current branch (label), and point it at a specific commit". In this case, we want to take the master "label", and have it point at what it was before we merged. With a remote repo, this is trivial:

git reset --hard origin/master

Boom. Our checked out master branch is now pointing at the same commit as origin/master is pointing at: in other words, we're back to where we started. Clean as a whistle.

No remote!

Okay, smarty: what happens if it's a purely local repo? You don't have an origin/master to fall back to, so you're buggered. Hah!

Not so fast. Let's say you don't have a remote. reset will still do the right thing, but we'll need to give it the actual SHA commit reference of where we were at. You wrote that down before you merged, right?

No?

NEVER FEAR. git totally has your back.

You might be thinking that you could just use git log to see where we were before. Here's my git log output:

git log --oneline
473578b Merge branch 'awesome_feature'
ebe0f08 adds feature checklist
9ea423f adds README

I mean, it's not bad, but it might not be clear which commit belongs to which branch. I could use a visualiser:

git log --oneline --graph
*   473578b Merge branch 'awesome_feature'
|\
| * ebe0f08 adds feature checklist
|/
* 9ea423f adds README

but where's the fun in that? git has a far cooler way of keeping track of where you've been:

git reflog
473578b HEAD@{0}: merge awesome_feature: Merge made by the 'recursive' strategy.
9ea423f HEAD@{1}: checkout: moving from awesome_feature to master
ebe0f08 HEAD@{2}: commit: adds feature checklist
9ea423f HEAD@{3}: checkout: moving from master to awesome_feature
9ea423f HEAD@{4}: commit (initial): adds README

Man, I love the reflog.

The reflog is a log of all the commits you've ever checked out, along with some narrative of how you got there. Note the second line of the output:

9ea423f HEAD@{1}: checkout: moving from awesome_feature to master

It tells me quite clearly that I moved from my awesome_feature branch to master, and when I was done I was looking at commit 9ea423f. After that, I merged:

473578b HEAD@{0}: merge awesome_feature: Merge made by the 'recursive' strategy.

So 9ea423f is the right commit to compare with:

git diff 9ea423f

and the right one to reset to if it's all gone horribly, horribly Pete Tong:

git reset --hard 9ea423f

and there we go: dodgy merge aborted!

You had me right up till "simply"

This might seem a little convoluted, but hey: git itself seems "convoluted" compared to just backing up folders before you make any changes you're not sure of. The point here is that, because all you're really doing is making commits and applying labels, regardless of what you do, there's very little that ever gets lost. This frees you up to wield that git chainsaw with a little less fear.

Take this, for example: Let's say you've been working on a branch, and decide it's time to merge down to master. You check out, and then get called into a meeting. You come back and run git merge my_cool_feature… and then realise you weren't actually on master. You've just merged your cool feature into some other branch, and it's a complete mess. No worries:

git reset --hard HEAD@{1}

This is a shortcut for saying that you want to reset the current branch to where it was before the last git action you did: in this case, to where it was before your crazy merge.

git is not just an awesome source control system, it's also silently keeping a little undo log running in the background, a meta log of all the commits you've looked at and how you got to each one.

git is not only a safety net for your code, it's a safety net for your process.

Now, seriously this time, go and read Think like (a) git.


Update: Thanks to Sam for the helpful feedback on a couple of points in this article.