Rewriting Git history

One of the things I often do in Git is rewrite my local Git history. For example, when I correct the message of the last commit, combine subsequent commits into one, or rebase my feature branch onto master to take in its latest changes. I always thought that such a rewrite modifies my Git commits. But that is not the case: it merely adds new commits. In this blog post I explain what happens.

A rewrite example: combining, or squashing, commits

To show what happens when you rewrite your local Git history, I start with a repo with a single README. The repo has seen three commits: the first adds the README and the second and third update it. I will to combine the second and third commit into one.

The Git log starts out like this:

$> git log
commit 71afd173843d044ef5c8cc7a911b61ac0449153c
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 15:00:38 2017 +0100

    Add second message to README

commit 97bbcd0893269d5ae3bd680c4a88bb83d341fe4a
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:59:45 2017 +0100

    Add message to README

commit 56b3a080cd720d6086b634fb6b18b04d3c037aaf
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:58:36 2017 +0100

    Add README

The following ASCII art shows the sequence of commit IDs:

[master] 56b3a --> 97bbc --> 71afd

Now I squash the last two commits into a single one:

$> git reset --soft HEAD~2
$> git status
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	modified:   README.md

$> git commit -am "Update README" 
[master 65d0c05] Update README
 1 file changed, 2 insertions(+)

This gives me the following log:

$> git log
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 15:15:49 2017 +0100

    Update README

commit 56b3a080cd720d6086b634fb6b18b04d3c037aaf
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:58:36 2017 +0100

    Add README

As mentioned, I thought that a rewrite of my Git history modified my commits, but the original commits are still there:

$> git log 71afd173843d044ef5c8cc7a911b61ac0449153c
commit 71afd173843d044ef5c8cc7a911b61ac0449153c
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 15:00:38 2017 +0100

    Add second message to README

commit 97bbcd0893269d5ae3bd680c4a88bb83d341fe4a
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:59:45 2017 +0100

    Add message to README

commit 56b3a080cd720d6086b634fb6b18b04d3c037aaf
Author: Pieter Swinkels <swinkels.pieter@yahoo.com>
Date:   Fri Mar 17 14:58:36 2017 +0100

    Add README

This means that the history looks like this:

[master] 56b3a --> 65d0c           
	      \
	       --> 97bbc --> 71afd

What happens when I do checkout the previous HEAD of the master branch? Well, a checkout of commit 71afd gives me the following:

Note: checking out '71afd173843d044ef5c8cc7a911b61ac0449153c'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 71afd17... Add second message to README

So I can still work with it as if it were an ordinary branch and as soon as it is appropriate, I can "convert" the chain of commits to an actual branch.

Will the original commits be in my repo forever? It depends. If they are on a branch or tag, they will remain in your repo. If not, Git does make use of a garbage collector to delete objects that it considers unreachable. The original commits are reachable from your (local) commit history and as such, safe for deletion. However, it is a temporary safety net as your commit history can be pruned after a certain time, by default after 90 days. If that happens, they can be deleted1.

Conclusion

The example in the previous paragraph shows that a rewrite of Git history does not modify commits, it adds new ones. The old ones are still available, albeit only temporarily. Knowing this it has become a lot less scary to rewrite my Git history.

Footnotes:

1

The information came from this StackOverflow question: What's the difference between git reflog and git log?

Comments

Comments powered by Disqus