How to truncate git history (sample script included)
28th March 2011
Under a few assumptions (most importantly – you do not have any non-merged branches,), it is very easy to throw away git repository commits older than an arbitrarily-chosen commit.
Here’s a sample script (call it e.g. git-truncate and put into your ~/bin or whichever location you have in PATH).
#!/bin/bash
git checkout --orphan temp $1
git commit -m "Truncated history"
git rebase --onto temp $1 master
git branch -D temp
# The following 2 commands are optional - they keep your git repo in good shape.
git prune --progress # delete all the objects w/o references
git gc --aggressive # aggressively collect garbage; may take a lot of time on large repos
Invocation: cd to your repository, then git-truncate refspec, where refspec is either a commit’s SHA1 hash-id, or a tag.
Expected result: a git repository starting with “Truncated history” initial commit, and continuing to the tip of the branch you were on when calling the script.
If you truncate repositories often, then consider adding an optional 2nd argument (truncate-commit message) and also some safeguards against improper use – currently, even if refspec is wrong, the script will not abort after a failed checkout.
Thanks for posting any improvements you may have.
Source: Tekkub’s post on github discussions.
See also: how to remove a single file from all of git’s commits.
May 4th, 2012 at 7:30
Thanks for the script!
Git 1.7.1 doesn’t support “checkout –orphan”, so I found I had to replace
git checkout –orphan temp $1
with
git checkout $1
git symbolic-ref HEAD refs/heads/temp
which may or may not be correct, but it /seems/ to work.
July 19th, 2012 at 12:51
This doesn’t really truncate. It detaches all the commits up till
refspec
. But the commits remain present in the git tree. Any idea how to delete all the detached objects?July 20th, 2012 at 17:21
If I remember it right, objects with no references to them are cleaned up during garbage collection (`git gc`); maybe there is a need to supply an extra option to gc (like `–prune`?).
August 12th, 2012 at 10:37
Thank you. This helped me a lot. I neeeded to truncate history to get rid of secrets I had stored in the code, as passwords and app keys, before pushing the whole repository to a public github repo.
September 7th, 2012 at 18:06
Based on notes at the bottom of “http://git-scm.com/docs/git-filter-branch”, I tried the technique in the original post above, then cloned into a new repo, then gc pruned, and then it was much smaller.
I haven’t tried the “graft then filter-branch then clone (and prune)” strategy. I presume it would give similar results?
I also haven’t tried the “destructive” reflog shrinking technique mention in the Git docs.
December 30th, 2013 at 15:08
It works modified like so in latest git version:
#!/bin/bash
git checkout --orphan temp $1
git commit -m 'Truncated history'
git rebase --onto temp $1 master
git branch -D temp
December 31st, 2013 at 12:24
LaboDJ, I don’t see any differences to my code in the post…
P.S. I’ve also edited both my post and your comment to properly show double-dashes.
March 28th, 2014 at 8:31
Two things to add to that.
git prune --progress
git gc --aggressive
the last might not be necessary. git prune will delete all the unreffed objects.
March 28th, 2014 at 11:57
@Cy – thanks, added both.
P.S. Nice email address!
August 3rd, 2015 at 22:03
It’s a great idea and just what I want, but it fails with this error:
error: Cannot delete the branch ‘temp’ which you are currently on.
August 4th, 2015 at 17:58
Nigel,
I’ve just tested using git 2.4.6 – had no errors.
Check if this command has `master` at the end and completes successfully – looks like you are not switching to the `master` branch for some reason, which then prevents you from deleting the `temp` branch:
git rebase --onto temp $1 master
If you just ran all the code as a script – try issuing commands one by one, to see if there are any problems reported by the earlier commands.
March 29th, 2021 at 17:50
This works, but you will loose all timestamps for commits that are retained. Seems like the git team should address a real need to perform housekeeping for repos that undergo lots of change over many years.