Working with a huge git respository

Image result for fattest cat

Are you from the future?

It can be a real pain in the ass to git clone a huge repository and it was for me. If you are coming from the future then you can download git 2.19 and use 

git clone --depth=1 https://github.com/user/repo --filter=blob:limit=1m
or
git clone --depth=1 https://github.com/user/repo --filter=sparse:path=<path>

depth=1 makes sure to download only one level and the filter is limiting files greater than 1 megabyte. More info 
and

Actually if you use the latest version of gitlab you can use the command above. On github [time of writing 04/12/2018] though you will get the following

warning: filtering not recognized by server, ignoring

sparseCheckout? (not a solution,  current workaround is after this)

No the solution is not sparse-checkout, because with sparse-checkout you still have to fetch at least one whole working level. Although it is easy to use as follows:

git init
git remote add origin repo_link
git config core.sparsecheckout true

then you can edit .git/info/sparse-checkout

let's say you can only a specific file then you could add:
iwantthisfile
iwantthisfiletoo

basically a rule per line.

What if you want to ignore a folder and add the rest?

/*
!ignorethisfolder/*

You get the idea.

Current Workaround

Until GitHub decides to support filtering the only thing that can save us is to look for scripts online. Many of which make use of SVN! Yes svn is supported on github? Where is this life going? https://help.github.com/articles/support-for-subversion-clients/

And the good news is that svn export can checkout 1 specific file or folder. So! since I needed everything except one big folder:




I replaced the original repo with a dummy one obviously. This is an unoptimised script as you might have guessed. I would suggest that you modify it for your crazy use cases. Still, I hope that it helps!



Comments

  1. The reason GitHub does not support this yet is because they are focusing on the technology that Microsoft developed called VFSForGit.

    ReplyDelete

Post a Comment

Popular Posts