The scenario
Imagine I am forced to work with some of my files always stored inside .zip
files. Some of the files inside the ZIP file are small text files and change often, while others are larger but luckily rather static (e.g. images).
If I want to place these ZIP files inside a Git repository, each ZIP is treated as a blob, so whenever I commit the repository grows by the size of the ZIP file... even if only one small text file inside changed!
Why this is realistic
Microsoft Word 2007/2010 .docx
and Excel .xlsx
files are ZIP files...
What I want
Is there, by any chance, a way to tell Git to not treat ZIP files as files, but rather as directories and treat their contents as files?
The advantages
- much smaller repository size, i.e. quicker transfer/backup
- Display changes with Git to ZIP files would automagically work
But it couldn't work, you say?
I realize that without extra metadata this would lead to some amount of ambiguity: on a git checkout
Git would have to decide whether to create foo.zip/bar.txt
as a file in a regular directory or a ZIP file. However, this could be solved through configuration options, I would think.
Two ideas how it could be done (if it doesn't exist yet)
- using a library such as
minizip
orIO::Compress::Zip
inside Git - somehow adding a filesystem layer such that Git actually sees ZIP files as directories to start with
.docx
files makes sense, but in many other cases you might want to consider tracking the individual files normally with git and only building the resulting.zip
using an appropriate build tool likemake
. – TamekiaUNX
format. It's also recursive: it contains aBLX
file and aDFX
file, which are both archives, which correspond to is 'business layer' and 'data foundation', respectively. I'd like to have a solution as well. – Stonechat