How to limit file size on commit?
Asked Answered
N

6

23

Is there an option to limit the file size when committing?

For example: file sizes above 500K would produce a warning. File sizes above 10M would stop the commit.

I'm fully aware of this question which technically makes this a duplicate but the answers only offer a solution on push, which would be too late for my requirements.

Naji answered 19/9, 2016 at 14:57 Comment(5)
Possible duplicate of Limiting file size in git repositoryLombard
@bbodenmiller, This is not a duplicate. Did you read my question until the end? I referred to the question you specified in the last sentence.Naji
@user2476373 while I understand your desire to find a solution which checks the file size before pushing you should understand that this is the only reliable way to enforce this restriction for multiple people. A pre-commit hook is a local hook and as such is not distributed with the repository. On a coworkers machine this hook will not exist. You might want to do both, a local pre-commit-hook and a remote update-hook.Pontius
pre-commit.com/hooks.html check-added-large-filesKahlil
the best answer for it is over at stackoverflow.com/a/77547007Daigle
F
18

This pre-commit hook will do the file size check:

.git/hooks/pre-commit

#!/bin/sh
hard_limit=$(git config hooks.filesizehardlimit)
soft_limit=$(git config hooks.filesizesoftlimit)
: ${hard_limit:=10000000}
: ${soft_limit:=500000}

list_new_or_modified_files()
{
    git diff --staged --name-status|sed -e '/^D/ d; /^D/! s/.\s\+//'
}

unmunge()
{
    local result="${1#\"}"
    result="${result%\"}"
    env echo -e "$result"
}

check_file_size()
{
    n=0
    while read -r munged_filename
    do
        f="$(unmunge "$munged_filename")"
        h=$(git ls-files -s "$f"|cut -d' ' -f 2)
        s=$(git cat-file -s "$h")
        if [ "$s" -gt $hard_limit ]
        then
            env echo -E 1>&2 "ERROR: hard size limit ($hard_limit) exceeded: $munged_filename ($s)"
            n=$((n+1))
        elif [ "$s" -gt $soft_limit ]
        then
            env echo -E 1>&2 "WARNING: soft size limit ($soft_limit) exceeded: $munged_filename ($s)"
        fi
    done

    [ $n -eq 0 ]
}

list_new_or_modified_files | check_file_size

Above script must be saved as .git/hooks/pre-commit with execution permissions enabled (chmod +x .git/hooks/pre-commit).

The default soft (warning) and hard (error) size limits are set to 500,000 and 10,000,000 bytes but can be overriden through the hooks.filesizesoftlimit and hooks.filesizehardlimit settings respectively:

$ git config hooks.filesizesoftlimit 100000
$ git config hooks.filesizehardlimit 4000000
Foushee answered 19/9, 2016 at 16:38 Comment(4)
This is pretty nice, but you want to read without -r as git diff --name-status encodes file names. (Another option is to use -z and bash's ability to use zero termination, but that limits you to bash.) You can also just use git diff --name-only --diff-filter=d (assuming new enough Git version to understand lowercase d, else spell it out with ACMRTUXB, though some of these can't happen anyway) and then eliminate the sed.Funds
@Funds read without -r as git diff --name-status encodes file names Didn't know this. Added decoding of filenames to the script.Foushee
echo -e is not universally supported by shells, so if this script fails (it fails on the recent mac bash version of 3.2.57) then the -e flags need to be removed (env echo -e and env echo -E) or to make it more portable the calls to echo -e / -E should be replaced with printf for example: env printf "$result" and env printf 1>&2 '\n%s\n' "ERROR: hard size..."Mendicant
The script actually does not work for me got errors: <code> fatal: Not a valid object name .git/hooks/pre-commit: 27: [: Illegal number: .git/hooks/pre-commit: 31: [: Illegal number: </code>Hortatory
E
6

A shorter, bash-specific version of @Leon's script, which prints the file sizes in a human-readable format. It requires a newer git for the --diff-filter=d option:

#!/bin/bash
hard_limit=$(git config hooks.filesizehardlimit)
soft_limit=$(git config hooks.filesizesoftlimit)
: ${hard_limit:=10000000}
: ${soft_limit:=1000000}

status=0

bytesToHuman() {
  b=${1:-0}; d=''; s=0; S=({,K,M,G,T,P,E,Z,Y}B)
  while ((b > 1000)); do
    d="$(printf ".%01d" $((b % 1000 * 10 / 1000)))"
    b=$((b / 1000))
    let s++
  done
  echo "$b$d${S[$s]}"
}

# Iterate over the zero-delimited list of staged files.
while IFS= read -r -d '' file ; do
  hash=$(git ls-files -s "$file" | cut -d ' ' -f 2)
  size=$(git cat-file -s "$hash")

  if (( $size > $hard_limit )); then
    echo "Error: Cannot commit '$file' because it is $(bytesToHuman $size), which exceeds the hard size limit of $(bytesToHuman $hard_limit)."
    status=1
  elif (( $size > $soft_limit )); then
    echo "Warning: '$file' is $(bytesToHuman $size), which exceeds the soft size limit of $(bytesToHuman $soft_limit). Please double check that you intended to commit this file."
  fi
done < <(git diff -z --staged --name-only --diff-filter=d)
exit $status

As with the other answers, this must be saved with execute permissions as .git/hooks/pre-commit.

Example output:

Error: Cannot commit 'foo' because it is 117.9MB, which exceeds the hard size limit of 10.0MB.
Extraneous answered 29/10, 2018 at 20:9 Comment(4)
What is improved about it?Naji
Clarified in the description. It isn't inherently better, just shorter and simpler, with more a human-friendly display of file sizes.Extraneous
This works better than selected answer today, as the munge logic doesn't seem to work on the current git output? This uses git arguments to obviate the need to unmunge.Herder
@Extraneous Could you suggest how to use your solution for Git on Windows ?Intonation
T
4

You need to implement eis script you already look for in the pre-commit hook.

From documentation, we learned that pre-commit hook

takes no parameters, and is invoked before obtaining the proposed commit log message and making a commit. Exiting with a non-zero status from this script causes the git commit command to abort before creating a commit.

Basically, the hook is called to check if the user is allowed to commit his changes.

The script originally made by eis on other post becomes

#!/bin/bash
# File size limit is meant to be configured through 'hooks.filesizelimit' setting
filesizelimit=$(git config hooks.filesizelimit)

# If we haven't configured a file size limit, use default value of about 10M
if [ -z "$filesizelimit" ]; then
        filesizelimit=10000000
fi

# You specify a warning limit
filesizewarning=500000

# With this command, we can find information about the file coming in that has biggest size
# We also normalize the line for excess whitespace
biggest_checkin_normalized=$(git ls-tree --full-tree -r -l HEAD | sort -k 4 -n -r | head -1 | sed 's/^ *//;s/ *$//;s/\s\{1,\}/ /g' )

# Based on that, we can find what we are interested about
filesize=`echo $biggest_checkin_normalized | cut -d ' ' -f4,4`

# Actual comparison
# To cancel a push, we exit with status code 1
# It is also a good idea to print out some info about the cause of rejection
if [ $filesize -gt $filesizelimit ]; then

        # To be more user-friendly, we also look up the name of the offending file
        filename=`echo $biggest_checkin_normalized | cut -d ' ' -f5,5`

        echo "Error: Too large push attempted." >&2
        echo  >&2
        echo "File size limit is $filesizelimit, and you tried to push file named $filename of size $filesize." >&2
        echo "Contact configuration team if you really need to do this." >&2
        exit 1
elif [ $filesize -gt $filesizewarning ]; then
        echo "WARNING ! A file size is bigger that $filesizewarning"
fi
exit 0
Thesda answered 19/9, 2016 at 15:34 Comment(1)
This script checks the sizes of committed files (corresponding to HEAD) and thus is not suitable as a pre-commit hookFoushee
U
0

There is a general pre-commit hook. You can write a script to check file size and then accept or reject the commit. Git however gives the user the ability to bypass the check) from the command line type "git help hooks" for more information. Here is the relevant info on the pre-commit hook.

pre-commit

This hook is invoked by git commit, and can be bypassed with --no-verify option. It takes no parameter, and is invoked before obtaining the proposed commit log message and making a commit. Exiting with non-zero status from this script causes the git commit to abort.

Ufa answered 19/9, 2016 at 15:31 Comment(1)
I can't see the answer. Is a link missing?Tooling
T
0

Just wanted to comment the solution @Leon provided was awesome. I hit a minor snag where it aborted if a empty directory started to attempt to be tracked. So I had to add

[ -d "$f" ] && continue 

there before the ls-files command to avoid the error.

I would have preferred to post a comment as this is not an answer but I don't have the reputation points.

Note: I know git 'ignores' directories but apparently not before the pre-commit hook is run.

Tojo answered 10/9, 2021 at 19:16 Comment(0)
D
0

I further modified @connor-mckay & @vtwaldo21. it was throwing me some errors on empty files. it only shows error on (hard limit) large then GitHub file size limit and removes that file from cache by running git rm --cached on that file, and it also adds that file path into .gitignore file if not added already. this is most common use case which probably everyone wants, I recommend all people using it, this sh scripts also works in windows and its command prompt as well.

#!/bin/sh

# this file should be placed at .git/hooks/pre-commit

hard_limit=$(git config hooks.filesizehardlimit)
soft_limit=$(git config hooks.filesizesoftlimit)
: ${hard_limit:=52428800}  # 50 MB
: ${soft_limit:=49408000}  # ~47 MB

status=0

bytesToHuman() {
  b=${1:-0}; d=''; s=0; S=({,K,M,G,T,P,E,Z,Y}B)
  while ((b > 1000)); do
    d="$(printf ".%02d" $((b % 1000 * 100 / 1000)))"
    b=$((b / 1000))
    let s++
  done
  echo "$b$d${S[$s]}"
}

# Iterate over the zero-delimited list of staged files.
while IFS= read -r -d '' file; do
  [ -d "$file" ] && continue 
  hash=$(git ls-files -s "$file" | cut -d ' ' -f 2)
  size=$(git cat-file -s "$hash" 2>/dev/null)

  if [ -z "$size" ] || [ "$size" -eq 0 ]; then
    echo "Warning: Unable to determine size for '$file'."
    continue
  fi

  if (( size > hard_limit )); then
    echo "Error: '$file' is $(bytesToHuman $size), which exceeds the hard size limit of $(bytesToHuman $hard_limit). Removing from commit and adding to .gitignore."
    
    # Remove file from staging area
    git rm --cached "$file"
    
    # Add file path to .gitignore if not already present
    if ! grep -q "^$(echo "$file" | sed 's/[].[^$\\*]/\\&/g')$" .gitignore; then
      echo "$file" >> .gitignore
    fi
    
    status=1
  elif (( size > soft_limit )); then
    echo "Warning: '$file' is $(bytesToHuman $size), which exceeds the soft size limit of $(bytesToHuman $soft_limit). Please double check that you intended to commit this file."
  fi
done < <(git diff -z --staged --name-only --diff-filter=d)


# exit $status 
#because file is removed from cache we dont need to stop commit anymore 

exit 0
Daigle answered 25/11, 2023 at 7:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.