How can I prevent users from committing binaries into subversion?
Asked Answered
S

7

13

I have a headstrong user who stubbornly insists on committing his binaries (executables, DLLs) into our subversion repositories. I'd go in and delete them, but of course nothing is ever really deleted from subversion.

While there are times when we need to commit binaries, I don't want users doing it as a matter of routine. I can set an ignore property but that doesn't prevent users from committing binaries if they are really determined. What I'd like to do is be able to control the ability to commit nominated file types, particularly .exe and .dll files, on a directory-by-directory basis.

Is there a way to do that in SVN? If it makes any differentce, we are using VisualSVN server and TortoiseSVN.

Small answered 29/1, 2010 at 22:42 Comment(6)
Well, with three answers so quicly and impossible to choose between, I'm sure that an example script would secure 'accepted' status :)Small
How about disciplining your user? Not every solution is technical you know?Guibert
@Lasse: I agree, but I actually find this useful as a means for preventing myself from accidentally chucking binaries into the SVN repository (i.e. setting up Tortoise on a new machine and forgetting to add the "bin" and "obj" exceptions)Rowe
Unfortunately, in a volunteer effort, there can be no question of disciplining a user. His contributions are too valuable to the project to risk losing him completely, so my preferred solution is just to silently ignore most binaries.Small
I agree on that though. Personally I use VisualSVN, and when I add a project to Subversion with it, it automatically adds those ignores for me, along with a few others for good measure. But do note that if you're dealing with a boneheaded user (read: stupid), he will be able to do it no matter what you do. His next step would probably be to try to camouflage the files in some way.Guibert
"nothing is really deleted"... unless you use svn obliterate.Hardwood
E
5

Tim:

You might try this python hook script. It is (loosely) based on the one above, but allows regular expression patterns for the reject paths and allows overriding the check by having a line that begins

Overide:

in the log message. It uses the new python print syntax, so it requires a fairly recent version of python (2.6+?).

from __future__ import print_function

import sys,os
import subprocess 
import re

#this is a list of illegal patterns:
illegal_patterns = [
    '\.exe$',
    '\.dll$',
    '[\^|/]bin/',
    '[\^|/]obj/',
]

# Path to svnlook command:
cmdSVNLOOK=r"{}bin\svnlook.exe".format(os.environ["VISUALSVN_SERVER"])

print(illegal_patterns, file=sys.stderr)

print("cmdSVNLook={}".format(cmdSVNLOOK), file=sys.stderr)

def runSVNLook(subCmd, transact, repoPath):
    svninfo =  subprocess.Popen([cmdSVNLOOK, subCmd, '-t', transact, repoPath], 
                          stdout = subprocess.PIPE, stderr=subprocess.PIPE)
    (stdout, stderr) = svninfo.communicate()

    if len(stderr) > 0:
        print("svnlook generated stderr: " + stderr, file=sys.stderr)
        sys.exit(1)

    return [ line.strip() for line in stdout.split("\n") ]

def findIllegalPattern(fileName):
    for pattern in illegal_patterns:
        if re.search(pattern, fileName):
            print("pattern: {} matched filename:{}".format(pattern, fileName))
            return pattern
    return None

def containsOverRide(logOutput):
    retVal = False
    for line in logOutput:
        print("log line: {}".format(line), file=sys.stderr)
        if re.match("^override:", line.lower()):
            retVal = True
            break
    print("contiansOverRide={}".format(retVal), file=sys.stderr)
    return retVal

def findIllegalNames(changeOutput):
    illegalNames = []
    prog = re.compile('(^[ACUDRM_])[ACUDRM]*\s+(.+)')  # regex for svnlook output
    for line in changeOutput:
        print("processing:{}".format(line), file=sys.stderr)
        if (line != ""):
            match=re.search(prog, line.strip())
            if match:
                mode = match.group(1) 
                ptFilename = match.group(2)
                if mode == 'A':
                  pattern = findIllegalPattern(ptFilename)
                  if pattern:
                      illegalNames.append((pattern, ptFilename))
            else:
                print("svnlook output parsing failed!", file=sys.stderr)
                sys.exit(1)
    return illegalNames

######### main program ################
def main(args):
    repopath = args[1]
    transact = args[2]

    retVal = 0

    overRidden = containsOverRide(runSVNLook("log", transact, repopath))
    illegalFiles = findIllegalNames(runSVNLook("changed", transact, repopath))

    if len(illegalFiles):
        msg = "****************************************************************************\n"

        if len(illegalFiles) == 1:
            msg += "* This commit contains a file which matches a forbidden pattern            *\n"
        else:
            msg += "* This commit contains files which match a forbidden pattern               *\n"

        if overRidden:
            msg += "* and contains an Override line so the checkin will be allowed            *\n"
        else:
            retVal = 1

            msg += "* and is being rejected.                                                   *\n"
            msg += "*                                                                          *\n"
            msg += "* Files which match these patterns are genreraly created by the            *\n"
            msg += "* built process and should not be added to svn.                            *\n"
            msg += "*                                                                          *\n"
            msg += "* If you intended to add this file to the svn repository, you neeed to     *\n"
            msg += "* modify your commit message to include a line that looks like:            *\n"
            msg += "*                                                                          *\n"
            msg += "* OverRide: <reason for override>                                          *\n"
            msg += "*                                                                          *\n"
        msg +=  "****************************************************************************\n"

        print(msg, file=sys.stderr)

        if len(illegalFiles) == 1:
            print("The file and the pattern it matched are:", file=sys.stderr)
        else:
            print("The files and the patterns they matched are:", file=sys.stderr)

        for (pattern, fileName) in illegalFiles:
              print('\t{}\t{}'.format(fileName, str(pattern)), file=sys.stderr)

    return retVal

if __name__ == "__main__":
    ret = main(sys.argv)
    sys.exit(ret)
Excellency answered 11/1, 2012 at 2:53 Comment(1)
This is perfect. I've got IronPython on the server and this script works perfectly for my needs. I like the concept of giving the users the ability to override the hookscript. VisualSVN requires a batch file though, so I had to create a one-liner to just call the Python script.Small
I
5

Here is a small hooks script which is doing what you want: You have to configure 2 things:

  • illegal_suffixes: a python list with all suffixes which should abort the commit
  • cmdSVNLOOK: the path to svnlook program

import sys
import subprocess 
import re

#this is a list of illegal suffixes:
illegal_suffixes = ['.exe','.dll']

# Path to svnlook command:
cmdSVNLOOK="/usr/bin/svnlook";

def isIllegalSuffix(progname):
    for suffix in illegal_suffixes:
        if (ptFilename.endswith(suffix)):
            return True
    return False

######### main program ################
repopath = sys.argv[1]
transact = sys.argv[2]

retVal = 0
svninfo = subprocess.Popen([cmdSVNLOOK, 'changed', '-t', transact, repopath], 
                                                        stdout = subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = svninfo.communicate();

prog = re.compile('(^[ACUDRM_])[ACUDRM]*\s+(.+)')  # regex for svnlook output
for line in stdout.split("\n"):
    if (line.strip()!=""):
        match=re.search(prog, line.strip())
        if match:
            mode = match.group(1) 
            ptFilename = match.group(2)
            if mode == 'A' and isIllegalSuffix(ptFilename): 
              retVal = 1
              sys.stderr.write("Please do not add the following ")
              sys.stderr.write("filetypes to repository:\n")
              sys.stderr.write(str(illegal_suffixes)+"\n")
              break
        else:
            sys.stderr.write("svnlook output parsing failed!\n")
            retVal = 1
            break
    else:
        # an empty line is fine!
        retVal = 0
sys.exit(retVal)
Institutive answered 30/1, 2010 at 19:28 Comment(4)
Thanks, I appreciate you taking the time to post that - unfortunately we're using VisualSVN server which rund on Windows. I'll need either VBScript, JScript or a DOS batch file. Nevertheless, +1 for posting the script.Small
you can use python under windows, and svnlook is available with visual svn : visualsvn.com/support/svnbook/ref/svnlookDissension
I developed this under windwos ;-) it is tested on linux and windows. You can surely use python for hooks and svnlook is part of VisualSVN, otherwise you can (and should) install the svn-commandlineInstitutive
I've accepted this answer because the poster took the time to provide sample code. This didn't actually solve my problem because we're a Windows shop and running VisualSVN on Windows Server and we really are limited to VBScript (ugh!). Never theless, the logic is sound and I'm sure I can translate it.Small
E
5

Tim:

You might try this python hook script. It is (loosely) based on the one above, but allows regular expression patterns for the reject paths and allows overriding the check by having a line that begins

Overide:

in the log message. It uses the new python print syntax, so it requires a fairly recent version of python (2.6+?).

from __future__ import print_function

import sys,os
import subprocess 
import re

#this is a list of illegal patterns:
illegal_patterns = [
    '\.exe$',
    '\.dll$',
    '[\^|/]bin/',
    '[\^|/]obj/',
]

# Path to svnlook command:
cmdSVNLOOK=r"{}bin\svnlook.exe".format(os.environ["VISUALSVN_SERVER"])

print(illegal_patterns, file=sys.stderr)

print("cmdSVNLook={}".format(cmdSVNLOOK), file=sys.stderr)

def runSVNLook(subCmd, transact, repoPath):
    svninfo =  subprocess.Popen([cmdSVNLOOK, subCmd, '-t', transact, repoPath], 
                          stdout = subprocess.PIPE, stderr=subprocess.PIPE)
    (stdout, stderr) = svninfo.communicate()

    if len(stderr) > 0:
        print("svnlook generated stderr: " + stderr, file=sys.stderr)
        sys.exit(1)

    return [ line.strip() for line in stdout.split("\n") ]

def findIllegalPattern(fileName):
    for pattern in illegal_patterns:
        if re.search(pattern, fileName):
            print("pattern: {} matched filename:{}".format(pattern, fileName))
            return pattern
    return None

def containsOverRide(logOutput):
    retVal = False
    for line in logOutput:
        print("log line: {}".format(line), file=sys.stderr)
        if re.match("^override:", line.lower()):
            retVal = True
            break
    print("contiansOverRide={}".format(retVal), file=sys.stderr)
    return retVal

def findIllegalNames(changeOutput):
    illegalNames = []
    prog = re.compile('(^[ACUDRM_])[ACUDRM]*\s+(.+)')  # regex for svnlook output
    for line in changeOutput:
        print("processing:{}".format(line), file=sys.stderr)
        if (line != ""):
            match=re.search(prog, line.strip())
            if match:
                mode = match.group(1) 
                ptFilename = match.group(2)
                if mode == 'A':
                  pattern = findIllegalPattern(ptFilename)
                  if pattern:
                      illegalNames.append((pattern, ptFilename))
            else:
                print("svnlook output parsing failed!", file=sys.stderr)
                sys.exit(1)
    return illegalNames

######### main program ################
def main(args):
    repopath = args[1]
    transact = args[2]

    retVal = 0

    overRidden = containsOverRide(runSVNLook("log", transact, repopath))
    illegalFiles = findIllegalNames(runSVNLook("changed", transact, repopath))

    if len(illegalFiles):
        msg = "****************************************************************************\n"

        if len(illegalFiles) == 1:
            msg += "* This commit contains a file which matches a forbidden pattern            *\n"
        else:
            msg += "* This commit contains files which match a forbidden pattern               *\n"

        if overRidden:
            msg += "* and contains an Override line so the checkin will be allowed            *\n"
        else:
            retVal = 1

            msg += "* and is being rejected.                                                   *\n"
            msg += "*                                                                          *\n"
            msg += "* Files which match these patterns are genreraly created by the            *\n"
            msg += "* built process and should not be added to svn.                            *\n"
            msg += "*                                                                          *\n"
            msg += "* If you intended to add this file to the svn repository, you neeed to     *\n"
            msg += "* modify your commit message to include a line that looks like:            *\n"
            msg += "*                                                                          *\n"
            msg += "* OverRide: <reason for override>                                          *\n"
            msg += "*                                                                          *\n"
        msg +=  "****************************************************************************\n"

        print(msg, file=sys.stderr)

        if len(illegalFiles) == 1:
            print("The file and the pattern it matched are:", file=sys.stderr)
        else:
            print("The files and the patterns they matched are:", file=sys.stderr)

        for (pattern, fileName) in illegalFiles:
              print('\t{}\t{}'.format(fileName, str(pattern)), file=sys.stderr)

    return retVal

if __name__ == "__main__":
    ret = main(sys.argv)
    sys.exit(ret)
Excellency answered 11/1, 2012 at 2:53 Comment(1)
This is perfect. I've got IronPython on the server and this script works perfectly for my needs. I like the concept of giving the users the ability to override the hookscript. VisualSVN requires a batch file though, so I had to create a one-liner to just call the Python script.Small
Q
3

Write a pre-commit hook that checks added files whether they fit your criteria.

You could use pre-commit-check.py as a starting point.

Quintin answered 29/1, 2010 at 22:48 Comment(2)
How would you suggest that I control this on a per-directory basis? Some directories I need to allow binaries to be checked in, others not. I would prefer not to have to hard-code this information into a script.Small
Your script could read the list of allowed paths from a file (you probably want to store the file on the server, not in the repository, so that the user cannot change it) If you want to store the information in the repository, you could use properties on the directory instead. This makes the information more local, and it will automatically handle new branches/tags.Quintin
D
3

You can use a pre-commit hook. You'll have to write a simple program (in any language) which returns a non-zero value if the file is binary.

See here for generic documentation about repository hooks, and here for a python example from Apache.

You could look at the files names, or use file to look at their types.

Dissension answered 29/1, 2010 at 22:48 Comment(1)
This -- in general you might also want to check against .dll, .exe, etc... filenames given this user's stubborn persistence.Prowel
I
3

On TortoiseSVN you can have the user add .dll, .exe, etc to the ignore list. That way, they user won't accidentally check them in. See here for more info:

http://tortoisesvn.net/docs/release/TortoiseSVN_en/tsvn-dug-ignore.html

On the server side, as others have stated, you can use a hook script.

Inactive answered 29/1, 2010 at 22:56 Comment(5)
Well, it's a headstrong user I'm dealing with here. He's been asked several times not to commit binaries but still does. I don't think it is a matter of remembering. That's why I need to enforce the policy.Small
To be blunt, one way of enforcing this issue is to fire him. I don't mean this is your first method of handling it, but if push comes to shove, non-team players have no place on a team.Guibert
You can also deny any commit access. So he can only send his diffs as patches to his colleagues. This is the way Subversion itself restrict its write access to their repository: You have to prove that you write correct code by sending patches to the mailing listInstitutive
@Lasse - it's a volunteer effort, I can't really consider "firing" a volunteer who is doing good work. We do need his contribution to the project. I just need to stop him committing binaries. If this were not a voluntary effort and assuming I were the manager, then I'd be in a much stronger position, obviously.Small
The point is, he may not know not to check those in. He's just adding everything in the directory and committing. By having his Tortoise ignore them, the problem would likely go away.Inactive
T
1

You can use svnlook command. Here is a python class that do this job:

    SVNTransactionParser(object):
        def __init__(self, repos, txn):
            self.repos = repos
            self.txn = txn
            self.ms = magic.open(magic.MAGIC_NONE)
            self.ms.load()

        def tx_files(self):
            files_to_analyze = list()
            for l in self.__svnlook('changed')[0].readlines():
                l = l.replace('\n', '');
                if not l.endswith('/') and l[0] in ['A', 'U']:
                    files_to_analyze.append(l.split(' ')[-1:][0])

            files = dict()        
            for file_to_analyze in files_to_analyze:
                files[file_to_analyze] = {
                                'size': self.__svnlook('filesize', file_to_analyze)[0].readlines()[0].replace('\n', ''),
                                'type': self.ms.buffer(self.__svnlook('cat', file_to_analyze)[0].readline(4096)),
                                'extension': os.path.splitext(file_to_analyze)[1]}

            return files

        def __svnlook(self, command, extra_args=""):
            cmd = '%s %s %s -t "%s" %s' % (SVNLOOK, command, self.repos, self.txn, extra_args)
            out = popen2.popen3(cmd)
            return (out[0], out[2])

tx_files() method returns a map with info like this:

{ 
    '/path/to/file1.txt': {'size': 10, 'type': 'ASCII', 'extension': '.txt'}, 
    '/path/to/file2.pdf': {'size': 10134, 'type': 'PDF', 'extension': '.dpf'}, 
}

You will need the library python-magic (https://github.com/ahupp/python-magic)

Tektite answered 27/8, 2013 at 20:14 Comment(0)
L
0

You could use a pre-commit hook script that checks if the file is binary or textual.

Lederhosen answered 29/1, 2010 at 22:48 Comment(1)
Bad idea, you can't add pictures for a website for instance. Extension check is much betterKwangju

© 2022 - 2024 — McMap. All rights reserved.