Filter directory when using shutil.copytree?
Asked Answered
F

5

21

Is there a way I can filter a directory by using the absolute path to it?

shutil.copytree(directory,
                target_dir,
                ignore = shutil.ignore_patterns("/Full/Path/To/aDir/Common")) 

This doesn't seem to work when trying to filter the "Common" Directory located under "aDir". If I do this:

shutil.copytree(directory,
                target_dir,
                ignore = shutil.ignore_patterns("Common"))

It works, but every directory called Common will be filtered in that "tree", which is not what I want.

Any suggestions ?

Thanks.

Flour answered 20/10, 2011 at 20:47 Comment(0)
T
19

You can make your own ignore function:

shutil.copytree('/Full/Path', 'target',
              ignore=lambda directory, contents: ['Common'] if directory == '/Full/Path/To/aDir' else [])

Or, if you want to be able to call copytree with a relative path:

import os.path
def ignorePath(path):
  def ignoref(directory, contents):
    return (
        f for f in contents
        if os.path.abspath(os.path.join(directory, f)) == path)
  return ignoref

shutil.copytree('Path', 'target', ignore=ignorePath('/Full/Path/To/aDir/Common'))

From the docs:

If ignore is given, it must be a callable that will receive as its arguments the directory being visited by copytree(), and a list of its contents, as returned by os.listdir(). Since copytree() is called recursively, the ignore callable will be called once for each directory that is copied. The callable must return a sequence of directory and file names relative to the current directory (i.e. a subset of the items in its second argument); these names will then be ignored in the copy process. ignore_patterns() can be used to create such a callable that ignores names based on glob-style patterns.

Tribunal answered 20/10, 2011 at 21:1 Comment(2)
Small correction, ignoref function should return list (i.e. [f for f in ...] instead of generator.Puzzlement
Another correction; os.abspath should be os.path.abspathEcthyma
F
4

The API for shutil.ignore_patterns() doesn't support absolute paths, but it is trivially easy to roll your own variant.

As a starting point, look at the source code for *ignore_patterns*:

def ignore_patterns(*patterns):
    """Function that can be used as copytree() ignore parameter.

    Patterns is a sequence of glob-style patterns
    that are used to exclude files"""
    def _ignore_patterns(path, names):
        ignored_names = []
        for pattern in patterns:
            ignored_names.extend(fnmatch.filter(names, pattern))
        return set(ignored_names)
    return _ignore_patterns

You can see that it returns a function that accepts a path and list of names, and it returns a set of names to ignore. To support your use case, create you own similar function that uses takes advantage of path argument. Pass your function to the ignore parameter in the call to copytree().

Alternatively, don't use shutil as-is. The source code is short and sweet, so it isn't hard to cut, paste, and customize.

Flickinger answered 20/10, 2011 at 21:3 Comment(0)
A
3

You'll want to make your own ignore function, which checks the current directory being processed and returns a list containing 'Common' only if the dir is '/Full/Path/To/aDir'.

def ignore_full_path_common(dir, files):
    if dir == '/Full/Path/To/aDir':
        return ['Common']
    return []

shutil.copytree(directory, target_dir, ignore=ignore_full_path_common)
Aggression answered 20/10, 2011 at 21:7 Comment(0)
E
2

Many Thanks for the answer. It helped me to design my own ignore_patterns() function for a bit different requirement. Pasting the code here, it might help someone.

Below is the ignore_patterns() function for excluding multiple files/directories using the absolute path to it.

myExclusionList --> List containing files/directories to be excluded while copying. This list can contain wildcard pattern. Paths in the list are relative to the srcpath provided. For ex:

[EXCLUSION LIST]

java/app/src/main/webapp/WEB-INF/lib/test
unittests
python-buildreqs/apps/abc.tar.gz
3rd-party/jdk*

Code is pasted below

def copydir(srcpath, dstpath, myExclusionList, log):

    patternlist = []
    try:
        # Forming the absolute path of files/directories to be excluded
        for pattern in myExclusionList:
            tmpsrcpath = join(srcpath, pattern)
            patternlist.extend(glob.glob(tmpsrcpath)) # myExclusionList can contain wildcard pattern hence glob is used
        copytree(srcpath, dstpath, ignore=ignore_patterns_override(*patternlist))
    except (IOError, os.error) as why:
        log.warning("Unable to copy %s to %s because %s", srcpath, dstpath, str(why))
        # catch the Error from the recursive copytree so that we can
        # continue with other files
    except Error as err:
        log.warning("Unable to copy %s to %s because %s", srcpath, dstpath, str(err))


# [START: Ignore Patterns]
# Modified Function to ignore patterns while copying.
# Default Python Implementation does not exclude absolute path
# given for files/directories

def ignore_patterns_override(*patterns):
    """Function that can be used as copytree() ignore parameter.
    Patterns is a sequence of glob-style patterns
    that are used to exclude files/directories"""
    def _ignore_patterns(path, names):
        ignored_names = []
        for f in names:
            for pattern in patterns:
                if os.path.abspath(join(path, f)) == pattern:
                    ignored_names.append(f)
        return set(ignored_names)
    return _ignore_patterns

# [END: Ignore Patterns]
Ellen answered 18/2, 2021 at 17:10 Comment(0)
W
0

Platform independent. Paths glob patterns [".gitkeep","app/build","*.txt"]

    def callbackIgnore(paths):
        """ callback for shutil.copytree """
        def ignoref(directory, contents):
            arr = [] 
            for f in contents:
                for p in paths:
                    if (pathlib.PurePath(directory, f).match(p)):
                        arr.append(f)
            return arr
    
        return ignoref
Weariless answered 14/2, 2023 at 21:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.