Copying specific files to a new folder, while maintaining the original subdirectory tree
Asked Answered
F

1

3

I have a large directory with many subdirectories that I am trying to sort, I am trying to copy specific file types to a new folder, but I want to maintain the original subdirectories.

def copyFile(src, dest):
try:
    shutil.copy(src,dest)
except shutil.Error as e:
    print('Error: %s' % e)
except IOError as e:
    print('Error: %s' % s.strerror)


for root, directories, files in os.walk(directory):
    for directoryname in directories:
        dirpath = os.path.join(root,directoryname)
        dir_paths.append(dirpath)
        dir_names.append(directoryname)

        if not os.listdir(dirpath): #Cheching if directory is empty
            print("Empty")
            EmptyDirs.append(directoryname) #Add directory name to empty directory list
            EmptyDirPath.append(dirpath)
        else:
            pass

    for filename in files:
        filepath = os.path.join(root,filename)
        file_paths.append(filepath)
        file_names.append(filename)

    if filename.lower().endswith(".sldasm"):
            print(filename.encode('utf8'))
            SolidModels.append(filename)
            copyFile(filepath,dest)
    elif filename.lower().endswith(".sldprt"):
            print(filename.encode('utf8'))
            SolidModels.append(filename)
            copyFile(filepath,dest)
    else:
        pass

This is the code I am using now, but it just copies the files without copying the subdirectories they were originally in, so they are completely unorganized in the new folder.

This is the new code using copytree, however now the specific files will not copy, only the subdirectories do.

def copytree(src, dst, symlinks=False, ignore=None):
    names = os.listdir(src)
    if ignore is not None:
        ignored_names = ignore(src, names)
    else:
        ignored_names = set()

os.makedirs(dst)
errors = []

for name in names:
    if name in ignored_names:
        continue

    srcname = os.path.join(src, name)
    dstname = os.path.join(dst, name)

    try:
        if symlinks and os.path.islink(srcname):
            linkto = os.readlink(srcname)
            os.symlink(linkto, dstname)
        elif os.path.isdir(srcname):
            copytree(srcname, dstname, symlinks, ignore)
        else:
            if src is "*.sldasm":
                copy2(srcname, dstname)
            elif src is "*.sldprt":
                copy2(srcname, dstname)

    except (IOError, os.error) as why:
            errors.append((srcname, dstname, str(why)))
Fly answered 2/2, 2016 at 13:54 Comment(3)
See shutil.copytree docs.python.org/2/library/shutil.html Specifically the copytree example code.Catch
I used the sample, thank you! I got all of the subdirectories to copy, but now I'm having trouble getting the files to copy into the subdirectory they belong in. Any suggestions? I added my updated code aboveFly
The indenting in your code is very confusing because it's a mixture of tabs and spaces. It's generally considered a best practice to only use spaces and most editors can be configured to automatically convert tabs to space when you type.Botel
B
11

You can do what you want with the built-in shutil.copytree() function by using (abusing?) its optional ignore keyword argument. The tricky part is that, if given, it must be a callable that returns what, in each directory, should not be copied, rather than what should be.

However it possible to write a factory function similar to shutil.ignore_patterns() that creates a function that does what's needed, and use that as the ignore keyword argument's value.

The function returned first determines what files to keep via the fnmatch.filter() function, then removes them from the list of everything which is in the given directory, unless they're a sub-directory name, in which case they're left in for later [recursive] processing. (This is what makes it copy the whole tree and what was probably wrong with your attempt to write your own copytree() function).

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Works in Python 2.7 & 3.x

import fnmatch
from os.path import isdir, join

def include_patterns(*patterns):
    """ Function that can be used as shutil.copytree() ignore parameter that
    determines which files *not* to ignore, the inverse of "normal" usage.

    This is a factory function that creates a function which can be used as a
    callable for copytree()'s ignore argument, *not* ignoring files that match
    any of the glob-style patterns provided.

    ‛patterns’ are a sequence of pattern strings used to identify the files to
    include when copying the directory tree.

    Example usage:

        copytree(src_directory, dst_directory,
                 ignore=include_patterns('*.sldasm', '*.sldprt'))
    """
    def _ignore_patterns(path, all_names):
        # Determine names which match one or more patterns (that shouldn't be
        # ignored).
        keep = (name for pattern in patterns
                        for name in fnmatch.filter(all_names, pattern))
        # Ignore file names which *didn't* match any of the patterns given that
        # aren't directory names.
        dir_names = (name for name in all_names if isdir(join(path, name)))
        return set(all_names) - set(keep) - set(dir_names)

    return _ignore_patterns


if __name__ == '__main__':

    from shutil import copytree, rmtree
    import os

    src = r'C:\vols\Files\PythonLib\Stack Overflow'
    dst = r'C:\vols\Temp\temp\test'

    # Make sure the destination folder does not exist.
    if os.path.exists(dst) and os.path.isdir(dst):
        print('removing existing directory "{}"'.format(dst))
        rmtree(dst, ignore_errors=False)

    copytree(src, dst, ignore=include_patterns('*.png', '*.gif'))

    print('done')
Botel answered 2/2, 2016 at 18:43 Comment(2)
Great, thanks so much! Could I also use this to include files based on their name? For instance, if I wanted to include all files with "Fork" in the name?Fly
Yes, I believe so — any glob-style file name pattern you want can be used. i.e. "*Fork*.*".Botel

© 2022 - 2024 — McMap. All rights reserved.