Python - Copying only new files into another directory
Asked Answered
W

5

5

I'm trying to copy files from one directory into another ONLY if those files don't already exist.

So if I have directory A with files "1.txt, 2.txt, 3.txt" and directory B with files "1.txt, 2.txt", I want to only copy "3.txt" into directory B without rewriting the other files.

I'm a Python beginner and I've spent hours on this and have done plenty of research but I just can't seem to find the answer! Help!

Thanks

Waldack answered 31/7, 2014 at 3:33 Comment(1)
Create a for loop in the original directory and then use this #83331 to determine if the file already exists. If it doesn't copy the file.Labbe
W
5

I'll help you out, and lead you to the answer (but I won't just outright give it to you!).

So, first things first, most basic file editing functions are in the os module, so let's import that at the top of our script:

import os

Now let's see how we check if a file exists. [A little research shows the os.path module has a function called exists which checks if a file exists!] So now we're set, but we have to figure out how to get all the files in directory A. It looks like the os module works for this too, with the listdir function. If we have a directory called "directoryone", we could get all the files / directories in it (and then put them in a list) with this:

[file for file in os.listdir("directoryone")]

But we only want to get files, so we have to add an if statement to narrow down our list:

[file for file in os.listdir("directoryone") if os.path.isfile(os.path.join("directoryone", f))]

So now we have a statement which gets all the files in a directory, and we have a way to check if a file exists. The last thing we need to do is figure out how to copy files. We have to import the shutil model for this:

import shutil

And then we can use the shutil.copy function as so:

shutil.copy(srcfile, dstdir)

So we'd end up with this code:

import os, shutil

directorya = "exampledir"
directoryb = "exampledir2"
files = [file for file in os.listdir(directorya) if os.path.isfile(os.path.join(directorya, file))]
for file in files:
    if not os.path.exists(os.path.join(directoryb, file)):
        shutil.copy(os.path.join(directorya, file), directoryb)
Woollyheaded answered 31/7, 2014 at 3:47 Comment(1)
Thank you for the breakdown. Solved my problem and taught me the method. Thanks, again.Waldack
L
2

Get the list of files in both directory and u will get using os.listdir

>>> a_files = ['1.txt', '2.txt', '3.txt']
>>> b_files = ['1.txt', '2.txt']

Take a difference of both list.

>>> set(a_files) - set(b_files)
set(['3.txt'])

U will get only 3.txt copy this file to folder B

Leanora answered 31/7, 2014 at 3:39 Comment(1)
Just difference. symmetric_difference will cause problems if there are any extra files in b_files. Eg. set(a_files).difference(b_files)Jussive
N
1

If you want to copy only files with the extension *.txt you can do this:

#!/usr/bin/python

import glob
import os
import shutil
dira = 'path-to-dira'
dirb = 'path-to-dirb'

for filename in glob.glob(os.path.join(dira,'*.txt')):
   print os.path.join(dirb,os.path.basename(filename))
   if not os.path.isfile(os.path.join(dirb,os.path.basename(filename))):
       shutil.copy(filename,dirb)

If you want all files replace '*.txt' with '*'

Nawab answered 31/7, 2014 at 4:10 Comment(1)
Thanks for the "*.txt" addition, great to know!Waldack
A
0
import shutil
from os import listdir
from os.path import isfile, join

DIR_A = "<Complete path to A directory>"
DIR_B = "<Complete path to B directory>"

onlyfiles_A = [ f for f in listdir(DIR_A) if isfile(join(DIR_A,f)) ]
onlyfiles_B = [ f for f in listdir(DIR_B) if isfile(join(DIR_B,f)) ]

for f_a in onlyfiles_A:
    if not f_a in onlyfiles_B:
        src = DIR_A+"/"+f_a
        shutil.copy(src, DIR_B)

For more information browse through this link:

https://docs.python.org/2/library/shutil.html

Ananthous answered 31/7, 2014 at 3:47 Comment(0)
L
0

Maybe you can do the next:

import glob
import shutil

src = "/opt/"
dst = "/opt/something/"

files_src = set(glob.glob(src+"*.txt"))
files_dst = set(glob.glob(dst+"*.txt"))

other_files = files_src-files_dst
for _file in other_files:
    shutil.copy(src+_file, dst)
Laciniate answered 31/7, 2014 at 3:47 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.