Python: Getting files into an archive without the directory?
Asked Answered
S

1

16

I've been learning python for about 3 weeks now, and I'm currently trying to write a little script for sorting files (about 10.000) by keywords and date appearing in the filename. Files before a given date should be added to an archive. The sorting works fine, but not the archiving

It creates an archive - the name is fine - but in the archive is the complete path to the files. If i open it, it looks like: folder1 -> folder2 -> folder3 -> files.

How can I change it such that the archive only contains the files and not the whole structure?

Below is a snippet with my zip function, node is the path where the files were before sorting, folder is a subfolder with the files sorted by a keyword in the name, items are the folders with files sorted by date.

I am using Python 2.6

def ZipFolder(node, zipdate):
    xynode = node + '/xy'
    yznode = node + '/yz'
    for folder in [xynode,yznode]:
        items = os.listdir(folder)
        for item in items:
            itemdate = re.findall('(?<=_)\d\d\d\d-\d\d', item)
            print item
            if itemdate[0] <= zipdate:
                arcname = str(item) + '.zip'
                x = zipfile.ZipFile(folder + '/' + arcname, mode='w', compression = zipfile.ZIP_DEFLATED)
                files = os.listdir(folder + '/' + item)
                for f in files:
                    x.write(folder + '/' + item + '/' + f)
                    print 'writing ' + str(folder + '/' + item + '/' + f) + ' in ' + str(item)
                x.close()
                shutil.rmtree(folder + '/' + item)
    return

I am also open to any suggestions and improvements.

Sketch answered 10/8, 2011 at 8:27 Comment(0)
F
21

From help(zipfile):

 |  write(self, filename, arcname=None, compress_type=None)
 |      Put the bytes from filename into the archive under the name
 |      arcname.

So try changing your write() call with:

x.write(folder + '/' + item + '/' + f, arcname = f)

About your code, it seems to me good enough, especially for a 3 week pythonist, although a few comments would have been welcomed ;-)

Fury answered 10/8, 2011 at 9:16 Comment(14)
Not comments but docstring. Explain what the function does and what is it useful for, not how it works and the docstring is most useful for that.Marrymars
@Jan Hudec: with docstring you mean sth like this: def ZipFolder(node, zipdate): """zips files older than 'zipdate' in the subfolders of 'node'"""Sketch
@rny: Yes. It's the standard way to document functions in python.Marrymars
Get rid of that ugly joining. x.write(folder, item, f, arcname = f, sep='/')Needleful
@SpringField: That function is not a print()... it is not even a file.write(), and in the docs I don't see any reference to the argument sep.Fury
@Fury , Python Cookbook - page 78.Needleful
@SpringField: I don't know what page says, but... x.write(folder, item, arcname=f) -> TypeError: got multiple values for keyword argument 'arcname'. x.write(..., arcname=f, sep='/') -> TypeError: got unexpected keyword argument 'sep'.Fury
@Fury , a picture from my interpreter with the "sep" in action. imgbox.com/jNp6FKd6Needleful
@SpringField: That's the print() function that does have a sep function. The one in my answer is a zipfile.ZipFile.write() function, and that has nothing to do with the former. No sep, no end, no file, no multiple arguments; that can be seen in the help string I quoted in the answer.Fury
While @Springfield is technically incorrect, it does seem like it would be more correct to use: x.write(os.path.join(folder,item,f),arcname=f).Brookes
I knew this answer and yet came looking because, the documentation specifically says the opposite: Write the file named filename to the archive, giving it the archive name arcname (by default, this will be the same as filename, but without a drive letter and with leading path separators removed).Glutathione
I think they're referring only to the name of the file and not the directory structure, which will remain as it is in the filename.Glutathione
Please add an explanation as to which part exactly omits the directory path. Duplicates are redirected here without a proper explanation.Retool
@Arete: The write() function takes two parameters: the first one is the name of a local file whose data is to be added to the zip file; the second one is the name under which that data will be stored. So a call to z.write('c:\\users\\rodrigo\\Desktop\\My Document.doc', 'd.doc') will store that file from my desktop as a plain d.doc inside the root directory of the zip file. If instead of d.doc you write foo/d.doc then it will create a directory foo inside the zip file and store the d.doc archive there.Fury

© 2022 - 2024 — McMap. All rights reserved.