How do you get a directory listing sorted by creation date in python?
Asked Answered
M

19

221

What is the best way to get a list of all files in a directory, sorted by date [created | modified], using python, on a windows machine?

Margotmargrave answered 3/10, 2008 at 19:10 Comment(0)
P
191

Update: to sort dirpath's entries by modification date in Python 3:

import os
from pathlib import Path

paths = sorted(Path(dirpath).iterdir(), key=os.path.getmtime)

(put @Pygirl's answer here for greater visibility)

If you already have a list of filenames files, then to sort it inplace by creation time on Windows (make sure that list contains absolute path):

files.sort(key=os.path.getctime)

The list of files you could get, for example, using glob as shown in @Jay's answer.


old answer Here's a more verbose version of @Greg Hewgill's answer. It is the most conforming to the question requirements. It makes a distinction between creation and modification dates (at least on Windows).

#!/usr/bin/env python
from stat import S_ISREG, ST_CTIME, ST_MODE
import os, sys, time

# path to the directory (relative or absolute)
dirpath = sys.argv[1] if len(sys.argv) == 2 else r'.'

# get all entries in the directory w/ stats
entries = (os.path.join(dirpath, fn) for fn in os.listdir(dirpath))
entries = ((os.stat(path), path) for path in entries)

# leave only regular files, insert creation date
entries = ((stat[ST_CTIME], path)
           for stat, path in entries if S_ISREG(stat[ST_MODE]))
#NOTE: on Windows `ST_CTIME` is a creation date 
#  but on Unix it could be something else
#NOTE: use `ST_MTIME` to sort by a modification date
        
for cdate, path in sorted(entries):
    print time.ctime(cdate), os.path.basename(path)

Example:

$ python stat_creation_date.py
Thu Feb 11 13:31:07 2009 stat_creation_date.py
Polynesia answered 11/2, 2009 at 21:58 Comment(7)
This worked perfectly. I'm trying to compare two directories cdate with each other. Is there a way to compare the seconds between the two cdates?Jobey
@malcmcmul: cdate is a float number of seconds since Epoch.Polynesia
This works but the most succinct solution is at https://mcmap.net/q/120539/-directory-listing-based-on-time-duplicateJonell
@jmoz: do you mean like this. The solution you've link is wrong: it doesn't filter regular files. Note: my solution calls stat once per dir.entry.Polynesia
Forgive me, link provided by Sabastian is even more succinct! Thank you.Jonell
paths = sorted(Path(directory).iterdir(), key=os.path.getmtime) File "/usr/lib/python2.7/genericpath.py", line 62, in getmtime return os.stat(filename).st_mtime TypeError: coercing to Unicode: need string or buffer, PosixPath foundGetaway
@LavaSangeetham notice that the answer for the pathlib solution says Python 3, not Python 2.7Polynesia
B
204

I've done this in the past for a Python script to determine the last updated files in a directory:

import glob
import os

search_dir = "/mydir/"
# remove anything from the list that is not a file (directories, symlinks)
# thanks to J.F. Sebastion for pointing out that the requirement was a list 
# of files (presumably not including directories)  
files = list(filter(os.path.isfile, glob.glob(search_dir + "*")))
files.sort(key=lambda x: os.path.getmtime(x))

That should do what you're looking for based on file mtime.

EDIT: Note that you can also use os.listdir() in place of glob.glob() if desired - the reason I used glob in my original code was that I was wanting to use glob to only search for files with a particular set of file extensions, which glob() was better suited to. To use listdir here's what it would look like:

import os

search_dir = "/mydir/"
os.chdir(search_dir)
files = filter(os.path.isfile, os.listdir(search_dir))
files = [os.path.join(search_dir, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x))
Bangalore answered 3/10, 2008 at 19:12 Comment(15)
glob() is nice, but keep in mind that it skips files starting with a period. *nix systems treat such files as hidden (thus omitting them from listings), but in Windows they are normal files.Encomiast
These solutions don't exclude dirs from list.Pernickety
Your os.listdir solution is missing the os.path.join: files.sort(lambda x,y: cmp(os.path.getmtime(os.path.join(search_dir,x)), os.path.getmtime(os.path.join(search_dir,y))))Interception
files.sort(key=lambda fn: os.path.getmtime(os.path.join(search_dir, fn)))Polynesia
files = filter(os.path.isfile, os.listdir(search_dir))Polynesia
Your solution doesn't sort by creation date as OP asks. See #168909Polynesia
@J.F. - the question actually asks "date [created | modified]" so mtime is a better choice than ctime.Bangalore
@J.F. - thanks for pointing out the "key" param to sort, that was added in Python 2.4 and this code was originally on python 2.3 so I wasn't aware of it at the time. Learn something new every day!Bangalore
A mere files.sort(key=os.path.getmtime) should work (without lambda).Polynesia
Note: after os.chdir(search_dir), you don't need os.listdir(search_dir); you could use os.listdir(os.curdir) instead and therefore you don't need os.path.join(search_dir, f) either. You could replace the last 3 lines with this: files = sorted(filter(os.path.isfile, os.listdir(os.curdir)), key=os.path.getmtime)Polynesia
In case of a large folder, and if one only wants the last file, there is no more efficient way of doing this, right?Electrolyte
@Electrolyte to monitor a folder for new files, you could use the watchdog module. To find the file created last in the given directory only once, max() + os.scandir() or os.listdir() is enough. Here's code example (text in Russian)Polynesia
How do I manipulate the time it gives me? For example, I want to look at the files that are older than one week? Is there a way to convert the output from os.path.getmtime(x) to a date?Archibaldo
the os.chdir() is relevant although I have an absolute path. Welcome to Python!Odetteodeum
you could do files = [os.path.join(search_dir, f) for f in files if ".txt" in f] to get only .txt files (example) - to get only files with specific extensionDer
P
191

Update: to sort dirpath's entries by modification date in Python 3:

import os
from pathlib import Path

paths = sorted(Path(dirpath).iterdir(), key=os.path.getmtime)

(put @Pygirl's answer here for greater visibility)

If you already have a list of filenames files, then to sort it inplace by creation time on Windows (make sure that list contains absolute path):

files.sort(key=os.path.getctime)

The list of files you could get, for example, using glob as shown in @Jay's answer.


old answer Here's a more verbose version of @Greg Hewgill's answer. It is the most conforming to the question requirements. It makes a distinction between creation and modification dates (at least on Windows).

#!/usr/bin/env python
from stat import S_ISREG, ST_CTIME, ST_MODE
import os, sys, time

# path to the directory (relative or absolute)
dirpath = sys.argv[1] if len(sys.argv) == 2 else r'.'

# get all entries in the directory w/ stats
entries = (os.path.join(dirpath, fn) for fn in os.listdir(dirpath))
entries = ((os.stat(path), path) for path in entries)

# leave only regular files, insert creation date
entries = ((stat[ST_CTIME], path)
           for stat, path in entries if S_ISREG(stat[ST_MODE]))
#NOTE: on Windows `ST_CTIME` is a creation date 
#  but on Unix it could be something else
#NOTE: use `ST_MTIME` to sort by a modification date
        
for cdate, path in sorted(entries):
    print time.ctime(cdate), os.path.basename(path)

Example:

$ python stat_creation_date.py
Thu Feb 11 13:31:07 2009 stat_creation_date.py
Polynesia answered 11/2, 2009 at 21:58 Comment(7)
This worked perfectly. I'm trying to compare two directories cdate with each other. Is there a way to compare the seconds between the two cdates?Jobey
@malcmcmul: cdate is a float number of seconds since Epoch.Polynesia
This works but the most succinct solution is at https://mcmap.net/q/120539/-directory-listing-based-on-time-duplicateJonell
@jmoz: do you mean like this. The solution you've link is wrong: it doesn't filter regular files. Note: my solution calls stat once per dir.entry.Polynesia
Forgive me, link provided by Sabastian is even more succinct! Thank you.Jonell
paths = sorted(Path(directory).iterdir(), key=os.path.getmtime) File "/usr/lib/python2.7/genericpath.py", line 62, in getmtime return os.stat(filename).st_mtime TypeError: coercing to Unicode: need string or buffer, PosixPath foundGetaway
@LavaSangeetham notice that the answer for the pathlib solution says Python 3, not Python 2.7Polynesia
S
44

There is an os.path.getmtime function that gives the number of seconds since the epoch and should be faster than os.stat.

import os 

os.chdir(directory)
sorted(filter(os.path.isfile, os.listdir('.')), key=os.path.getmtime)
Slaty answered 6/2, 2011 at 16:47 Comment(0)
E
27

Here's my version:

def getfiles(dirpath):
    a = [s for s in os.listdir(dirpath)
         if os.path.isfile(os.path.join(dirpath, s))]
    a.sort(key=lambda s: os.path.getmtime(os.path.join(dirpath, s)))
    return a

First, we build a list of the file names. isfile() is used to skip directories; it can be omitted if directories should be included. Then, we sort the list in-place, using the modify date as the key.

Encomiast answered 3/10, 2008 at 19:46 Comment(1)
It sorted it by oldest first to newest. When I wanted the 5 newest files I had to do the following a[-5:]Irra
B
22

Here's a one-liner:

import os
import time
from pprint import pprint

pprint([(x[0], time.ctime(x[1].st_ctime)) for x in sorted([(fn, os.stat(fn)) for fn in os.listdir(".")], key = lambda x: x[1].st_ctime)])

This calls os.listdir() to get a list of the filenames, then calls os.stat() for each one to get the creation time, then sorts against the creation time.

Note that this method only calls os.stat() once for each file, which will be more efficient than calling it for each comparison in a sort.

Brutal answered 3/10, 2008 at 19:15 Comment(3)
that's hardly pythonic, though it does solve the job (disclaimer: didn't test the code).Proclivity
This solution doesn't exclude dirs from list.Pernickety
@Constantin: that's true, but a quick [... if stat.S_ISREG(x)] would handle that.Brutal
O
21

In python 3.5+

from pathlib import Path
sorted(Path('.').iterdir(), key=lambda f: f.stat().st_mtime)
Ondrej answered 15/9, 2017 at 4:25 Comment(3)
for creation date, use f.stat().st_ctime instead.Illiterate
You should cast the PosixPath object into str in order to execute String methods.Voletta
Maybe I miunderstand the comment. Could you clarify which str method? We are sorting on st_mtime, not PosixPath.Ondrej
R
20

Without changing directory:

import os    

path = '/path/to/files/'
name_list = os.listdir(path)
full_list = [os.path.join(path,i) for i in name_list]
time_sorted_list = sorted(full_list, key=os.path.getmtime)

print time_sorted_list

# if you want just the filenames sorted, simply remove the dir from each
sorted_filename_list = [ os.path.basename(i) for i in time_sorted_list]
print sorted_filename_list
Relation answered 21/5, 2015 at 18:28 Comment(0)
R
15
from pathlib import Path
import os

sorted(Path('./').iterdir(), key=lambda t: t.stat().st_mtime)

or

sorted(Path('./').iterdir(), key=os.path.getmtime)

or

sorted(os.scandir('./'), key=lambda t: t.stat().st_mtime)

where m time is modified time.

Rianon answered 8/11, 2019 at 18:38 Comment(0)
T
13

Here's my answer using glob without filter if you want to read files with a certain extension in date order (Python 3).

dataset_path='/mydir/'   
files = glob.glob(dataset_path+"/morepath/*.extension")   
files.sort(key=os.path.getmtime)
Tachymetry answered 13/9, 2013 at 9:59 Comment(0)
T
10
# *** the shortest and best way ***
# getmtime --> sort by modified time
# getctime --> sort by created time

import glob,os

lst_files = glob.glob("*.txt")
lst_files.sort(key=os.path.getmtime)
print("\n".join(lst_files))
Thermography answered 7/9, 2019 at 6:30 Comment(3)
please provide contextChromosome
"best" is subjective. Your answer would be better if you explained why you think it's the best way.Bernice
If you want "the best", you certainly don't use glob, as it's really slow.Amidst
J
5
sorted(filter(os.path.isfile, os.listdir('.')), 
    key=lambda p: os.stat(p).st_mtime)

You could use os.walk('.').next()[-1] instead of filtering with os.path.isfile, but that leaves dead symlinks in the list, and os.stat will fail on them.

Joannejoannes answered 3/10, 2008 at 20:7 Comment(0)
S
4

For completeness with os.scandir (2x faster over pathlib):

import os
sorted(os.scandir('/tmp/test'), key=lambda d: d.stat().st_mtime)
Sills answered 20/9, 2019 at 15:39 Comment(0)
L
1

this is a basic step for learn:

import os, stat, sys
import time

dirpath = sys.argv[1] if len(sys.argv) == 2 else r'.'

listdir = os.listdir(dirpath)

for i in listdir:
    os.chdir(dirpath)
    data_001 = os.path.realpath(i)
    listdir_stat1 = os.stat(data_001)
    listdir_stat2 = ((os.stat(data_001), data_001))
    print time.ctime(listdir_stat1.st_ctime), data_001
Lingual answered 18/10, 2011 at 2:29 Comment(0)
D
1

Alex Coventry's answer will produce an exception if the file is a symlink to an unexistent file, the following code corrects that answer:

import time
import datetime
sorted(filter(os.path.isfile, os.listdir('.')), 
    key=lambda p: os.path.exists(p) and os.stat(p).st_mtime or time.mktime(datetime.now().timetuple())

When the file doesn't exist, now() is used, and the symlink will go at the very end of the list.

Dorian answered 11/8, 2017 at 21:18 Comment(0)
M
1

This was my version:

import os

folder_path = r'D:\Movies\extra\new\dramas' # your path
os.chdir(folder_path) # make the path active
x = sorted(os.listdir(), key=os.path.getctime)  # sorted using creation time

folder = 0

for folder in range(len(x)):
    print(x[folder]) # print all the foldername inside the folder_path
    folder = +1
Mauromaurois answered 3/6, 2020 at 9:26 Comment(1)
In my code the files are sorted as oldest to newest. To get newest filenames or folders first, you need to add reverse = True in the file list (in my case it was x). so, x = sorted(os.listdir(), key=os.path.getctime, reverse=True)Mauromaurois
T
0

Here is a simple couple lines that looks for extention as well as provides a sort option

def get_sorted_files(src_dir, regex_ext='*', sort_reverse=False): 
    files_to_evaluate = [os.path.join(src_dir, f) for f in os.listdir(src_dir) if re.search(r'.*\.({})$'.format(regex_ext), f)]
    files_to_evaluate.sort(key=os.path.getmtime, reverse=sort_reverse)
    return files_to_evaluate
Tibia answered 3/4, 2019 at 16:17 Comment(0)
D
0

Add the file directory/folder in path, if you want to have specific file type add the file extension, and then get file name in chronological order. This works for me.

import glob, os
from pathlib import Path
path = os.path.expanduser(file_location+"/"+date_file)  
os.chdir(path)    
saved_file=glob.glob('*.xlsx')
saved_file.sort(key=os.path.getmtime)

print(saved_file)
Duggan answered 11/4, 2022 at 20:33 Comment(0)
A
-1

Turns out os.listdir sorts by last modified but in reverse so you can do:

import os
last_modified=os.listdir()[::-1]
Atheist answered 2/2, 2021 at 15:39 Comment(1)
"Turns out os.listdir sorts by last modified but in reverse " - No, it doesn't. The doc clearly states: "os.listdir(path='.') Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order" (emphasis mine)Exit
Z
-5

Maybe you should use shell commands. In Unix/Linux, find piped with sort will probably be able to do what you want.

Zeitgeist answered 3/10, 2008 at 19:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.