Here is your answer on how to traverse a large directory file by file on Windows!
I searched like a maniac for a Windows DLL that will allow me to do what is done on Linux, but no luck.
So, I concluded that the only way is to create my own DLL that will expose those static functions to me, but then I remembered pywintypes.
And, YEEY! this is already done there. And, even more, an iterator function is already implemented! Cool!
A Windows DLL with FindFirstFile(), FindNextFile() and FindClose() may be still somewhere there but I didn't find it. So, I used pywintypes.
EDIT: They were hiding in plain sight in kernel32.dll. Please see ssokolow's answer, and my comment to it.
Sorry for dependency. But I think that you can extract win32file.pyd from ...\site-packages\win32 folder and eventual dependencies and distribute it independent of win32types with your program if you have to.
I found this question when searching on how to do this, and some others as well.
Here:
How to copy first 100 files from a directory of thousands of files using python?
I posted a full code with Linux version of listdir() from here (by Jason Orendorff) and with my Windows version that I present here.
So anyone wanting a more or less cross-platform version, go there or combine two answers yourself.
EDIT: Or better still, use scandir module or os.scandir() (in Python 3.5) and following versions. It better handles errors and some other stuff as well.
from win32file import FindFilesIterator
import os
def listdir (path):
"""
A generator to return the names of files in the directory passed in
"""
if "*" not in path and "?" not in path:
st = os.stat(path) # Raise an error if dir doesn't exist or access is denied to us
# Check if we got a dir or something else!
# Check gotten from stat.py (for fast checking):
if (st.st_mode & 0170000) != 0040000:
e = OSError()
e.errno = 20; e.filename = path; e.strerror = "Not a directory"
raise e
path = path.rstrip("\\/")+"\\*"
# Else: Decide that user knows what she/he is doing
for file in FindFilesIterator(path):
name = file[-2]
# Unfortunately, only drives (eg. C:) don't include "." and ".." in the list:
if name=="." and name=="..": continue
yield name