Recursively iterate through all subdirectories using pathlib
Asked Answered
T

6

182

How can I use pathlib to recursively iterate over all subdirectories of a given directory?

p = Path('docs')
for child in p.iterdir():
    # do things with child

only seems to iterate over the immediate children of a given directory.

I know this is possible with os.walk() or glob, but I want to use pathlib because I like working with the path objects.

Two answered 6/6, 2018 at 7:22 Comment(0)
L
147

You can use the glob method of a Path object:

p = Path('docs')
for i in p.glob('**/*'):
     print(i.name)
Lapful answered 6/6, 2018 at 7:27 Comment(3)
There's also a rglob method, which adds **/ before the pattern, so you can do p.rglob('*') instead.Firry
Yes as per @pylang's answer below. Thought it would be rude to update mine as he's had good success so far.Lapful
You have upstanding character @JacquesGaudin. Cheers.Spoilt
S
274

Use Path.rglob (substitutes the leading ** in Path().glob("**/*")):

path = Path("docs")
for p in path.rglob("*"):
    print(p.name)
Spoilt answered 4/1, 2019 at 22:12 Comment(1)
This is nice, you can also add the file extension e.g. for p in path.rglob("*.pdf"):Citified
L
147

You can use the glob method of a Path object:

p = Path('docs')
for i in p.glob('**/*'):
     print(i.name)
Lapful answered 6/6, 2018 at 7:27 Comment(3)
There's also a rglob method, which adds **/ before the pattern, so you can do p.rglob('*') instead.Firry
Yes as per @pylang's answer below. Thought it would be rude to update mine as he's had good success so far.Lapful
You have upstanding character @JacquesGaudin. Cheers.Spoilt
L
19

To find just folders the right glob string is:

'**/'

So to find all the paths for all the folders in your path do this:

p = Path('docs')
for child in p.glob('**/'):
    print(child)

If you just want the folder names without the paths then print the name of the folder like so:

p = Path('docs')
for child in p.glob('**/'):
    print(child.name)
Landlord answered 31/8, 2020 at 17:4 Comment(1)
To find just folders using rglob: p.rglob("./")Cummine
W
13

As of Python 3.12 use pathlib.Path.walk()

import pathlib    
path = pathlib.Path(r"E:\folder")
for root, dirs, files in path.walk():
    print("Root: ")
    print(root)
    print("Dirs: ")
    print(dirs)
    print("Files: ")
    print(files)
    print("")
Wiliness answered 28/3, 2023 at 0:49 Comment(0)
S
10

pathlib has glob method where we can provide pattern as an argument.

For example : Path('abc').glob('**/*.txt') - It will look for current folder abc and all other subdirectories recursively to locate all txt files.

Sexlimited answered 6/6, 2018 at 7:38 Comment(0)
M
10

Use list comprehensions:

(1) [f.name for f in p.glob("**/*")]  # or
(2) [f.name for f in p.rglob("*")]

You can add if f.is_file() or if f.is_dir() to (1) or (2) if you want to target files only or directories only, respectively. Or replace "*" with some pattern like "*.txt" if you want to target .txt files only.

See this quick guide.

Moussaka answered 6/1, 2020 at 2:14 Comment(3)
What is the point in using list comprehension? How does that complement the existing answers?Lapful
I was looking at the other answers that were printing the results, so I was offering it as an alternative. But you're right, the original post doesn't make it explicit it's needed.Moussaka
We might as well add generator expressions as well.Expurgatory

© 2022 - 2024 — McMap. All rights reserved.