I have huge set of files that I want to traverse through using python. I am using os.walk(source) for the same and is working but since I have a huge set of files it is taking too much and memory resources since its getting the complete list all at once. How can I optimize this to use less resources and may be walk through one directory at a time or in some other efficient manner and still able to iterate the complete set of files. Thanks
for dir, dirnames, filenames in os.walk(START_FOLDER):
for name in dirnames:
#if PRIVATE_FOLDER not in name:
for keyword in FOLDER_WITH_KEYWORDS_DELETION_EXCEPTION_LIST:
if keyword in name.lower():
ignoreList.append(name)
os.walk
already returns a generator, which is lazy. Are you turning it into a list or something? Because if not, it should not cause memory issues. (Also, post your code.) – Horizonlen(FOLDER_WITH_KEYWORDS_DELETION_EXCEPTION_LIST)
? You can hoist thename.lower()
out of the innermost loop, which can help if the keywords list is very large. – Ns