Fast way to enumerate all files including sub-folders
Asked Answered
B

4

5

Does anyone know of a faster way to enumerate through a directory and sub-folders to gather all the files in the enumeration? This is what I have right now:

Public Shared allFiles() As String
allFiles = Directory.GetFiles(<ServerLocation>, "*.*", SearchOption.AllDirectories)

Thanks! JFV

EDIT: I am enumerating these files from a server location. I don't know if that will change the perspective of this question or not. Thanks for all the input so far!

Betancourt answered 28/6, 2009 at 4:28 Comment(0)
M
5

Short answer:

If this code is functionally correct for your project and you haven't proved it to be a problem with a profiler then don't change it. Continue to use a functionally correct solution until you prove it to be slow.

Long answer:

How fast or slow this particular piece of code is depends on a lot of factors. Many of which will depend on the specific machine you are running on (for instance hard drive speed). Looking at code that involves the file system and nothing else, it's very difficult to say "x is faster than y" with any degree of certainty.

In this case, I can only really comment on one thing. The return type of this method is an array of FileInfo values. Arrays require contiguous memory and very large arrays can cause fragmentation issues in your heap. If you have extremely large directories that you are reading it could lead to heap fragmentiation and indirectly performance issues.

If that turns out to be a problem then you can PInvoke into FindFirstFile / FindNextFile and get them one at a time. The result will be likely functionally slower in CPU cycles but will have less memory pressure.

But I must stress that you should prove these are problems before you fix them.

Morphine answered 28/6, 2009 at 4:39 Comment(1)
Yet at the same time, anyone who has tried to list all files in a large directory with .NET will know that it IS a performance problem. Listing 1000 folder names, JUST THE FOLDER NAMES, can take almost a full second.Spano
J
3

using System.Collections.Generic;

private static List<string> GetFilesRecursive(string b)
{

             // 1.
            // Store results in the file results list.
            List<string> result = new List<string>();

            // 2.
            // Store a stack of our directories.
            Stack<string> stack = new Stack<string>();

            // 3.
            // Add initial directory.
            stack.Push(b);

            // 4.
            // Continue while there are directories to process
            while (stack.Count > 0)
            {
                // A.
                // Get top directory
                string dir = stack.Pop();

                try
                {
                    // B
                    // Add all files at this directory to the result List.
                    result.AddRange(Directory.GetFiles(dir, "*.*"));

                    // C
                    // Add all directories at this directory.
                    foreach (string dn in Directory.GetDirectories(dir))
                    {
                        stack.Push(dn);
                    }
                }
                catch
                {
                    // D
                    // Could not open the directory
                }
            }
            return result;
        }

Props to the original article: http://www.codeproject.com/KB/cs/workerthread.aspx

Jansen answered 16/11, 2010 at 15:41 Comment(0)
E
0

This is a crude way of doing it.

dir /s /b

Get the output of this into a text file, read it & split by \r\n.
Run the above command in a specific directory, to see if it helps.

To only get the directories

dir /s /b /ad

To only get the files

dir /s /b /a-d

EDIT: Jared is right in saying not to use other approaches unless your approach is proved slow.

Erasme answered 28/6, 2009 at 5:40 Comment(1)
Could you please clarify why you think invoking dir and reading the output would be faster than a Directory.GetFiles call?Gean
M
0

Heres my solution. The initial start up is a little slow, im working on that. The my.computer.filesystem object is probabbly the problem for the slow start up. But this method will list 31,000 files in under 5 minutes over a Network.

Imports System.Threading

Public Class ThreadWork

Public Shared Sub DoWork()
    Dim i As Integer = 1
    For Each File As String In My.Computer.FileSystem.GetFiles("\\172.16.1.66\usr2\syscon\S4_650\production\base_prog", FileIO.SearchOption.SearchTopLevelOnly, "*.S4")
        Console.WriteLine(i & ". " & File)
        i += 1
    Next
End Sub 'DoWork
End Class 'ThreadWork

Module Module1

Sub Main()
    Dim myThreadDelegate As New ThreadStart(AddressOf ThreadWork.DoWork)
    Dim myThread As New Thread(myThreadDelegate)
    myThread.Start()
    '"Pause" the console to read the data.
    Console.ReadLine()
End Sub 'Main

End Module

.......

Madewell answered 6/12, 2011 at 14:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.