Excel VBA efficient get file names function
Asked Answered
D

3

8

I need to get a collection of file names from a folder on a remote server using VBA in excel 2010. I have a function that works and in the majority of cases it would do the job, however the remote server frequently has terrible, terrible network performance issues. This means that looping through say 300 files to put their names into a collection can take 10 minutes, the number of files in the folder is likely to grow to thousands so this is not workable, I need a way to get all of the file names in a single network request and not looping. I believe its connecting to the remote server that is taking the time so a single request should be able to get all of the files in one pass fairly quickly.

This is the function I currently have in place:

Private Function GetFileNames(sPath As String) As Collection
'takes a path and returns a collection of the file names in the folder

Dim oFolder     As Object
Dim oFile       As Object
Dim oFSO        As Object
Dim colList     As New Collection

Set oFSO = CreateObject("Scripting.FileSystemObject")
Set oFolder = oFSO.GetFolder(folderpath:=sPath)

For Each oFile In oFolder.Files
    colList.Add oFile.Name
Next oFile

Set GetFileNames = colList

Set oFolder = Nothing
Set oFSO = Nothing

End Function
Drowse answered 8/10, 2014 at 17:34 Comment(1)
+ 1 Good question :) You almost had me thinking!Famine
D
2

Ok, I have found a solution that works for my situation and perhaps others will find it useful too. This soution uses the windows API and gets me the filenames in 1 second or less where as the FSO method was taking several minutes. It still involves a loop so i'm not certain why it is so much faster but it is.

This takes a path like "c:\windows\" and returns a collection of all the files (and directories) in that folder. The exact parameters I have used require windows 7 or newer, see the comments in the declarations.

'for windows API call to FindFirstFileEx
Private Const INVALID_HANDLE_VALUE = -1
Private Const MAX_PATH = 260

Private Type FILETIME
    dwLowDateTime   As Long
    dwHighDateTime  As Long
End Type

Private Type WIN32_FIND_DATA
    dwFileAttributes    As Long
    ftCreationTime      As FILETIME
    ftLastAccessTime    As FILETIME
    ftLastWriteTime     As FILETIME
    nFileSizeHigh       As Long
    nFileSizeLow        As Long
    dwReserved0         As Long
    dwReserved1         As Long
    cFileName           As String * MAX_PATH
    cAlternate          As String * 14
End Type

Private Const FIND_FIRST_EX_CASE_SENSITIVE  As Long = 1
'MSDN: "This value is not supported until Windows Server 2008 R2 and Windows 7."
Private Const FIND_FIRST_EX_LARGE_FETCH     As Long = 2

Private Enum FINDEX_SEARCH_OPS
    FindExSearchNameMatch
    FindExSearchLimitToDirectories
    FindExSearchLimitToDevices
End Enum

Private Enum FINDEX_INFO_LEVELS
    FindExInfoStandard
    FindExInfoBasic 'MSDN: "This value is not supported until Windows Server 2008 R2 and Windows 7."
    FindExInfoMaxInfoLevel
End Enum

Private Declare Function FindFirstFileEx Lib "kernel32" Alias "FindFirstFileExA" ( _
ByVal lpFileName As String, ByVal fInfoLevelId As Long, lpFindFileData As WIN32_FIND_DATA, _
    ByVal fSearchOp As Long, ByVal lpSearchFilter As Long, ByVal dwAdditionalFlags As Long) As Long
Private Declare Function FindNextFile Lib "kernel32" Alias "FindNextFileA" ( _
    ByVal hFindFile As Long, lpFindFileData As WIN32_FIND_DATA) As Long
Private Declare Function FindClose Lib "kernel32" (ByVal hFindFile As Long) As Long


Private Function GetFiles(ByVal sPath As String) As Collection

    Dim fileInfo    As WIN32_FIND_DATA  'buffer for file info
    Dim hFile       As Long             'file handle
    Dim colFiles    As New Collection

    sPath = sPath & "*.*"

    hFile = FindFirstFileEx(sPath & vbNullChar, FindExInfoBasic, fileInfo, FindExSearchNameMatch, 0&, FIND_FIRST_EX_LARGE_FETCH)

    If hFile <> INVALID_HANDLE_VALUE Then
        Do While FindNextFile(hFile, fileInfo)
            colFiles.Add Left(fileInfo.cFileName, InStr(fileInfo.cFileName, vbNullChar) - 1)
        Loop

        FindClose hFile
    End If

    Set GetFiles = colFiles

End Function
Drowse answered 13/10, 2014 at 11:19 Comment(0)
G
9

This one is lightning fast:

  Sub filesTest()
    Dim x() As String
    x = Function_FileList("YOUR_PATH_AND_FOLDER_NAME")
    Debug.Print Join(x, vbCrLf)
  End Sub

Which calls this function:

 Function Function_FileList(FolderLocation As String)
    Function_FileList = Filter(Split(CreateObject("wscript.shell").exec("cmd /c Dir """ & FolderLocation & """ /b /a-d").stdout.readall, vbCrLf), ".")
 End Function
Goy answered 8/10, 2014 at 18:19 Comment(7)
+ 1 Simply beautiful!Famine
It's no faster if you have a slow network connection or lots of files. dir iterates internally, and running it through exec means it runs on your local machine and suffers the same network latency.Calotte
@KenWhite What if the above code is put in a batch file and the file is then copied to the remote folder and then run from there?Famine
@SiddharthRout: Unless you use rexec or something similar, it will still run on the local machine and retrieve the list over the network. The question was not about running a process on a remote machine and then transferring a file, however; it was about retrieving a directory list without iterating, and that simply isn't possible. (Raymond Chen wrote a series of articles about this issue as related to Explorer retrieving file lists over slow network connections, but I don't have a link to them handy.)Calotte
@KenWhite: Hmm, What if then not retrieving the list over network but outputting the list in the remote folder to a text file and then retrieving that textfile?Famine
@SiddharthRout: See my comment to your answer. :-)Calotte
@tbur: That's an interesting approach, I'd love to test it to compare the performance, however I cannot invoke CMD or any executable that isn't white-listed for my user account. I tried running the function but got a run-time error "This program is blocked by group policy". Any solution would have to be limited to the office object model, VBA, or a COM library that is already present.Drowse
D
2

Ok, I have found a solution that works for my situation and perhaps others will find it useful too. This soution uses the windows API and gets me the filenames in 1 second or less where as the FSO method was taking several minutes. It still involves a loop so i'm not certain why it is so much faster but it is.

This takes a path like "c:\windows\" and returns a collection of all the files (and directories) in that folder. The exact parameters I have used require windows 7 or newer, see the comments in the declarations.

'for windows API call to FindFirstFileEx
Private Const INVALID_HANDLE_VALUE = -1
Private Const MAX_PATH = 260

Private Type FILETIME
    dwLowDateTime   As Long
    dwHighDateTime  As Long
End Type

Private Type WIN32_FIND_DATA
    dwFileAttributes    As Long
    ftCreationTime      As FILETIME
    ftLastAccessTime    As FILETIME
    ftLastWriteTime     As FILETIME
    nFileSizeHigh       As Long
    nFileSizeLow        As Long
    dwReserved0         As Long
    dwReserved1         As Long
    cFileName           As String * MAX_PATH
    cAlternate          As String * 14
End Type

Private Const FIND_FIRST_EX_CASE_SENSITIVE  As Long = 1
'MSDN: "This value is not supported until Windows Server 2008 R2 and Windows 7."
Private Const FIND_FIRST_EX_LARGE_FETCH     As Long = 2

Private Enum FINDEX_SEARCH_OPS
    FindExSearchNameMatch
    FindExSearchLimitToDirectories
    FindExSearchLimitToDevices
End Enum

Private Enum FINDEX_INFO_LEVELS
    FindExInfoStandard
    FindExInfoBasic 'MSDN: "This value is not supported until Windows Server 2008 R2 and Windows 7."
    FindExInfoMaxInfoLevel
End Enum

Private Declare Function FindFirstFileEx Lib "kernel32" Alias "FindFirstFileExA" ( _
ByVal lpFileName As String, ByVal fInfoLevelId As Long, lpFindFileData As WIN32_FIND_DATA, _
    ByVal fSearchOp As Long, ByVal lpSearchFilter As Long, ByVal dwAdditionalFlags As Long) As Long
Private Declare Function FindNextFile Lib "kernel32" Alias "FindNextFileA" ( _
    ByVal hFindFile As Long, lpFindFileData As WIN32_FIND_DATA) As Long
Private Declare Function FindClose Lib "kernel32" (ByVal hFindFile As Long) As Long


Private Function GetFiles(ByVal sPath As String) As Collection

    Dim fileInfo    As WIN32_FIND_DATA  'buffer for file info
    Dim hFile       As Long             'file handle
    Dim colFiles    As New Collection

    sPath = sPath & "*.*"

    hFile = FindFirstFileEx(sPath & vbNullChar, FindExInfoBasic, fileInfo, FindExSearchNameMatch, 0&, FIND_FIRST_EX_LARGE_FETCH)

    If hFile <> INVALID_HANDLE_VALUE Then
        Do While FindNextFile(hFile, fileInfo)
            colFiles.Add Left(fileInfo.cFileName, InStr(fileInfo.cFileName, vbNullChar) - 1)
        Loop

        FindClose hFile
    End If

    Set GetFiles = colFiles

End Function
Drowse answered 13/10, 2014 at 11:19 Comment(0)
F
0

I thought that there would be an API which could get me the file names in a directory without looping but couldn't find it. All the code that I know of involves looping either using fso or dir.

So is it possible to get the file names without looping. I guess yes... Here is one way I can think of...

When you type the below command in DOS Prompt, the entire file structure is sent to a text file

Dir C:\Temp\*.* > C:\Temp\MyFile.Txt

Doing the above from VBA

Sub Sample()
    Dim sPath As String

    sPath = "C:\Temp\"

    '~~> DIR C:\Temp\*.* > C:\Temp\MyFile.txt
    retval = Shell("cmd.exe /c Dir " & sPath & "*.* > " & sPath & "MyFile.Txt")
End Sub

For example (This is what is stored in MyFile.Txt)

Volume in drive C is XXXXXXX
Volume Serial Number is XXXXXXXXX

Directory of C:\Temp

10/08/2014  11:28 PM    <DIR>          .
10/08/2014  11:28 PM    <DIR>          ..
10/08/2014  11:27 PM               832 aaa.txt
10/08/2014  11:28 PM                 0 bbb.txt
10/08/2014  11:26 PM                 0 New Bitmap Image.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_2.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_2_2.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_3.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_3_2.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_4.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_4_2.bmp
10/08/2014  11:26 PM                 0 New Bitmap Image_5.bmp
            10 File(s)            832 bytes
             2 Dir(s)  424,786,952,192 bytes free

So now all you need to do is copy the text file from the remote folder to your folder and simply parse it to get the file names.

Famine answered 8/10, 2014 at 18:10 Comment(7)
This still runs the dir command from the local machine and requests the file listing across the network. Running it via cmd.exe still executes it locally. You'd have to copy a batch file or script across the network, execute it remotely using rexec or something similar, and then transfer the resulting file across the network once that remote process finished (meaning you'd have to wait and poll for it to complete).Calotte
True but I guess this is the only option OP has for the time being?Famine
It's not going to be an improvement. :-) The overhead of launching the file via rexec, polling, and then transferring the text file (and then parsing the text file to get the file list) is going to have a performance impact.Calotte
I just had a thought... what if we give the full path of the cmd (of that machine) so that it doesn't run the cmd from the local machine?Famine
Won't work. cmd would still run on the local machine. In order to run it on the remote machine, you have to actually execute it on the remote machine (in that machine's own operating system), not the local one. Running it from the local machine with a remote path would simply result in the entire application (cmd.exe) being retrieved in order to execute on the local OS.Calotte
ah I see... I was not aware of that.Famine
@SiddharthRout: Thanks for this, however I cannot use CMD, see my comment to the answer above.Drowse

© 2022 - 2024 — McMap. All rights reserved.