Multiple file-extensions searchPattern for System.IO.Directory.GetFiles
Asked Answered
R

22

170

What is the syntax for setting multiple file-extensions as searchPattern on Directory.GetFiles()? For example filtering out files with .aspx and .ascx extensions.

// TODO: Set the string 'searchPattern' to only get files with
// the extension '.aspx' and '.ascx'.
var filteredFiles = Directory.GetFiles(path, searchPattern);

Update: LINQ is not an option, it has to be a searchPattern passed into GetFiles, as specified in the question.

Roots answered 12/8, 2011 at 11:42 Comment(5)
I don't think there is any. Either list all files and then filter manually or perform a union on multiple searcher. But I'm pretty sure I have seen this exact question on SO before.Googly
#3527703Googly
Previously asked and answered here: #163662Fenderson
Why would LINQ ever not be an option? It's a very common library in .NET and should be used when necessary.Khmer
@MarkEntingh Using LINQ requires the code to iterate all files on disk and filter it "client-side". While providing a string (or similar) to something on OS API-level might have the chance to use some kind of smart cache in filesystem or OS, to improve performance.Roots
S
45

I believe there is no "out of the box" solution, that's a limitation of the Directory.GetFiles method.

It's fairly easy to write your own method though, here is an example.

The code could be:

/// <summary>
/// Returns file names from given folder that comply to given filters
/// </summary>
/// <param name="SourceFolder">Folder with files to retrieve</param>
/// <param name="Filter">Multiple file filters separated by | character</param>
/// <param name="searchOption">File.IO.SearchOption, 
/// could be AllDirectories or TopDirectoryOnly</param>
/// <returns>Array of FileInfo objects that presents collection of file names that 
/// meet given filter</returns>
public string[] getFiles(string SourceFolder, string Filter, 
 System.IO.SearchOption searchOption)
{
 // ArrayList will hold all file names
ArrayList alFiles = new ArrayList();

 // Create an array of filter string
 string[] MultipleFilters = Filter.Split('|');

 // for each filter find mathing file names
 foreach (string FileFilter in MultipleFilters)
 {
  // add found file names to array list
  alFiles.AddRange(Directory.GetFiles(SourceFolder, FileFilter, searchOption));
 }

 // returns string array of relevant file names
 return (string[])alFiles.ToArray(typeof(string));
}
Socialminded answered 12/8, 2011 at 11:47 Comment(1)
This is a very insufficient way of doing it, since you will loop entire directory for each filter. Instead you should check for each file if it has the filter then add to do the list. You can use the answer explained in this thread: #3754618Dextran
I
208
var filteredFiles = Directory
    .GetFiles(path, "*.*")
    .Where(file => file.ToLower().EndsWith("aspx") || file.ToLower().EndsWith("ascx"))
    .ToList();

Edit 2014-07-23

You can do this in .NET 4.5 for a faster enumeration:

var filteredFiles = Directory
    .EnumerateFiles(path) //<--- .NET 4.5
    .Where(file => file.ToLower().EndsWith("aspx") || file.ToLower().EndsWith("ascx"))
    .ToList();

Directory.EnumerateFiles in MSDN

Isobel answered 12/8, 2011 at 11:47 Comment(13)
+1 beat me to it :) does the double pipe need another enclosing bracket though?Bovid
Are you sure that's correct? The GetFiles method returns an array of FileInfo objects, not strings. How can you apply the EndsWith method?Rebirth
@Mario Vernari: GetFiles returns string[].Isobel
@Jgauffin: You are right! Sorry, I mismatched with DirectoryInfo. I have corrected my post as well.Rebirth
You must remove the * from the EndsWith() argument, it doesn't do wildcard matches.Westward
if compare extensions of file it will return exact match like '.Where(file => new FileInfo(file).Extension.Equals(".aspx") || new FileInfo(file).Extension.Equals(".ascx"))'Lubricate
@Lubricate Indeed but I think it will be heavierStigmasterol
Don't forget the new .NET4 Directory.EnumerateFiles for a performance boost... #5670117Puritan
And you can always use file.EndsWith("...", StringComparison.InvariantCultureIgnoreCase); rather than ToLowerPuritan
Last comment -- I think combining multiple searches is much faster: "*.ext1;*.ext2".Split(';').SelectMany(g => Directory.GetFiles(path, g)).ToList()Puritan
@drzaus: Why don't you put that in a new answer. I would upvote it.Isobel
@Isobel done, thank you. i <3 points. https://mcmap.net/q/87292/-multiple-file-extensions-searchpattern-for-system-io-directory-getfilesPuritan
Path.GetExtension() is better for getting file extension, at least you shoud add dot before extension type like ".aspx"Caliginous
P
50

I like this method, because it is readable and avoids multiple iterations of the directory:

var allowedExtensions = new [] {".doc", ".docx", ".pdf", ".ppt", ".pptx", ".xls", ".xslx"}; 
var files = Directory
    .GetFiles(folder)
    .Where(file => allowedExtensions.Any(file.ToLower().EndsWith))
    .ToList();
Pacific answered 6/5, 2015 at 16:23 Comment(3)
I like this a lot better because I don't have to parse my extension array and add it to regex or other manual work. Thanks!Heathenry
@Jodrell, or simply a HashSet<string>Guarded
HashSet<string> instead of an array for the extension makes no sense here, as the number of extensions is limited and the array gets iterated for each file, until EndsWith() gets true. If the method needs to be tuned for performance for a very large number of extensions, a Hashset could be used. In order to take effect, the extension of each file would then need to be matched explicitly (split, then match) instead of the EndsWith()-method. This will harm readibility and will be of no significant use in most if not all real life use cases. I therefor rolled back the community edit.Pacific
S
45

I believe there is no "out of the box" solution, that's a limitation of the Directory.GetFiles method.

It's fairly easy to write your own method though, here is an example.

The code could be:

/// <summary>
/// Returns file names from given folder that comply to given filters
/// </summary>
/// <param name="SourceFolder">Folder with files to retrieve</param>
/// <param name="Filter">Multiple file filters separated by | character</param>
/// <param name="searchOption">File.IO.SearchOption, 
/// could be AllDirectories or TopDirectoryOnly</param>
/// <returns>Array of FileInfo objects that presents collection of file names that 
/// meet given filter</returns>
public string[] getFiles(string SourceFolder, string Filter, 
 System.IO.SearchOption searchOption)
{
 // ArrayList will hold all file names
ArrayList alFiles = new ArrayList();

 // Create an array of filter string
 string[] MultipleFilters = Filter.Split('|');

 // for each filter find mathing file names
 foreach (string FileFilter in MultipleFilters)
 {
  // add found file names to array list
  alFiles.AddRange(Directory.GetFiles(SourceFolder, FileFilter, searchOption));
 }

 // returns string array of relevant file names
 return (string[])alFiles.ToArray(typeof(string));
}
Socialminded answered 12/8, 2011 at 11:47 Comment(1)
This is a very insufficient way of doing it, since you will loop entire directory for each filter. Instead you should check for each file if it has the filter then add to do the list. You can use the answer explained in this thread: #3754618Dextran
G
31

GetFiles can only match a single pattern, but you can use Linq to invoke GetFiles with multiple patterns:

FileInfo[] fi = new string[]{"*.txt","*.doc"}
    .SelectMany(i => di.GetFiles(i, SearchOption.AllDirectories))
    .ToArray();

See comments section here: http://www.codeproject.com/KB/aspnet/NET_DirectoryInfo.aspx

Gobo answered 12/8, 2011 at 11:50 Comment(2)
They'll collide if the patterns overlap. E.g., new string[]{"*.txt","filename.*"}. However, the call to Distinct doesn't actually resolve this problem, since FileInfo objects compare using reference equality, not semantic equality. It could be fixed by either removing the Distinct or passing it an IEqualityComparer<FileInfo>. Edited to do the former.Carious
I would think that SelectMany will iterate over the same file structure again (and again) so it might be sub-optimal in terms of performance.Lasagne
P
16
var filteredFiles = Directory
    .EnumerateFiles(path, "*.*") // .NET4 better than `GetFiles`
    .Where(
        // ignorecase faster than tolower...
        file => file.ToLower().EndsWith("aspx")
        || file.EndsWith("ascx", StringComparison.OrdinalIgnoreCase))
    .ToList();

Or, it may be faster to split and merge your globs (at least it looks cleaner):

"*.ext1;*.ext2".Split(';')
    .SelectMany(g => Directory.EnumerateFiles(path, g))
    .ToList();
Puritan answered 12/11, 2013 at 15:42 Comment(1)
and reposting on "original" question with more detail -- #163662Puritan
G
15

I fear you will have to do somthing like this, I mutated the regex from here.

var searchPattern = new Regex(
    @"$(?<=\.(aspx|ascx))", 
    RegexOptions.IgnoreCase);
var files = Directory.EnumerateFiles(path)
    .Where(f => searchPattern.IsMatch(f))
    .ToList();
Guarded answered 12/8, 2011 at 12:7 Comment(1)
this seems to be a nice approach, the missing part is to have a tested (working) regular expressionStigmasterol
B
12

The easy-to-remember, lazy and perhaps imperfect solution:

Directory.GetFiles(dir, "*.dll").Union(Directory.GetFiles(dir, "*.exe"))
Bankable answered 28/3, 2016 at 16:28 Comment(0)
R
4

I would use the following:

var ext = new string[] { ".ASPX", ".ASCX" };
FileInfo[] collection = (from fi in new DirectoryInfo(path).GetFiles()
                         where ext.Contains(fi.Extension.ToUpper())
                         select fi)
                         .ToArray();

EDIT: corrected due mismatch between Directory and DirectoryInfo

Rebirth answered 12/8, 2011 at 11:55 Comment(0)
D
3

I would try to specify something like

var searchPattern = "as?x";

it should work.

Dorwin answered 12/8, 2011 at 11:48 Comment(1)
Hah! I was afraid that aspx and ascx was too similar and would render a hack-solution like this. I want something general.Roots
P
3

A more efficient way of getting files with the extensions ".aspx" and ".ascx" that avoids querying the file system several times and avoids returning a lot of undesired files, is to pre-filter the files by using an approximate search pattern and to refine the result afterwards:

var filteredFiles = Directory.GetFiles(path, "*.as?x")
    .Select(f => f.ToLowerInvariant())
    .Where(f => f.EndsWith("px") || f.EndsWith("cx"))
    .ToList();
Ptolemaeus answered 9/10, 2015 at 14:40 Comment(0)
B
3

You can do it like this

new DirectoryInfo(path).GetFiles().Where(Current => Regex.IsMatch(Current.Extension, "\\.(aspx|ascx)", RegexOptions.IgnoreCase)
Berte answered 6/5, 2020 at 8:30 Comment(2)
In Question is: LINQ is not an option, so this answer is not usefulDzoba
Many questions here use LINQ, even the highest voted one, so tbh this comment isn't very usefulOrchestral
S
2
    /// <summary>
    /// Returns the names of files in a specified directories that match the specified patterns using LINQ
    /// </summary>
    /// <param name="srcDirs">The directories to seach</param>
    /// <param name="searchPatterns">the list of search patterns</param>
    /// <param name="searchOption"></param>
    /// <returns>The list of files that match the specified pattern</returns>
    public static string[] GetFilesUsingLINQ(string[] srcDirs,
         string[] searchPatterns,
         SearchOption searchOption = SearchOption.AllDirectories)
    {
        var r = from dir in srcDirs
                from searchPattern in searchPatterns
                from f in Directory.GetFiles(dir, searchPattern, searchOption)
                select f;

        return r.ToArray();
    }
Skeptical answered 15/12, 2011 at 7:32 Comment(0)
I
2
    public static bool CheckFiles(string pathA, string pathB)
    {
        string[] extantionFormat = new string[] { ".war", ".pkg" };
        return CheckFiles(pathA, pathB, extantionFormat);
    }
    public static bool CheckFiles(string pathA, string pathB, string[] extantionFormat)
    {
        System.IO.DirectoryInfo dir1 = new System.IO.DirectoryInfo(pathA);
        System.IO.DirectoryInfo dir2 = new System.IO.DirectoryInfo(pathB);
        // Take a snapshot of the file system. list1/2 will contain only WAR or PKG 
        // files
        // fileInfosA will contain all of files under path directories 
        FileInfo[] fileInfosA = dir1.GetFiles("*.*", 
                              System.IO.SearchOption.AllDirectories);
        // list will contain all of files that have ..extantion[]  
        // Run on all extantion in extantion array and compare them by lower case to 
        // the file item extantion ...
        List<System.IO.FileInfo> list1 = (from extItem in extantionFormat
                                          from fileItem in fileInfosA
                                          where extItem.ToLower().Equals 
                                          (fileItem.Extension.ToLower())
                                          select fileItem).ToList();
        // Take a snapshot of the file system. list1/2 will contain only WAR or  
        // PKG files
        // fileInfosA will contain all of files under path directories 
        FileInfo[] fileInfosB = dir2.GetFiles("*.*", 
                                       System.IO.SearchOption.AllDirectories);
        // list will contain all of files that have ..extantion[]  
        // Run on all extantion in extantion array and compare them by lower case to 
        // the file item extantion ...
        List<System.IO.FileInfo> list2 = (from extItem in extantionFormat
                                          from fileItem in fileInfosB
                                          where extItem.ToLower().Equals            
                                          (fileItem.Extension.ToLower())
                                          select fileItem).ToList();
        FileCompare myFileCompare = new FileCompare();
        // This query determines whether the two folders contain 
        // identical file lists, based on the custom file comparer 
        // that is defined in the FileCompare class. 
        return list1.SequenceEqual(list2, myFileCompare);
    }
Inappreciative answered 28/3, 2013 at 7:55 Comment(0)
C
2

Instead of the EndsWith function, I would choose to use the Path.GetExtension() method instead. Here is the full example:

var filteredFiles = Directory.EnumerateFiles( path )
.Where(
    file => Path.GetExtension(file).Equals( ".aspx", StringComparison.OrdinalIgnoreCase ) ||
            Path.GetExtension(file).Equals( ".ascx", StringComparison.OrdinalIgnoreCase ) );

or:

var filteredFiles = Directory.EnumerateFiles(path)
.Where(
    file => string.Equals( Path.GetExtension(file), ".aspx", StringComparison.OrdinalIgnoreCase ) ||
            string.Equals( Path.GetExtension(file), ".ascx", StringComparison.OrdinalIgnoreCase ) );

(Use StringComparison.OrdinalIgnoreCase if you care about performance: MSDN string comparisons)

Cytochemistry answered 12/10, 2015 at 12:25 Comment(0)
L
1

look like this demo:

void Main()
{
    foreach(var f in GetFilesToProcess("c:\\", new[] {".xml", ".txt"}))
        Debug.WriteLine(f);
}
private static IEnumerable<string> GetFilesToProcess(string path, IEnumerable<string> extensions)
{
   return Directory.GetFiles(path, "*.*")
       .Where(f => extensions.Contains(Path.GetExtension(f).ToLower()));
}
Lenora answered 8/11, 2013 at 10:10 Comment(1)
You have Path.GetExtension which you can use.Isobel
D
1

@Daniel B, thanks for the suggestion to write my own version of this function. It has the same behavior as Directory.GetFiles, but supports regex filtering.

string[] FindFiles(FolderBrowserDialog dialog, string pattern)
    {
        Regex regex = new Regex(pattern);

        List<string> files = new List<string>();
        var files=Directory.GetFiles(dialog.SelectedPath);
        for(int i = 0; i < files.Count(); i++)
        {
            bool found = regex.IsMatch(files[i]);
            if(found)
            {
                files.Add(files[i]);
            }
        }

        return files.ToArray();
    }

I found it useful, so I thought I'd share.

Drawbar answered 2/7, 2016 at 16:22 Comment(0)
C
1

c# version of @qfactor77's answer. This is the best way without LINQ .

string[] wildcards= {"*.mp4", "*.jpg"};
ReadOnlyCollection<string> filePathCollection = FileSystem.GetFiles(dirPath, Microsoft.VisualBasic.FileIO.SearchOption.SearchAllSubDirectories, wildcards);
string[] filePath=new string[filePathCollection.Count];
filePathCollection.CopyTo(filePath,0);

now return filePath string array. In the beginning you need

using Microsoft.VisualBasic.FileIO;
using System.Collections.ObjectModel;

also you need to add reference to Microsoft.VisualBasic

Collimate answered 19/3, 2018 at 18:53 Comment(0)
M
1

I did a simple way for seach as many extensions as you need, and with no ToLower(), RegEx, foreach...

List<String> myExtensions = new List<String>() { ".aspx", ".ascx", ".cs" }; // You can add as many extensions as you want.
DirectoryInfo myFolder = new DirectoryInfo(@"C:\FolderFoo");
SearchOption option = SearchOption.TopDirectoryOnly; // Use SearchOption.AllDirectories for seach in all subfolders.
List<FileInfo> myFiles = myFolder.EnumerateFiles("*.*", option)
    .Where(file => myExtensions
    .Any(e => String.Compare(file.Extension, e, CultureInfo.CurrentCulture, CompareOptions.IgnoreCase) == 0))
    .ToList();

Working on .Net Standard 2.0.

Morbidezza answered 14/2, 2019 at 16:58 Comment(0)
F
0
var filtered = Directory.GetFiles(path)
    .Where(file => file.EndsWith("aspx", StringComparison.InvariantCultureIgnoreCase) || file.EndsWith("ascx", StringComparison.InvariantCultureIgnoreCase))
    .ToList();
Foolproof answered 9/9, 2015 at 11:41 Comment(1)
Add additional explanation for the code. It might help OP understand your answer better.Bellaude
E
0

Just would like to say that if you use FileIO.FileSystem.GetFiles instead of Directory.GetFiles, it will allow an array of wildcards.

For example:

Dim wildcards As String() = {"*.html", "*.zip"}
Dim ListFiles As List(Of String) = FileIO.FileSystem.GetFiles(directoryyouneed, FileIO.SearchOption.SearchTopLevelOnly, wildcards).ToList
Engrossment answered 20/3, 2017 at 12:37 Comment(2)
Where does one acquire FileIO ?Extradite
It should be already included in your environment in Visual Studio (2015). It is part of Microsoft.VisualBasic namespace. In my case is VisualBasic because that's my language of choice.Engrossment
S
0

(Sorry to write this as an answer, but I don't have privileges to write comments yet.)

Note that the FileIO.FileSystem.GetFiles() method from Microsoft.VisualBasic is just a wrapper to execute a search for each provided pattern and merge the results. When checking the source from the .pbd file, you can see from this fragment FileSystem.FindPaths is executed for each pattern in the collection:

private static void FindFilesOrDirectories(
  FileSystem.FileOrDirectory FileOrDirectory,
  string directory,
  SearchOption searchType,
  string[] wildcards,
  Collection<string> Results)
{
    // (...)
    string[] strArray = wildcards;
    int index = 0;
    while (index < strArray.Length)
    {
      string wildCard = strArray[index];
      FileSystem.AddToStringCollection(Results, FileSystem.FindPaths(FileOrDirectory, directory, wildCard));
      checked { ++index; }
    }
    // (...)
}
Schwann answered 25/5, 2021 at 11:41 Comment(0)
G
0

According to jonathan's answer (for 2 file extensions):

public static string[] GetFilesList(string dir) =>
    Directory.GetFiles(dir, "*.exe", SearchOption.AllDirectories)
    .Union(Directory.GetFiles(dir, "*.dll", SearchOption.AllDirectories)).ToArray();
    

Or for more file extensions (search in this folder and subfolders):

public static List<string> GetFilesList(string dir, params string[] fileExtensions) {
    List<string> files = new List<string>();
    foreach (string fileExtension in fileExtensions) {
        files.AddRange(Directory.GetFiles(dir, fileExtension, SearchOption.AllDirectories));
    }
    return files;
}

List<string> files = GetFilesList("C:\\", "*.exe", "*.dll");

In 3250 files took to find 1890 files takes 0.6 second.

Gusto answered 18/12, 2022 at 15:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.