How can I make GetFiles() exclude files with extensions that start with the search extension?
Asked Answered
I

4

13

I am using the following line to return specific files...

FileInfo file in nodeDirInfo.GetFiles("*.sbs", option)

But there are other files in the directory with the extension .sbsar, and it is getting them, too. How can I differentiate between .sbs and .sbsar in the search pattern?

Imparisyllabic answered 27/11, 2013 at 11:13 Comment(2)
You cannot (at least with GetFiles/GetDirectories). This is a "limitation" of the search pattern. You should iterate through the results and filter manually the ones you want.Joses
also take a look at this: msdn.microsoft.com/en-us/library/wz42302f(v=vs.110).aspxSwoosh
Y
7

Try this, filtered using file extension.

  FileInfo[] files = nodeDirInfo.GetFiles("*", SearchOption.TopDirectoryOnly).
            Where(f=>f.Extension==".sbs").ToArray<FileInfo>();
Yeargain answered 27/11, 2013 at 11:30 Comment(0)
L
9

The issue you're experiencing is a limitation of the search pattern, in the Win32 API.

A searchPattern with a file extension (for example *.txt) of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern.

My solution is to manually filter the results, using Linq:

nodeDirInfo.GetFiles("*.sbs", option).Where(s => s.EndsWith(".sbs"),
    StringComparison.InvariantCultureIgnoreCase));
Leda answered 27/11, 2013 at 11:18 Comment(1)
You don't account for letter case here.Paramedic
Y
7

Try this, filtered using file extension.

  FileInfo[] files = nodeDirInfo.GetFiles("*", SearchOption.TopDirectoryOnly).
            Where(f=>f.Extension==".sbs").ToArray<FileInfo>();
Yeargain answered 27/11, 2013 at 11:30 Comment(0)
P
6

That's the behaviour of the Win32 API (FindFirstFile) that is underneath GetFiles() being reflected on to you.

You'll need to do your own filtering if you must use GetFiles(). For instance:

GetFiles("*", searchOption).Where(s => s.EndsWith(".sbs", 
    StringComparison.InvariantCultureIgnoreCase));

Or more efficiently:

EnumerateFiles("*", searchOption).Where(s => s.EndsWith(".sbs", 
    StringComparison.InvariantCultureIgnoreCase));

Note that I use StringComparison.InvariantCultureIgnoreCase to deal with the fact that Windows file names are case-insensitive.

If performance is an issue, that is if the search has to process directories with large numbers of files, then it is more efficient to perform the filtering twice: once in the call to GetFiles or EnumerateFiles, and once to clean up the unwanted file names. For example:

GetFiles("*.sbs", searchOption).Where(s => s.EndsWith(".sbs", 
    StringComparison.InvariantCultureIgnoreCase));
EnumerateFiles("*.sbs", searchOption).Where(s => s.EndsWith(".sbs", 
    StringComparison.InvariantCultureIgnoreCase));
Paramedic answered 27/11, 2013 at 11:17 Comment(5)
@Joey That just feels a little dirty to me, duplicating the filter. But perhaps it would have a perf implication. If not then I'd rather have just the one filter.Paramedic
It's faster, though ;-) In my small test here (running over our complete source folder, searching for *.cpp) it's about 10–25 % faster to specify the filter in GetFiles too. EnumerateFiles is slightly slower, but probably uses much less memory, especially for large result sets.Croquet
@Croquet Yes, I think that's reasonable. I guess it comes down to a balance between perf and purity! I've covered this in the answer now.Paramedic
Oh, and I guess .EndsWith(".sbs", StringComparison.InvariantCultureIgnoreCase) would be a better option that's also resistant to culture, as the file system ignores the culture for its case-insensitivity.Croquet
@Croquet Thanks. Showing my ignorance with ToLower()!Paramedic
C
1

Its mentioned in docs

When using the asterisk wildcard character in a searchPattern,a searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters.When using the question mark wildcard character, this method returns only files that match the specified file extension.

Chubby answered 27/11, 2013 at 11:20 Comment(4)
It would be marvellous if this would be true. Unfortunately, it is not and and here comes a new episode of the poor descriptions of searchPattern in MSDN :) I felt curious and did some tests and here come my conclusions...Joses
@varocarbas indeed..wonder where to use ?.OP can use *a?.sbs..Though that would require a to be somewhere in the file nameChubby
nodeDirInfo.GetFiles("5?.txt"); returns any file with just .txt (not .txtwhatever) containing two characters in the name, one of them being a 5. nodeDirInfo.GetFiles("?.txt"); Returns any .txt file with just one character in its name (not including .txtwhatever). You can get only *.txt by using a ????.txt approach if you know the maximum length of the file names you are looking for (??.txt retuns all the files with 1 or 2 characters in its name; ???.txt all the ones with 1,2 and 3, etc.).Joses
this was the answer I was hoping would work. But '?.sbs' returned nothing and '*?.sbs" returned all files with 'sbs' in the extension. The only thing these file names have in common is the extension. I imagine that would be the case with many such searches. I agree with varocarbas, the docs are not clear.Imparisyllabic

© 2022 - 2024 — McMap. All rights reserved.