Why is .NET's File.Open with a UNC path making excessive SMB calls?
Asked Answered
P

2

35

I have a block of code that needs to open and read a lot of small text files from a NAS server using UNC paths. This code is part of a module that was originally written in C++ but is now being converted to C#. The C# version is significantly slower. I determined that the call to open the file accounts for nearly all of the performance difference. Using WireShark I found that this is because the System.IO.File.Open call makes far more SMB network requests than similar C++ code.

The C++ code makes this call:

FILE *f = _wfsopen(fileName, L"r", _SH_DENYWR);

This results in the following sequence of SMB requests:

NT Create AndX Request, FID: 0x0004, Path: \\a\\i\\a\\q\\~141106162638847.nmd
NT Create AndX Response, FID: 0x0004
Trans2 Request, QUERY_FILE_INFO, FID: 0x0004, Query File Basic Info
Trans2 Response, FID: 0x0004, QUERY_FILE_INFO
Read AndX Request, FID: 0x0004, 1327 bytes at offset 0
Read AndX Response, FID: 0x0004, 1327 bytes
Close Request, FID: 0x0004
Close Response, FID: 0x0004
NT Create AndX Request, FID: 0x0005, Path: \\a\\i\\a\\q\\~141106162638847.nmd
NT Create AndX Response, FID: 0x0005

The C# code makes this call:

FileStream f = File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read);

This results in the following sequence of SMB requests:

Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i\\a\\q\\~141106162638847.nmd
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a\\i\\a\\q\\~141106162638847.nmd
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i\\a\\q\\~141106162638847.nmd
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: 
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: 
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a
Trans2 Response, FIND_FIRST2, Files: a
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a\\i
Trans2 Response, FIND_FIRST2, Files: i
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a\\i
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a\\i\\a
Trans2 Response, FIND_FIRST2, Files: a
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i\\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a\\i\\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a\\i\\a\\q
Trans2 Response, FIND_FIRST2, Files: q
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i\\a\\q\\~141106162638847.nmd
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a\\i\\a\\q\\~141106162638847.nmd
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i\\a\\q\\~141106162638847.nmd
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: 
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: 
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a
Trans2 Response, FIND_FIRST2, Files: a
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a\\i
Trans2 Response, FIND_FIRST2, Files: i
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a\\i
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a\\i\\a
Trans2 Response, FIND_FIRST2, Files: a
Trans2 Request, QUERY_PATH_INFO, Query File Basic Info, Path: \\a\\i\\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, QUERY_PATH_INFO, Query File Standard Info, Path: \\a\\i\\a
Trans2 Response, QUERY_PATH_INFO
Trans2 Request, FIND_FIRST2, Pattern: \\a\\i\\a\\q
Trans2 Response, FIND_FIRST2, Files: q
Close Request, FID: 0x000f
Close Response
NT Create AndX Request, FID: 0x0018, Path: \\a\\i\\a\\q\\~141106162638847.nmd
NT Create AndX Response, FID: 0x0018
Trans2 Request, QUERY_FILE_INFO, FID: 0x0018, Query File Basic Info
Trans2 Response, FID: 0x0018, QUERY_FILE_INFO
Read AndX Request, FID: 0x0018, 1327 bytes at offset 0
Read AndX Response, FID: 0x0018, 1327 bytes
Close Request, FID: 0x0018
Close Response, FID: 0x0018
NT Create AndX Request, FID: 0x0019, Path: \\a\\i\\a\\q\\~141106162638847.nmd
NT Create AndX Response, FID: 0x0019

Why does System.IO.File.Open make all these extra SMB requests? Is there any way to change this code to avoid all these extra requests?

Proparoxytone answered 25/11, 2014 at 21:42 Comment(7)
What do your file names look like? Can you maybe resolve the UNC part of the path to get an old-fashioned drive-letter-style path, and then use that for reading the files? (If they're all on the same server and there is a share that provides a drive letter - hmmm, not likely I guess or you wouldn't be using UNC.)Policeman
@RenniePet, it hadn't occurred to me to try a network drive letter style path. I tried that today and found that the results are the same: the .NET code still makes the same excessive set of SMB requests.Proparoxytone
I'd guess it is caused by ensuring a canonical name or in order to enforce security policies (which might require that canonical name).Diachronic
This isn't intended as an answer, but an off-the-wall idea, which is proabably not at all relevant, not knowing your exact situation. Is it really necessry to read all these small files over the network? Could you have a small Windows service program on the server where these files reside that when triggered, or once a day, reads all of the small files and puts them in a .zip file?Policeman
@Policeman thanks for the suggestion, but without going into implementation details, this isn't something we can do :( unfortunately.Vierno
You are almost certainly seeing the side-effects of the FileIOPermission.Demand() that File.Open() makes. NAS drives live in the Intranet zone. Exactly how that's discovered is very obscure, the SSCLI20 distribution is usually a good source for CLR implementation details but CAS has been stubbed out.Houser
Related to this question.Harvester
R
12

In short, File.Open calls new FileStream() and new FileStream() does a lot of calls:

  1. NormalisePath.

    String filePath = Path.NormalizePath(path, true, maxPath); // fullCheck: true
    

leads to this code:

1.a: Get full path:

    if (fullCheck) { ... 
        result = newBuffer.GetFullPathName();

GetFullPathName() calls Win32Native.GetFullPathName one or two times (depending on the lentgh of resulting path).

1.b. Trying to expand short path. Your path contains ~ char, so it looks like a candidate for a path expanding:

    if (mightBeShortFileName) {
        bool r = newBuffer.TryExpandShortFileName();

as a result, Win32Native.GetLongPathName() is called.

  1. FileIoPermission.Demand() (for non-trusted only):

    // All demands in full trust domains are no-ops, so skip 
    if (!CodeAccessSecurityEngine.QuickCheckForAllDemands()) {
        ...
        new FileIOPermission(secAccess, control, new String[] { filePath }, false, false).Demand();
    
  2. Open fileStream (floppy strikes back;)):

     // Don't pop up a dialog for reading from an emtpy floppy drive
    int oldMode = Win32Native.SetErrorMode(Win32Native.SEM_FAILCRITICALERRORS);
    try {
        ...
        _handle = Win32Native.SafeCreateFile(tempPath, fAccess, share, secAttrs, mode, flagsAndAttributes, IntPtr.Zero);
    
  3. Win32Native.GetFileType()

Not all of them would lead to smb request, but some will do. I've tried to reproduce chatty requests by debugging with source step-by-step (here's manual for enabling the .net source debugging) and checking the log after each step. Resuts are more similar to your's first listing. If you're really interested in finding the real issue, you'll have to do it yourself.

UPD Note that I've checked current (.net 4.5.2) behavior. It was changed multiple times since 2.0 (e.g. FileIOPermission.Demand() originally was called for full-trusted code too), so it depends:)

Rockbound answered 3/12, 2014 at 6:58 Comment(2)
Is there any way short of wrapping native code to get new FileStream to avoid some of these unneded steps? My code is running in the context of ASP.NET.Proparoxytone
@BradVoy, CodesInChaos already replied below: you had to get the file handle (IntPtr or, preferably, SafeFileHandle) somehow and to pass it into appropriate FileStream .ctor. If you want to control handle's lifetime by yourself, pass ownsHandle: false into SafeFileHandle .ctor. Here's example.Rockbound
Q
7

I don't really have a specific answer to why the .NET implementation is so chatty, but this behaviour would be due to the implementation of System.IO.FileStream as all that File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.Read); is doing is passing the parameters to the FileStream constructor.

public static FileStream Open(string path, FileMode mode, FileAccess access, FileShare share)
{
    return new FileStream(path, mode, access, share);
}

Changing the behaviour of FileStream would mean that you would basically have to re-implement the FileStream class which will require a lot of effort.

Your other more simpler alternative would be to create a native wrapper that calls the C++ code you gave. Then call the native wrapper from your C# code.

Quell answered 25/11, 2014 at 23:35 Comment(1)
I don't think the change is that big. Use native interop p/invoke to open the stream and then construct the file stream from the handle.Diachronic

© 2022 - 2024 — McMap. All rights reserved.