I've spent several days trying to speed up loading of symbols when debugging crash dumps using WinDbg, and I'm unable to get past a particular problem.
The issue is that when symbols for a module in the dump doesn't exist in any accessible symbol store or symbol server location (e.g. it's a third-party modules without available symbols), WinDbg will spend literally hours looking for them.
I've set up my symbol path correctly to properly set the search order and the cache directories:
.sympath cache*C:\SymbolCache1;\\our.corp\SymbolStore;SRV*C:\SymbolCache2*http://msdl.microsoft.com/download/symbols
Running with !sym noisy
and .reload /f
I can see:
SYMSRV: Notifies the client application that a proxy has been detected.
SYMSRV: Connecting to the Server: http://msdl.microsoft.com/download/symbols.
SYMSRV: Successfully connected to the Server.
SYMSRV: Sending the information request to the server.
SYMSRV: Successfully sent the information request to the server.
SYMSRV: Waiting for the server to respond to a request.
SYMSRV: Successfully received a response from the server.
SYMSRV: Closing the connection to the Server.
SYMSRV: Successfully closed the connection to the Server.
SYMSRV: c:\SymbolCache1\Some3rdParty.dll\0060D200cd1000\Some3rdParty.dll not found
SYMSRV: c:\SymbolCache2\Some3rdParty.dll\0060D200cd1000\Some3rdParty.dll not found
SYMSRV: http://msdl.microsoft.com/download/symbols/Some3rdParty.dll/0060D200cd1000/Some3rdParty.dll not found
<---- !!!! hanging here with *BUSY* showing in WinDbg
By running Process Monitor at the point in which it's hanging, I can see that WinDbg is searching what appears to be every directory in our giant network symbol store (\our.corp\SymbolStore) looking for symbols, even in directories for modules that are clearly unrelated.
What's weird is that in WinDbg you can see that it's extracted the timestamp of the module (0060D200cd1000) and is using that to look in the expected location in local directories and the MS symbol server. I can't figure out why it's doing a full scan of our (massive) network symbol store. Perhaps there's something unique about how it treats UNC paths?
This search can take 15 minutes or more per symbol, and if the dump has many missing symbols this can cause a !analyze -v
to take hours (if you're using the Visual Studio integration of WinDbg, it causes the hang to occur as soon as you load a crash dump, since for some reason that integration tries to load all symbols immediately despite .symopt
settings).
This problem is also easily reproducible if you try to load symbols for a non-existent made-up module name, e.g. .reload /f bogus.dll
.
Here's my WinDbg .symopt settings:
0:000> .symopt
Symbol options are 0x30337:
0x00000001 - SYMOPT_CASE_INSENSITIVE
0x00000002 - SYMOPT_UNDNAME
0x00000004 - SYMOPT_DEFERRED_LOADS
0x00000010 - SYMOPT_LOAD_LINES
0x00000020 - SYMOPT_OMAP_FIND_NEAREST
0x00000100 - SYMOPT_NO_UNQUALIFIED_LOADS
0x00000200 - SYMOPT_FAIL_CRITICAL_ERRORS
0x00010000 - SYMOPT_AUTO_PUBLICS
0x00020000 - SYMOPT_NO_IMAGE_SEARCH
I've looked all over thinking that there must be some flag to control this, but I can't seem to find it.
A couple things:
- This is not an issue with network speed or lack of local symbol cache. The issue only occurs with symbols that cannot be found, and only with the UNC symbol store (e.g. not with the Microsoft symbol server)
- I've already tried SYMOPT_NO_PUBLICS instead of SYMOPT_AUTO_PUBLICS
- I've verified that my symbol path is what I expect it to be, using the
sympath
command. I've also tried using the_NT_SYMBOL_PATH
environment variable instead. - I know that I can exclude certain symbols via a configuration file, but this isn't a workable solution because sometimes the missing symbol name is not known in advance
- I've seen someone else on the internet had this same issue and mentioned it on a Microsoft forum in a post titled "Poor WinDbg 6.11.1.404 performance when it is pointed to a large symbol cache" but received no help.