How to find i/o bottleneck within asp.net app
Asked Answered
G

3

6

We got a high traffic website which generates a lot of I/O. Within 10 minutes it has been reading over 10 gb of data (w3wp in question seen in task manager). For memory and application hangs I have been using WinDbg with success. But I don't know how I can find the object(s) / method(s) within a process which are responsible for the highest I/O.

Is this even possible?

Edit The question is: Is there a way to profile I/O operations in a .NET assembly, say: list of threads sorted by highest disk I/O (or something similar that would help me where to look)

Garling answered 25/2, 2013 at 1:8 Comment(8)
Do you have any idea where it could be happening in your code, or are you running completely blind?Sublieutenant
I have no idea, walked though all the code several times (it's quite a big site, lots of functionality). It's possible to upload and post pictures but that is around 10gb per month of data.Garling
Are you processing those images within the application by chance?Sublieutenant
nope, using a separate cookieless domain for that with a different application poolGarling
You have to do some calculations first. What is the average page size of your application? how many users per minute you have? with this you can calculate the average IO of your app and see if is normal for your traffic vs page size. If is normal then maybe you need to scale your hardware (use a SDD or a RAID instead of a simple HDD) or change something in your code.. Is hard to tell what the problem is with only this information but I hope this could help youInadvisable
@Inadvisable Thanks for your input. We're using a raid setup already and scale out to 2 servers. We have about 35k unique visitors per day and get around 400 requests/sec per server in busy times. It all depends on what you do on a website on whats normal. The original question was actually: Is there a way to analyze which thread/method is generating lots of i/o? Much like you can do with memory leaks and WinDbgGarling
Ok I see. Which version of OS are you using in your servers and what is the manufacturer/brand of your HDDs?Inadvisable
win 2008R2, 2x dell r300 with 2x fujitsu 15k SAS MBA3147RC Raid 1. All the images and static files are served from a separate disk.Garling
N
6

ANTS Performance Profiler

I have used this tool to great success - dealing with finding the specific instructions which are causing ~512GB of memory on a high-volume web farm getting chewed up within 5-10 minutes. Sounds like a very similar situation as yours.

Now, to be realistic - it's not going to magically solve your problem. It still requires a lot of setup, thorough analysis and detective work. But this tool definitely took the problem from "practically unsolvable" to "solvable within days".

Update:

As I mentioned in the comments (and Ben Emmett echoed), we can use ANTS to monitor memory, file system handles - pretty much any resource consumption and drill down the call stack to see the effects of specific routines.

Nate answered 25/2, 2013 at 1:19 Comment(5)
Could you elaborate a bit on how to do that with ants? Are you using the memory profiler? I find it a bit limiting compared to windbg.Garling
@Elger starting at the outermost suspect routines, we can watch how much memory, FS handles, etc. accumulate during that routine - and start to drill in to deeper routines while seeing the resource accumulation stay relatively flat - which means we are zeroing in on the problem.Nate
But the memory consumption is actually not so worrying (max 400 mb at busy times, around 275 mb normally for a 64 bit app). What do you mean by 512gb of memory in your situation?Garling
@Elger in my case it was memory consumption, but you can monitor any kind of resource consumption.Nate
ANTS Performance Profiler also records file system access which sounds like it could be useful in your situation: red-gate.com/supportcenter/Content/ANTS_Performance_Profiler/…Eviscerate
C
1

I came up with this tool AppDynamics Lite which displays your application calls costs and performance in a visual way. It might help you to find out which functions are making the most costy IO operations.

Quoting;

Understand the health of your CLR with key metrics like response time, throughput, exception rate, and garbage collection time as well as key system resource like CPU, memory and disk I/O.

Worth giving a shot as it is trial/free for 30 days. Hope it helps. Ps: I'm not affiliated with AppDynamics in any way.

Cecum answered 6/3, 2013 at 16:7 Comment(0)
K
1

You can use the (free) Windows Performance Toolkit from Windows 8 which does run also on Windows Vista and later. There you can turn on system wide profiling to see what was going on in all processes at once. No instrumentation necessary. Only one reboot is required to set an arcane registry key which is done by WPRUI.exe automatically.

With XPerf you could enable IO Init stack walking so that a call stack is taken for every IO which is started. The only issue is that the stacks will be broken for 64 bit processes which means that you will see only the first method above the BCL methods of your code because there is a Windows 7 bug in the stackwalking capabilities of the OS.

A workaround is to Ngen your assemblies or move to Server 2012 or switch to x86 for profiling to see deeper call stacks.

You will see all file IO and CPU activity even without any call stacks and the file names along how long the hard disc was used. That should give you good information which part of your app is causing the disc IO. From the partial call stacks you should be able to pinpoint your issue even without full stacks.

The tool will give you much more insight than any commercially available profiler at the expense that you need to learn how to use it. Since the call stacks do not end at your code or in user mode but in the kernel you can also determine if e.g. the virus scanner is causing significant IO delays. But you need to know how your processor does work. This toolset was originally aimed at kernel devs which explains why you see so many useless columns.

In the picture below you see file IO and CPU consumption stacked. When you select your high IO file in the disc IO graph it will highlight in the CPU consumption all related call stacks which were taken at the same time while the IO was active. This way you can diretly navigate from the IO to your potentially blocked threads.

enter image description here

Kelly answered 7/3, 2013 at 6:48 Comment(1)
Wow, thanks a lot for the insight. I didn't know this existed. I'll try it.Garling

© 2022 - 2024 — McMap. All rights reserved.