.NET GC Hang - windbg dump analysis shows GC thread blocking itself?
Asked Answered
S

0

6

I have an app that was freezing for a long time (sometimes a minute or more, sometimes it doesn't seem to come back at all, but this time it was about a minute) and I took a few dumps for analysis. I took three dumps, each one about 15 seconds apart but I don't see much different between them so I'll describe the last one.

First thing I do is run it through debug analysis and it comes up with the following diagnosis :

GC is running in this process. The Thread that triggered the GC is 34

I run !threadpool and it confirms GC via 81% CPU utilization :

CPU utilization: 81%
Worker Thread: Total: 19 Running: 19 Idle: 0 MaxLimit: 2047 MinLimit: 16
Work Request in Queue: 2212
    AsyncTimerCallbackCompletion TimerInfo@05fc7fb8
    *snipped 58 timers*
--------------------------------------
Number of Timers: 59
--------------------------------------
Completion Port Thread:Total: 1 Free: 1 MaxFree: 32 CurrentLimit: 0 MaxLimit: 1000 MinLimit: 16

59 timers, wow... anyway, I see the CPU is 81% confirming GC. The next item in analysis says the following :

The following threads in test.dmp are waiting for .net garbage collection to finish. Thread 34 triggered the garbage collection.The gargage collector thread wont start doing its work till the time the threads which have pre-emptive GC disabled have finished executing. The following threads have pre-emptive GC disabled 34,

( 3 9 10 12 19 20 30 31 38 39 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 67 71 72 )

44.16% of threads blocked (34 threads)

Looks like we have a thread with Pre-emptive GC flag disabled which is preventing the GC thread from starting. Let's look at the threads :

!threads -live -special
ThreadCount:      79
UnstartedThread:  0
BackgroundThread: 63
PendingThread:    0
DeadThread:       15
Hosted Runtime:   no
                                   PreEmptive   GC Alloc                Lock
       ID  OSID ThreadOBJ    State GC           Context       Domain   Count APT Exception
   0    1  29a4 006f2f68      6020 Enabled  00000000:00000000 006ec210     0 STA
   3    2  1750 006fddb0      b220 Enabled  00000000:00000000 006ec210     0 MTA (Finalizer)
   6    3  14a8 05c77fc8   8009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Completion Port)
   7    4   530 0074ac08   100a220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
   8    5  2798 05dcf0e0   200b220 Enabled  00000000:00000000 006ec210     0 MTA
   9    7  330c 05d0bbb8   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  10    9  2bf4 05b3db70   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  11    c  2a9c 05b3ea58   3009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  12   11  2a08 05cf9398   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  14   16  1c48 05cfacc0   200b220 Enabled  00000000:00000000 006ec210     1 MTA
  17   27  1cd8 05eacac0   1000220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  18   25   3ac 05ead9d8   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  19   24  32e0 05eac0b0      b220 Enabled  00000000:00000000 006ec210     0 MTA
  20   23   d3c 05eabba8      b220 Enabled  00000000:00000000 006ec210     0 MTA
  21   22  2bcc 05eab6a0   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  22   1f  19d8 05eae8f0   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  23   1d  2d68 05eaf300   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  24   1c   d78 05eaf808   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  25   1b  1424 05eafd10   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  26   2a  2914 05eb1638   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  27   2d  2f80 05eb2550   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  28   2f  1bf0 05eb2f60   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  29   32  10a4 05cfe920   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  30   36  3018 05cff330      b220 Enabled  00000000:00000000 006ec210     0 MTA
  31   39  29e0 05cfc5e8   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  32   3d  11dc 140ed960   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  34   34  2928 140eed80      b220 Disabled 00000000:00000000 006ec210     2 MTA (GC)
  35   37  1808 140efc98   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  36   40  23d4 140ef790   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  37   41  32e8 140ef288   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  38   35  2f64 140ee370   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  39   4a  33f4 05dd2f20   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  40   4c  2f4c 05dd3930   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  41   4d  3034 05dd3e38   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  42   4e  217c 05dd4340   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  43   4f  1d58 05dd4848   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  44   50   c2c 05dd4d50   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  45   62  27fc 4a0f20b8   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  46   65  2e34 4a0fa8b8   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  47   68  28e8 4a0fadc0   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  48   6f  1e6c 4987e468   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  49   71  33a4 4987c970   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  51   72  172c 4a0b0900   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  50   5e  1ad4 4a0fb7d0   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  52   73  2380 4a0b0e08   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  53   6d  2958 1455fca8   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  54   69  2140 4987ce78   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  55   3e  2c10 49836c90   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  56   52  1f8c 49837198   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  57   5b  216c 145606b8   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  58   5a   16c 4a0aaa80   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  59   47   a48 14560bc0   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  60   3c  243c 4996dae0   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  61   58  3398 4a0fb2c8   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  62   51  23b4 498376a0   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  63   53  2b48 4a46d358   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  64   56  2504 4a0b1310   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  65   59  2778 497b7f80   1019220 Enabled  00000000:00000000 006ec210     0 Ukn (Threadpool Worker)
  66   60  262c 4a05a6e0   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  67   3a  1bdc 14252d08      b220 Enabled  00000000:00000000 006ec210     0 MTA
  68   54  14bc 14252800   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  69   61  2ff0 14349f28   200b220 Enabled  00000000:00000000 006ec210     0 MTA
  71   10  26c0 4a0b1818   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)
  72   6b  26f4 49950518   1009220 Enabled  00000000:00000000 006ec210     0 MTA (Threadpool Worker)

       OSID     Special thread type
    2   2f88    DbgHelper 
    3   1750    Finalizer 
    5   2f44    Gate 
    6   14a8    IOCompletion 
    7    530    Timer 
    9   330c    ThreadpoolWorker 
   10   2bf4    ThreadpoolWorker 
   11   2a9c    ThreadpoolWorker 
   12   2a08    ThreadpoolWorker 
   17   1cd8    Wait 
   31   29e0    ThreadpoolWorker 
   34   2928    SuspendEE 
   38   2f64    ThreadpoolWorker 
   39   33f4    ThreadpoolWorker 
   45   27fc    ThreadpoolWorker 
   46   2e34    ThreadpoolWorker 
   47   28e8    ThreadpoolWorker 
   48   1e6c    ThreadpoolWorker 
   49   33a4    ThreadpoolWorker 
   50   1ad4    ThreadpoolWorker 
   51   172c    ThreadpoolWorker 
   52   2380    ThreadpoolWorker 
   53   2958    ThreadpoolWorker 
   54   2140    ThreadpoolWorker 
   55   2c10    ThreadpoolWorker 
   56   1f8c    ThreadpoolWorker 
   57   216c    ThreadpoolWorker 
   58    16c    ThreadpoolWorker 
   59    a48    ThreadpoolWorker 
   60   243c    ThreadpoolWorker 
   61   3398    ThreadpoolWorker 
   62   23b4    ThreadpoolWorker 
   63   2b48    ThreadpoolWorker 
   64   2504    ThreadpoolWorker 
   65   2778    ThreadpoolWorker 
   71   26c0    ThreadpoolWorker 
   72   26f4    ThreadpoolWorker 
   75   200c    GC 

We can see that thread 34 is the GC thread, but it's also the only thread that has PreEmptive GC disabled. Is it possible that the thread triggering the GC (34) is preventing itself from performing a GC? If so, then how was the app able to return from the freeze? The app eventually became responsive after a minute and continued chugging along so I'm trying to figure out what's going on here.

Stack info is too large to post here but I can't see anything on thread 34 that might be preventing itself from GC'ing (if that's what's really happening here). One more tidbit from debug analysys is the following which may or may not be relavent :

Detected possible blocking or leaked critical section at 0x006e1498 owned by thread 34 in test.dmp

Impact of this lock 1.3% of threads blocked (Threads 5) The following functions are trying to enter this critical section

clr!UnsafeEEEnterCriticalSection+1c

The following module(s) are involved with this critical section

C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll from Microsoft Corporation

However thread 5 isn't even a managed thread; it's a threadpool gate thread.

Any ideas on how I can get to the root of my problem as to why the GC is taking so long to start and eventually complete?

Sundin answered 10/3, 2015 at 17:51 Comment(3)
Just wanted to point out that 81% CPU utilization is what the !ThreadPool command reports by default when there is a GC in progress. blogs.msdn.com/b/tess/archive/2006/03/27/561715.aspxFascist
Hi Sasha! Yes, I mentioned that earlier in my post. Btw, thank you for your awesome articles, they've been very helpful to me! :DSundin
Hi @Anthony, did you figure this one out by any chance? I am facing a similar issue.Tenorio

© 2022 - 2024 — McMap. All rights reserved.