How does Thread.Abort() work?
Asked Answered
H

5

17

We usually throw exception when invalid input is passed to a method or when a object is about to enter invalid state. Let's consider the following example

private void SomeMethod(string value)
{
    if(value == null)
        throw new ArgumentNullException("value");
    //Method logic goes here
}

In the above example I inserted a throw statement which throws ArgumentNullException. My question is how does runtime manages to throw ThreadAbortException. Obviously it is not possible to use a throw statement in all the methods, even runtime manages to throw ThreadAbortException in our custom methods too.

I was wondering how do they do it? I was curious to know what is happening behind the scenes, I opened a reflector to open Thread.Abort and end up with this

[MethodImplAttribute(MethodImplOptions.InternalCall)]
private extern void AbortInternal();//Implemented in CLR

Then I googled and found this How does ThreadAbortException really work. This link says that runtime posts APC through QueueUserAPC function and that's how they do the trick. I wasn't aware of QueueUserAPC method I just gave a try to see whether it is possible with some code. Following code shows my try.

[DllImport("kernel32.dll")]
static extern uint QueueUserAPC(ApcDelegate pfnAPC, IntPtr hThread, UIntPtr dwData);
delegate void ApcDelegate(UIntPtr dwParam);

Thread t = new Thread(Threadproc);
t.Start();
//wait for thread to start
uint result = QueueUserAPC(APC, new IntPtr(nativeId), (UIntPtr)0);//returns zero(fails)
int error = Marshal.GetLastWin32Error();// error also zero

private static void APC(UIntPtr data)
{
    Console.WriteLine("Callback invoked");
}
private static void Threadproc()
{
    //some infinite loop with a sleep
}

If am doing something wrong forgive me, I have no idea how to do it. Again back to question, Can somebody with knowledge about this or part of CLR team explain how it works internally? If APC is the trick runtime follows what am doing wrong here?

Hodosh answered 9/8, 2013 at 19:22 Comment(10)
I don't know how it works internally, but I'm curious about why you want to know. If it's just for general knowledge, I can understand that. But I hope you're not planning to use that trick for anything. Aborting a thread is an ugly and potentially dangerous thing to do.Anatollo
@JimMischel No never, I know Abort is an evil, Am just curious to know how does it work.Hodosh
@SriramSakthivel did you try looking into mono sources? (mono-project.com)Ponder
@zespri I have gone through it for some other impl, let me give a try in this.Hodosh
@zespri unfortunately there also MethodImplOptions.InternalCallHodosh
@SriramSakthivel yes, but can't you trace down what this internal call does? you have all the sources.Ponder
@zespri I just checked it here let me download all sources and dig deeperHodosh
It's interprocessor driver comms. The CPU core running the thread to be aborted gets hardware-interrupted via an inter-core driver. Y'all need to understand how the hardware works.Latreese
@SriramSakthivel: You may find this recent question of mine interesting as well.Daveta
@SriramSakthivel: Look at my other answer that I updated and revived. I'm hoping it contains the smoking gun for how this all works.Daveta
A
11

Are you sure you read the page you were pointing to? In the end it boils down to:

The call to Thread.Abort boils down to .NET setting a flag on a thread to be aborted and then checking that flag during certain points in the thread’s lifetime, throwing the exception if the flag is set.

Afflux answered 9/8, 2013 at 19:27 Comment(11)
Yes but while the thread is executing my method how runtime manages to throw an exception in it?Hodosh
@SriramSakthivel Sadly the explanation written in that site is a little confusing, because it jumps a lot... in one step the thread is sleeping, but in the next step the Thread.Abort waits for the thread to go in another state.Afflux
@SriramSakthivel Googling the keyword "Asynchronous Exception" might get you on the right track.Vibraphone
In the end, I don't think it's so much complex... If a thread isn't running, its registers are saved somewhere. You go there and modify the Program Counter to point to your throw. The next time your thread resumes the PC is loaded and the CPU "jumps" to the throw. (the Program Counter aka Instruction Counter is the register with the address of the current executing instruction)Afflux
So Jit will modify the machine code on the fly to throw ThreadAbortException?Hodosh
@SriramSakthivel It doesn't need to modify anything. Let put it this way: when a thread finishes its time-slice, its registers are saved somewhere, and the point in the code it was is saved somewhere. This point in the code is a register too (the PC). This "somewhere" is in memory. If someone modifies this "serialized" registers, it can make the thread resume somewhere else, and this somewhere else is on a "throw new ThreadAbort();" In truth probably it doesn't do this, because the .NET won't do this trick if the thread is in a finally block, or in native code (for example a WinAPI)Afflux
Now I get a picture about how it can be done, thank you very much. but looking for how .net doesHodosh
There is no need to wait for anything, or rely on time-slicing, (stupid terminology). Either a thread is not running, (in which case it's state can be set to 'never run again'), or it is running on a different core than the thread calling Thread.Abort, in which case the OS can hardware-interrupt that core and force the thread to enter the OS code and die.Latreese
'The call to Thread.Abort boils down to .NET setting a flag on a thread to be aborted' WHERE DID THAT QUOTE COME FROM???Latreese
@MartinJames From the document linked in the question.Afflux
@MartinJames I linked an article in my question, that was a part in that article. am not sure about that article so raised a questionHodosh
E
10

To get your APC callback to work, you need a thread handle (which is not the same as the thread ID). I've also updated the attributes on the PInvokes.

Also keep in mind that the thread needs to be in an "alert-able" wait state in order for the APC to be called (which Thread.Sleep will give us). So if the thread is busy doing stuff, it may not be called.

[DllImport("kernel32.dll", EntryPoint = "GetCurrentThread", CallingConvention = CallingConvention.StdCall)]
public static extern IntPtr GetCurrentThread();

[DllImport("kernel32.dll", EntryPoint = "QueueUserAPC", CallingConvention = CallingConvention.StdCall, SetLastError = true)]
public static extern uint QueueUserAPC(ApcDelegate pfnAPC, IntPtr hThread, UIntPtr dwData);

[UnmanagedFunctionPointerAttribute(CallingConvention.StdCall)]
public delegate void ApcDelegate(UIntPtr dwParam);

[DllImport("kernel32.dll", EntryPoint = "DuplicateHandle", CallingConvention = CallingConvention.StdCall, SetLastError = true)]
public static extern bool DuplicateHandle([In] System.IntPtr hSourceProcessHandle, [In] System.IntPtr hSourceHandle, [In] System.IntPtr hTargetProcessHandle, out System.IntPtr lpTargetHandle, uint dwDesiredAccess, [MarshalAsAttribute(UnmanagedType.Bool)] bool bInheritHandle, uint dwOptions);

[DllImport("kernel32.dll", EntryPoint = "GetCurrentProcess", CallingConvention = CallingConvention.StdCall, SetLastError = true)]
public static extern IntPtr GetCurrentProcess();


static IntPtr hThread;
public static void SomeMethod(object value)
{
    DuplicateHandle(GetCurrentProcess(), GetCurrentThread(), GetCurrentProcess(), out hThread, 0, false, 2);

    while (true)
    {
        Console.WriteLine(".");
        Thread.Sleep(1000);
    }
}

private static void APC(UIntPtr data)
{
    Console.WriteLine("Callback invoked");
}

static void Main(string[] args)
{
    Console.WriteLine("in Main\n");

    Thread t = new Thread(Program.SomeMethod);
    t.Start();

    Thread.Sleep(1000); // wait until the thread fills out the hThread member -- don't do this at home, this isn't a good way to synchronize threads...
    uint result = QueueUserAPC(APC, hThread, (UIntPtr)0);

    Console.ReadLine();
}


Edit:
How the CLR injects the exception
Given this loop for the thread function:

while (true)
{
    i = ((i + 7) * 3 ^ 0x73234) & 0xFFFF;
}

I then .Aborted the thread and looked at the native stack trace

...
ntdll!KiUserExceptionDispatcher
KERNELBASE!RaiseException
clr!RaiseComPlusException
clr!RedirectForThrowControl2
clr!RedirectForThrowControl_RspAligned
clr!RedirectForThrowControl_FixRsp
csTest.Program.SomeMethod(System.Object)
...

Looking at the return address of the RedirectForThrowControl_FixRsp call, it is pointing into the middle of my loop, for which there are no jumps or calls:

nop
mov     eax,dword ptr [rbp+8]
add     eax,7 // code flow would return to execute this line
lea     eax,[rax+rax*2]
xor     eax,73234h
and     eax,0FFFFh
mov     dword ptr [rbp+8],eax
nop
mov     byte ptr [rbp+18h],1
jmp     000007fe`95ba02da // loop back to the top

So apparently the CLR is actually modifying the instruction pointer of the thread in question to physically yank control from the normal flow. They obviously needed to supply several wrappers to fixup and restore all the stack registers to make it work correctly (thus the aptly named _FixRsp and _RspAligned APIs.


In a separate test, I just had Console.Write() calls within my thread loop, and there it looked like the CLR injected a test just before the physical call out to WriteFile:

KERNELBASE!RaiseException
clr!RaiseTheExceptionInternalOnly
clr! ?? ::FNODOBFM::`string'
clr!HelperMethodFrame::PushSlowHelper
clr!JIT_RareDisableHelper
mscorlib_ni!DomainNeutralILStubClass.IL_STUB_PInvoke(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte*, Int32, Int32 ByRef, IntPtr)
mscorlib_ni!System.IO.__ConsoleStream.WriteFileNative(Microsoft.Win32.SafeHandles.SafeFileHandle, Byte[], Int32, Int32, Boolean)
Eventuate answered 29/8, 2013 at 18:13 Comment(14)
Thanks, My bad. I was using ThreadId instead of handle. Now apc works. But this doesn't answer my question fully. How does runtime manages to throw exception even when my thread is not entering alertable state.(Some infinite loop without sleep)?Hodosh
@BrianGideon and @josh Surprisingly I get similar answers from both of you in a near same time after several days of this question. Thank you very much both of you guys. Am such a stupid missed thread handle here :(. Can anybody explain without entering alertable state runtime throws ThreadAbortException, How do they do?Hodosh
@SriramSakthivel: You posed a really good question that got me thinking. I, like you, would still like to know the exact details of how the abort gets injected. I don't think it's quite as simple as "setting an abort" flag. Also, take a look at my question here for another perspective of where knowing exactly how the aborts work would be useful information.Daveta
Yes this made me to think a lot but no clue. Waiting for a solid answer not assumptions.! Who will be in a position to answer this? How to contact @EricLippert? btw am looking at your question for last 1 hour.Hodosh
@BrianGideon and Sriram, I just debugged a couple of scenarios. The most interesting one involved a tight loop within the thread, and the CLR apparently wedged itself into the control flow. See the edit to my post above.Eventuate
Interesting man, How did you get native stack? and In your second example with Console.Write() there was no call to _fixRsp makes confusion. What's your view about this?Hodosh
Am even interested in knowing how do they manage when they are in CER where they shouldn't abort.Hodosh
@SriramSakthivel: Console.Write almost certainly goes into the alertable state at which point the CLR injects the exception inline. There is no need to modify the instruction pointer or fix the stack frame. I think that's what josh is saying.Daveta
I'm using WinDbg to debug my test app. In the second sample it appears that just before a PInvoke out to the actual native API happens it will check some state, and divert to raise the exception (although I'm still digging through the assembly behind IL_STUB_PInvoke). But yes, @BrianGideon is correct in that case they don't need to perform the brutal severing of the instruction flow.Eventuate
@SriramSakthivel It appears the inclusion of CER prevents the tight loop from being aborted through the IP hijack.Eventuate
Thanks for WinDbg tool. and It's my pleasure that someone shows interest and digging deeper to find an answer to my quesion.Hodosh
Yes CER prevents it, I just tested it out of curiosity from @BrianGideon 's recent question. but even CER fails to prevent it when we give an alertable state(Sleep(0)). Don't know why!Hodosh
Assuming they use IP hack to throw AsynchronousExceptions they still need to consider CER, CriticalFinializer code which GC executes, etc. They(CLR team) are amazing, they are doing more than what we think. I admire them for atleast making us to think this much. This was something beyond my knowledge before, now somewhat clear.Hodosh
@joshpoley: When you get some time download the SSCLI code. I've been going through it and there are some really good clues in it. First, it is definitely changing the instruction pointer. Second, I can see the implementations of some of the methods you posted in your answer here. Third, I am seeing all kinds of hooks where the abort gets injected (try-catch-finally control flow handling, GC initiations, etc.) I'm still trying to determine if a thread context switch is another injection opportunity. My leading theory is the GC thread occasionally polls and that's how most get injected.Daveta
D
1

I downloaded the SSCLI code and started poking around. The code is difficult for me to follow (mostly because I am not a C++ or ASM expert), but I do see a lot of hooks where the aborts are injected semi-synchronously.

  • try/catch/finally/fault block flow control processing
  • GC activations (allocating memory)
  • proxied through soft interrupts (like with Thread.Interrupt) when in an alertable state
  • virtual call intercepts
  • JIT tail call preparations
  • unmanaged to managed transitions

That is just to name a few. What I wanted to know was how asynchronous aborts were injected. The general idea of hijacking the instruction pointer is part of how it happens. However, it is far more complex than what I described above. It does not appear that a Suspend-Modify-Resume idiom is always used. From the SSCLI code I can see that it does suspend and resume the thread in certain scenarios to prepare for the hijack, but this is not always the case. It looks to me that the hijack can occur while the thread is running full bore as well.

The article you linked to mentions that an abort flag is set on the target thread. This is technically correct. The flag is called TS_AbortRequested and there is a lot of logic that controls how this flag is set. There are checks for determining if a constrained execution region exists and whether the thread is currently in a try-catch-finally-fault block. Some of this work involves a stack crawl which means the thread must be suspended and resumed. However, how the change of the flag is detected is where the real magic happens. The article does not explain that very well.

I already mentioned several semi-synchronous injection points in the list above. Those should be pretty trivial to understand. But, how does the asynchronous injection happen exactly? Well, it appears to me that the JIT is the wizard behind by the curtain here. There is some kind of polling mechanism built into the JIT/GC that periodically determines if a collection should occur. This also provides an opportunity to check to see if any of the managed threads have changed state (like having the abort flag set). If TS_AbortRequested is set then the hijack happens then and there.

If you are looking at the SSCLI code here are some good functions to look at.

  • HandleThreadAbort
  • CommonTripThread
  • JIT_PollGC
  • JIT_TailCallHelper
  • COMPlusCheckForAbort
  • ThrowForFlowControl
  • JIT_RareDisableHelper

There are many other clues. Keep in mind that this is the SSCLI so the method names may not match exactly with call stacks observed in production (like what Josh Poley discovered), but there will be similarities. Also, a lot of the thread hijacking is done with assembly code so it is hard to follow at times. I highlighted JIT_PollGC because I believe this is where the interesting stuff happens. This is the hook that I believe the JIT will dynamically and strategically place into the executing thread. This is basically the mechanism for how those tight loops can still receive the abort injections. The target thread really is essentially polling for the abort request, but as part of a larger strategy to invoke the GC1

So clearly the JIT, GC, and thread aborts are intimately related. It is obvious when you look at the SSCLI code. As an example, the method used to determine the safe points for thread aborts is the same as the one used to determine if the GC is allowed to run.


1Shared Source CLI Essentials, David Stutz, 2003, pg. 249-250

Daveta answered 9/8, 2013 at 20:49 Comment(0)
D
1

To get the QueueUserAPC to work you have to do two things.

  1. Acquire the target thread handle. Note that this is not the same thing as the native thread id.
  2. Allow the target thread to go into an alertable state.

Here is a complete program that demonstrates this.

class Program
{
    [DllImport("kernel32.dll", EntryPoint = "DuplicateHandle", CallingConvention = CallingConvention.StdCall, SetLastError = true)]
    public static extern bool DuplicateHandle([In] System.IntPtr hSourceProcessHandle, [In] System.IntPtr hSourceHandle, [In] System.IntPtr hTargetProcessHandle, out System.IntPtr lpTargetHandle, uint dwDesiredAccess, [MarshalAsAttribute(UnmanagedType.Bool)] bool bInheritHandle, uint dwOptions);

    [DllImport("kernel32.dll", EntryPoint = "GetCurrentProcess", CallingConvention = CallingConvention.StdCall, SetLastError = true)]
    public static extern IntPtr GetCurrentProcess();

    [DllImport("kernel32.dll")]
    private static extern IntPtr GetCurrentThread();

    [DllImport("kernel32.dll")]
    private static extern uint QueueUserAPC(ApcMethod pfnAPC, IntPtr hThread, UIntPtr dwData);

    private delegate void ApcMethod(UIntPtr dwParam);

    static void Main(string[] args)
    {
        Console.WriteLine("Main: " + Thread.CurrentThread.ManagedThreadId);
        IntPtr threadHandle = IntPtr.Zero;
        var threadHandleSet = new ManualResetEvent(false);
        var apcSet = new ManualResetEvent(false);
        var thread = new Thread(
            () =>
            {
                Console.WriteLine("thread started");
                threadHandle = GetCurrentThread();
                DuplicateHandle(GetCurrentProcess(), GetCurrentThread(), GetCurrentProcess(), out threadHandle, 0, false, 2);
                threadHandleSet.Set();
                apcSet.WaitOne();
                for (int i = 0; i < 10; i++)
                {
                    Console.WriteLine("thread waiting");
                    Thread.Sleep(1000);
                    Console.WriteLine("thread running");
                }
                Console.WriteLine("thread finished");
            });
        thread.Start();
        threadHandleSet.WaitOne();
        uint result = QueueUserAPC(DoApcCallback, threadHandle, UIntPtr.Zero);
        apcSet.Set();
        Console.ReadLine();
    }

    private static void DoApcCallback(UIntPtr dwParam)
    {
        Console.WriteLine("DoApcCallback: " + Thread.CurrentThread.ManagedThreadId);
    }

}

This essentially allows a developer to inject the execution of a method into any arbitrary thread. The target thread does not have to have a message pump like would be necessary for the traditional approach. One problem with this approach though is that the target thread has to be in an alertable state. So basically the thread must call one of the canned .NET blocking calls like Thread.Sleep, WaitHandle.WaitOne, etc. for the APC queue to execute.

Daveta answered 29/8, 2013 at 18:41 Comment(7)
Thank you brian for clearing APC, but main problem resides still at the same place. Without alertable state too runtime throws ThreadAbortException this is something beyond my knowledge :(Hodosh
Interestingly I had the same bug with GetCurrentThread(). Since it returns a pseudo-handle, when you actually use it from the main thread, it will actually queue the APC to the main thread's queue. I changed my code to Duplicate the handle so I wouldn't use the pseudo-handle.Eventuate
@joshpoley: Oh you're right! I didn't even notice that. My example is fixed now. Now I need to digest the update to your answer. It looks like you are finding some good clues.Daveta
Brian and @Josh It seems you both guys are sitting together one after another and posting answers, same kind of edits. What a coincidence!Hodosh
I think I get it. The point when a thread gets a new time slice but before it resumes execution is also an alertable point. But the MSDN documentation doesn't explicitly state that.Tintype
@confusopoly: You might be on the right track. I don't think it's technically in an alertable state at that point, but the context switch does provide an opportunity for injecting the exception via the instruction pointer hijacking method. Real alertable states can be used for injections without hijacking the IP and stack frame.Daveta
@Tintype Yes, Brian is correct. Because I tested with an infinite loop without a sleep, and requested APC, My Apc never gets invoked in that caseHodosh
L
0

It's easy, the underlying OS does it. If the thread is in any state except 'running on another core', there is no problem - it's state is set to 'never run again'. If the thread is runing on another core, the OS hardware-interrupts the other core via. it's interprocessor driver and so exterminates the thread.

Any mention of 'time-slice', 'quantum' etc. is just.....

Latreese answered 9/8, 2013 at 22:49 Comment(1)
The OS itself has no idea that you are running a managed thread and does not have some special state that it checks in order to throw a managed exception.Eventuate

© 2022 - 2024 — McMap. All rights reserved.