Isolate exceptions thrown in an AppDomain to not Crash the Application

I

2

9

TL;DR: How do you isolate add-in exceptions from killing the main process?

I want to have a very stable .Net application that runs less stable code in an AppDomain. This would appear to be one of the prime purposes of the AppDomain in the first place (well, that and security sandboxing) but it doesn't appear to work.

For instance in AddIn.exe:

public static class Program
{
    public static void Main(string[] args)
    {
        throw new Exception("test")
    }
}

Called in my 'stable' code with:

var domain = AppDomain.CreateDomain("sandbox");
domain.UnhandledException += (sender, e) => { 
    Console.WriteLine("\r\n ## Unhandled: " + ((Exception) e.ExceptionObject).Message);
};
domain.ExecuteAssemblyByName("AddIn.exe", "arg A", "arg B")

The exception thrown in the AppDomain gets passed straight to the application that created the domain. I can log these with domain.UnhandledException and catch them in the wrapper application.

However, there are more problematic exceptions thrown, for instance:

public static class Program
{
    public static void Main(string[] args)
    {
        Stackoverflow(1);
    }

    static int Stackoverflow(int x)
    {
        return Stackoverflow(++x);
    }
}

This will throw a stackoverflow exception that kills the entire application every time. It doesn't even fire domain.UnhandledException - it just goes straight to killing the entire application.

In addition calling things like Environment.Exit() from inside the AppDomain also kill the parent application, do not pass GO, do not collect £200 and don't run any ~Finialiser or Dispose().

It seems from this that AppDomain fundamentally doesn't do what it claims (or at lease what it appears to claim) to do, as it just passes all exceptions straight to the parent domain, making it useless for isolation and pretty weak for any kind of security (if I can take out the parent process I can probably compromise the machine). That would be a pretty fundamental failure in .Net, so I must be missing something in my code.

Am I missing something? Is there some way to make AppDomain actually isolate the code that it's running and unload when something bad happens? Am I using the wrong thing and is there some other .Net feature that does provide exception isolation?

Infinitive answered 12/6, 2015 at 16:48 Comment(0)

R

6

I'll throw on some random thoughts, but what @Will has said is correct regarding permissions, CAS, security transparency, and sandboxing. AppDomains are not quite superman. Regarding exceptions though, an AppDomain is capable of handling most unhandled exceptions. The category of exceptions that they are not is called an asynchronous exception. Finding documentation on such exceptions is a little more difficult now that we have async/await, but it exists, and they come in three common forms:

StackOverflowException
OutOfMemoryException
ThreadAbortException

These exceptions are said to be asynchronous because they can be thrown anywhere, even between CIL opcodes. The first two are about the whole environment dying. The CLR lacks the powers of a Phoenix, it cannot handle these exceptions because the means of doing so are already dead. Note that these rules only exist when the CLR throws them. If you just new-up and instance and throw it yourself, they behave like normal exceptions.

Sidenote: If you ever peek at a memory dump of a process that is hosting the CLR, you will see there are always OutOfMemoryException, ThreadAbortException, and StackOverflowException on the heap, but they have no roots you can see, and they never get GCed. What gives? The reason they are there is because the CLR preallocates them - it wouldn't be able to allocate them at the time they are needed. It wouldn't be able to allocate an OutOfMemoryException when we're out of memory.

There is a piece of software that is able to handle all of these exceptions. Starting in 2005, SQL has had the ability to run .NET assemblies with a feature called SQLCLR. SQL server is a rather important process, and having a .NET assembly throw an OutOfMemoryException and it bringing down the entire SQL process seemed tremendously undesirable, so the SQL team doesn't let that happen.

They do this using a .NET 2.0 feature called constrained execution and critical regions. This is where things like ExecuteCodeWithGuaranteedCleanup come into play. If you are able to host the CLR yourself, start with native code and spin up the CLR yourself, you are then able to change the escalation policy: from native code you are able to handle those managed exceptions. This is how SQL CLR handles those situations.

Roderick answered 12/6, 2015 at 20:8 Comment(0)

B

6

You can't do anything about Environment.Exit(), just like you can't prevent a user from killing your process in Task Manager. Static analysis for this could be circumvented, as well. I wouldn't worry too much about that. There are things you can do, and things you really can't.

The AppDomain does do what it claims to do. However, what it actually claims to do and what you believe it claims to do are two different things.

Unhandled exceptions anywhere will take down your application. AppDomains don't protect against these. But you can prevent unhandled exceptions from crossing AppDomain boundaries by the following (sorry, no code)

Create your AppDomain
Load and unwrap your plugin controller in this AppDomain
Control plugins through this controller, which
Isolates calls to 3rd party plugins by wrapping them in try/catch blocks.

Really, the only thing an AppDomain gives you is the ability to load, isolate and unload assemblies that you do not fully trust during runtime. You cannot do this within the executing AppDomain. All loaded assemblies stay until execution halts, and they enjoy the same permission set as all other code in the AppDomain.

To be a touch clearer, here's some pseudocode that looks like c# that prevents 3rd-party code from throwing exceptions across the AppDomain boundary.

public class PluginHost : IPluginHost, IPlugin
{
    private IPlugin _wrapped;
    void IPluginHost.Load(string filename, string typename)
    {
        // load the assembly (filename) into the AppDomain.
        // Activator.CreateInstance the typename to create 3rd party plugin
        // _wrapped = the plugin instance
    }

    void IPlugin.DoWork()
    {
        try
        {
            _wrapped.DoWork();
        }catch(Exception ex)
            // log
            // unload plugin whatevs
        }
}

This type would be created in your Plugin AppDomain, and its proxy unwrapped in the application AppDomain. You use it to puppet the plugin within the Plugin AppDomain. It prevents exceptions from crossing AppDomain boundaries, performs loading tasks, etc etc. Pulling a proxy of the plugin type into the application AppDomain is very risky, as any object types that are NOT MarshalByRefObject that the proxy can somehow get into your hands (e.g., Throw new MyCustomException()) will result in the plugin assembly being loaded in the application AppDomain, thus rendering your isolation efforts null and void.

(this is a bit oversimplified)

Butcher answered 12/6, 2015 at 17:1 Comment(5)

I don't really expect to be able to stop the user, but I would expect to be able to protect my app from code launched in a sandbox - it wouldn't be much of sandbox otherwise. I guess isolation and sandboxing isn't what AppDomains are supposed to do, but without sandboxing the security can be broken. Doesn't that make them pointless? I have to trust the code to handle all its exceptions but I don't trust it to write to disk? What am I missing? – Infinitive 12/6, 2015 at 17:51

@Infinitive well, you can protect your code by using AppDomains, but the protection is limited. They aren't pointless, as I've said you can control CAS, load/unload assemblies (unloading assemblies is big!), and isolate 3rd party code using them. They aren't superman, but they're not aquaman. And you don't have to trust "code" to "handle exceptions" as you can easily prevent unhandled exceptions from crossing the AppDomain boundary (not including uncatchables). You're missing the benefits because you only see what it doesn't give you, I believe. – Butcher 12/6, 2015 at 18:50

There are a few exceptions that do not play by the normal rules. The big three are StackOverflowException, OutOfMemoryException, and ThreadAbortException. These exist beyond what an AppDomain/Thread can handle. The only way to reasonably handle these cases is to host the CLR yourself and use constrained execution and reliability contracts. This is how SQLCLR is able to defend itself against the CLR from OOMing without it taking down the SQL process with it. – Roderick 12/6, 2015 at 19:15

For me appdomain is not providing any isolation. An unhandled excpetion raised on one of the child appdomain is crashing the parent process. The exception is neither of the three mentioned above. – Gesellschaft 2/1, 2018 at 6:21

"You can't do anything about Environment.Exit()" - actually you can in a reduced PermissionSet sandbox AppDomain. Exit requires UnmanagedCode permission so as soon as the untrusted code attempts this in the sandbox domain an exception will be thrown and can be caught by the sandbox manager as you said. learn.microsoft.com/en-us/dotnet/api/… – Matos 28/11, 2018 at 4:25

R

6