How do I fix the deadlock on a threadpool thread that has a SynchronizationContext?
Asked Answered
H

2

6

I am getting intermittent deadlocks when using HttpClient to send http requests and sometimes they are never returning back to await SendAsync in my code. I was able to figure out the thread handling the request internally in HttpClient/HttpClientHandler for some reason has a SynchronizationContext during the times it is deadlocking. I would like to figure out how the thread getting used ends up with a SynchronizationContext, when normally they don't have one. I would assume that whatever object is causing this SynchronizationContext to be set is also blocking on the Thread, which is causing the deadlock.

Would I be able to see anything relevant in the TPL ETW events?

How can I troubleshoot this?



Edit 2: The place that I have been noticing these deadlocks is in a wcf ServiceContract(see code below) inside of a windows service. The SynchronizationContext that is causing an issue is actually a WindowsFormsSynchronizationContext, which I assume is caused by some control getting created and not cleaned up properly (or something similar). I realize there almost certainly shouldn't be any windows forms stuff going on inside of a windows service, and I'm not saying I agree with how it's being used. However, I didn't write any of the code using it, and I can't just trivially go change all of the references.

Edit: here is an example of the general idea of the wcf service I was having a problem with. It's a simplified version, not the exact code:

[ServiceContract]
[ServiceBehavior(InstanceContextMode = InstanceContextMode.Single, ConcurrencyMode = ConcurrencyMode.Multiple)]
internal class SampleWcfService
{
    private readonly HttpMessageInvoker _invoker;

    public SampleWcfService(HttpMessageInvoker invoker)
    {
        _invoker = invoker;
    }
    
    [WebGet(UriTemplate = "*")]
    [OperationContract(AsyncPattern = true)]
    public async Task<Message> GetAsync()
    {
        var context = WebOperationContext.Current;
        using (var request = CreateNewRequestFromContext(context))
        {
            var response = await _invoker.SendAsync(request, CancellationToken.None).ConfigureAwait(false);
            var stream = response.Content != null ? await response.Content.ReadAsStreamAsync().ConfigureAwait(false) : null;
            return StreamMessageHelper.CreateMessage(MessageVersion.None, "GETRESPONSE", stream ?? new MemoryStream());
        }
    }
}

Adding ConfigureAwait(false) to the 2 places above didn't completely fix my problem because a threadpool thread used to service a wcf request coming into here may already have a SynchronizationContext. In that case the request makes it all the way through this whole GetAsync method and returns. However, it still ends up deadlocked in System.ServiceModel.Dispatcher.TaskMethodInvoker, because in that microsoft code, it doesn't use ConfigureAwait(false) and I want to assume there is a good reason for that (for reference):

var returnValueTask = returnValue as Task;

if (returnValueTask != null)
{
    // Only return once the task has completed                        
    await returnValueTask;
}

It feels really wrong, but would converting this to using APM (Begin/End) instead of using Tasks fix this? Or, is the only fix to just correct the code that is not cleaning up its SynchronizationContext properly?

Humfrey answered 16/3, 2018 at 0:9 Comment(12)
Do you have any threads? How come you posted no code?Muleteer
@Muleteer There's not much I can add that I haven't already said in the question. I was able to tell that the SynchronizationContext was set in the problem thread based on a process dump, but the thread doesn't have a managed stacktrace, only unmanaged stuff.Humfrey
The Theory that something is initiating a SynchronizationContext and then waiting on it finishing (wich never happens) is solid. But without you showing us the code, there is only so much we can tell you.Wold
Are you awaiting all your Tasks all the way up or blocking at some point? If the latter, either change stop blocking and await or use ConfigureAwait(false) on your Tasks: blog.stephencleary.com/2012/07/dont-block-on-async-code.htmlBrisbane
@Brisbane i was able to get it to stop deadlocking by using ConfigureAwait(false). But I'm still trying to figure out how the SynchronizationContext is getting set in the first place, and then i'll decide based on that whether it makes sense to use the ConfigureAwait(false). Also I wanna make sure adding the ConfigureAwait(false) won't add any unintended side effects, even though it appears to help this case.Humfrey
What kind of an application is this? As I mentioned, I think somewhere upstream of your HttpClient code you are not awaiting a task, which until you added ConfigureAwait(false) was causing the deadlock when the async resumption attempted to grab the same context that was already held by the upstream thread. Here's some more reading: medium.com/bynder-tech/…Brisbane
@Brisbane It's a windows service. And I kind of figured what you said might be the case. And that's why in my question I mentioned ETW events, because there is a lot of code, and it's not easy to figure out where in the code this SynchronizationContext is being set at runtime. So I was hoping there was some way I could capture more info from the process to see the callstack before it's getting created, or something of that nature.Humfrey
How do you call the SampleWcfService.GetAsync method? Do you call Task.Run or anything similar using this GetAsync method?Suboceanic
@Suboceanic I don't specifically call it anywhere, it gets called internally by .Net code in System.ServiceModel.Dispatcher.TaskMethodInvoker. The code I posted isn't where the method gets called, but it's right above that. And then in the code I posted it awaits the Task that gets returned from invoking GetAsync.Humfrey
Also, how do I know what to fix in this question from downvotes w/out comments or suggested edits?Humfrey
Perhaps you're hitting the max of 2 concurrent connections per host limit. See #4092706 and #10404444Model
I'm manually setting that value to higher than 2 and there were no warnings in my wcf traces related to hitting the max connections limit. So, I don't think that is related, but thank you for the suggestion. Even if it was lower than it should be, I'm pretty sure that wouldn't cause it to deadlock like this. I would think it would just make each request take longer to return back to the client.Humfrey
H
4

Update: we now know we're dealing with a WindowsFormsSynchronizationContext (see comments), for whatever reason in a WCF application. It's no surprise then to see deadlocks since the point of that SyncContext is to run all continuations on the same thread.

You could try to to set WindowsFormsSynchronizationContext.AutoInstall to false. According to its docs, what it does is:

Gets or sets a value indicating whether the WindowsFormsSynchronizationContext is installed when a control is created

Assuming someone creates a WindowsForms control somewhere in your app, then that might be your issue and would potentially be solved by disabling this setting.

An alternative to get rid of an existing SynchronizationContext would be to just overwrite it with null, and later restoring it (if you're nice). This article describes this approach and provides a convenient SynchronizationContextRemover implementation you could use.

However, this probably won't work if the SyncContext is created by some library methods you use. I'm not aware of a way to prevent a SyncContext from being overwritten, so setting a dummy context won't help either.


Are you sure the SynchronizationContext is actually at fault here?

From this MSDN magazine article:

Default (ThreadPool) SynchronizationContext (mscorlib.dll: System.Threading)
The default SynchronizationContext is a default-constructed SynchronizationContext object. By convention, if a thread’s current SynchronizationContext is null, then it implicitly has a default SynchronizationContext.

The default SynchronizationContext queues its asynchronous delegates to the ThreadPool but executes its synchronous delegates directly on the calling thread. Therefore, its context covers all ThreadPool threads as well as any thread that calls Send. The context “borrows” threads that call Send, bringing them into its context until the delegate completes. In this sense, the default context may include any thread in the process.

The default SynchronizationContext is applied to ThreadPool threads unless the code is hosted by ASP.NET. The default SynchronizationContext is also implicitly applied to explicit child threads (instances of the Thread class) unless the child thread sets its own SynchronizationContext.

If the SynchronizationContext you are seeing is the default one, it should be fine (or rather, you will have a very hard time to avoid it being used).

Can't you provide more details / code about what's involved?

One thing that looks immediately suspicious to me in your code (though it may be completely fine) is that you have a using block that captures a static WebOperationContext.Current in request, which will both be captured by the generated async state machine. Again, might be fine, but there's a lot of potential for deadlocks here if something waits on WebOperationContext

Hasan answered 11/5, 2018 at 12:58 Comment(15)
As far as the using in relation to be WebOperationContext, all it's doing is copying some stuff from that to a newly created HttpRequestMessage, I don't think that should cause a problem.Humfrey
Also, I believe I have good reason for thinking that the SynchronizationContext is the problem. based on tracing I added, normally threads come through there with a null SynchronizationContext, and only the ones that have SynchronizationContext.Current not null are hanging at the place I mentioned in my question: await returnValueTask;Humfrey
And I'm not saying that there isn't a bug in our code somewhere that is causing this SynchronizationContext to come up intermittently, because there probably is. I'm trying to figure out if I can adjust this wcf service and adjacent code to workaround the issue (or figure out if there is something inherently wrong with the implementation), since none of this code I posted is actually using the thread's SynchronizationContext anyways.Humfrey
What do you mean by "waits on WebOperationContext"?Humfrey
All I'm saying is try to figure out if the added SynchronizationContext is the default one (it's just called "SynchronizationContext", not e.g. AspNetSynchronizationContext). As mentioned in the article, this context can be set by various things if none is already present. Which means it will be hard to track down where it is introduced and also hard to avoid it being added.Hasan
My final thought on WebOperationContext.Current is just that it is static and thus likely shared by other operations, as well as being captured by the async state machine. This is not a problem in itself, but if there are critical resources in it that can cause deadlocks or race conditions e.g. by locking on them, then it can be difficult to track down where exactly that happens. Again, this is just guesswork because we can't see the full picture. I'm merely pointing out that there are other areas that can cause your problem.Hasan
It's actually a WindowsFormsSynchronizationContext.Humfrey
In other words, it's not the default one. I have process/heap dumps to prove this and some other tracing info, I just don't think I necessarily want to directly post the files here, because it may have sensitive data. But, I can take screenshots.Humfrey
@Humfrey how does a WinForms SyncContext get in there? I updated my answer with some suggestions.Hasan
Yeah, that's a good question. I can see the places where it is being referenced with using System.Windows.Forms, but this is a big codebase, and it's non-trivial to try to figure out which of those places might be creating a WindowsFormsSynchronizationContext, and more importantly when. And that's the struggle of this whole issue. I don't think that windows forms should be referenced anywhere in the codebase for this service, but that's a whole other problem that I don't want to get into.Humfrey
TL;DR The main issue is the winforms assembly is referenced in a lot of files.Humfrey
Have you tried turning off that AutoInstall property? According to its docs, simply creating a control is enough to create a WinForms SyncContext, which the property prevents.Hasan
Thank you for the suggestion. I have not tried turning it off yet, but I will when I get a chance.Humfrey
I'm also trying to figure out if that would cause any other side effectsHumfrey
It appears setting WindowsFormsSynchronizationContext.AutoInstall to false seems to work in the testing I've done. I'm a little hesitant though, because I'm not sure what effect it might have on whatever windows forms stuff is happening in the background at runtime. And this issue is not the easiest to reproduce.Humfrey
D
2

Try below; I have found success in similar cases getting into the async rabbit hole.

var responsebytes = await response.Content.ReadAsByteArrayAsync();
MemoryStream stream = new MemoryStream(filebytes);

Response the stream variable.

Hope it helps.

Diachronic answered 10/5, 2018 at 12:56 Comment(5)
I'll try it when I get a chanceHumfrey
Do you mean new MemoryStream(responsebytes)?Humfrey
This doesn't change anything; I mean ReadAsStreamAsync is essentially doing the same thing internally, with some buffering added.Humfrey
@Humfrey Yes sorry, new MemoryStream(responsebytes). I miss-typed. In your code sample you are using two awaits, which is right but try: var stream = response.Content != null ? await response.Content.ReadAsStreamAsync().Result : null; You will still have one await and which should be sufficient for the async method. If this does not work - I am done and sorry. Thank you for trying though.Diachronic
First off, this doesn't compile: var stream = response.Content != null ? await response.Content.ReadAsStreamAsync().Result : null;. You are trying to await the Result in this code, instead of a Task. Also, making it synchronous won't even avoid the deadlock that I mentioned, and could even introduce more problems.Humfrey

© 2022 - 2024 — McMap. All rights reserved.