Using Parallel Extensions with ThreadStatic attribute. Could it leak memory?
Asked Answered
S

2

6

I'm using Parallel Extensions fairly heavily and I've just now encountered a case where using thread local storage might be sensible to allow re-use of objects by worker threads. As such I was looking at the ThreadStatic attribute which marks a static field/variable as having a unique value per thread.

It seems to me that it would be unwise to use PE with the ThreadStatic attribute without any guarantee of thread re-use by PE. That is, if threads are created and destroyed to some degree would the variables (and thus objects they point to) remain in thread local storage for some indeterminate amount of time, thus causing a memory leak? Or perhaps the thread storage is tied to the threads and disposed of when the threads are disposed? But then you still potentially have threads in a pool that are longed lived and that accumulate thread local storage from various pieces of code the threads are used for.

Is there a better approach to obtaining thread local storage with PE?

Thankyou.

Submultiple answered 12/6, 2010 at 17:9 Comment(1)
The correct terminology is "retired" rather than "destroyed" regarding threads being removed from the pool and then shuffling off their stacks.Shimkus
L
5

I would strongly encourage using the normal pattern for thread-local storage, described in this MSDN article.

When you use [ThreadStatic], what matters is whether or not a threadpool thread cleans up the TLS variables when it terminates. There isn't any suggestion in the MSDN docs that it doesn't. It wouldn't be hard to implement, it only has to call the TlsFree() API function. I wrote a little test app, no evidence of any leak.

Latinalatinate answered 12/6, 2010 at 18:2 Comment(0)
K
4

EDIT: Given Hans's answer, it sounds like the TLS actually would be cleaned up anyway... which just leaves this bit of the answer:

Do you really have no better way of reusing values within a thread? If there are two tasks which use the same thread (one completes, then the other runs) are they really going to want the same value? Are you actually just using this as a way of avoiding propagating the data in a more controlled way through your task?

Koenraad answered 12/6, 2010 at 17:19 Comment(9)
The scenario is a simulation of a grid based 'world' - independently evaluating a set of agents in said world. Hence to run in parallel I can create a new world, use, and discard within each parallel loop. My intention was to put a Reset() method on the world to allow re-use. I figure static local storage gets me out of having to manage my own pool of 'worlds' with associated thread locked access to the pool, etc.Submultiple
@the-locster: I'm afraid I still don't see the benefit of thread-local storage here. If it's within a task, why not just keep hold of the reference?Koenraad
Each evaluation puts one agent into one world all by itself. Thus if I have 8 CPU cores/threads I have 8 independent worlds being simulated at any given time - one world per thread.Submultiple
@the-locster: But don't you also have 8 agents? If so, why shouldn't the agent know about the world, instead of relying on the thread local storage?Koenraad
@Jon: I think the bit I didn't explain well is that there are many agents (hundred or thousands) that need to be evaluated in a world. Hence one world per active core/thread rather than one per agent - otherwise I have thousands of worlds allocated in memory, each of which only gets used once.Submultiple
@the-locster: But what's special about the agents which execute in one thread which means they can share one world, but others can't? If agents can actually share worlds at any time, just not concurrently, then I would go for a simple pool rather than thread statics. Why introduce dependencies on threading when they're unnecessary?Koenraad
@Jon: "...then I would go for a simple pool rather than thread statics.". Yes a pool of worlds would fit well here. Essentially I'm fine tuning to make this code as fast as possible - hence I'm looking at TLS as a means of avoiding the thread lock that would be required to access the pool (the high speed nature of the code means that lock contention would probably not be insignificant). Possibly this is misguided but I'd like to try it to determine which is fastest.Submultiple
@the-locster: If you're already using Parallel Extensions, then presumably you've got ConcurrentBag available to you, which could act as a pool if you know how many you need. How long does each agent take? The cost of acquiring a lock twice (once to retrieve the world from the pool, once to return it) is going to be insignificant unless the agents are also doing insignificant amounts of work.Koenraad
@Jon: Points noted. I'll experiment with ConcurrentBag; my limited experience with concurrent collections is that they tend to employ very efficient locking strategies (moreso that a Monitor.Enter/Exit). Thanks for the discussion.Submultiple

© 2022 - 2024 — McMap. All rights reserved.