How to synchronize access to many objects
Asked Answered
A

8

12

I have a thread pool with some threads (e.g. as many as number of cores) that work on many objects, say thousands of objects. Normally I would give each object a mutex to protect access to its internals, lock it when I'm doing work, then release it. When two threads would try to access the same object, one of the threads has to wait.

Now I want to save some resources and be scalable, as there may be thousands of objects, and still only a hand full of threads. I'm thinking about a class design where the thread has some sort of mutex or lock object, and assigns the lock to the object when the object should be accessed. This would save resources, as I only have as much lock objects as I have threads.

Now comes the programming part, where I want to transfer this design into code, but don't know quite where to start. I'm programming in C++ and want to use Boost classes where possible, but self written classes that handle these special requirements are ok. How would I implement this?

My first idea was to have a boost::mutex object per thread, and each object has a boost::shared_ptr that initially is unset (or NULL). Now when I want to access the object, I lock it by creating a scoped_lock object and assign it to the shared_ptr. When the shared_ptr is already set, I wait on the present lock. This idea sounds like a heap full of race conditions, so I sort of abandoned it. Is there another way to accomplish this design? A completely different way?

Edit: The above description is a bit abstract, so let me add a specific example. Imagine a virtual world with many objects (think > 100.000). Users moving in the world could move through the world and modify objects (e.g. shoot arrows at monsters). When only using one thread, I'm good with a work queue where modifications to objects are queued. I want a more scalable design, though. If 128 core processors are available, I want to use all 128, so use that number of threads, each with work queues. One solution would be to use spatial separation, e.g. use a lock for an area. This could reduce number of locks used, but I'm more interested if there's a design which saves as much locks as possible.

Augmentative answered 26/4, 2010 at 18:52 Comment(0)
E
4

You could use a mutex pool instead of allocating one mutex per resource or one mutex per thread. As mutexes are requested, first check the object in question. If it already has a mutex tagged to it, block on that mutex. If not, assign a mutex to that object and signal it, taking the mutex out of the pool. Once the mutex is unsignaled, clear the slot and return the mutex to the pool.

Estrone answered 26/4, 2010 at 21:26 Comment(8)
That sounds like it will solve the problem. As objects have a unique identifier, the mutex pool has a way to identify the objects. I'm trying it out. Thanks!Augmentative
Doesn't that just move the problem up a level? Now we need a mutex on the object to make sure that it is safe to assign a mutex to the object.Approbate
No, you just need a mutex to protect your mutex pool (a map of object id's to mutexes in use).Augmentative
@vividos. You would require one mutex to protect your mutex pool but the action of getting a mutex off the pool should be relatively quick as long as there is one available. IF the OP's purpose is to perform a long action on an item that requires locking for a period of time, it is this "long-held" lock that you want to apply to just the one object.Talishatalisman
This applies to critical sections as well (added so that searching for critical section shows this).Morgun
did you implement this solution ? I've a similar problem and I would like to know how you solved to release the mutex back to the pool. I mean, how do you know, when you release the mutex, that it can be safely put back in queue if you do not know if another thread is holding it ?Sheepshead
@LeonardoBernardini: Years ago, yes. These days everything is lock-free. But you could possibly implement a reference count.Estrone
i was thinking about that, but how do you implement a lock-free access to queued items ? do you have a reference ? Many thanks !Sheepshead
Y
3

Without knowing it, what you were looking for is Software Transactional Memory (STM).

STM systems manage with the needed locks internally to ensure the ACI properties (Atomic,Consistent,Isolated). This is a research activity. You can find a lot of STM libraries; in particular I'm working on Boost.STM (The library is not yet for beta test, and the documentation is not really up to date, but you can play with). There are also some compilers that are introducing TM in (as Intel, IBM, and SUN compilers). You can get the draft specification from here

The idea is to identify the critical regions as follows

transaction {
  // transactional block
}

and let the STM system to manage with the needed locks as far as it ensures the ACI properties.

The Boost.STM approach let you write things like

int inc_and_ret(stm::object<int>& i) {
  BOOST_STM_TRANSACTION {
    return ++i;
  } BOOST_STM_END_TRANSACTION 
}

You can see the couple BOOST_STM_TRANSACTION/BOOST_STM_END_TRANSACTION as a way to determine a scoped implicit lock.

The cost of this pseudo transparency is of 4 meta-data bytes for each stm::object.

Even if this is far from your initial design I really think is what was behind your goal and initial design.

Yancey answered 26/4, 2010 at 20:34 Comment(5)
Boost.STM sounds like an interesting library. I'm going to have a look at it. Thanks!Augmentative
Note that the branch vbe is more up to date that the trunck. :)Yancey
@Augmentative Does Boost.STM respond to your needs?Yancey
I didn't try out the library yet. I assume that all objects that should be protected by access from two transactions should be stored as stm::object<MyType> somewhere. I think STM could also be used to solve my problem.Augmentative
@Vicente I have looked at Boost.STM and the various articles of research as I have an interest in this issue myself, i.e. a container that allows locking of individual items without having to create a mutex for each one. The transactions on each item should be "pessimistic" but you should be allowed concurrent transactions on other parts of the collection. I have not seen any documentation on your class and how it applies to this particular model.Talishatalisman
G
1

I doubt there's any clean way to accomplish your design. The problem that assigning the mutex to the object looks like it'll modify the contents of the object -- so you need a mutex to protect the object from several threads trying to assign mutexes to it at once, so to keep your first mutex assignment safe, you'd need another mutex to protect the first one.

Personally, I think what you're trying to cure probably isn't a problem in the first place. Before I spent much time on trying to fix it, I'd do a bit of testing to see what (if anything) you lose by simply including a Mutex in each object and being done with it. I doubt you'll need to go any further than that.

If you need to do more than that I'd think of having a thread-safe pool of objects, and anytime a thread wants to operate on an object, it has to obtain ownership from that pool. The call to obtain ownership would release any object currently owned by the requesting thread (to avoid deadlocks), and then give it ownership of the requested object (blocking if the object is currently owned by another thread). The object pool manager would probably operate in a thread by itself, automatically serializing all access to the pool management, so the pool management code could avoid having to lock access to the variables telling it who currently owns what object and such.

Gainless answered 26/4, 2010 at 19:20 Comment(1)
You can use a boost::once to set the mutex to the object. The cost is the once_flag variable which you can reset to init subsequently as you release the mutex back to the pool. There solution may involve some "global" locking and unlocking but I think that the purpose is that he may wish to retain a lock on an object for a significant period of time whilst allowing the rest of the system to be available for use.Talishatalisman
G
1

Personally, here's what I would do. You have a number of objects, all probably have a key of some sort, say names. So take the following list of people's names:

 Bill Clinton
 Bill Cosby 
 John Doe
 Abraham Lincoln 
 Jon  Stewart 

So now you would create a number of lists: one per letter of the alphabet, say. Bill and Bill would go in one list, John, Jon Abraham all by themselves.

Each list would be assigned to a specific thread - access would have to go through that thread (you would have to marshall operations to an object onto that thread - a great use of functors). Then you only have two places to lock:

 thread() { 
      loop { 
         scoped_lock lock(list.mutex); 
         list.objectAccess(); 
      }
 } 

 list_add() { 
       scoped_lock lock(list.mutex); 
       list.add(..); 
 } 

Keep the locks to a minimum, and if you're still doing a lot of locking, you can optimise the number of iterations you perform on the objects in your lists from 1 to 5, to minimize the amount of time spent acquiring locks. If your data set grows or is keyed by number, you can do any amount of segregating data to keep the locking to a minimum.

Gemeinschaft answered 26/4, 2010 at 19:32 Comment(0)
A
1

It sounds to me like you need a work queue. If the lock on the work queue became a bottle neck you could switch it around so that each thread had its own work queue then some sort of scheduler would give the incoming object to the thread with the least amount of work to do. The next level up from that is work stealing where threads that have run out of work look at the work queues of other threads.(See Intel's thread building blocks library.)

Approbate answered 26/4, 2010 at 19:40 Comment(1)
I already have a work queue per thread, the question is about locking the objects when working on them.Augmentative
M
0

If I follow you correctly ....

struct table_entry {
    void *   pObject;     // substitute with your object
    sem_t    sem;         // init to empty
    int      nPenders;    // init to zero
};

struct table_entry *  table;

object_lock (void * pObject) {
    goto label;                   // yes it is an evil goto

    do {
        pEntry->nPenders++;
        unlock (mutex);
        sem_wait (sem);
label:
        lock (mutex);
        found = search (table, pObject, &pEntry);
    } while (found);

    add_object_to_table (table, pObject);
    unlock (mutex);
}

object_unlock (void * pObject) {
    lock (mutex);
    pEntry = remove (table, pObject);   // assuming it is in the table
    if (nPenders != 0) {
        nPenders--;
        sem_post (pEntry->sem);
    }
    unlock (mutex);
}

The above should work, but it does have some potential drawbacks such as ...

  1. A possible bottleneck in the search.
  2. Thread starvation. There is no guarantee that any given thread will get out of the do-while loop in object_lock().

However, depending upon your setup, these potential draw-backs might not matter.

Hope this helps.

Micro answered 26/4, 2010 at 19:16 Comment(1)
This seems to implement the ideas behind the mutex pool described by John Dibling. Thanks for sharing the idea!Augmentative
T
0

We here have an interest in a similar model. A solution we have considered is to have a global (or shared) lock but used in the following manner:

  • A flag that can be atomically set on the object. If you set the flag you then own the object.
  • You perform your action then reset the variable and signal (broadcast) a condition variable.
  • If the acquire failed you wait on the condition variable. When it is broadcast you check its state to see if it is available.

It does appear though that we need to lock the mutex each time we change the value of this variable. So there is a lot of locking and unlocking but you do not need to keep the lock for any long period.

With a "shared" lock you have one lock applying to multiple items. You would use some kind of "hash" function to determine which mutex/condition variable applies to this particular entry.

Talishatalisman answered 25/11, 2010 at 13:21 Comment(1)
wouldn't there be a race between multiple threads waiting on the condition variable? or is only one waiting thread waked up?Augmentative
P
0

Answer the following question under the @JohnDibling's post.

did you implement this solution ? I've a similar problem and I would like to know how you solved to release the mutex back to the pool. I mean, how do you know, when you release the mutex, that it can be safely put back in queue if you do not know if another thread is holding it ?

by @LeonardoBernardini


I'm currently trying to solve the same kind of problem. My approach is create your own mutex struct (call it counterMutex) with a counter field and the real resource mutex field. So every time you try to lock the counterMutex, first you increment the counter then lock the underlying mutex. When you're done with it, you decrement the coutner and unlock the mutex, after that check the counter to see if it's zero which means no other thread is trying to acquire the lock . If so put the counterMutex back to the pool. Is there a race condition when manipulating the counter? you may ask. The answer is NO. Remember you have a global mutex to ensure that only one thread can access the coutnerMutex at one time.

Paulsen answered 8/12, 2015 at 11:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.