How do I take ownership of an abandoned boost::interprocess::interprocess_mutex?
Asked Answered
E

1

9

My scenario: one server and some clients (though not many). The server can only respond to one client at a time, so they must be queued up. I'm using a mutex (boost::interprocess::interprocess_mutex) to do this, wrapped in a boost::interprocess::scoped_lock.

The thing is, if one client dies unexpectedly (i.e. no destructor runs) while holding the mutex, the other clients are in trouble, because they are waiting on that mutex. I've considered using timed wait, so if I client waits for, say, 20 seconds and doesn't get the mutex, it goes ahead and talks to the server anyway.

Problems with this approach: 1) it does this everytime. If it's in a loop, talking constantly to the server, it needs to wait for the timeout every single time. 2) If there are three clients, and one of them dies while holding the mutex, the other two will just wait 20 seconds and talk to the server at the same time - exactly what I was trying to avoid.

So, how can I say to a client, "hey there, it seems this mutex has been abandoned, take ownership of it"?

Evans answered 24/7, 2009 at 19:29 Comment(3)
If you're relying on the clients to do synchronization, you're doing it backward. You really should fix your server so it can accept multiple connections, even if it just makes the other connections wait while it serves one at a time. That lets you take the interprocess part out of the equation.Luannaluanne
Fair point. However, my application was originally specified as having only one client at a time - I only recently (as in today) found out that there could be multiple clients. I attempted to solve it the easy way, but I suppose I'll have to come up with something more sophisticated.Candracandy
It looks like whole mechanism of mutex is flawed without a recovery mechanism. Wish boost fixes this.Katharinekatharsis
F
9

Unfortunately, this isn't supported by the boost::interprocess API as-is. There are a few ways you could implement it however:

If you are on a POSIX platform with support for pthread_mutexattr_setrobust_np, edit boost/interprocess/sync/posix/thread_helpers.hpp and boost/interprocess/sync/posix/interprocess_mutex.hpp to use robust mutexes, and to handle somehow the EOWNERDEAD return from pthread_mutex_lock.

If you are on some other platform, you could edit boost/interprocess/sync/emulation/interprocess_mutex.hpp to use a generation counter, with the locked flag in the lower bit. Then you can create a reclaim protocol that will set a flag in the lock word to indicate a pending reclaim, then do a compare-and-swap after a timeout to check that the same generation is still in the lock word, and if so replace it with a locked next-generation value.

If you're on windows, another good option would be to use native mutex objects; they'll likely be more efficient than busy-waiting anyway.

You may also want to reconsider the use of a shared-memory protocol - why not use a network protocol instead?

Footwear answered 24/7, 2009 at 19:40 Comment(2)
Great answer. I don't think I'll be implementing it, though; it doesn't seem worth the trouble - I'll think of something else. About your suggestion of using a network protocol, I couldn't agree with you more. Unfortunately, it's just too late in the game to change things so radically.Candracandy
this saved my day.. Hack boost source code is a good way to go .Corene

© 2022 - 2024 — McMap. All rights reserved.