Pthread mutex assertion error
Asked Answered
D

8

49

I'm encountering the following error at unpredictable times in a linux-based (arm) communications application:

pthread_mutex_lock.c:82: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.

Google turns up a lot of references to that error, but little information that seems relevant to my situation. I was wondering if anyone can give me some ideas about how to troubleshoot this error. Does anyone know of a common cause for this assertion?

Thanks in advance.

Dinitrobenzene answered 9/7, 2009 at 18:36 Comment(4)
Having eliminated all other possibilities, I decided to invest in some RTFM. It appears I have been using the mutex in a way that is not officially supported. When a thread is waiting for some external stimulus, it waits on its mutex. The thread comes back to life when the mutex is released, always from another thread. So the releasing thread is never the mutex owner. I changed the implementation to use a condition variable. I don't know yet if this is the reason for my troubles. I've been (mis)using the mutex this way for years and haven't had any problems with it until now.Dinitrobenzene
Aren't pthread_mutexes (and mutexes in general) documented such that they must be unlocked by the same thread that locked them? The fact that it happens to work on other platforms is implementation-specific and not portable.Acne
I think that's what I said in my comment above. My implementation was misusing the mutex, so I changed it to make correct usage of a condition variable. All that remains is to confirm that this was in fact behind the intermittent assertion.Dinitrobenzene
I have the same error sometimes when my mutex is not initialized correctly --> use pthread_mutex_initFlossi
D
38

Rock solid for 4 days straight. I'm declaring victory on this one. The answer is "stupid user error" (see comments above). A mutex should only be unlocked by the thread that locked it. Thanks for bearing with me.

Dinitrobenzene answered 13/7, 2009 at 21:57 Comment(1)
Your solution only applies to unlocking then, right? I'm getting the same error when trying to lock it.Wamsley
K
12

TLDR: Make sure you are not locking a mutex that has been destroyed / hasn't been initialized.

Although the OP has his answer, I thought I would share my issue in case anyone else has the same problem I did.

Notice that the assertion is in __pthread_mutex_lock and not in the unlock. This, to me, suggests that most other people having this issue are not unlocking a mutex in a different thread than the one that locked it; they are just locking a mutex that has been destroyed.

For me, I had a class (Let's call it Foo) that registered a static callback function with some other class (Let's call it Bar). The callback was being passed a reference to Foo and would occasionally lock/unlock a mutex that was a member of Foo.

This problem occurred after the Foo instance was destroyed while the Bar instance was still using the callback. The callback was being passed a reference to an object that no longer existed and, therefore, was calling __pthread_mutex_lock on garbage memory.

Note, I was using C++11's std::mutex and std::lock_guard<std::mutex>, but, since I was on Linux, the problem was exactly the same.

Kelpie answered 9/4, 2017 at 1:15 Comment(1)
To add to this, I had this happen when I was unlocking the same lock twice. The assertion error happened the next time I was attempting to acquire the lock, which made it a bit harder to find.Tootsy
P
4

I was faced with the same problem and google sent me here. The problem with my program was that in some situations I was not initializing the mutex before locking it.

Although the statement in the accepted answer is legitimate, I think it is not the cause of this failed assertion. Because the error is reported on pthread_mutex_lock (and not unlock).

Also, as always, it is more likely that the error is in the programmers source code rather than the compiler.

Philanthropic answered 15/11, 2011 at 23:33 Comment(0)
S
2

In case you are using C++ and std::unique_lock, check this answer: https://mcmap.net/q/356767/-pthread_mutex_lock-c-62-__pthread_mutex_lock-assertion-mutex-gt-__data-__owner-0-39-failed

Seamy answered 23/2, 2021 at 2:25 Comment(2)
@mark you are right. I changed my answer.Seamy
@Seamy You still have a little typo here. I guess it has to be unique instead of unqiue. Just saying. Have a good day!Godbey
G
1

The quick bit of Googling I've done often blames this on a compiler mis-optimization. A decent summation is here. It might be worth looking at the assembly output to see if gcc is producing the right code.

Either that or you are managing to stomp on the memory used by the pthread library... those sort of problems are rather tricky to find.

Gorget answered 9/7, 2009 at 18:44 Comment(1)
I've been down the compiler mis-optimization path, which doesn't appear to be an issue in this case: assert (mutex->__data.__owner == 0); 154: e5953008 ldr r3, [r5, #8] 158: e3530000 cmp r3, #0 ; 0x0 15c: 1a0001a0 bne 7e4 <__pthread_mutex_lock+0x7e4>Dinitrobenzene
P
1

I was having same problem

in my case inside the thread i was connecting vertica db with odbc adding following setting to /etc/odbcinst.ini solved my problem. dont geting the exception so far.

[ODBC]
Threading = 1

credits to : hynek

Pytlik answered 5/2, 2015 at 9:57 Comment(0)
F
1

I have just fought my way through this one and thought it might help others. In my case the issue occured in a very simple method that locked the mutex, checked a shared variable and then returned. The method is an override of the base class which creates a worker thread.

The problem in this instance was that the base class was creating the thread in the constructor. The thread then started executing and the derived classes implementation of the method was called. Unfortunately the derived class had not yet completed constructing and the mutex in the derived class had uninitialised data as the mutex owner. This made it look like it was actually locked when it wasn't.

The solution is really simple. Add a protected method to the base class called StartThread(). This needs to be called in the derived classes constructor, not from the base class.

Fanaticism answered 23/7, 2019 at 23:47 Comment(0)
I
0

adding Threading=0 in /etc/odbcinst.ini file fixed this issue

Inkerman answered 14/9, 2016 at 11:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.