[sldev] OSX mutex-related lockups

Trevor Powell trevor at gridbug.org
Sat Aug 4 21:35:21 PDT 2007


This may have been raised before;  I'm pretty new to SL development,  
and I haven't read absolutely all of the list archives yet, but I  
haven't seen any comments about it yet..  and as an extra caveat, I'm  
still working with the 1.18.0.6 code;  I haven't looked at the source  
for 1.18.1.2 yet.  For all I know, this may have already been  
addressed, in which case, please disregard everything that follows.  :)



While testing various patches in an OS X build, I've noticed an issue  
with mutexes.  Inside llthread.cpp, line 269, llMutexes create their  
internal mutex objects using the "APR_THREAD_MUTEX_DEFAULT" mutex  
behaviour.

Apparently, under Win32 (I'm reliably informed), this default  
behaviour is the equivalent of APR_THREAD_MUTEX_NESTED, whereas under  
OSX (I've determined through testing), it's the equivalent of  
APR_THREAD_MUTEX_UNNESTED.  I haven't tested under Linux, but I  
suspect that it's probably treated as UNNESTED there, as well.


So this means that if code within a single thread tries to lock a  
single llMutex twice before unlocking it again, it will appear to run  
correctly in all Win32 builds, but will cause a lockup when that same  
code runs under OS X.

In my local build, I've modified the llMutex class to explicitly  
request the 'NESTED' mutex type, and this seems to have resolved a  
couple of frequent OS X lockups I've suffered while testing various  
patches.  I'd propose making this change part of the official  
source;  I figure that anything which makes the different platforms  
work more in the same way can only be a good thing, right?

Alternately, it'd be just as good to switch the official source to  
explicitly use 'UNNESTED' mutexes and so share the OS X lockups with  
the Win32 folks, so that Win32-using patch authors become able to  
debug their own mutex issues, which are currently invisible to them.   
The base code appears to have all been written assuming that mutexes  
will follow the UNNESTED mutex behaviour anyway;  it's only been in  
patches where I've seen code which assumed mutexes work the other  
way.  But I don't actually have any preference one way or the other,  
as long as we all have the same mutex behaviour on the various  
platforms when we're done.


It's also worth noting that there are a few mutexes created in SL  
which don't go through the llMutex class, most notably in llapr.cpp  
and in llpumpio.cpp.  These also currently use the DEFAULT behaviour,  
and probably also ought to be switched to a specific intended mutex  
behaviour (either NESTED or UNNESTED), instead of letting the  
different platforms treat mutexes in whatever strange manner they  
happen to have set up as the default.

Anybody have thoughts on this?  Or better, confirmation that this  
actually is a real potential problem in the base code, and that I  
haven't made some terrible newbie blunder and totally misinterpreted  
all my test results?  :)

Trev


More information about the SLDev mailing list