[sldev] lltexturefetch race condition

Melinda Green melinda at superliminal.com
Wed May 6 20:39:01 PDT 2009


Rob Lanphier wrote:
> In "Crash in lltexturefetch just after logging in"
> https://jira.secondlife.com/browse/VWR-12775
>
> Robin writes:
>   
>> Ok its a race condition between the
>> LLTextureFetchWorker::callbackDecoded() and
>> LLTextureFetchWorker::doWork()
>>
>> mState == DECODE_IMAGE kicks off the decode and sets the state machine
>> to DECODE_IMAGE_UPDATE, this does nothing until the decoded flag is
>> set, this is set in LLTextureFetchWorker::callbackDecoded(), but as
>> soon as its set, the state machine moves on and in the event of a
>> failed decode it sets mFormattedImage=NULL and sets the state machine
>> back to INIT
>>
>> but in LLTextureFetchWorker::callbackDecoded we have already passed
>> the check for STATE!=DECODE_IMAGE_UPDATE so the code moves on and then
>> finds itsself for a NULL mFormattedImage
>>
>> Adding a 2nd
>>
>> if (mState != DECODE_IMAGE_UPDATE)
>>
>> { llwarns << "Decode callback for " << mID << " with state = " <<
>> mState << llendl; }
>>
>> after the
>>
>> mDecoded = TRUE;
>>
>> results in the 2nd warns message fireing every time mFormattedImage is
>> null here, the first warns *BEFORE* mDecoded-TRUE never fires so its a
>> race condition.
>>
>>     
>
> Based on my reading of things, this comment may have been overlooked. 
> Couple of questions:
> 1.  Does the comment above get us closer to a deeper fix for this problem?
> 2.  Should we go ahead and apply the attached patches for this, even if
> they don't constitute a "real fix", because they do at least seem to
> keep us from crashing?
>
> Rob

Regarding 1: I'd say "Probably". Whoever does attempt a deep fix will 
now know of some great places to set breakpoints.
Regarding 2: I'd say "Why not?". Avoiding a crash is better than not. 
Avoiding this crashe won't make it harder to find a deep fix but it will 
remove a lot of the urgency to attempt one. But then that has to be a 
good thing too, right?

BTW, I don't have the surrounding code to examine but one strange 
looking thing in the patches is that image objects appear to be used 
inconsistently in that one block or even a single expression will use 
both "." and "->" to dereference one of these objects. Since these 
appear to not be simple pointer objects I'd expect that we'd want to use 
the dot notation throughout.

-Melinda



More information about the SLDev mailing list