[sldev] Endian bug in 1.15.0.2 message template reader?

Paul TBBle Hampson Paul.Hampson at Pobox.com
Sun May 13 21:38:24 PDT 2007


On Sun, May 13, 2007 at 06:03:34PM -0700, John Hurliman wrote:
> Paul TBBle Hampson wrote:
> >I'm seeing a problem (posted to the Linux Alpha Client forum [1]) with
> >my PowerPC build of the 1.15.0.2 Linux client, which does not happen on
> >an AMD64 build of the same source.

> >WARNING: processParcelOverlay: Got parcel overlay size 4 expecting 1024

> >The significance of the above error is that in 16-bits, 4 byteswaps to
> >1024 (0x0004 => 0x0400). This combined with the fact that it doesn't
> >happen on AMD64 is why I think there's a byteswapping bug.

> >The message template for ParcelOverlay is:

> >        {   Data        Variable    2   }   // packed bit-field, (grids*grids)/N

> >I guess my first question is, what's that '2' mean?

> According to http://www.libsecondlife.org/wiki/Protocol_%28network%29
> the "Variable 2" means two bytes are used for the 'size' or 'count' of
> that block or field.

> "Variable blocks are prepended with either one or two bytes saying how
> many there are. If a block is Variable 2 in the protocol map it uses
> two bytes in big endian format, so a count of 7 would be specified
> with 0x0700."

Thanks, that's exactly what I needed to confirm.

> The page I linked above attempts to describe the endianness of the
> packets. It starts off with:

> Big : Big : Little

> for the header and then everything in the meat of the packet is little
> except for size identifiers are big and IP addresses are big (even
> though other 32-bit integers are little). I'm actually surprised it
> took this long for a platform-endian issue to crop up.

Well, there's a routine htonmemcpy, aliased as ntohmemcpy, which copies
variables out of the data stream, and applies such swizzling as
neccessary. This makes the scheme work, by making the on-wire weirdness
typographically appear to be big- ("n") endianness. It then returns to
previous levels of confusion by using the htonmemcpy in the reading
code... (routine lives in message.h in llmessage if you want to look)

However, the code which processes Variable sizes looks correct to me:
(lltemplatemessagereader.cpp:572 in llmessage) 

	// what type of variable?
	if (mvci.getType() == MVT_VARIABLE)
	{
		// variable, get the number of bytes to read from the template
		S32 data_size = mvci.getSize();
		U8 tsizeb = 0;
		U16 tsizeh = 0;
		U32 tsize = 0;

		if ((decode_pos + data_size) > mReceiveSize)
		{
			logRanOffEndOfPacket( sender );
			return FALSE;
		}
		switch(data_size)
		{
		case 1:
			htonmemcpy(&tsizeb, &buffer[decode_pos], MVT_U8, 1);
			tsize = tsizeb;
			break;
		case 2:
			htonmemcpy(&tsizeh, &buffer[decode_pos], MVT_U16, 2);
			tsize = tsizeh;
			break;
		case 4:
			htonmemcpy(&tsizeb, &buffer[decode_pos], MVT_U32, 4);
			break;
		default:
			llerrs << "Attempting to read variable field with unknown size of " << data_size << llendl;
			break;
			
		}
		decode_pos += data_size;

		if ((decode_pos + (S32)tsize) > mReceiveSize)
		{
			logRanOffEndOfPacket( sender );
			return FALSE;
		}
		cur_data_block->addData(mvci.getName(), &buffer[decode_pos], tsize, mvci.getType());
		decode_pos += tsize;

I haven't verified yet, but I'm pretty sure the second
logRanOffEndOfPacket is the one I'm hitting with the ObjectUpdates.

Note that there's a bug in the above, the case 4: case tries to memcpy
four bytes into a byte, overflowing it, probably hammering any value
that happened to be in tsizeh _and_ letting a size of 0 leak through to
the following code.

Of course, nothing currently appears to be a Variable 4 (or so grep
tells me) so that's not the issue here.

The issue is of course the poor naming of htonmemcpy. Specifically,
to make this work sensibly, there should be a seperate MVT_VARSIZE_1,
MVT_VARSIZE_2, and MVT_VARSIZE_3 which swizzle the big-endian on-wire
format into the host-endian format, different from MVT_U{8,16,32} which
swizzle the little-endian on-wire format into the host-endian format.

That would be the least-visually-disruptive change, anyway.

What I now can't work out is why does this work on the LE arches?

On LE, htonmemcpy translates directly into a memcpy. If they're
big-endian on the wire, they should be failing on the LE arches _and_
the BE arches.

If they're little-endian on the wire, then the above code is fine, and
shouldn't be failing on the BE arches.

Anyway, I'm going to make logRanOffEndOfPacket a little more verbose,
and then rebuild (so I'll get to see tonight if I'm right. >_<) but at
this point the bug is not leaping out to me visibly.

-- 
-----------------------------------------------------------
Paul "TBBle" Hampson, B.Sc, LPI, MCSE
On-hiatus Asian Studies student, ANU
The Boss, Bubblesworth Pty Ltd (ABN: 51 095 284 361)
Paul.Hampson at Pobox.Com

Of course Pacman didn't influence us as kids. If it did,
we'd be running around in darkened rooms, popping pills and
listening to repetitive music.
 -- Kristian Wilson, Nintendo, Inc, 1989

License: http://creativecommons.org/licenses/by/2.1/au/
-----------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.secondlife.com/pipermail/sldev/attachments/20070514/bda6f95f/attachment-0001.pgp


More information about the SLDev mailing list