[sldev] UDP to TCP/HTTP - performance issues?
Tateru Nino
tateru.nino at gmail.com
Thu Nov 15 17:55:29 PST 2007
Callum Lerwick wrote:
> On Wed, 2007-11-14 at 23:57 +1100, Tateru Nino wrote:
>
>> I'm noting a suspicious stall in the viewer when a presence popup
>> happens. Disabling the popup does not seem to change the behaviour.
>> Getting a friend to log in and log out, the viewer still stalls for a
>> second or so around the time the popup _would_ have appeared if it were
>> enabled. That's another data point for you.
>>
>
> I've noticed this too. My wife has even complained about it. At one
> point it would even crash, though that seems to have been fixed.
>
> I've also noticed that during periods of high (total?) packet loss, the
> viewer will completely freeze. Which happens often when you're on a less
> than stellar wireless connection. No screen refreshes, CPU usage drops
> to zero. Why is network IO capable of blocking the graphics engine? The
> rest of the machine remains responsive, so I don't think its an
> X11/OpenGL problem.
>
Anything that prevents the viewer getting UDP packets successfully back
on a circuit for approximately 20 seconds will essentially kill the
circuit. YMMV: If your network bandwidth slider is set higher, it takes
less 'turbulence' to break the circuit (<-Anecdotal evidence only)
Bonus points - higher latencies seem to artificially throttle circuit
bandwidth - even if you have a clean end-to-end connection, speed of
light lag (through copper, fibre or routers) seems to severely impact
the rates at which data is delivered. Eg: If your normal 'clean' latency
is (say) ~320ms (because you're in Australia), then you're highly
unlikely to ever get more than 300Kbps from the grid (except for odd,
unexplained bursts) - and setting the bandwidth slider higher will
result in lower transfer rates. (Again, anecdotal)
As for network IO and the graphics engine - when it comes down to the
metal, the system can: Handle a network interrupt, or render spiffy
graphics (or handle a disk interrupt or whatever). Down at the metal
surface, the CPU can only perform one thing at a time. In cases of
multiple cores and multiple processors, extra units are sometimes locked
out during certain hardware tasks.
The majority of consumer network cards have a network buffer of three
standard size ethernet frames/packets maximum. That's the hardware. The
driver almost always implements additional buffers on top of that. And
some of them do it _so_ badly, that reducing the number and size of
buffers in my NIC drivers actually _helped_ - turned out the
thrice-damned driver was sucking up a ton of not-readily-visible CPU
time in buffer management. It could be that what you're seeing is a
sloppily written NIC driver (and honestly? Most of them are apparently
written by poo-flinging monkeys) coping with one of those 'odd,
unexplained bursts' that I mentioned.
That said, the CPU has to be available and interruptable to catch a
network packet _often_. Many times per second. It's got to drop what
it's doing, grab the mitt and catch data from the NIC, or that data is
lost - and losing data generally causes worse performance than getting
it slowly. Lost packets often need to be resent, there are delays in
recognizing when a packet is lost, and so on. UDP/TCP - doesn't matter.
As soon as you implement reliability, you run into the same bag of issues.
Whew. More grist for the mill.
--
Tateru Nino
http://dwellonit.blogspot.com/
More information about the SLDev
mailing list