[sldev] Question: Replacing current group chat with XMPP?

David M Chess chess at us.ibm.com
Wed Sep 10 08:00:19 PDT 2008


>From: "Paul Oppenheim (Poppy Linden)" <poppy at lindenlab.com>
>
>David M Chess wrote:
>> I don't really understand this scaling argument.  In my experience, 
>> neither "push everything everywhere" nor "constantly ask if there's 
>> anything new" scales; what scales is actual pub/sub.  In the case of 
>> group IM, messages get pushed by some server (or some cluster of 
>> entities acting as a logical server) to all and only those places where 

>> someone cares.  There's some overhead in keeping track of where there 
>> are places that someone cares, but in most use-cases I've seen it's the 

>> approach that scales the best.
>
>Do you have any references / links / copypasta / personal stories on this 
kind of architecture research? I've been 
>hungering for scalability research lately, and been surprised by much of 
what I've read. I would assume polling with 
>caching would be much faster than pub/sub because you can use much dumber 
machinery, but I've also not investigated 
>too many message queuing systems (I also don't work directly on IM). I 
can't speak for others on this list, but if 
>you cook up a scalability resource mail you've got an audience of at 
least one ;)

I will look around for some references; in my experience actual practical 
stories about how stuff works get published far too rarely.  :)

My most recent experience with this is in some distributed computing 
middleware (IBM WebSphere VE, nee XD).  The exact situation there is 
somewhat radically different, so the details of the solution we used 
aren't really relevant, but the basic calculation is.

If you have (say) a group with 1000 members, spread across 10 ADs, where 
at a typical time 100 of those members are logged in, involving 6 of the 
ADs, with a typical peak rate of 1 message every two seconds, and you 
don't want to impose an additional latency of more than 5 seconds, say 
(group IM with even just a 5-second delay would be unusable imho, but 
we'll stick that in to get a lower bound), the choices seem to be:

(1) Have 100 polls against the message store every 5 seconds, for a 
20-hits-per-second load just for this one rather typical medium-sized 
group, or

(2) Send one message to each of the 10 ADs every two seconds, four of 
which are unnecessary, or

(3) Keep track of the fact that four of those ADs don't have anyone in the 
group logged in right now (which involves some messaging overhead to keep 
track of that, but it's really pretty simple), and send one message to the 
6 actually involved ADs every two seconds.

If people log in and our much oftener than they send messages, then the 
who-is-on tracking in (3) might make (2) the most efficient choice, but I 
tend to think that (3) generally wins.  Depending on just what you mean by 
"with caching", (1) is pretty bad; you might insert a nearby cache between 
the viewer and the message store in (1), but even in the best case (one 
cache per AD, say) you still have more than three polls per second against 
the caches (16.7 every five seconds) and more than one poll per second 
(six every five seconds) against the message store, just for this one 
group.  This can be made better by polling less often from the caches (but 
then you get unacceptable chatlag), or by pushing from the message center 
to the caches (but then we're really doing (2) or (3), not (1)).

This is all very top-of-the-head, so I may have overlooked something key, 
or I may have misunderstood what you meant by "polling with caching", or I 
may have just divided wrong.  :)  But hopefully it's at least slightly 
interesting.  :)

Dale Innis
DaleInnisEmail at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/sldev/attachments/20080910/19227d99/attachment.htm


More information about the SLDev mailing list