[sldev] Question: Replacing current group chat with XMPP?

David M Chess chess at us.ibm.com
Thu Sep 11 07:41:47 PDT 2008


>From: "Edward Artaud" <edward.artaud at gmail.com>
>
>There are a lot of tradeoffs of pub/sub vs polling, do a google seach
>on the words 'pub sub polling' and you'll see many pages on the
>subject.  True pub/sub (as opposed to RESTful pub/sub aka polling)
>means the server has to be statefully aware of who's subscribed.  In
>practice, not being statefully aware of your subscribers scales better
>than being aware of them, and as Poppy points out, nice things like
>caching and ETags/If-Modified-Since/304 Not Modified can be applied in
>an much more fault tolerant and inexpensive brute force way rather
>than true pub/sub which typically requires a lot more complex (and
>harder to maintain systems) to operate at scale. 

Well, pub/sub requires that the system as a whole knows statefully who is 
subscribed.  The publisher doesn't necessarily have to know, and no single 
server has to know, just the system as a whole; pub/sub can be done in 
various cleverly distributed ways to reduce the requirements on any one 
server.  In my experience polling doesn't scale well unless there are 
relatively loose latency requirements (i.e. its okay if things sometimes 
take a long time to propogate), which I don't think we have in this case. 
Pub/sub doesn't have to be heavyweight and hard to maintain; again, I 
think IRC is basically pub/sub, and IRC implementations aren't afaik 
famously hard to maintain.

>As to the user experience, there's no reason why any of the approaches 
discussed
>would have to chage the existing user experience. 

I agree in general; on the other hand the user-experience requirements 
(like typically-short message delivery delays) can determine which 
approach is actually best.

>My point is that
>something I learned while implementing just such a system several
>years ago was that group IM with no constraint on number of
>participants is a totally different beast than 1 to 1 IM.

I agree completely there.  :)

>I think what some people are forgetting in all these statistics and 
metrics
>and whatever is that Large Groups in SL do not equal "Large number of 
people
>in chat"
>
>A group may only be used to send notices out to the members, others 
people
>drop chat as soon as they get the first message and do not "join back in"
>until they feel like chatting.
>
>I have 25 groups, of those 25 groups only 5 of them EVER have chat 
occuring,
>and of those 5 a very rarely keep all 5 going the entire time I am logged
>in.

That's entirely true; when we do the scaling calculations we have to look 
at what actually happens (and can be expected to happen) in detail, not 
just write down high-level numbers of groups and membership sizes and so 
on.  I think we should assume that ultimately many people will belong to 
more than 25 groups, but (probably?) fewer than 25 that are used 
significantly for chat.  And that the typical group chat channel will not 
be active with chatter 24/7 (although some probably will be). 

> From: Tateru Nino <tateru.nino at gmail.com>

> It's my understanding that 5 seconds group IM latency is routinely
> exceeded in practice.

I would say "often" rather than "routinely".  :)  In the sense that when 
group chatlag gets that bad, the people in the channel perceive it as 
broken, and most of the traffic become people laughing or moaning about 
the lag.  So my impression is that for the system to be perceived as 
*working*, typical latencies have to be far lower.

> From: "Dahlia Trimble" <dahliatrimble at gmail.com>

>I don't know if the IRC model is applicable to SL, but from my experience 
as
>a somewhat heavy IRC user, I just dont see any of the group chat problems
>that SL sees and I haven't seen any evidence that the SL *logged in* user
>and message volume is greater, if anything I would believe the IRC volume 
is
>much higher.

Do you have a rough feeling for how many different channels a user is 
likely to be in at once?  Again, my impression from Zero is that the IRC 
provider(s) that he talked to cited that as a particular worry-point for 
scaling.

>From what little I know of IRC architecture, each server can serve a 
limited
>amount of users and also forwards messages to other servers in a star
>configuration. I'm having a hard time envisioning a system that could 
scale
>better than that by just throwing hardware at it as seems possible with 
the
>IRC model.

That's my impression also.  Distributed pub/sub!  (That is, there's no 
polling, and a message isn't sent to a server unless there's an interested 
user in that direction.)

>I wish it was tongue-in-cheek, but I am a heavy IRC user and so far it's 
the
>only way I can reliably communicate with others when SL's system fails, 
or
>if I want to message people on other grids.

Well, sure, but I don't really understand the desire for a normal IRC 
client _in the SL viewer_.  Doesn't KDE / Gnome / Finder / Windows handle 
the problem of offering convenient access to both SL and IRC at the same 
time?

> From: "Edward Artaud" <edward.artaud at gmail.com>

>Yes, that's why I why I addressed both in my previous email.  The
>point is that single web servers (even with dynamic pages) routinely
>serve well over 20 requests a second (which I think was your estimate
>of # of requests for polling), and http already has ways to further
>optimized with etags and such, 

That was 20 requests per second *per group*.  Which I think would be a 
problem for the server, for nearby network bandwidth, etc.  Etags don't 
help with the request rate, only the data volume.

>and more importantly, can be
>administered and scaled by a much less experienced ops team than a
>pub/sub system, whereas your approach will require the Lab to recruit
>very expensive MQ people out of IBM Global Services or Tibco to keep
>it reliably running 24/7.  Most companies just don't have the
>resources and expertise to run large scale pub/sub systems.

Pub/sub doesn't mean MQ or Tibco; it's a style of communication, not a 
product line.  IRC uses pub/sub I believe, and I don't think IRC servers 
are all maintained via IBM service contracts.  :)  For that matter every 
mailman mailing list is using a pub/sub algorithm.

> From: "Celierra Darling" <Celierra at gmail.com>

>This seems to suggest running both systems, shifting the largest
>groups to IRC so they at least can have functional group chat (I think
>that's still broken, no?).  And if IRC uses much fewer resources, one
>might also try using IRC for large rooms not quite near breakage, to
>save on resources, as long as it's not enough to start encountering
>this too-many-simultaneous-joins problem.  But there are some obvious
>downsides, such as having to maintain both systems, and implementing a
>way to transition a room up to IRC (and down from IRC, too?).  I'm not
>sure how the rewards and costs balance here.

Ewww!  :)  Running two different systems at once would be rather a pain, 
as you say.  It's a good point, though, that we should at least keep in 
mind the possibility of having more than just the two obvious kinds of 
groups (with IM and without IM).  It's not impossible that a hybrid would 
be the best solution.  A related question is whether the hybridization 
would be just at the implementation level (in which case we don't need any 
particular consensus on it, and people can play around), or whether the 
intragrid part of the protocol itself (the OGP for group IM) would need to 
have more than one style (which means we'd have to figure out the right 
standard to agree on).

Thanks much to all for the comments, and as usual I will point at the Wiki 
for a good place to write stuff down:
https://wiki.secondlife.com/wiki/User:Dale_Innis/Group_IM_in_OGP 

Dale Innis
DaleInnisEmail at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/sldev/attachments/20080911/5bae004c/attachment-0001.htm


More information about the SLDev mailing list