From sldev at free.fr  Mon Apr  8 06:27:00 2019
From: sldev at free.fr (Henri Beauchamp)
Date: Mon, 8 Apr 2019 15:27:00 +0200
Subject: [opensource-dev] SL server issues in the past 4 months
Message-ID: <20190408152700.2176e4856a699767f411313c@free.fr>

Greetings,

I'm writing this email because I'm getting tired of having to pile up
workarounds in my viewer code for all the SL server issues that have
appeared in a succession and every few weeks since mid December, and
think it is time for you folks, at LL, to do some serious review of
the latest changes you brought server-side.

1.- Bogus attachment kill messages on region change.

This issue appeared mid-December 2018 and results in derezzed (but still
active) attachments on region change (sim border crossing or TP alike).

I instrumented my viewer code so to be able to trace the problem. Here is
how to witness it occurring in real time:

Get the Cool VL Viewer, disable the workaround for lost attachments
(un-check Advanced -> Network -> Ignore bogus kill-attachment messages),
enable the "Attachments" debug tag (from Advanced -> Consoles -> Debug
tags) and the debug console, then TP and/or cross region boundaries.
Just watch the debug messages that will allow you to track every event
dealing with your attachments thanks to the special debug code I added
to understand and fix this issue.

You will see that, often (but not always), the departure sim sends a
"kill object" messages for your attachments (which causes them to de-rez
in your viewer) while you are already in the arrival region; in this case,
the arrival region usually sends a re-parenting message for your
attachments (they get parented to your avatar again and thus re-rez).
Sadly, the re-parenting message is sometimes (often) not always received
or incomplete, or even received before the kill message from the departure
region (race condition), causing some attachments not to re-rez in your
viewer.
Interestingly, the arrival sim always got the correct, full list of your
attachments since you will notice that, even if not rezzed, they are still
active (their scripts still work): this is also why, when TPing or crossing
a boundary to another sim, your attachments often reappear (the region
still transmitted them right to the next sim).

To make things even more complicated, the bake server (which keeps a copy of
the COF) seems to receive as well the kill and re-parent messages from the
sims, and is therefore reflecting the same bad state for your attachments
in its copy of the COF; this is why, in my workaround for that bug (which
simply ignores the bogus kill object messages sent from the departure sim),
I also trigger a COF full resync (wearables + attachments) and a rebake.

What I do not understand in the first place, is why the Hell the departing
sim sends kill_objects messages at all to the departing avatar for its
attachments !  The attachments follow the avatar and therefore do not change
parent. Even their position, being relative to the parent avatar, does not
change. Same thing for the bake server that apparently receives the same
message while it should not (the avatar outfit did not change at all, and
even if a scripted object could get detached on arrival as a result of
a scripted changed() LSL event, that event would still occur in the arrival
sim and sent from it to both the viewer and the bake server: the departure
sim shall never have anything to so with objects attached to the avatar !).


2.- Bogus rebakes with bad body textures.

For about one week now, the above workaround (which worked fine for almost
4 months) gets partly defeated because the rebake it triggers gets the bake
server to return bogus textures for the body parts (sometimes a layer, such
as a tattoo, is missing, other times it's like if the avatar was not wearing
any skin texture).
Of course, the user can still rebake manually to get it fixed (well, at least
in my viewer, since I'm not even sure you still can rebake in LL's official
viewer), but this is extremely annoying and pretty much inexplicable by me
(nothing wrong, seen viewer side, just bad baked textures arriving).
So, I was in the process of coding yet another workaround (double-rebake
after the bogus kill-attachment message is ignored) when I decided to write
this email, because this is just getting *ridiculous* !


3.- Failed event polls.

For about 3 weeks, and almost always since last week, I get failed/retried
event polls, which never happened before. Here is a log of one such failed
poll:

DEBUG: LLCoreHttpUtil::HttpCoroHandler::onCompleted: Error Http_499 - Cannot access url: https://sim10685.agni.lindenlab.com:12043/cap/287c7dfd-63d1-a74c-6670-17f4f2d1d5c3 - Reason: Malformed response contents
INFO: LLSDXMLParser::parse: XML_STATUS_ERROR parsing:<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
INFO: LLSDXMLParser::parse: XML_STATUS_ERROR parsing:<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
DEBUG: LLCoreHttpUtil::HttpCoroHandler::onCompleted: Returned body:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>502 Proxy Error</title>
</head><body>
<h1>Proxy Error</h1>
<p>The proxy server received an invalid
response from an upstream server.<br />
The proxy server could not handle the request <em><a href="http://localhost:13011/agent/b43c4b76-3816-49ce-933d-e1a4eef3226e/event-get">POST&nbsp;http://localhost:13011/agent/b43c4b76-3816-49ce-933d-e1a4eef3226e/event-get</a></em>.<p>
Reason: <strong>Error reading from remote server</strong></p></p>
</body></html>
WARNING: LLEventPollImpl::eventPollCoro: Event poll <13> Retrying in 65 seconds; error count is now 10

Most of the times, these failed event polls happen for neighbouring sims,
but they do also sometimes happen for the agent sim, meaning that should
the failed retries count reach 10, the agent gets disconnected as a result.

By looking at the returned body contents, you can see a reference to
"localhost" (which I assume would be an attempt by the server to access
a service running on itself), and to ports in the 120xx and 130xx range,
which are UDP ports... Would it mean you forgot to remove the calls on
your servers, that attempt to access UDP services that got shut down on
them ?...


4.- Failed TPs

It has been years I never saw that many failed TPs resulting in timeouts
and disconnections... I mean, I got barely one in a blue moon for the past
5 years (at least), and I'm now seeing one or several every day !


In the hope my observations will help you guys to get things back on
track (because it's getting really badly needed).


Regards,

Henri.


From oz at lindenlab.com  Mon Apr  8 07:49:42 2019
From: oz at lindenlab.com (Oz Linden (Scott Lawrence))
Date: Mon, 8 Apr 2019 10:49:42 -0400
Subject: [opensource-dev] SL server issues in the past 4 months
In-Reply-To: <20190408152700.2176e4856a699767f411313c@free.fr>
References: <20190408152700.2176e4856a699767f411313c@free.fr>
Message-ID: <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com>

On 2019-04-08 09:27 , Henri Beauchamp wrote:
> In the hope my observations will help you guys to get things back on
> track (because it's getting really badly needed).

Thank you Henri.

We are very much aware of these problems and trying hard to correct 
them, but I believe you may have added some useful insights; I've 
forwarded your message to the two teams I have attacking them.

-- 
OZ LINDEN | Senior Director, Second Life Engineering
email: oz at lindenlab.com <mailto:oz at lindenlab.com> | Scott Lawrence 
<https://www.linkedin.com/in/scottdlawrence/>
LINDEN LAB | Create Virtual Experiences <https://www.lindenlab.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190408/53621d92/attachment.htm 

From sldev at vlan1000.net  Mon Apr  8 14:29:46 2019
From: sldev at vlan1000.net (Alex)
Date: Tue, 09 Apr 2019 07:29:46 +1000
Subject: [opensource-dev] SL server issues in the past 4 months
In-Reply-To: <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com>
References: <20190408152700.2176e4856a699767f411313c@free.fr>
	<4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com>
Message-ID: <4a73478b4345b0aac3c8dd871f026fac@vlan1000.net>

Hi Oz, 

Perhaps the SL grid status page should list teleports as in a degraded
state rather than fully operational. TPV's are being blamed for this
issue by many users. Listing a service "operational" also implies fully
operational with no issues. 

On 2019-04-09 00:49, Oz Linden (Scott Lawrence) wrote:

> On 2019-04-08 09:27 , Henri Beauchamp wrote: 
> 
>> In the hope my observations will help you guys to get things back on
>> track (because it's getting really badly needed).
> 
> Thank you Henri. 
> 
> We are very much aware of these problems and trying hard to correct them, but I believe you may have added some useful insights; I've forwarded your message to the two teams I have attacking them.
> 
> -- 
> OZ LINDEN | Senior Director, Second Life Engineering
> email: oz at lindenlab.com | Scott Lawrence [1]
> LINDEN LAB | Create Virtual Experiences [2] 
> _______________________________________________
> Policies and (un)subscribe information available here:
> http://wiki.secondlife.com/wiki/OpenSource-Dev
> Please read the policies before posting to keep unmoderated posting privileges

-- 
Kind Regards,
Alex. 

Links:
------
[1] https://www.linkedin.com/in/scottdlawrence/
[2] https://www.lindenlab.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190409/fa6514f6/attachment.htm 

From sldev at free.fr  Wed Apr 10 04:27:48 2019
From: sldev at free.fr (Henri Beauchamp)
Date: Wed, 10 Apr 2019 13:27:48 +0200
Subject: [opensource-dev] SL server issues in the past 4 months
In-Reply-To: <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com>
References: <20190408152700.2176e4856a699767f411313c@free.fr>
	<4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com>
Message-ID: <20190410132748.9d907b9ddd6da474ae3ea267@free.fr>

On Mon, 8 Apr 2019 10:49:42 -0400, Oz Linden (Scott Lawrence) wrote:

> On 2019-04-08 09:27 , Henri Beauchamp wrote:
> > In the hope my observations will help you guys to get things back on
> > track (because it's getting really badly needed).
> 
> Thank you Henri.
> 
> We are very much aware of these problems and trying hard to correct 
> them, but I believe you may have added some useful insights; I've 
> forwarded your message to the two teams I have attacking them.

Thank you !

In the mean time, I managed to diagnose the rebake issue and fixed it:
what I found out could, perhaps, also be a hint about what happens for
failed TPs.

The problem was a race condition (as it is often the case in the
server/viewer communication): my workaround for derezzing attachments
triggered a rebake in the arrival sim, but I did not check whether the
capabilities for that sim were received or not; this did not cause any
issue before last week (i.e. rebaking with old (cached, in my viewer)
capabilities URIs did not matter), but it does now (whether it is the
result of a changed timing, algorithm or policy on the server side is
of course unknown to me).

In my new code, I simply flag the rebake as needed and now actually
perform it only once the capabilities for the new agent region are
received, and everything works like a charm (well, not fully, since that
workaround should not even be needed in the first place, and the bogus
kill-objects message on agent attachments is still wrongly sent by the
departure region).

Now, why might it be related at all with failed TPs ?... Well, if the
capabilities are received too late, and seeing how viewers implementing
region-Windlight are way more prone to timeouts than mine, it could
indeed be that the viewers are attempting to use a capability that is
not yet available, causing a timeout...

Regards,

Henri.

From nagle at animats.com  Wed Apr 10 11:15:02 2019
From: nagle at animats.com (John Nagle)
Date: Wed, 10 Apr 2019 11:15:02 -0700
Subject: [opensource-dev] SL server issues in the past 4 months
In-Reply-To: <20190410132748.9d907b9ddd6da474ae3ea267@free.fr>
References: <20190408152700.2176e4856a699767f411313c@free.fr>
	<4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com>
	<20190410132748.9d907b9ddd6da474ae3ea267@free.fr>
Message-ID: <6defbf45-c4f2-153c-8a38-7072bea1788d@animats.com>

On 4/10/19 4:27 AM, Henri Beauchamp wrote:
> In the mean time, I managed to diagnose the rebake issue and fixed it:
> what I found out could, perhaps, also be a hint about what happens for
> failed TPs.
> 
> The problem was a race condition (as it is often the case in the
> server/viewer communication): my workaround for derezzing attachments
> triggered a rebake in the arrival sim, but I did not check whether the
> capabilities for that sim were received or not; this did not cause any
> issue before last week (i.e. rebaking with old (cached, in my viewer)
> capabilities URIs did not matter), but it does now (whether it is the
> result of a changed timing, algorithm or policy on the server side is
> of course unknown to me).
> 
> In my new code, I simply flag the rebake as needed and now actually
> perform it only once the capabilities for the new agent region are
> received, and everything works like a charm (well, not fully, since that
> workaround should not even be needed in the first place, and the bogus
> kill-objects message on agent attachments is still wrongly sent by the
> departure region).
> 
> Now, why might it be related at all with failed TPs ?... Well, if the
> capabilities are received too late, and seeing how viewers implementing
> region-Windlight are way more prone to timeouts than mine, it could
> indeed be that the viewers are attempting to use a capability that is
> not yet available, causing a timeout...

    Hm. On failed teleports, I see lots of capability retrieval
failures in the Firestorm log. Like this:

2019-04-09T19:57:01Z WARNING #CoreHttp#  llcorehttp/_httppolicy.cpp(434) 
stageAfterCompletion : HTTP request 0x7f81c4f9a5f0 failed after 5 
retries.  Reason:  Not Found (Http_404)
2019-04-09T19:57:01Z WARNING #CoreHTTP# 
llmessage/llcorehttputil.cpp(282) onCompleted : Possible failure 
[Http_404] cannot POST url 
'https://sim10658.agni.lindenlab.com:12043/cap/4d310fee-b2c7-cb62-357d-0317811990e7' 
because Not Found
2019-04-09T19:57:01Z INFO #  llcommon/llsdserialize_xml.cpp(417) parse : 
LLSDXMLParser::Impl::parse: XML_STATUS_ERROR parsing:cap not found: 
'4d310fee-b2c7-cb62-357d-0317811990e7'
2019-04-09T19:57:01Z WARNING #LLEventPollImpl# 
newview/lleventpoll.cpp(222) eventPollCoro : Canceling coroutine
2019-04-09T19:57:01Z WARNING #CoreHttp#  llcorehttp/_httppolicy.cpp(434) 
stageAfterCompletion : HTTP request 0x305e8e50 failed after 1 retries. 
Reason:  Not Found (Http_404)
2019-04-09T19:57:01Z WARNING #CoreHTTP# 
llmessage/llcorehttputil.cpp(282) onCompleted : Possible failure 
[Http_404] cannot POST url 
'https://sim10412.agni.lindenlab.com:12043/cap/d3b0c971-112d-b7c7-00d9-17dcb92b5027' 
because Not Found
2019-04-09T19:57:01Z INFO #  llcommon/llsdserialize_xml.cpp(417) parse : 
LLSDXMLParser::Impl::parse: XML_STATUS_ERROR parsing:cap not found: 
'd3b0c971-112d-b7c7-00d9-17dcb92b5027'
2019-04-09T19:57:01Z WARNING #LLEventPollImpl# 
newview/lleventpoll.cpp(222) eventPollCoro : Canceling coroutine
2019-04-09T19:57:02Z INFO #  newview/llviewerdisplay.cpp(239) 
display_stats : FPS: 21.70
2019-04-09T19:57:02Z WARNING #CoreHttp#  llcorehttp/_httppolicy.cpp(434) 
stageAfterCompletion : HTTP request 0x40ff54b0 failed after 1 retries. 
Reason:  Not Found (Http_404)
2019-04-09T19:57:02Z WARNING #CoreHTTP# 
llmessage/llcorehttputil.cpp(282) onCompleted : Possible failure 
[Http_404] cannot POST url 
'https://sim10658.agni.lindenlab.com:12043/cap/11bb494c-1a04-24f1-8644-826273c548d8' 
because Not Found
2019-04-09T19:57:02Z INFO #  llcommon/llsdserialize_xml.cpp(417) parse : 
LLSDXMLParser::Impl::parse: XML_STATUS_ERROR parsing:cap not found: 
'11bb494c-1a04-24f1-8644-826273c548d8'
2019-04-09T19:57:02Z WARNING #LLEventPollImpl# 
newview/lleventpoll.cpp(222) eventPollCoro : Canceling coroutine

This was from yesterday, when Oz had people from Server User Group
TPing between three mostly empty Linden sims as a test.

I've seen occasional errors like that before. That there are ever
404 errors for a cap seems wrong. The sim told the viewer to fetch
that URL directly from the sim, and then the sim didn't have it
available.

				John Nagle


From oz at lindenlab.com  Wed Apr 17 06:29:14 2019
From: oz at lindenlab.com (Oz Linden (Scott Lawrence))
Date: Wed, 17 Apr 2019 09:29:14 -0400
Subject: [opensource-dev] Discontinuing the viewer-development-commits list
Message-ID: <9e4d4844-c6a3-22bd-4f6e-8f2ec66a72b9@lindenlab.com>

As part of cleaning up little-used infrastructure, I am discontinuing 
the posts for commits to viewer-release on the 
viewer-development-commits list.? If you would like notices, you can 
create your own personal notices by 'watching' the repository using 
bitbucket.

-- 
OZ LINDEN | Senior Director, Second Life Engineering
email: oz at lindenlab.com <mailto:oz at lindenlab.com> | Scott Lawrence 
<https://www.linkedin.com/in/scottdlawrence/>
LINDEN LAB | Create Virtual Experiences <https://www.lindenlab.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190417/a74b77a5/attachment.htm 

From nickyperian at gmail.com  Tue Apr 23 09:11:41 2019
From: nickyperian at gmail.com (Nicky Perian)
Date: Tue, 23 Apr 2019 11:11:41 -0500
Subject: [opensource-dev] Last linux 3p-vlc-bin
Message-ID: <CAF34W=EWPt5YvvfmJ-W9JAP1XwOnkU=7JCSJggGMBmW45OBBPg@mail.gmail.com>

https://automated-builds-secondlife-com.s3.amazonaws.com/hg/repo/3p-vlc-bin/rev/315283/arch/Linux/vlc_bin-2.2.3-linux-201606011750-r10.tar.bz2

Can this LL build id 315283 be identified to node from which it was built?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190423/b85479aa/attachment.htm 

From jack at raspberrypi.org  Fri Apr 26 06:24:32 2019
From: jack at raspberrypi.org (Jack Lang)
Date: Fri, 26 Apr 2019 14:24:32 +0100
Subject: [opensource-dev] Firestorm viewer on RaspberryPi
In-Reply-To: <CAF34W=EWPt5YvvfmJ-W9JAP1XwOnkU=7JCSJggGMBmW45OBBPg@mail.gmail.com>
References: <CAF34W=EWPt5YvvfmJ-W9JAP1XwOnkU=7JCSJggGMBmW45OBBPg@mail.gmail.com>
Message-ID: <CAOdmOsQhexm+ciBjb5jOuQZdW6nVsNFGvvSs7C1bvPYY+RKa-A@mail.gmail.com>

I'm trying to port Firestorm to Raspberry Pi (Debian 9), following
https://wiki.phoenixviewer.com/fs_compiling_firestorm_alexivy_debian_9

However it fails to find pthread_create, even though it is installed. Where
might the filepath be defined?

Thanks

Jack Lang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190426/9c82c0ac/attachment.htm 

From sldev at free.fr  Fri Apr 26 11:09:30 2019
From: sldev at free.fr (Henri Beauchamp)
Date: Fri, 26 Apr 2019 20:09:30 +0200
Subject: [opensource-dev] Firestorm viewer on RaspberryPi
In-Reply-To: <CAOdmOsQhexm+ciBjb5jOuQZdW6nVsNFGvvSs7C1bvPYY+RKa-A@mail.gmail.com>
References: <CAF34W=EWPt5YvvfmJ-W9JAP1XwOnkU=7JCSJggGMBmW45OBBPg@mail.gmail.com>
	<CAOdmOsQhexm+ciBjb5jOuQZdW6nVsNFGvvSs7C1bvPYY+RKa-A@mail.gmail.com>
Message-ID: <20190426200930.e180a27dfd3a7b89affa3ac1@free.fr>

On Fri, 26 Apr 2019 14:24:32 +0100, Jack Lang wrote:

> I'm trying to port Firestorm to Raspberry Pi

Are you serious ? O.O

Since mesh has been implemented, the viewer code is tightly bound to
x86 processors, because it makes a large use of SSE2 math. You'd have
to translate all that code into an ARM equivalent (Neon); while doable,
it's not for the faint-hearted.

Even if you could get the code to compile (which would involve to also
recompile all the pre-built libraries, such as Dullahan, Collada and a
few others you likely don't have in the Pi's distro), there are also
the *big* problems with the CPU and GPU speeds; the resulting viewer
would likely render at under 5fps, at best (and in a skybox with no
avatar around)...

And finally, there is the *wall*: 1Gb of RAM is notably too little to
run a viewer (even a v1 viewer such as mine, even compiled for 32 bits).

Henri.