From sldev at free.fr Mon Apr 8 06:27:00 2019 From: sldev at free.fr (Henri Beauchamp) Date: Mon, 8 Apr 2019 15:27:00 +0200 Subject: [opensource-dev] SL server issues in the past 4 months Message-ID: <20190408152700.2176e4856a699767f411313c@free.fr> Greetings, I'm writing this email because I'm getting tired of having to pile up workarounds in my viewer code for all the SL server issues that have appeared in a succession and every few weeks since mid December, and think it is time for you folks, at LL, to do some serious review of the latest changes you brought server-side. 1.- Bogus attachment kill messages on region change. This issue appeared mid-December 2018 and results in derezzed (but still active) attachments on region change (sim border crossing or TP alike). I instrumented my viewer code so to be able to trace the problem. Here is how to witness it occurring in real time: Get the Cool VL Viewer, disable the workaround for lost attachments (un-check Advanced -> Network -> Ignore bogus kill-attachment messages), enable the "Attachments" debug tag (from Advanced -> Consoles -> Debug tags) and the debug console, then TP and/or cross region boundaries. Just watch the debug messages that will allow you to track every event dealing with your attachments thanks to the special debug code I added to understand and fix this issue. You will see that, often (but not always), the departure sim sends a "kill object" messages for your attachments (which causes them to de-rez in your viewer) while you are already in the arrival region; in this case, the arrival region usually sends a re-parenting message for your attachments (they get parented to your avatar again and thus re-rez). Sadly, the re-parenting message is sometimes (often) not always received or incomplete, or even received before the kill message from the departure region (race condition), causing some attachments not to re-rez in your viewer. Interestingly, the arrival sim always got the correct, full list of your attachments since you will notice that, even if not rezzed, they are still active (their scripts still work): this is also why, when TPing or crossing a boundary to another sim, your attachments often reappear (the region still transmitted them right to the next sim). To make things even more complicated, the bake server (which keeps a copy of the COF) seems to receive as well the kill and re-parent messages from the sims, and is therefore reflecting the same bad state for your attachments in its copy of the COF; this is why, in my workaround for that bug (which simply ignores the bogus kill object messages sent from the departure sim), I also trigger a COF full resync (wearables + attachments) and a rebake. What I do not understand in the first place, is why the Hell the departing sim sends kill_objects messages at all to the departing avatar for its attachments ! The attachments follow the avatar and therefore do not change parent. Even their position, being relative to the parent avatar, does not change. Same thing for the bake server that apparently receives the same message while it should not (the avatar outfit did not change at all, and even if a scripted object could get detached on arrival as a result of a scripted changed() LSL event, that event would still occur in the arrival sim and sent from it to both the viewer and the bake server: the departure sim shall never have anything to so with objects attached to the avatar !). 2.- Bogus rebakes with bad body textures. For about one week now, the above workaround (which worked fine for almost 4 months) gets partly defeated because the rebake it triggers gets the bake server to return bogus textures for the body parts (sometimes a layer, such as a tattoo, is missing, other times it's like if the avatar was not wearing any skin texture). Of course, the user can still rebake manually to get it fixed (well, at least in my viewer, since I'm not even sure you still can rebake in LL's official viewer), but this is extremely annoying and pretty much inexplicable by me (nothing wrong, seen viewer side, just bad baked textures arriving). So, I was in the process of coding yet another workaround (double-rebake after the bogus kill-attachment message is ignored) when I decided to write this email, because this is just getting *ridiculous* ! 3.- Failed event polls. For about 3 weeks, and almost always since last week, I get failed/retried event polls, which never happened before. Here is a log of one such failed poll: DEBUG: LLCoreHttpUtil::HttpCoroHandler::onCompleted: Error Http_499 - Cannot access url: https://sim10685.agni.lindenlab.com:12043/cap/287c7dfd-63d1-a74c-6670-17f4f2d1d5c3 - Reason: Malformed response contents INFO: LLSDXMLParser::parse: XML_STATUS_ERROR parsing: INFO: LLSDXMLParser::parse: XML_STATUS_ERROR parsing: DEBUG: LLCoreHttpUtil::HttpCoroHandler::onCompleted: Returned body: 502 Proxy Error

Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request POST http://localhost:13011/agent/b43c4b76-3816-49ce-933d-e1a4eef3226e/event-get.

Reason: Error reading from remote server

WARNING: LLEventPollImpl::eventPollCoro: Event poll <13> Retrying in 65 seconds; error count is now 10 Most of the times, these failed event polls happen for neighbouring sims, but they do also sometimes happen for the agent sim, meaning that should the failed retries count reach 10, the agent gets disconnected as a result. By looking at the returned body contents, you can see a reference to "localhost" (which I assume would be an attempt by the server to access a service running on itself), and to ports in the 120xx and 130xx range, which are UDP ports... Would it mean you forgot to remove the calls on your servers, that attempt to access UDP services that got shut down on them ?... 4.- Failed TPs It has been years I never saw that many failed TPs resulting in timeouts and disconnections... I mean, I got barely one in a blue moon for the past 5 years (at least), and I'm now seeing one or several every day ! In the hope my observations will help you guys to get things back on track (because it's getting really badly needed). Regards, Henri. From oz at lindenlab.com Mon Apr 8 07:49:42 2019 From: oz at lindenlab.com (Oz Linden (Scott Lawrence)) Date: Mon, 8 Apr 2019 10:49:42 -0400 Subject: [opensource-dev] SL server issues in the past 4 months In-Reply-To: <20190408152700.2176e4856a699767f411313c@free.fr> References: <20190408152700.2176e4856a699767f411313c@free.fr> Message-ID: <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com> On 2019-04-08 09:27 , Henri Beauchamp wrote: > In the hope my observations will help you guys to get things back on > track (because it's getting really badly needed). Thank you Henri. We are very much aware of these problems and trying hard to correct them, but I believe you may have added some useful insights; I've forwarded your message to the two teams I have attacking them. -- OZ LINDEN | Senior Director, Second Life Engineering email: oz at lindenlab.com | Scott Lawrence LINDEN LAB | Create Virtual Experiences -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190408/53621d92/attachment.htm From sldev at vlan1000.net Mon Apr 8 14:29:46 2019 From: sldev at vlan1000.net (Alex) Date: Tue, 09 Apr 2019 07:29:46 +1000 Subject: [opensource-dev] SL server issues in the past 4 months In-Reply-To: <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com> References: <20190408152700.2176e4856a699767f411313c@free.fr> <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com> Message-ID: <4a73478b4345b0aac3c8dd871f026fac@vlan1000.net> Hi Oz, Perhaps the SL grid status page should list teleports as in a degraded state rather than fully operational. TPV's are being blamed for this issue by many users. Listing a service "operational" also implies fully operational with no issues. On 2019-04-09 00:49, Oz Linden (Scott Lawrence) wrote: > On 2019-04-08 09:27 , Henri Beauchamp wrote: > >> In the hope my observations will help you guys to get things back on >> track (because it's getting really badly needed). > > Thank you Henri. > > We are very much aware of these problems and trying hard to correct them, but I believe you may have added some useful insights; I've forwarded your message to the two teams I have attacking them. > > -- > OZ LINDEN | Senior Director, Second Life Engineering > email: oz at lindenlab.com | Scott Lawrence [1] > LINDEN LAB | Create Virtual Experiences [2] > _______________________________________________ > Policies and (un)subscribe information available here: > http://wiki.secondlife.com/wiki/OpenSource-Dev > Please read the policies before posting to keep unmoderated posting privileges -- Kind Regards, Alex. Links: ------ [1] https://www.linkedin.com/in/scottdlawrence/ [2] https://www.lindenlab.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190409/fa6514f6/attachment.htm From sldev at free.fr Wed Apr 10 04:27:48 2019 From: sldev at free.fr (Henri Beauchamp) Date: Wed, 10 Apr 2019 13:27:48 +0200 Subject: [opensource-dev] SL server issues in the past 4 months In-Reply-To: <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com> References: <20190408152700.2176e4856a699767f411313c@free.fr> <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com> Message-ID: <20190410132748.9d907b9ddd6da474ae3ea267@free.fr> On Mon, 8 Apr 2019 10:49:42 -0400, Oz Linden (Scott Lawrence) wrote: > On 2019-04-08 09:27 , Henri Beauchamp wrote: > > In the hope my observations will help you guys to get things back on > > track (because it's getting really badly needed). > > Thank you Henri. > > We are very much aware of these problems and trying hard to correct > them, but I believe you may have added some useful insights; I've > forwarded your message to the two teams I have attacking them. Thank you ! In the mean time, I managed to diagnose the rebake issue and fixed it: what I found out could, perhaps, also be a hint about what happens for failed TPs. The problem was a race condition (as it is often the case in the server/viewer communication): my workaround for derezzing attachments triggered a rebake in the arrival sim, but I did not check whether the capabilities for that sim were received or not; this did not cause any issue before last week (i.e. rebaking with old (cached, in my viewer) capabilities URIs did not matter), but it does now (whether it is the result of a changed timing, algorithm or policy on the server side is of course unknown to me). In my new code, I simply flag the rebake as needed and now actually perform it only once the capabilities for the new agent region are received, and everything works like a charm (well, not fully, since that workaround should not even be needed in the first place, and the bogus kill-objects message on agent attachments is still wrongly sent by the departure region). Now, why might it be related at all with failed TPs ?... Well, if the capabilities are received too late, and seeing how viewers implementing region-Windlight are way more prone to timeouts than mine, it could indeed be that the viewers are attempting to use a capability that is not yet available, causing a timeout... Regards, Henri. From nagle at animats.com Wed Apr 10 11:15:02 2019 From: nagle at animats.com (John Nagle) Date: Wed, 10 Apr 2019 11:15:02 -0700 Subject: [opensource-dev] SL server issues in the past 4 months In-Reply-To: <20190410132748.9d907b9ddd6da474ae3ea267@free.fr> References: <20190408152700.2176e4856a699767f411313c@free.fr> <4c257e6a-55b3-f129-ad7c-68743aad9404@lindenlab.com> <20190410132748.9d907b9ddd6da474ae3ea267@free.fr> Message-ID: <6defbf45-c4f2-153c-8a38-7072bea1788d@animats.com> On 4/10/19 4:27 AM, Henri Beauchamp wrote: > In the mean time, I managed to diagnose the rebake issue and fixed it: > what I found out could, perhaps, also be a hint about what happens for > failed TPs. > > The problem was a race condition (as it is often the case in the > server/viewer communication): my workaround for derezzing attachments > triggered a rebake in the arrival sim, but I did not check whether the > capabilities for that sim were received or not; this did not cause any > issue before last week (i.e. rebaking with old (cached, in my viewer) > capabilities URIs did not matter), but it does now (whether it is the > result of a changed timing, algorithm or policy on the server side is > of course unknown to me). > > In my new code, I simply flag the rebake as needed and now actually > perform it only once the capabilities for the new agent region are > received, and everything works like a charm (well, not fully, since that > workaround should not even be needed in the first place, and the bogus > kill-objects message on agent attachments is still wrongly sent by the > departure region). > > Now, why might it be related at all with failed TPs ?... Well, if the > capabilities are received too late, and seeing how viewers implementing > region-Windlight are way more prone to timeouts than mine, it could > indeed be that the viewers are attempting to use a capability that is > not yet available, causing a timeout... Hm. On failed teleports, I see lots of capability retrieval failures in the Firestorm log. Like this: 2019-04-09T19:57:01Z WARNING #CoreHttp# llcorehttp/_httppolicy.cpp(434) stageAfterCompletion : HTTP request 0x7f81c4f9a5f0 failed after 5 retries. Reason: Not Found (Http_404) 2019-04-09T19:57:01Z WARNING #CoreHTTP# llmessage/llcorehttputil.cpp(282) onCompleted : Possible failure [Http_404] cannot POST url 'https://sim10658.agni.lindenlab.com:12043/cap/4d310fee-b2c7-cb62-357d-0317811990e7' because Not Found 2019-04-09T19:57:01Z INFO # llcommon/llsdserialize_xml.cpp(417) parse : LLSDXMLParser::Impl::parse: XML_STATUS_ERROR parsing:cap not found: '4d310fee-b2c7-cb62-357d-0317811990e7' 2019-04-09T19:57:01Z WARNING #LLEventPollImpl# newview/lleventpoll.cpp(222) eventPollCoro : Canceling coroutine 2019-04-09T19:57:01Z WARNING #CoreHttp# llcorehttp/_httppolicy.cpp(434) stageAfterCompletion : HTTP request 0x305e8e50 failed after 1 retries. Reason: Not Found (Http_404) 2019-04-09T19:57:01Z WARNING #CoreHTTP# llmessage/llcorehttputil.cpp(282) onCompleted : Possible failure [Http_404] cannot POST url 'https://sim10412.agni.lindenlab.com:12043/cap/d3b0c971-112d-b7c7-00d9-17dcb92b5027' because Not Found 2019-04-09T19:57:01Z INFO # llcommon/llsdserialize_xml.cpp(417) parse : LLSDXMLParser::Impl::parse: XML_STATUS_ERROR parsing:cap not found: 'd3b0c971-112d-b7c7-00d9-17dcb92b5027' 2019-04-09T19:57:01Z WARNING #LLEventPollImpl# newview/lleventpoll.cpp(222) eventPollCoro : Canceling coroutine 2019-04-09T19:57:02Z INFO # newview/llviewerdisplay.cpp(239) display_stats : FPS: 21.70 2019-04-09T19:57:02Z WARNING #CoreHttp# llcorehttp/_httppolicy.cpp(434) stageAfterCompletion : HTTP request 0x40ff54b0 failed after 1 retries. Reason: Not Found (Http_404) 2019-04-09T19:57:02Z WARNING #CoreHTTP# llmessage/llcorehttputil.cpp(282) onCompleted : Possible failure [Http_404] cannot POST url 'https://sim10658.agni.lindenlab.com:12043/cap/11bb494c-1a04-24f1-8644-826273c548d8' because Not Found 2019-04-09T19:57:02Z INFO # llcommon/llsdserialize_xml.cpp(417) parse : LLSDXMLParser::Impl::parse: XML_STATUS_ERROR parsing:cap not found: '11bb494c-1a04-24f1-8644-826273c548d8' 2019-04-09T19:57:02Z WARNING #LLEventPollImpl# newview/lleventpoll.cpp(222) eventPollCoro : Canceling coroutine This was from yesterday, when Oz had people from Server User Group TPing between three mostly empty Linden sims as a test. I've seen occasional errors like that before. That there are ever 404 errors for a cap seems wrong. The sim told the viewer to fetch that URL directly from the sim, and then the sim didn't have it available. John Nagle From oz at lindenlab.com Wed Apr 17 06:29:14 2019 From: oz at lindenlab.com (Oz Linden (Scott Lawrence)) Date: Wed, 17 Apr 2019 09:29:14 -0400 Subject: [opensource-dev] Discontinuing the viewer-development-commits list Message-ID: <9e4d4844-c6a3-22bd-4f6e-8f2ec66a72b9@lindenlab.com> As part of cleaning up little-used infrastructure, I am discontinuing the posts for commits to viewer-release on the viewer-development-commits list.? If you would like notices, you can create your own personal notices by 'watching' the repository using bitbucket. -- OZ LINDEN | Senior Director, Second Life Engineering email: oz at lindenlab.com | Scott Lawrence LINDEN LAB | Create Virtual Experiences -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190417/a74b77a5/attachment.htm From nickyperian at gmail.com Tue Apr 23 09:11:41 2019 From: nickyperian at gmail.com (Nicky Perian) Date: Tue, 23 Apr 2019 11:11:41 -0500 Subject: [opensource-dev] Last linux 3p-vlc-bin Message-ID: https://automated-builds-secondlife-com.s3.amazonaws.com/hg/repo/3p-vlc-bin/rev/315283/arch/Linux/vlc_bin-2.2.3-linux-201606011750-r10.tar.bz2 Can this LL build id 315283 be identified to node from which it was built? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190423/b85479aa/attachment.htm From jack at raspberrypi.org Fri Apr 26 06:24:32 2019 From: jack at raspberrypi.org (Jack Lang) Date: Fri, 26 Apr 2019 14:24:32 +0100 Subject: [opensource-dev] Firestorm viewer on RaspberryPi In-Reply-To: References: Message-ID: I'm trying to port Firestorm to Raspberry Pi (Debian 9), following https://wiki.phoenixviewer.com/fs_compiling_firestorm_alexivy_debian_9 However it fails to find pthread_create, even though it is installed. Where might the filepath be defined? Thanks Jack Lang -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.secondlife.com/pipermail/opensource-dev/attachments/20190426/9c82c0ac/attachment.htm From sldev at free.fr Fri Apr 26 11:09:30 2019 From: sldev at free.fr (Henri Beauchamp) Date: Fri, 26 Apr 2019 20:09:30 +0200 Subject: [opensource-dev] Firestorm viewer on RaspberryPi In-Reply-To: References: Message-ID: <20190426200930.e180a27dfd3a7b89affa3ac1@free.fr> On Fri, 26 Apr 2019 14:24:32 +0100, Jack Lang wrote: > I'm trying to port Firestorm to Raspberry Pi Are you serious ? O.O Since mesh has been implemented, the viewer code is tightly bound to x86 processors, because it makes a large use of SSE2 math. You'd have to translate all that code into an ARM equivalent (Neon); while doable, it's not for the faint-hearted. Even if you could get the code to compile (which would involve to also recompile all the pre-built libraries, such as Dullahan, Collada and a few others you likely don't have in the Pi's distro), there are also the *big* problems with the CPU and GPU speeds; the resulting viewer would likely render at under 5fps, at best (and in a skybox with no avatar around)... And finally, there is the *wall*: 1Gb of RAM is notably too little to run a viewer (even a v1 viewer such as mine, even compiled for 32 bits). Henri.