capnmidnight/Calla

Avatars are not always removed from map after leaving room

binarica opened this issue · 4 comments

Sometimes when exiting a room (either by closing the tab or clicking the "Leave" button), user names from previous sessions might still appear, which slows down the session when there are multiple avatars in the view:

Screenshot_2020-10-07_23-39-44

I've seen this issue on occasion.

Sometimes I wish I hadn't built on top of Jitsi. I first built this with Jitsi's IFrame API, which made the setup really fast and everything just worked. Then people complained about not having webcam feeds. I think webcam feeds are a mistake, but I caved and reimplemented using Jitsi's lib-jitsi-meet API (there's no way to share a video feed from one window to another. The audio was also not shared between windows, but it didn't matter because you could still hear it, while the graphics of my game occluded the video stream).

Then all the trouble started. The IFrame API is just a system for controlling Jitsi's own interface. Lib-jitsi-meet is the underlying library for building Jitsi's interface. And it's not easy to use. Building out a from-scratch WebRTC implementation would probably have been easier.

One of the issues that is difficult to manage is the disconnecting users. For whatever reason, using the documented "leave" commands doesn't always succeed. Jitsi gets into a state where it still thinks people are connected, even though there are no feeds coming from them. I don't know what it is and I haven't had a lot of time to trouble shoot it lately.

When it gets really bad, I have to restart the Jitsi server.

This is a bug in Jitsi. If it doesn't happen when using the regular Jitsi Meet interface, then there must be some secret, undocumented way of leaving a conference that I'm not doing.

JaKXz commented

Sometimes I wish I hadn't built on top of Jitsi.

Curious, is there another framework to build on top of that could achieve this?

If I were doing this again, I'd probably run a WebRTC server like Janus Gateway (https://github.com/meetecho/janus-gateway) and just do regular WebRTC calls. Jitsi has too many features for Calla. Most of what Jitsi does is just overhead for this project.

The WebRTC part is not hard. Actually, I'd say it's a lot easier than the audio spatialization portion that I've already built here with Calla. The hard part is that something like 20% of users in the wild are behind corporate or mobile network firewalls that prevent pure P2P connections from being achieved. That means that a room of 10 randomly selected people has only about a 13% chance of successfully pairing between all users.

So to mitigate that issue, you need something called a TURN server (Traversal Using Relay NAT), which is just a fancy way of saying "give up on P2P entirely and stick a centralized server in the middle". It's a fallback; WebRTC expects to attempt to use a P2P connection first and only use the centralized system if it can't make a connection.

But TURN servers are a sensitive commodity on the internet. You have to be very careful in setting them up that other people on the internet don't steal your bandwidth without your permission. So it's not enough to just setup a TURN server, you have to also setup a secure authentication system between your web server and your TURN server.

Jitsi VideoBridge (https://github.com/jitsi/jitsi-videobridge) is Jitsi Meet's TURN server. It's called a "Selective Forwarding Unit", which means it does double-duty in deduplicating streams between users. But JVB is not documented very well. It basically just says "use XMPP to communicate with the server", so not only do you have to figure out how to setup JVB, but you also have to figure out how to talk XMPP to it. This complication is necessary for that security reason I mentioned before.

My initial goal with Calla, back in late February, was to see if I could get something running in a weekend. I also didn't know as much about the TURN server ecosystem at the time, just that I needed one. Back when I first learned how to write WebRTC software, there wasn't much available, so this year I was learning the server ecosystem from scratch.

So when I found Jitsi Meet, on the surface it seemed like a good idea. It's extremely easy to install, basically one apt install command, plus a small amount of configuration changes. And using lib-jitsi-meet to interface with it, at first, looks like it's going to be easy. Unfortunately, it's extremely brittle with regards to making sure everything is done in the order it expects. It's also extremely large, probably about 80% of the code in Calla, and even still 50% of the code in a large, WebXR project I'm building at work. It has a significant impact on the performance of the overall system, which is a problem for me in the WebXR scenario.

I'm in the process of abstracting the Jitsi parts of Calla out to swappable components. In the current version of Calla, all communication goes through lib-jitis-meet. I'm going to maintain that going forward, but provide different options for being able to integrate with different TURN servers. I have the position sharing going through my own server now, so I just need to get the audio streams and room management.

Janus Gateway is another TURN server, which can also act as an SFU with an additional plugin. Mozilla Hubs uses Janus Gateway. Hopefully I can get an attempt at running that up and running. Unfortunately, I'm not finding the documentation any easier to understand.

archiving