meetecho/janus-gateway

ICE Failed

shamil614 opened this issue · 46 comments

I'm having some problems with the echo test. The server is deployed to a public IP on DigitalOcean, and the ssl cert is installed and configured. Yesterday at work all seemed to work fine but after I got home either I changed a config or there's a networking situation that I don't understand. The local video renders but the remote video fails.

Here's what Janus logs:

Creating new session: 1423452863
Creating new handle in session 1423452863: 418682067
[418682067] There's a message for JANUS EchoTest plugin
[418682067] There's a message for JANUS EchoTest plugin
[418682067] Creating ICE agent (controlled mode)
[418682067] ICE send thread started...
[418682067] Done! Ready to setup remote candidates and send connectivity checks...
ICE started and trickling, sending connectivity checks for candidates retrieved so far...
[WARN] [418682067]    Unsupported transport tcp!
No more remote candidates for handle 418682067!
[ERR] [ice.c:janus_ice_cb_component_state_changed:634:] ICE failed for handle 418682067...
No WebRTC media anymore
[418682067] ICE send thread leaving...

Below are some screenshots
screen shot 2014-11-14 at 6 33 02 am
screen shot 2014-11-14 at 6 33 10 am
screen shot 2014-11-14 at 6 33 37 am

Which browser? I still get a lot of ICE failure in Firefox.

I've been using Chrome primarily. Though I tried on FF and got the same results.

Did you set any client side ICE servers? I am guessing you need to set a STUN and a TURN server so that the server can communicate through your NAT. The media could very well "make it" to the server but sending it back, it could be getting blocked by your network set up.

I did set a Google STUN server, and my public IP in the config file

[nat]
public_ip = 104.236.35.218
stun_server = stun4.l.google.com
stun_port = 19302

Is there more I should do?

Forgive me because I'm a newbie to WebRTC.

Yeah, sometimes STUN is not enough and you have to use a TURN server(depending on your NAT settings). There are public TURN servers available for testing from viagenie. Try using one of those as well to see if you still do not get a feed.

Oh ok. And I just use the STUN/TURN from viagenie in the janus.cfg file in place of the google stun? It looks like they require I pass a user name and password too. Uhh...nevermind I guess this is what I need to be looking at.

server: server,
                    // No "iceServers" is provided, meaning janus.js will use a default STUN server
                    // Here are some examples of how an iceServers field may look like to support TURN
                    //      iceServers: [{url: "turn:yourturnserver.com:3478", username: "janususer", credential: "januspwd"}],
                    //      iceServers: [{url: "turn:yourturnserver.com:443?transport=tcp", username: "janususer", credential: "januspwd"}],
                    //      iceServers: [{url: "turns:yourturnserver.com:443?transport=tcp", username: "janususer", credential: "januspwd"}],

https://github.com/meetecho/janus-gateway/blob/a0be3d0cf171456f1823c0d5602e23e56ae398f2/html/echotest.js

The echotest.js in the demo pages show you how to set iceServers in the javascript client API wrapper. Add iceServers: [{url: "turn:yourturnserver.com:3478", username: "janususer", credential: "januspwd"}] to your Janus constructor before your success definition. FYI: credential is your password...

Well I gave it a try and I'm still getting the same results. When I disable my home firewall, I can connect, but even with the turn server, I still have a failing connection. Here's the log:

Creating new session: 980060701
Creating new handle in session 980060701: 3078398561
[3078398561] There's a message for JANUS EchoTest plugin
[3078398561] There's a message for JANUS EchoTest plugin
[3078398561] Creating ICE agent (controlled mode)
[3078398561] No stream, wait a bit in case this trickle got here before the SDP...
[3078398561] ICE send thread started...
[3078398561] Waiting for candidates-done callback...
[3078398561] Waiting for candidates-done callback...
[WARN] [3078398561]    Unsupported transport tcp!
[3078398561] Done! Ready to setup remote candidates and send connectivity checks...
ICE started and trickling, sending connectivity checks for candidates retrieved so far...
No more remote candidates for handle 3078398561!
[ERR] [ice.c:janus_ice_cb_nice_recv:680:] Still waiting for the DTLS stack for component 1 in stream 1...
[ERR] [ice.c:janus_ice_cb_component_state_changed:634:] ICE failed for handle 3078398561...
No WebRTC media anymore
[3078398561] ICE send thread leaving...

Here's what I added to the echotest.js
echotest

Interesting...Do other webrtc solutions work from your home network? Apprtc.appspot.com perhaps which has its own turn servers and just sets up a peer connection(you will have to have somebody on the other end and not just try against your self because the host candidate will be chosen and not a relay one)? It has to do with the firewall obviously and a TURN server should help with that

You know, I'm not 100% certain if I've tried webRTC from my home since I switched out some networking gear. TimeWarner cable did some upgrades that made me swap out my modem/router. I've since come into work and echo is working...so yeah good times. I'm gonna head home and try again with Apprtc.appspot.com and see what happens.

Good luck and let me know whats up!

Ok so I can make a successful webRTC echo test when I disable my firewall. Otherwise the Turn servers don't seem to help. Here's about output of what I get now that my firewall is disabled.

Creating new session: 1624144628
Creating new handle in session 1624144628: 2016229237
[2016229237] There's a message for JANUS EchoTest plugin
[2016229237] There's a message for JANUS EchoTest plugin
[2016229237] Creating ICE agent (controlled mode)
[2016229237] ICE send thread started...
[WARN] [2016229237]    Unsupported transport tcp!
[2016229237] Waiting for candidates-done callback...
[2016229237] Waiting for candidates-done callback...
[2016229237] Waiting for candidates-done callback...
[2016229237] Done! Ready to setup remote candidates and send connectivity checks...
ICE started and trickling, sending connectivity checks for candidates retrieved so far...
No more remote candidates for handle 2016229237!
[ERR] [ice.c:janus_ice_cb_nice_recv:680:] Still waiting for the DTLS stack for component 1 in stream 1...
[WARN] [2016229237]     Missing valid SRTP session (packet arrived too early?), skipping...
[2016229237] The DTLS handshake has been completed
WebRTC media is now available
[2016229237] Started thread: setup of the SCTP association
[2016229237] Starting thread for SCTP association
[2016229237] Connected to the DataChannel peer
[2016229237] There's a message for JANUS EchoTest plugin

I was able to use Apprtc.appspot.com from my home network.

On two computers inside the same network or with one computer outside your network? Using it one the same computer or between two computers on the same network is a useless test as the ICE candidate used would be the host one and definitely not one gathered through stun or turn(and thus the traffic would never leave your network).

TURN servers help if they're reachable: if your firewall filters the ports the TURN server listens on, then they cannot help you either. That's why most deployments use 80 and 443, both UDP and TCP/TLS, as ports for their TURN servers, to maximize che changes of getting through.

Right you are. Yeah I thought you were going to say that. It was the same computer on the same network so it's not a good test. I can get a WiFi hotspot from work and test using Apprtc.appspot.com on Monday.

On a related front, I was doing some tests of my company's webRTC implementation (OpenTok) from my home network. The other participant was on the work network. Long story short, the results were mixed when I had the firewall enabled. Sometimes it worked and sometimes not. Really just odd behavior.

Thank you for the inclusion of that point @lminiero. So I need to try an ICE server on 443? Looks like this company has Turn servers available as a service. http://xirsys.com/guide/
Good idea to give 'em a try?

All this maybe a good thing because my router is standard issue from the cable company which many people in our area are getting. The "enabled firewall" doesn't really explain what it's doing when it's "enabled".

It's definitely useful if you can try one. Otherwise, you may try deploying one yourself, there's a popular open source tool you can use for the purpose, just for testing maybe:

https://code.google.com/p/rfc5766-turn-server/

configuring it is easy and there are several guides around.

As a side note, also have a look at the admin interface in Janus (admin.html in the demos). It will display sessions and related handles. Look for the handle associated to your echo test, and check what the admin has to say with respect to candidates, ICE, DTLS etc.

So this morning I was able to successfully connect with a coworker via Apprtc.appspot.com from my home network, yet echo test was still failing. My coworker was able to connect to echo test from our work network via http/https.

I am using some Turn servers in the client side code via Xirsys.com

@lminiero in regards to your suggestion about the admin portal. I'm seeing all sorts of interesting data when I have a failing test, though I'm not sure what all to make of it.

Here's the part on DTLS, which appears to be failing

"dtls": {
  "fingerprint": "<removed>",
  "remote-fingerprint": "<removed>",
  "dtls-role": "active",
  "dtls-state": "trying",
  "valid": 0,
  "ready": 0
 }

I'm not sure about what to make of the local / remote candidates

"local-candidates": [
  "a=candidate:1 1 udp 2013266431 104.236.35.218 37266 typ host\r\n"
],
"remote-candidates": [
  "candidate:1841357947 1 udp 2122194687 192.168.0.7 56256 typ host generation 0",
  "candidate:2400953187 1 udp 25042687 75.126.93.125 57765 typ relay raddr 72.177.19.210 rport 55028 generation 0",
  "candidate:2400953187 1 udp 41819903 75.126.93.125 50467 typ relay raddr 72.177.19.210 rport 50763 generation 0"
                    ]

Another thought, how do I know if the Xirsys Turn servers are being used properly?

The two most relevant bits to look for are the ICE state (and possible the candidates Janus is aware of) and the DTLS state. The former tells you whether a "channel" was correctly established, while the latter if a DTLS handshake was succeeded on top of that. In your case it's ICE that is likely failing due to too restrictive network settings that can't be mitigated by the setup you're making use of.

You only pasted the candidates so I don't know what's the ICE state. I can see that the only remote (browser) candidates available to Janus are a private address (useless) and a couple of relay (TURN) addresses. The fact that relay addresses are available should imply that the browser managed o contact the TURN server and ask for relay functionality, and so that they should help establishing a connection. Not sure what's not working in your case. Try capturing some dumps using Wireshark to see what's happening there (e.g., connectivity checks in any direction).

IIRC you said that you specified a public IP in the Janus config: if Janus is deployed on a publicly reachable machine, that setting is not needed, so try removing it.

Thank you @lminiero that's very helpful. I'll take another look and gather more data.

Here's the full output from the admin screen

{
    "session_id": 766793855,
    "handle_id": 1095879909,
    "plugin": "janus.plugin.echotest",
    "flags": {
        "processing-offer": 0,
        "starting": 1,
        "ready": 0,
        "stopped": 0,
        "alert": 0,
        "bundle": 1,
        "rtcp-mux": 1,
        "trickle": 1,
        "all-trickles": 1,
        "trickle-synced": 1,
        "data-channels": 1,
        "plan-b": 0,
        "cleaning": 0
    },
    "sdps": {
        "local": "v=0\r\no=- 4887134209681225351 2 IN IP4 127.0.0.1\r\ns=-\r\nt=0 0\r\na=group:BUNDLE audio video data\r\na=msid-semantic: WMS janus\r\nm=audio 1 RTP/SAVPF 111 103 104 0 8 106 105 13 126\r\na=mid:audio\r\nc=IN IP4 104.236.35.218\r\na=sendrecv\r\na=rtcp-mux\na=ice-ufrag:cAzE\r\na=ice-pwd:ZmoYd4WtoFsXLvZSv6uWZv\r\na=ice-options:trickle\r\na=fingerprint:sha-256 05:FF:68:61:E2:CE:8E:68:1F:2B:E6:5D:73:E7:22:7F:F0:72:8F:B3:B7:69:36:64:DD:66:70:14:B8:6D:1D:98\r\na=setup:active\r\na=connection:new\r\na=rtpmap:111 opus/48000/2\r\na=fmtp:111 minptime=10\r\na=rtpmap:103 ISAC/16000\r\na=rtpmap:104 ISAC/32000\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:106 CN/32000\r\na=rtpmap:105 CN/16000\r\na=rtpmap:13 CN/8000\r\na=rtpmap:126 telephone-event/8000\r\na=maxptime:60\r\na=ssrc:1201668004 cname:janusaudio\r\na=ssrc:1201668004 msid:janus janusa0\r\na=ssrc:1201668004 mslabel:janus\r\na=ssrc:1201668004 label:janusa0\r\na=candidate:1 1 udp 2013266431 104.236.35.218 37266 typ host\r\nm=video 1 RTP/SAVPF 100 116 117 96\r\na=mid:video\r\nc=IN IP4 104.236.35.218\r\na=sendrecv\r\na=rtcp-mux\na=ice-ufrag:cAzE\r\na=ice-pwd:ZmoYd4WtoFsXLvZSv6uWZv\r\na=ice-options:trickle\r\na=fingerprint:sha-256 05:FF:68:61:E2:CE:8E:68:1F:2B:E6:5D:73:E7:22:7F:F0:72:8F:B3:B7:69:36:64:DD:66:70:14:B8:6D:1D:98\r\na=setup:active\r\na=connection:new\r\na=rtpmap:100 VP8/90000\r\na=rtcp-fb:100 ccm fir\r\na=rtcp-fb:100 nack\r\na=rtcp-fb:100 nack pli\r\na=rtcp-fb:100 goog-remb\r\na=rtpmap:116 red/90000\r\na=rtpmap:117 ulpfec/90000\r\na=rtpmap:96 rtx/90000\r\na=fmtp:96 apt=100\r\na=ssrc-group:FID 3420609499 3524326210\r\na=ssrc:134869651 cname:janusvideo\r\na=ssrc:134869651 msid:janus janusv0\r\na=ssrc:134869651 mslabel:janus\r\na=ssrc:134869651 label:janusv0\r\na=candidate:1 1 udp 2013266431 104.236.35.218 37266 typ host\r\nm=application 1 DTLS/SCTP 5000\r\na=mid:data\r\na=sctpmap:5000 webrtc-datachannel 16\r\nc=IN IP4 104.236.35.218\r\na=sendrecv\r\na=ice-ufrag:cAzE\r\na=ice-pwd:ZmoYd4WtoFsXLvZSv6uWZv\r\na=ice-options:trickle\r\na=fingerprint:sha-256 05:FF:68:61:E2:CE:8E:68:1F:2B:E6:5D:73:E7:22:7F:F0:72:8F:B3:B7:69:36:64:DD:66:70:14:B8:6D:1D:98\r\na=setup:active\r\na=connection:new\r\na=candidate:1 1 udp 2013266431 104.236.35.218 37266 typ host\r\n",
        "remote": "v=0\r\no=- 4887134209681225351 2 IN IP4 127.0.0.1\r\ns=-\r\nt=0 0\r\na=group:BUNDLE audio video data\r\na=msid-semantic: WMS fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv\r\nm=audio 1 RTP/SAVPF 111 103 104 0 8 106 105 13 126\r\nc=IN IP4 0.0.0.0\r\na=rtcp:1 IN IP4 0.0.0.0\r\na=ice-ufrag:smFYJIW8/jbOYYkg\r\na=ice-pwd:l8VaO3Up3oViifbqLUlxSljf\r\na=ice-options:google-ice\r\na=fingerprint:sha-256 DE:0F:AC:72:D1:8E:8C:DA:32:72:05:B7:09:E5:A0:4E:A7:F5:2E:21:E1:36:4B:9F:9F:F6:56:87:D5:05:73:07\r\na=setup:actpass\r\na=mid:audio\r\na=extmap:1 urn:ietf:params:rtp-hdrext:ssrc-audio-level\r\na=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=sendrecv\r\na=rtcp-mux\r\na=rtpmap:111 opus/48000/2\r\na=fmtp:111 minptime=10\r\na=rtpmap:103 ISAC/16000\r\na=rtpmap:104 ISAC/32000\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:106 CN/32000\r\na=rtpmap:105 CN/16000\r\na=rtpmap:13 CN/8000\r\na=rtpmap:126 telephone-event/8000\r\na=maxptime:60\r\na=ssrc:466379917 cname:PVA9l4yP8DzSg/Gi\r\na=ssrc:466379917 msid:fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv 90139a52-d556-4024-8292-7b083b6931fb\r\na=ssrc:466379917 mslabel:fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv\r\na=ssrc:466379917 label:90139a52-d556-4024-8292-7b083b6931fb\r\nm=video 1 RTP/SAVPF 100 116 117 96\r\nc=IN IP4 0.0.0.0\r\na=rtcp:1 IN IP4 0.0.0.0\r\na=ice-ufrag:smFYJIW8/jbOYYkg\r\na=ice-pwd:l8VaO3Up3oViifbqLUlxSljf\r\na=ice-options:google-ice\r\na=fingerprint:sha-256 DE:0F:AC:72:D1:8E:8C:DA:32:72:05:B7:09:E5:A0:4E:A7:F5:2E:21:E1:36:4B:9F:9F:F6:56:87:D5:05:73:07\r\na=setup:actpass\r\na=mid:video\r\na=extmap:2 urn:ietf:params:rtp-hdrext:toffset\r\na=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time\r\na=sendrecv\r\na=rtcp-mux\r\na=rtpmap:100 VP8/90000\r\na=rtcp-fb:100 ccm fir\r\na=rtcp-fb:100 nack\r\na=rtcp-fb:100 nack pli\r\na=rtcp-fb:100 goog-remb\r\na=rtpmap:116 red/90000\r\na=rtpmap:117 ulpfec/90000\r\na=rtpmap:96 rtx/90000\r\na=fmtp:96 apt=100\r\na=ssrc-group:FID 3420609499 3524326210\r\na=ssrc:3420609499 cname:PVA9l4yP8DzSg/Gi\r\na=ssrc:3420609499 msid:fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv 66313a29-c224-483a-a6d0-f68ebbff1afd\r\na=ssrc:3420609499 mslabel:fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv\r\na=ssrc:3420609499 label:66313a29-c224-483a-a6d0-f68ebbff1afd\r\na=ssrc:3524326210 cname:PVA9l4yP8DzSg/Gi\r\na=ssrc:3524326210 msid:fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv 66313a29-c224-483a-a6d0-f68ebbff1afd\r\na=ssrc:3524326210 mslabel:fYQTuxIeTRCk6tI2OkZYWRfRpp4s9MakN0Kv\r\na=ssrc:3524326210 label:66313a29-c224-483a-a6d0-f68ebbff1afd\r\nm=application 1 DTLS/SCTP 5000\r\nc=IN IP4 0.0.0.0\r\na=ice-ufrag:smFYJIW8/jbOYYkg\r\na=ice-pwd:l8VaO3Up3oViifbqLUlxSljf\r\na=ice-options:google-ice\r\na=fingerprint:sha-256 DE:0F:AC:72:D1:8E:8C:DA:32:72:05:B7:09:E5:A0:4E:A7:F5:2E:21:E1:36:4B:9F:9F:F6:56:87:D5:05:73:07\r\na=setup:actpass\r\na=mid:data\r\na=sctpmap:5000 webrtc-datachannel 1024\r\n"
    },
    "streams": [
        {
            "id": 1,
            "ready": -1,
            "ssrc": {
                "audio": 1201668004,
                "video": 134869651,
                "audio-peer": 466379917,
                "video-peer": 3420609499
            },
            "components": [
                {
                    "id": 1,
                    "state": "connected",
                    "local-candidates": [
                        "a=candidate:1 1 udp 2013266431 104.236.35.218 37266 typ host\r\n"
                    ],
                    "remote-candidates": [
                        "candidate:1841357947 1 udp 2122194687 192.168.0.7 56256 typ host generation 0",
                        "candidate:2400953187 1 udp 25042687 75.126.93.125 57765 typ relay raddr 72.177.19.210 rport 55028 generation 0",
                        "candidate:2400953187 1 udp 41819903 75.126.93.125 50467 typ relay raddr 72.177.19.210 rport 50763 generation 0"
                    ],
                    "dtls": {
                        "fingerprint": "05:FF:68:61:E2:CE:8E:68:1F:2B:E6:5D:73:E7:22:7F:F0:72:8F:B3:B7:69:36:64:DD:66:70:14:B8:6D:1D:98",
                        "remote-fingerprint": "DE:0F:AC:72:D1:8E:8C:DA:32:72:05:B7:09:E5:A0:4E:A7:F5:2E:21:E1:36:4B:9F:9F:F6:56:87:D5:05:73:07",
                        "dtls-role": "active",
                        "dtls-state": "trying",
                        "valid": 0,
                        "ready": 0
                    }
                }
            ]
        }
    ]
}

The ICE state is connected, so connectivity shouldn't be the issue. Have you verified that it's like that in chrome://webrtc-internals (Chrome) or about:webrtc (Firefox)?

What doesn't seem to be working is the DTLS handshake, which is still on "trying". You'll have to do the Wireshark capture I mentioned and check, using the DTLS filter, if any handshake message is actually there or not. Try capturing on both the server and client side.

So I know this much. I can complete a echo test on my home network with the firewall enabled when going over plain http. Whenever, I switch to DTLS, it fails. Of course all that is supported by "dtls-state": "trying".

As far as I can tell Wireshark is showing DTLS handshake. Correct me if I'm wrong.
screen shot 2014-11-19 at 5 14 43 pm

Chrome Internals is reporting a failure on the ice servers when connecting via https
screen shot 2014-11-19 at 4 59 55 pm

And showing succcess when connecting via http
screen shot 2014-11-19 at 5 22 42 pm

Taking all this in....my guess is that my IceServers via Xirsys are the likely suspect?
Is that a valid conclusion?

Lastly, on passing in iceServers in the echotest.js file, should I only pass in one, even though Xirsys gives me two, and the example shows an array being passed?
screen shot 2014-11-19 at 5 28 33 pm

You can pass two. When you say HTTP vs. HTTPS, do you mean accessing the demos using HTTP/HTTPS, or do you mean something different?

Yes, I'm testing use the echo test demo. In particular I'm testing using long polling. Like the example code.

   if(window.location.protocol === 'http:')
    server = "http://" + window.location.hostname + ":8088/janus";
   else
        server = "https://" + window.location.hostname + ":8089/janus";

So when I run the test via http://domain.com/echotest.html the test works, and chrome-internals reporting ICEConnectionStateConnected ICEConnectionStateCompleted
When I run the test via https://janus.conf.meetecho.com/echotest.html the test fails and we get the dtl-state: trying and the chrome-internals reporting ICEConnectionStateFailed.

Not sure it's related, that's just the signalling transport. The only difference with respect to WebRTC should be that with HTTPS you don't have to always accept permissions as they're remembered.

For what concerns DTLS, from the Wireshark snippet you pasted the DTLS handshake starts but doesn't completes. 104.236.15.218 (Janus?) sends a Hello, the private address 192.168.0.7 (your browser?) then sends its Hello and more info, first trying some retransmits as there's no answer, then sending the grouped messages separately in case the message is lost for fragmentation. If that's all the messages you see, 104.236.15.218 never sends a message back. Have you done a trace on both the client and server sides? Just to verify whether such messages are actually sent/received on both sides.

This is what I got when I ran tshark on the server (104.XXX)
Looks to me like the server is responding but the messages aren't getting through to the client. Is that correct?

906  20.611776 104.236.35.218 -> 72.177.19.210 DTLSv1.0 239 Client Hello
907  20.682088 72.177.19.210 -> 104.236.35.218 DTLSv1.0 877 Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done
908  20.688160 104.236.35.218 -> 72.177.19.210 DTLSv1.0 1863 Certificate, Client Key Exchange, Certificate Verify (Fragment), Certificate Verify (Reassembled), Change Cipher Spec, Encrypted Handshake Message
3 917  21.690719 72.177.19.210 -> 104.236.35.218 DTLSv1.0 877 Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done
4 918  21.691218 104.236.35.218 -> 72.177.19.210 DTLSv1.0 1838 Certificate, Client Key Exchange, Certificate Verify, Change Cipher Spec, Encrypted Handshake Message
5 941  23.682346 72.177.19.210 -> 104.236.35.218 DTLSv1.0 877 Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done
942  23.682590 104.236.35.218 -> 72.177.19.210 DTLSv1.0 1838 Certificate, Client Key Exchange, Certificate Verify, Change Cipher Spec, Encrypted Handshake Message

Yep, if you can't see them on the client side that looks like it.

Any advice on where to go next?

Not sure, as far as Janus is concerned it looks like it's doing the right thing, retransmitting the DTLS message as it should. You shoud check it you have any firewalls in place that may interfere with this behaviour. What's weird is that bidirectional communication should be available, as in theory, since ICE succeeded and DTLS was started, connectivity checks worked in both directions. Have you looked at some dumps that include both the ICE exchange (STUN messages) followed by the DTLS setup on both sides to see if this is actually the case?

Another thing that comes to mind is that this may be related to fragmentation. I see the DTLS packets sent by Janus are quite large (~1800) which may have them discarded since they're larger than the MTU (~1500). Are you using the default certificate and keys that come with the code, or did you generate some on your own? If so, how did you create them? In case, try with the default certificate to see if things work for you instead.

@lminiero that's a good idea about the certificates. I'm using our domain specific certs. How, they were generated, I don't know, as they were created before starting this job. I'll try the default certs.

OMG! Looks like the default certs are working. I don't freaking believe it. I never would have thought the certs were incorrect because we've been using them on our various servers without incident. Looking at the file size there's quite a difference. It does appear our certificate chain include the root certificate, so maybe that would reduce the file size enough?

The certificates are not incorrect, just too large. You never had issues as you never used them on UDP, I guess, but just on TCP, e.g. for HTTPS. AFAIK the DTLS implementation in openssl should take care of fragmenting too large packets when retransmitting too many times (by default more messages are grouped as you can see from your dump) but not sure what triggers it. If you can manage to reduce your certificates somehow it would definitely help.

Looks like this bug ticket incorporates this issue.
http://rt.openssl.org/Ticket/Display.html?id=2089&user=guest&pass=guest

This seems to refer to the usage of DTLS to transfer data, though, and not to the handshake. This should be the issue to look at:
http://rt.openssl.org/Ticket/Display.html?id=2755&user=guest&pass=guest
Not sure whether the OpenSSL you're using includes it or not.

Looks like it was patched a while ago. I was on 1.01f (the default Ubuntu Package). Just changed the build to download an compile the latest patch (OpenSSL 1.0.1j 15 Oct 2014)

The latest build of OpenSSL did not help.
I don't see how I can get my key size down since the min size is 2048 bit for every cert. provider.
I'm coming to the conclusion that we'll just have to use the echo test as is. Assume some people that have restrictive firewalls will get failing tests when they could actually perform a webRTC session.

Generate a shorter certificate just for DTLS, as no verification is involved as of now. Yuo can still keep your certifiates for HTTPS and other services in your application. Closing the issue as it doesn't seem Janus related, feel free to reopen in case you find more information.

I tried to use a CSR under 2048 but the authority wouldn't allow it.

Thank you for the help. I do very much appreciate it. This is a great project.

What was the solution here?

I ran into this over the weekend and I absolutely could not resolve it. I tried setting STUN & TURN Servers, but consistently ran into ICE Failed messages.

looks, solution is never found for it on this forum. We are facing the exact problem but with alpine docker.
With ubuntu docker, we have no such issues. If any of need it, I will provide dockerfile and dependent stuff.

We are doing it on AWS, so ensure aws firewall rules are fine.

Above issue, has nothing to do with firewall, instead looks like either some build environment issue or something else.

Hello nswarnkar,
Please provide dockerfile and dependent stuff for me too.
Thanks!

Hi Team I am Thiru
please let me know what is the reason it failed , where we can fix this one .

Dec 07 07:00:27 ip-172-31-18-114 sudo[55457]: [WSS-0x7f377c0097a0] Destroying WebSocket client
Dec 07 07:00:27 ip-172-31-18-114 sudo[55457]: Destroying session 2903022895602233; 0x7f3788006880
Dec 07 07:00:27 ip-172-31-18-114 sudo[55457]: Detaching handle from JANUS VideoRoom plugin; 0x7f3788006950 0x7f3788007f90 0x7f3788006950 0x7f3788006>
Dec 07 07:00:27 ip-172-31-18-114 sudo[55457]: [8692756865077481] Handle and related resources freed; 0x7f3788006950 0x7f3788006880
Dec 07 07:05:42 ip-172-31-18-114 sudo[55457]: [WSS-0x7f377c00ae90] Destroying WebSocket client
Dec 07 07:05:42 ip-172-31-18-114 sudo[55457]: Destroying session 1907904165100295; 0x7f3788006ac0
Dec 07 07:05:42 ip-172-31-18-114 sudo[55457]: Detaching handle from JANUS VideoRoom plugin; 0x7f37880022d0 0x7f37880068e0 0x7f37880022d0 0x7f3788003>
Dec 07 07:05:42 ip-172-31-18-114 sudo[55457]: [janus.plugin.videoroom-0x7f37880068e0] No WebRTC media anymore; 0x7f37880022d0 0x7f37880034a0
Dec 07 07:05:42 ip-172-31-18-114 sudo[55457]: [65804917152261] Handle and related resources freed; 0x7f37880022d0 0x7f3788006ac0
Dec 07 07:26:09 ip-172-31-18-114 sudo[55457]: [ERR] [transports/janus_http.c:janus_http_handler:1436] Invalid url /ws/v1/cluster/apps/new-application

I am new into this Janus set up what ever meethico provide installation guide I will follow it working fine. when we trying to connect andriod to janus using ws://localhost:8188 onside video is open but second side is not open . there is no error also . would you please help me