w3c/webrtc-pc

Should media capabilities influence what is exposed in what is exposed in WebRTC offers and answers

youennf opened this issue · 13 comments

WebRTC implementations currently tend to expose the whole list of codecs that are available. On contrary media capabilities is telling whether a particular codec is supported.

For some codecs, like VP8, VP9 and H264, it might be fine in terms of privacy since most UAs support them.

For newer codecs though, it is an actual issue.
Maybe what WebRTC exposes in terms of codecs should be based on what MC already exposed to the web page, in addition to the base offering that has no privacy issues.

This could apply to offers as well as receiver/sender getCapabilities.

As an example, HEVC codec would not be exposed by default in getCapabilities/SDP.
It would get exposed by default if the web page checks whether HEVC is supported for WebRTC via media capabilities.

@henbos At one point, you noticed that getCapabilities() could return a different answer when it was called before createOffer() as compared with after. This was because createOffer() is async and would have enumerated all the hardware resources before returning (which would then be reflected in what was returned in getCapabilities()).

I am wondering if the same effect is seen with MediaCapabilities.

I believe #2925 would help in resolving this issue - hiding specialist codecs until they are asked for would reduce the scope of privacy exposure.

So in the terminology of #2925, we should say that "if a codec is queried for WebRTC usage, it should be enabled for (sending/receiving)?
It does change the query function from being stateless to affecting the state of an otherwise unrelated interface, which seems a bit odd. We might want to make that function explicit (extra flag in query?)

To be clear, my 2-week-old comment was intended to say "no" unless we add a flag to the request desiring this behavior.

AIUI, you are suggesting to have an explicit opt-in somewhere, which sounds reasonable.
This opt-in could be in MC APIs (the flag) or in WebRTC APIs (new API, or related to codec preference API for instance).

If we merge w3c/media-capabilities#186 (return RTCRtpCodecCapability when querying for "webrtc"), a minimal API change would be to allow adding this codec description to setParameters() and setCodecPreferences(), thereby enabling the codec.

Discussion on WG call Feb 20, 2024: Support for idea, should be worked into a proposal.

This issue had an associated resolution in WebRTC February 2024 meeting – 20 February 2024 (Should media capabilities influence what is exposed in what is exposed in WebRTC offers and answers #2929):

RESOLUTION: Proceed with proposal based on data-minimization appraoch

Assuming we finalise w3c/media-capabilities#186, we would use RTCRtpCodecCapability as the output of Media Capabilities.

On WebRTC side, API wise, there could be different approaches:

  1. Context wide approach. Add RTCRtpSender.addCodec and RTCRtpReceiver.addCodec, applies to all peer connections of the given context
  2. RTCPeerConnection wide approach. Add senderCodecs and receiverCodecs fields to RTCConfiguration
  3. Transceiver approach. Use RTCRtpSender.setCodecPreferences and RTCRtpReceiver.setCodecPreferences

3 has no API overhead which is nice.
I might have a slight preference for 1 though: from a web developer point of view, this is the least invasive approach and would make it easy to handle UAs with different default codec sets in one single place.

@alvestrand, thoughts?

Since we're now declaring that codec definitions (and their enabled status) are per-sender/receiver properties, I think #1 is the best approach.
I do think that RTCRtpSender.SetParameters(encodings.codec = new codec) may be an obvious enough surface that we don't need to have RTCRtpSender.AddCodec() - if you're not going to send it, why are you bothering to enable it?

Similarly, RTCRtpReceiver.SetCodecPreferences(new codecs) should be enough API surface for the receiver - if you're not going to announce that you can receive it, why bother?

In retrospect, it was probably a mistake to put SetCodecPreferences() on the transceiver.

(2 is a bad idea because it goes at cross purposes with the idea of deriving codecs from installed transform. 3 is a bad idea because setCodecPreferences() isn't defined to touch the sender.)

Oh option 1 is slightly unclear.
I was meaning static methods on RTCRtpSender/RTCRtpReceiver so that it applies to all peer connections of a context.
Basically, the spec is defining a list of implemented receive/send codecs. We would make this list per context and modify it via these APIs.

It seems you are talking of option 3 (per transceiver/sender/receiver instance).

Aha, I did not read that carefully enough. Yes, I think we need to preserve the independence of PeerConnections - if there are two libraries on a page both using PeerConnection, leaking the fact that one of them is enabling a specific codec to the other one would come as an unwelcome surprise for both of them.