w3c/mediasession

Introduce way to expose supported MediaSession actions?

Closed this issue · 18 comments

The current specification allow a website to set events in order to control MediaSession actions. Having an event listener for an action is used by the UA to decide whether to show a button in its UI (per spec). In addition of introducing side effects when listening to an event, this is creating strange behaviour where a page will modify the MediaSession UI when adding the event listeners (on Android, the notification will flicker). This would also happen when a page will change its event handler (ie. session.onplay = foo; session.onplay = foo; will add/remove twice).

Should we have a more straightforward way to expose that a page can handle/support some actions without creating an event listener? or are we happy with the current situation and we expect implementers to find workarounds for the side effects?

/CC @annevk @slightlyoff for general feedback
/CC @jernoble for implementer feedback

The DOM specification recommends against this pattern.

@annevk This proposal is copacetic with the DOM recommendation. No page-visible behavior is observable due to setting this eventListener. The spec is (correctly) silent about how the UA exposes these actions to the OS.

@mounirlamouri It sounds like your display problems would be solved by "enqueuing a task" in spec parlance.

@jernoble if the only way this can be implemented is via adding event listeners having side effects, I wouldn't call it copacetic.

That's incorrect. In the very basic use case, hitting a hardware pause button would fire playpause event which, unless the page calls preventDefault(), will toggle the play state of the video element.

So why is OP talking about UI modifications based on the presence of event listeners?

Both Android and iOS have software controls for changing playback state. The UA (or the platform on which the UA runs) has to decide what controls to display, which is why @mounirlamouri brought it up.

For those it would make more sense if the app could indicate what UI actions have an effect somehow. Listeners don't really tell you that one way or another.

If we provided a way for the page to indicate it supports "next track" commands (e.g.) which the page author set but then didn't add a "next track" event listener, I'm not going to display a "next track" UI because I'm pretty sure it won't do anything.

@xxyzzzq @beaufortfrancois and I have been talking about these. None of us is super happy with the current situations. One of the reason is the side effect of adding an event handler. Another is that some events should definitely have a default action but some probably shouldn't. For example, Chrome will likely force to pause all players after pause is fired if they are not paused but some actions like nexttrack or previoustrack would be hard and pointless to implement by the UA. Similarly, because in the current model, listening to the play event has a side effect, it wouldn't be clear to a website if they can do onplay = doUIChanges(); without handling playback themselves.

We came up with the following proposal:

enum MediaSessionAction {
  "play",
  "pause",
  "previoustrack",
  "nexttrack",
  "seekforward",
  "seekbackward",
};

[Constructor(sequence<MediaSessionAction> actions)]
interface MediaSessionActions : EventTarget {
  attribute EventHandler onplay;
  attribute EventHandler onpause;
  // onplaypause was removed in favour of using the MediaSessionPlayback.state
  attribute EventHandler onprevioustrack;
  attribute EventHandler onnexttrack;
  attribute EventHandler onseekbackward;
  attribute EventHandler onseekforward;
};

partial interface MediaSession {
  attribute MediaSessionActions? actions;
};

That way, a website can specify which actions it supports by doing navigator.mediaSession.actions = new MediaSessionActions(['play', 'pause', 'seekforward', 'seekbackward']).

Then, the MediaSessionActions object can receive the appropriate events following web platform conventions. It will also allow us to be more aggressive with regards to default behaviour and we can simply assume the browser will entirely delegate the behaviour to the page. For example, handling the play event will automatically prevent the default event from the UA.

In summary, the main changes from this proposal are:

  • preventDefault() is no longer needed: there is no more default behaviour if an action is marked as handled by the website (exception for pausing);
  • playpause is removed;
  • handling actions is done by creating a MediaSessionActions object which will then receive the events.

There are likely better ways to do this so comments welcome :)

I see the proposal of @mounirlamouri moved the control into a new object, which is the opposite of #45 . I think there's a reason why we flattened the media controls into MediaSession. We need to keep that in mind.

What would happen if MediaSessionActions contains play but website doesn't handle onplaypause event?

I'm not sure why we need to create MediaSessionActions object first. I believe code below should be enough to tell if an action is marked as handled by the website or not (no preventDefault in there)

navigator.mediaSession.actions.addEventListener('play', function(event) {
  // Do stuff...
});

playpause would be removed because we could use the playbackState to figure out the current state and call the right callback.

Otherwise, what you propose is what we have today with various events, right? The idea with creating the object is that it's easier to set a contract that the page will then handle the actions it says it will. It also solves the issue of the side effect of event listening. Another approach is to have a method or attribute that will be used to mark the handled actions.

In general, I'm not strongly against the side effects of listening to the events but I would prefer if there was no default behaviour apart from pause because it would otherwise be very confusing for developers because the default behaviour couldn't be specified.

@annevk @jernoble WDYT?

After discussing with @xxyzzzq and @beaufortfrancois, we came up with an alternative. The idea would be to not use addEventListener but something else so we don't break the web platform semantics. We could for example add a method call setActionHandler that will take a MediaActionType in parameter and a callback.

It would look like this:

navigator.mediaSession.setActionHandler('nexttrack', () => {
  Playlist.next();
});
navigator.mediaSession.setActionHandler('pause', () => {
  Player.pause();
});

Obviously, it is very similar to the event listener pattern but we can then specify our own rules which is that handling an event means that the user agent will step doing any default behaviour and will use this as a signal that the website supports the action.

This is only an idea to get the discussion moving so please feel free to leave feedback :)

I'm not a huge fan of the method name setActionHandler but I don't have anything else to propose so LGTM.

I like the idea of making these "action handlers" rather than "event listeners". I might suggest calling this setAction(name, handler) rather than setActionHandler, or better yet just using WebIDLs named property getters and setters to define the MediaSessionActions object:

callback MediaSessionActionHandler = void ();

interface MediaSessionActions {
    getter MediaSessionActionHandler getActionHandler(DOMString action);
    setter creator void setActionHandler(DOMString action, MediaSessionActionHandler handler);
    deleter void removeActionHandler(DOMString action);
}

partial interface MediaSession {
    MediaSessionActions actions;
}

Which would allow clients to write code like:

var actions = navigator.mediaSession.actions;
actions.play = () => { doPlay(); }
actions.play();
delete actions.play;

// as well as

actions.setActionHandler('pause', () => { doPause(); });
actions.getActionHandler('pause'))();
actions.removeActionHandler('pause');

// or even

for (action of actions) {
    delete actions[action]; // Not sure if this is legal syntax, but you get the idea.
    actions.removeActionHandler(action); // The more verbose syntax
}

That sounds reasonable to me. My understanding of the named property is that we would need to define a getter, setter and deleter in addition of the get/set/remove methods. Furthermore, we would need to reject if the action isn't supported. Those look like details to me though.

@annevk @slightlyoff any objection?

What you proposed seems reasonable, though I thought some Chrome engineers had reservations about new callback-based APIs.

You don't want to use named getters/setters/deleters. They cause conflicts between actual members of the object and things you want to look up. Just use short method names. (Once https://www.w3.org/Bugs/Public/show_bug.cgi?id=26986 is fixed for IDL this should be more clear.)

To summarize, I've added setActionHandler() based on all the above comments. It seems that there was a consensus for this to be better than the event model we had. I personally liked @jernoble proposal but based on @annevk's comments, I didn't go with it. I decided to not add removeActionHandler() and getActionHandler() to reduce the fingerprint of the API at the moment. We can add them later without any breakage. Finally, I'm happy to discuss the name in another issue but I felt that setAction() could be too confusing.