 Hi, welcome to Ask Chrome, the Q&A show where you get to ask questions of the Chrome team and we get to ask them. My name's Tom Greenaway and I'm the game's lead for the web developer relations team here at Google. And I'm joined by Francois before, hey! Francois is our expert on all things media in the team. And today's topic, you may have guessed it, is media, by which I mean video and audio APIs on the worldwide web. The first topic that we're going to dive into is autoplay. Now the questions we have here are, how does Chrome set the media engagement score for autoplay permission by origin and can it be set at a subdirectory level? This comes from Jack Palmer on Twitter with the Twitter handle Usain Blute. So I think the best way to start this question, break it down, is by explaining what autoplay is. So autoplay is this idea that you arrive on a website and maybe it has an audio or a video element and that element just starts playing audio. So it's pretty straight forward, you may have been able to guess it from the name autoplay. I think that the natural extension of that is to understand what is autoplay blocking. Do you want to explain that Francois? Sure. So back in 2018 we changed how Chrome handles autoplay. The goal was to improve one specific user media experience issue with all the experience in the past. We browse from website to website looking for new stuff learning new things and at some point out of nowhere our speaker starts to yell at us because one random website decided to play some audio without telling us. Yeah, I really hate that. I do as well. So that's why we've made those changes. The new rules are pretty simple. Muted autoplay is always allowed meaning that the website can steal autoplay video without any sound thanks to the muted and autoplay attributes. On the other hand autoplay with sound is only allowed under some conditions. The first one is if the user has started to engage with the website let's say with a button click for instance in that case autoplay with sound will be allowed as it's a good sign that the user has started to engage. So that interaction shows sort of like a level of trust from the user towards the website. That's interesting because I think there's an exception actually also for PWA's right. So this is PWA means a progressive web application which means like a website that's embracing more like application level kind of technologies. So a user can actually take a PWA and install it or add it to the home screen of their device then PWA's get a special exception from this autoplay blocking right? Exactly, yeah. PWA did not get autoplay blocking basically. Yeah, yeah because that's probably a very strong sign that the user trusts that website and you know really likes it. Okay well back to the questions then part of it also was what is the media engagement score? So again and we actually worked on a blog post last year about this if I remember right. The media engagement score is this sort of score that Chrome tracks for websites and we consult that score to see whether or not we believe that the user has demonstrated trust towards that particular website. Exactly, yeah. Do you want to explain how that's actually calculated? Sure. First of all the Chrome media engagement score is not a hidden thing. Like you can actually go right now to the internal page Chrome media engagement to get a look at what's happening under the hood. Now Chrome currents approach is a ratio of visits or sessions to the number of significant media playbacks. And what you need to know about the significant media playback is that the user media conception must be at least seven seconds that audio must be present and admitted that the tab must be active and finally that the size of the video must be at least 200 by 140 pixels. And from that Chrome calculates what we call a media engagement score which is highest on website where media is played on a regular basis. I see, yeah. And when this score or when the threshold is crossed, auto play with some will always be allowed. Okay, right. So it's so that the Chrome now believes that there is a strong sign of trust between the user and that website. And yeah that's really interesting about the minimum video size that makes sense. I think another thing that's important to mention though is that you can clear like when you clear your history in Chrome that also clears the score, right? So that's important from a privacy perspective. Well, also when the question was another topic about the sub-directory level and how that operates for the blocking and the auto play. So basically the short answer is no, it cannot be set at the subjectory level. On the web the identity card of the website is the origin. And that's what we use for all privileges associated with a website such as permission or pop-up blocking. And we have no plans to change that. For what it's worth, this is a known issue that some website may face today when they share their origin with, let's say, a video streaming service and a news website. Sadly, there is no programmatic way to solve this for now. We have thought about having some virtual origin on the web, but that ID didn't make a lot of progress so far. If that happens though, we could use that for media engagement. Okay, cool. One thing, I do have one trick to share about auto play if you're interested in that, of course. Today, a user can actually change the auto play policy. So it's simple, let this simply have to actually go to a website, click on the lock icon and simply mark the website as being allowed to make some sound. In that case, auto play with sound will be allowed. So I think we can move on to our next topic then, which is picture in picture. These questions come from Rob Patton. Questions are, can we use picture in picture for audio playback and can we customize the playback controls in the picture in picture window? So Francois, let's start with the basics. Like what is picture in picture or PIP? Yeah, so this browser feature like picture in picture allows users to actually watch a video in a floating window that is always on top of other window so that they can keep an eye on what they're watching while interacting with other sites or applications. And it was meant for video indeed. Now the web is full of awesome primitives that when used together can create some magic. And that's exactly what we need to have a PIP window for audio playback only. I see. So this sounds like something that would be really great for like podcast or music players, right? So that means you can be on a podcast or music player website and you can pop out controls. So even though it's just audio, you can control the playback, even when you're multitasking. Exactly. So let me show you how this would work. First, you would fetch and decode an image like you would normally do. Then create a canvas element and fill its 2D context with all the image pixels that have been decoded. Now you create a video element and assign its source to the result of canvas.captureStream. And finally, by simply setting the muted attribute to true, the video can be played automatically so that then when video.requestPicture and picture is called, a PIP window with a still image gets presented to the user. Now, this is not the full story, because right now, the snippet of code doesn't allow users to navigate through all available tracks. And for that, we need the media session API. So with the new media session, by simply setting some media session action handlers for previous track and next tracks, the user will get two new buttons in the PIP window and can click them to listen to the previous or the next track. And that's basically it. Oh, by the way, this is what users actually get on Spotify.com with the media player right now. Oh, I see. OK, cool. And what about custom controls? So that would be great in the like. We can think of a custom Hangup button or a Fullscreen or even a Like button. That is sadly not supported today. Hopefully, one day we will solve this issue with what we call picture and picture for arbitrary content. That would allow web developers to put any HTML element in a picture and picture window, not just a video element. Hopefully, one day we'll get there. It is not supported today. And if developers really want it, is there a path for them to show that it is? I would recommend them to actually go to the GitHub repository for that. Please do that. OK, cool. And file like an issue or something or star it. Yeah, that's pretty cool. I mean, that's a great example of how the web evolves, right? So I mean, all software does that. It's sort of the beauty of software. But I think in particular, the web is more democratic in that nature. Definitely. On the topic of evolving APIs, let's move on to the next subject, which is WakeLock. So we were asked if we could help explain the WakeLock API and why it's needed. Plus, related to this topic is a question about background audio too, which I think connects. I'll start by explaining WakeLock. So this is the idea that the web browser decides to prevent the operating system from auto-dimming the screen or turning it off completely because the browser intentionally keeps the screen active without user interaction. But why is this needed? And how does it relate to media? So sure. So first of all, this is very related to media, actually. Because when you watch a movie or a video on the web, you usually press Play, then Sit Back, and Watch, right? So now, if the video is a film or a TV show, for instance, then it's going to go for 30 minutes or even two hours. And as a user, you don't want the device to deem or switch up the screen, right? Right. Yeah, that makes sense. The only alternative would be for the user to have to interact with the device continually and keep the device awake. So that would be quite frustrating, right? So instead, right now, web browser detects when a video is playing on the Active tab, and they will tell the operating system to not lock the screen. In other words, they keep the screen awake. Hence the name, Wakelock. And that makes absolutely perfect sense for media. But I think there is a new dedicated API for Wakelock, right? I think it's in origin trials. It is, actually, yeah. Yeah. And that's actually separate to video and audio, because obviously, we've already got the video and audio elements working the way you're talking about with the interaction with the screen and the operating system. So this new API will allow developers to directly request Wakelock. Why is this needed outside of video and media? So the reason is that there are other use cases where developers need this behavior outside of media. Let's say, for instance, that you are baking some chocolate cookies from a recipe website and that every 30 seconds or so, your phone screen turns off. How would you prevent that, except from maybe touching your device screen with your nose, except that your hands are full of flour, right? Right, yeah. Now, that's actually a really good use case. It's definitely one where the developer needs to keep the screen awake. And I actually feel like, though, I have used websites that already do this. Maybe it was like a recipe website, I'm not sure. But I don't think there was any video or audio playing. So I'm starting to think like, how were they doing? Well, actually, I think I can probably guess. But I'll ask you to explain. How are they doing that if the Wakelock API isn't actually out yet? Actually, what we found is that many developers are placing hidden videos in their page and playing them in the background. OK, yeah. So that's probably not very efficient, I imagine. Nope, not at all. OK, so this explains how things have evolved and why we're building this API. Because we want to enable developers to just bypass this hack. Because obviously, they need this feature and they're working around it at the moment. And hopefully, this will be much cleaner and more efficient. Got it, OK. Again, this is an example of the web evolving. Developers leveraged an existing API, maybe not in the best way, to achieve something they wanted out of the web platform. And now the web community is standardizing this as a new API and giving the people what they want. Bingo. Another related question is, and so it's related because it's about the browser and operating systems. If a tab is playing audio and the user focuses on another tab or multitasks to another app, are there any chances that the operating system or the browser will claim memory and end the process? I know on desktop if I play audio and I change tab, Chrome never seems to kill the audio process. But what about mobile? So you're right about desktop. That's what happens. But it's quite different on Android. If audio is playing, Chrome will mark itself as a foreground process, which will make it a priority for the system. So if the system is running into a high-memory pressure situation, then Chrome may still be killed. If everything's normal, it shouldn't happen. Now, if audio playback is paused on the other end, Chrome will act as another Android application. And it may be killed as well. Interesting. It's really amazing to think about all the work that the web browser does to manage the developer needs, the operating system, and the user needs. By the way, that's why we call a browser a user agent. Interesting. So those were the three main topics we had to cover Francois. But I do have a couple of bonus questions. So firstly, I'd like to ask you, what is your favorite web media API and why? Oh, so even though I worked a lot on picture-in-picture, I think my favorite API is the Media Session API. Without it, all our media notification would be blank and sad. By simply providing metadata for the media your web app is playing, users will get customized media notification that includes the title, the name of the artist, of the album, and also a nice artwork. And that's only five lines of code. And as we've seen before with picture-in-picture as well, it also allows web developers to handle media events, such as track changing or seeking, that may come from a notification or even a hardware key. Let's say, for instance, a user hits the next track keyboard button. Bam, that will be handled by a Media Session Action handler. By the way, this is also what powers the new Media Hub launch in Chrome recently. So if your website shows a blank notification without any controls today, please do me a favor and go add those lines of code. What about you? Do you have any favorite API? Yeah, I mean, it's not really a specific API, but one of the features I really love about web browsers is the ability to have programmatic subtitles. I think this is really good for accessibility. And for me personally, because I'm actually learning French at the moment, or perhaps I could say, je prends français maintenant, it's very useful because I can be listing to the French audio and also watching the French subtitles. And that really helps connect the two. OK, so moving forward though, we have one last question that comes from Ashley Gullen on Twitter. The questions are, what's next for the media on the web and what features are on your radar and what are you thinking of doing in the future? So the Chrome media team is working hard on maintaining and optimizing the current scope of media APIs. But I'm quite sure, as me, you're looking for cool new stuff for 2020. So I'd like to mention one thing in particular, web codecs. Web codecs aims to provide low level and efficient access to built-in media encoders and decoders. The goal here is to expose all the codecs the browser currently use for the HTML media element, for instance. This will hopefully enable better support for latency-sensitive game streaming, client-side effect, or transcoding. Nice. OK. Yeah, I mean, as the games expert, I noticed you mentioned that latency-sensitive game streaming. That makes perfect sense, right? Because earlier forms of video streaming weren't really related to bi-directional communication. But obviously, for a video game, it's streaming in the cloud. It sends the video down, and then the user's interactions need to be sent back up. And so that means that the latency is just extraordinarily important, because you just cannot buffer. Exactly. OK, Francois, that's all our questions. This was super fun. It was. We're finished, but a quick note for everyone watching. If you have any follow-up questions, you can ask them to us on Twitter with the hashtag Ask Chrome. Or you can post them on the YouTube video, and we'll try and cover them either on Twitter or in a follow-up episode. So yeah, thanks again for watching. See you. Ciao.