 I can start. Let's go. Let's go. Okay, we're good. Yep. Okay, so if you do want to ask questions, you're welcome to chime in. I can't see chat or anything though, so somebody might need to tell me. Otherwise, you can ask them at the end or not ask questions. My Twitter DMs are open, so you can message me there if you want. Or you can, you know, publicly tweet at me, although if you do that, nice things would be preferred. All right, cool. Let's go. Not important slide, but that's my Twitter handle if you want to DM me and a bunch of my demos on Co-Pen as Vandy Michael. Some of the demos that I'll show today are up there and they're public. Some aren't because I haven't had a chance to clean them up yet, but most of them, most of the ones I'll show you are up if you want to go check them out later. Not at all important, but this is my dog, Jello. I have this for two reasons. I use the word Jello a lot in my demos and then people think I'm talking about like the food, but actually I'm talking about my dog and all of my character names ride with Jello. So when you notice the theme, it's because of him. This isn't really important at all. It's just an excuse to have pictures of my dog in my talks. As we said, this is like an off the cuff version of my talk, so hopefully it goes okay, but I am going to talk pretty fast because I want to get it going in 20 minutes. So we'll see how we go. If I'm talking too fast, maybe like somebody say, we'll see how we go. So what we're going to talk about is using sensor and browse APIs on the web. We're going to focus on my three favorite speech recognition API, device orientation API and the ambient light sensor. What I want to make clear at the beginning is I call this talk fun with browser and sensor APIs because I've not made anything particularly serious with it. The experiments I like to tinker and have a play. As I mentioned before, I like to do some variable fonts so a lot of my demos involve those. But for me, the experiments and the discovery is part of how we figure out what we want to do in the web in the future and how we want to use these technologies when they're more widely available or if we think of something cool, different interfaces or devices that we might be able to use them in. Because they're experimental, the browser support varies. So this is my legend. If it's got a full logo, it's good. You can use it for support for the browsers. If it's got a flag, it means you have to enable a flag. And if it's got a star, it means it's coming soon. Things have shifted a bit on a couple of them since I last gave this talk. I will mention that if it's relevant. But this is a pretty solid guide. So with that out of the way, we're going to talk first about web speech APIs. That's another picture of my dog. Amazing. So the web speech API basically just enables you to incorporate voice data into your websites or your apps. Browser support, pretty good as you can see. The Firefox star, last time I spoke to Mozilla, they were really close at releasing this. I think they're just working on permissions. So, you know, like giving the browser permission to use the speech recognition. It's currently split up into two separate interfaces. There's speech recognition for understanding human voice and turning it into text. And then there's speech synthesis. And that reads out text in a computer generated voice. For today, I'm only going to talk about speech recognition. If you want to know about speech synthesis, NBN has a bunch of great resources on it. Some of the things that you use in speech recognition, you also use with speech synthesis. So we'll keep a couple of those things in the code that I'll show you in a sec. First up, let me show you how it works. I'll just bring this up. Okay. I hope this works. I've never demoed this in a live stream before. So let's see what happens. I'm just going to click and then see if it gets my words right. Can you see the page? Yep. Amazing. Okay. Okay. Cool. So you can see that it got one of them wrong. Yeah, yeah. So it said paint at the end. I forget what I actually said, but I didn't say paint. Okay. I clicked again. So every time I click, it's basically trying to determine what I'm saying and render it on the screen in text. That's some of it wrong. Brenda, instead of Brenda, amazing. That's fantastic. So as you can see, it's not perfect. I particularly find that accents are represented very well. If you're American, fantastic. If you're Australian, it often doesn't get some of my words right. So I can't imagine what it would be like for people that have much stronger accents than I do. But it's not too bad, at least for me. And obviously, the more that we use it, the better that it gets. So that's kind of an example of how it works. In order to do that, it's actually quite straightforward. So we need a couple of things. First, we need the speech recognition constructor. So a bunch of JavaScript, pretty much, is what we'll use to create a new speech recognition object instance. And what that does is give us access to the API's methods and properties. There are quite a lot of them for speech recognition. I'm not going to go through every one. I'll just talk about the ones that you need in order to get started and do something simple like the example that I just showed. So along with the constructor, we need some methods, specifically start and stop. And they pretty much do what you would expect. Start. Start speech recognition service from listening to incoming audio. And usually we would trigger this with a click or some kind of user interaction because browsers don't like listening to people without them saying it's okay, which is probably a good thing. And then stop, which stops the speech recognition service from listening to the audio. What stop also does is it attempts to return a result from the audio that's been captured. So while start just starts listening, stop has that extra step of trying to return a result. So with that, we also need a couple of event handlers. So most of them simply listen for changes in recognition status. And there are a bunch of event handlers, like error logging and that kind of stuff. Two of the main ones on result, which runs when the service returns a result. This is really special because it's executed every time the user speaks a word or several words in quick succession. Then we have speech end. And that runs when the speech recognition service has stopped being detected. And the way that speech recognition works is it will keep listening until it hears a pause for long enough. So if I stop speaking and was silent for a few seconds, then it would stop and it would try and return a result. Obviously, you can also trigger it with like an end. But as soon as it stops hearing something, it ends. So the last thing we need, which is the most complicated thing I'm going to tell you today, speech recognition to actually get the result is a little bit convoluted. And we kind of need to go through a few steps to get there. So first, we have to access the speech recognition results list object. And that's just a collection of all the information about the result. That contains the first result object. Now, that result contains an item called speech recognition alternative object. And the speech recognition alternative object contains what we really want, which is the transcript of what it thinks we heard from the microphone. And this sounds really confusing and convoluted. And that's because it is. The reason it's like that is the API was designed to contain multiple alternative interpretations of what you said. So by default, it returns only one. But you can also get access to things like confidence ratings of what it thinks someone said. So in the example that I showed you, sometimes the words change as I speak because the more context it gets, the more it tries to figure out what I said. I have never found a situation where I needed to do something different to this. So I don't know if maybe they'll simplify the API in future, but I can't see how you would use it any other way at the moment. And all of the examples from other people I found do it this way as well. So I guess we'll see. So this is it in like one line of code. First we get the list object, then we get the first result, then we get the first alternative and then we get the transcript. So that's how we get the actual text that it thinks that I said. We can chuck all of that in a function. So I've got a test speech function, which I run on click. It pretty much uses everything that I just told you. We set up our constructor. I've also got a div that I'm finding on the page called Output to chuck my text in. We start it. When we get a result, we find the transcript and then we put that on the page and then we've got out on speech and event handler to stop the recognition. That's pretty much it. That's how I ran the text example I showed you before. One thing to note, FlyFox doesn't do the streaming. So the text coming up as you speak, that won't work in FlyFox. It only works in Chrome. But I like it because it looks really cool. So as an example, if you wanted to do something fun with this, let me go back into Chrome. I'll just get rid of that. This is my dragon. His name is Marshmallow and he is a fighter breathing dragon. And we're going to say a word and he is going to breathe that text alive in fire. I really hope this works. Fire. Yay! Okay, cool. So as I mentioned, I do a lot of stuff with variable fonts. So this uses a variable font called DecoVar that makes it look like it's animating. So it's just a text. It's a H1. I think it's a H1 on the page. Actually, might be a paragraph. Anyway, not important. I say the text, render it into my text node that I have on the page. And also animate it using CSS. All of that. The only JavaScript that's used in this is the JavaScript that detects speech recognition and sends the text to the page. And then it just adds a class to trigger the animation. So you can do a lot of really fun things to create more interactive experiences. This kind of stuff is great for storytelling or maybe marketing and advertising websites. If you want to give people a way to be a part of what you're doing, you can also create stuff where maybe people can control things that are going on the page. So for example, in this case, this is all an SVG. I can say, this is Mellow, my dog. We're going to say, who's a good boy? And then we can animate the SVG or we can say what was it? That's right. Where's the ball? So you can give a bunch of commands. In this case, I'm making a dog do a thing. But you can give a bunch of commands and animate things or change things or create different interactions. And this kind of stuff might be really useful for video interfaces if you want to go play or pause or next or whatever at the screen. Different ways for users to navigate your website or your web apps. Really good for accessibility for people that maybe aren't using traditional input devices. And you can build these experiences into your websites. Another thing that I would love to see happen with speech recognition, but is actually a little bit difficult right now, is using speech recognition and variable fonts to determine the tone and the intent of people's words. So at the moment when we do speech attacks interfaces, it's all very static, right? You say a word and it prints it out and it's exactly the same for everything that you say. But I think if we were able to combine variable fonts with speech recognition and we could combine it with web audio to detect volume or tone or pitch, then we could represent that text in more meaningful ways and create more meaningful text to speech interfaces. So I'd love to see that, but there's no way to do that easily with speech recognition at the moment because of the way there's no time stamps attached to the transcripts. So you can't match up like volume to a transcript. I have a loose example, but it's like so dodged. So I'm not going to show you because it doesn't really work properly. But I'd love to see that working with speech recognition in the future for like really cool meaningful interfaces. So one thing I wanted to mention about text to speech, the way it works in Chrome is they have the web speech API and it goes straight to Google's speech detects. So you do need the internet in order for this to work. It won't work without it. With Google, they record what you're saying and they store what you're saying, which you may or may not want. You can actually buy their speech to text service and pay to not have it record, but for normal web users, whatever you say to Google, it's going to record and keep. And this is how they improve their speech to text service. For Firefox, Mozilla actually have a speech proxy in between the speech API and the speech to text provider. At the moment, Firefox are using Google's speech to text provider because it's better than deep speech, which is the Mozilla speech to text. You can actually switch to deep speech. There's a blog post I can share later. They explain how to do it, but it's not as good and they don't have as much language support, obviously, because Google has many more resources. The difference with Firefox is that they actually pay Google to not send any data about you. So if you want to test it out and not be recorded, use Firefox because it basically goes through their AWS promises. There's no information about you. It gets sent to Google, then it gets returned and it doesn't keep it or know anything. So it's a little bit more private if that's a concern you have. There's also other speech to text providers, but I mean Google and deep speech are the two main ones that I'm familiar with and probably get the most input. Should also contribute to deep speech to make it better so that we have more options. Yeah, so I guess to finish up and move on from Web Audio, sorry, not audio, speech recognition. For me, I feel like it's going to allow us to have more interactive and more accessible experiences and I would really love to see people experimenting and working with it a bit more. It's really close to landing without a flag in Firefox. So we'll have access to it in Chrome, Edge, Firefox. So this is, you know, quite a few browsers supporting it soon. So I think there's a lot of opportunity there. Wow, I'm late over time. I'm so sorry. We're going to talk about orientation and I will speed up a little bit. Basically, to do device orientation, we need devices that have an accelerometer or gyroscope. So while the browser support for device orientation is really good, you obviously need a device that can detect changes to orientation. It's seen a lot in mobile devices to automatically rotate the display so it's the correct way up. Really common use case. I hate that because when I'm laying in bed, it goes the wrong way and it's really annoying. So I turned it off but that's a feature you can use. One thing I wanted to mention before I show you the code is that when I first did this presentation, it was perfect in all the browsers. Since iOS 13, you now need to trigger an interaction for device orientation to work in iOS 13 and above. So you'll see on code, and some of my examples have a button that you have to click. This is because people got hold of a device orientation, did really dumb stuff and pissed a lot of people off. So now we have to trigger it in order to use it. So thanks a lot all those people. It was a lot of ad networks, just so you know. So another reason to hate ads on the internet. So much like speech recognition, we only need a few things. The first thing is the device orientation event, which provides information on the physical orientation of the device that we're going to be using. And then we also need the device orientation event properties. There are four. I'm only going to talk about three alpha, beta and gamma. And these each represent different number values depending on the orientation of the device. So the first one is alpha. And this represents the rotation on a flat surface. So for example, if you put it your phone or whatever flat on a table and spun it around, that's alpha. And it's from a number between zero to 360. It's kind of like a level. Beta represents tilting backwards and forwards. So it's kind of like this. And that's a range of minus 180 to 180. And then gamma is tilting left or right. So it's like this. And that represents a range from minus 90 to 90. So in order to use it, we have to check if device orientation exists just to make sure we have access to it before we try and do anything. Otherwise the browser will not be happy with you and have a sad. Then we access the device orientation to an event listener and pass function in that does some code. Really basic. You could do something like this. So in this case, I'm detecting gamma. So the tilt left and right. And then you can say, you know, if the current gamma is less than minus 50, do a thing. So I actually use this code and I made this. So when I tilt my phone, I slam all the text into the side. It uses splitting JS. So once it hits minus 50, I just position them all over to the left. It's also tilting. So I use a constant reading from the tilt to pass it into the variable font to change the how italic or the slant of the font. It's really subtle, but it does actually shift. So you can kind of create fun stuff like that or you can do something like this, which is just like a little, you know, make the moon move. This is an SVG. My favorite one is this one. I've made a lot of text effects with split vertical layouts. I like this because it's quite subtle. And you could just leave it like split straight down the middle for when device orientation isn't working. But when it is, you could get like a nice little effect that some people see. I also like this one, which changes the weight and transform of each letter as you rotate. This is on Copan and it works with any number of letters assuming it fits on the screen. Maybe if you rotate it, it would be better. Device orientation would be good. So what I like about device orientation is that a lot of what we do on the web is quite static and predetermined. You know, we have hover states and we click to do a thing and we've decided how that will work. But I like the idea that we can give our users the opportunity to shape their experience based on the situation that they're in in that moment. It's used a lot at the moment for games, augmented reality, maps, subtle UI effects. Apple use it a lot in their native apps. If you rotate on like premium articles, you get like a sheen, shiny thing happening. So we can use it for subtle UI effects or parallax and stuff like that. But we can use what our user is experiencing to do that. And I think that's really exciting because to me predetermined animation and experiences aren't fun. Like, I love the idea that a person, I could be sitting next to someone and their experience of my website could be completely different to mine. I think that's really cool and it gives us a lot more opportunity to create more engaging stories and more interesting things on the web. Kind of like how it used to be in the 90s when we had fun designs. So with that, I'm going to finish up and talk about ambient light sensor, which is actually super quick because there's only two things that you need for ambient light sensor. And just for those that don't know, very limited browser support. If you look on it, can I use an MDN? You will get conflicting messages, which I'll explain about in a moment. But at the moment, it only works in Chrome behind a flag. To my understanding, it will be landing in Chrome without a flag pretty soon. But I'll explain why it's been delayed because it's actually been worked on quite a bit. So an ambient light sensor, it's just a photo detector that's used to sense the amount of ambient light that's in a room. It's in all sorts of devices, laptops, TVs, mobile phones. You'll see it most commonly used to dim screen brightness when you're in a dark room so your eyes don't get burnt out by the light. That's the most common feature. For the web, this is the second attempt at implementing the ambient light sensor. Firefox had implemented first, but Mozilla were not happy with the design of the original API, so they redesigned it. The spec writers took the opportunity to broaden the scope and include things like node, as well as the web browser. So the API can reach not just browsers, but other JavaScript runtimes and contexts like wearables and IoT, that kind of thing. My understanding is the fit that adopted the shape of the API in their SDK. So if you understand this, you can probably use it in other contexts as well. The reason that it is delayed from release and the reason that it hasn't been more widely implemented is because of the security concerns. So researchers raised some concerns over the ability to steal user data using the ambient light sensor, which frankly I think is incredible. I don't like how people figure this stuff out is mind-blowing to me. As a result of this, the spec writers and the developers worked with the researchers to mitigate some of the known issues. The main solution that they implemented was to introduce a frequency cap. So in my initial implementations, you'd get a constant light reading from the API. It's now capped at 10 hertz and readouts are rounded to the nearest 50 lux. So lux is the light measurement unit that's used for measuring light. This solution removes some of the accuracy, so it's harder to do bad things. I have not found this to be limiting in any of my demos. It had no significant impact, so I think it's great. Less security concerns, still works, no major problems. So this is now shifting Chrome behind the flag. Also edge, because edge is now Chromium, so we get it in two. Two for one, I guess. In order to use it much like speech recognition and device orientation, we need to set up the ambient light sensor object. You might have noticed that the sensors are quite similar. This is intentional design. They're trying to make sensor APIs very similar in how you use them to make it easier for us, which is great. And I'll mention something about that in a sec. So we need to set up their new ambient light sensor object. Then we need the on reading event handler. This is not part of the ambient light sensor spec specifically. It's part of the sensor interface, which contains a bunch of properties, event handlers and methods that a bunch of different sensors can use. So once you start, like I said, you can use a bunch of others as well. So on reading event handler, it's called when a reading is taken and it will return the current light level when you access the properties on it. Reading frequency is decided by you. You can pass an optional value to the constructor and that specifies the number of readings per second, keeping in mind that it also has a cap for security reasons. Finally, the only thing that exists on ambient light sensor is the illuminance property and that returns the light level as a lux value. In order to use it, literally you could copy and paste this code into a HTML page as long as the elements were there would work. Check the ambient light sensor exists in the window. Create your object. Check for a reading. Find the illuminance value. So if it's less than 20, you could change the background to black and if it's more than 20, it could be blue. Also has the sensor dot start as well, which is common across a bunch of different sensors. Now, if you wanted to do something with this, my favorite example, this is going to be fun because I'm going to cover the camera in order to demo this. So it's going to go black, which kind of works with the whole mood. So if I cover the light, we can change the screen to animate or do something different. And I love the idea of a website where, you know, it's dark. So maybe you want it to be like super moody or something or it's light and you're like, cool, this is fun. It's the middle of the day. It will be pretty chill. I desperately want to make like a story where if you're reading it in the day, it kind of has this really nice vibe. But when it's dark and you're in a room that's really dark or whatever, it starts to feel like ominous and creepy. I think that would be a really fun experience for people. So just to be clear though, the sensor isn't in the web camera. It's actually next to it, but it's quite difficult to like cover the sensor. Oh, there we go. I got it without covering the camera. Most people think it's the camera, but it's not. It's actually right next to it. So another, oh, how do I get there now that that thing's covered? Let's do that. Another example. I like this example. This is my first one. Sorry, I'm going to blind you all now with light. This one changes the text based on like the variable font weight based on the amount of light in the room. This I really like for accessibility purposes. As we move between rooms, whether they're low light or really bright, you can change the weight or optical sizing or different axes in a variable font to make it more legible and more readable. I think this is really great because, again, we have these really static experiences that don't take into account our user's environments. So this gives us the opportunity to change things based on their environment. And we can also tie these into things like color contrast modes and dark mode and stuff like that. And we can make adjustments to create a more refined and legible experience. Or if you want to do something fun, this is great for storytelling. This font is called Blue GX by Tykutua. Arthur who makes the best, most exciting fonts on the internet. And all I'm doing is using the light to make the flowers bloom because flowers need light and that makes sense to me. But this would be great for storytelling and marketing and advertising so you could change the way it's experienced based on the amount of light. So I really love the ambient light sensor. It's my favorite. I wish it was more widely supported. There are two specifications, the old spec in Firefox. Don't use that. It won't work. It always confuses people. There are two links on NDN. Don't go to the one that's for Firefox. I've got a link at the end of my slides that you should go to. But literally every time I post about this, people go to the wrong link and then I get confused. So take the one that I have at the end of my slides. So to finish up, because I've gone way over. I'm so sorry. This is what happens when I ramble. Don't be limited by what we can already do. The web's really young and there's a lot of stuff that we can make. And these are experiments that get me really excited about what we can do in the future. But also like we can use them together. They don't have to be used independently. This uses the ambient light sensor and device orientation to make a spotlight. It says Jell-O at your source. It's a message I left my husband one day. But you get to experience this message only if you're in the right kind of environment. And you can create puzzles or other experiences that require these inputs to complete them. And I think that's super exciting. Another example, I don't know if this is going to work. This one is so touchy because of my accent. So we'll see. This is Smello. He's a grumpy wizard. He only casts spells if it's dark. So if I cover my sensor and I say, please cast your spell. Yeah. Okay, cool. So it worked. That was lucky. Did normally this one take me like three goes because for some reason you can't understand what I'm saying. So you can combine them together to create a bunch of different things. One of my favorites is this unicorn one that I made. I can't show you the device orientation because that's difficult on a phone demo. So that's it working with the device orientation, right? But when you use it on a phone altogether, you can make the beam show and you can say, oh, let's see if this will work. Commence abduction. Yay. So it'll only work if the light is over the catacomb and you say commence abduction. If you get it wrong, it'll call you a fool and you'll have to try again. So these are fun experiments that I made that I think are really cool and I think demonstrate possibilities for us in the future, whether it's for practical interface reasons like accessibility or legibility or readability, or if it's just for fun games or advertising or whatever. There are opportunities here for us to discover things and we have the chance to determine how that stuff is going to be used in the future. So that's it. Thanks, everybody. And a bit lessens the link. Go to this one. Don't go to the other one. Otherwise, it won't work for you. If you have any questions, you can either ask me now or you can hit me up on Twitter. Done. Thanks. Thanks, Nandie. That was amazing. So you can just hang around, I guess. You should stop sharing your screen. Next slide, share your screen. Done. Stop sharing.