 All right, okay, after that I'd like to introduce Wincy Rufus Wincy Rufus actually he adds the e-commerce and usability practices division at Neve Technologies And he's been helping and building teams build applications for about 12 years now for about 12 years And he's actually he actually started his career being building flash games and other rich internet applications Then he decided it's too too many sense for one person So he's moved on to JavaScript. I guess passionate about front-end tech and usability and loves dabbling in HTML 5 CSS 3 and JavaScript He enjoys conducting workshops and speaking on topics related to front-end technologies What he's going to talk about right now is motion detection in JavaScript for gesture-based interaction No plugins involved. Take it away. Wincy all yours Houston do you copy? Houston do you copy me? Can you hear me? Better now Okay. Yeah, so Don't be all the crap that he just said so I'm going to be talking about motion detection I'm trying to see how you can do that with gesture-based animation and when I say gestures That's not about your swipes your pinches your zooms That's about standing about two or three feet away from your Computer or your application and you know doing something and continue controlling our application to that and probably hopefully that's interesting so Before I get into that an evolution in how you have been interacting with the computing devices You obviously start off with your keyboards, right? So there was a time when you only used keyboard to interact with the computer and then you moved on to your mouse So you had a mouse now click on the mouse You had a GUI click on it and you pass commands and then the Apple guys came in and your Apple guys said Hey what you can now swipe you can swipe you can pinch you can zoom on your screen and that passes commands great So that was kind of an introduction to gestures After Apple your Samsung guys came in said hey guys. You don't need to touch You just move your hand on top of it and things happen great Your micro max probably came in your micro max came in and they said no touch no hand over the device You blow onto the device things happen, right? so people are really hooked on to gestures and I think two classic examples your Xbox connect awesome things like people are going crazy playing with that You spend hours playing with it Because you don't need anything you just move your hands legs things work Yes, my TV is man. Samsung smart TV. You don't have to run around hunting for a remote So on a couch move your hand control the application Gestures are really really coming in and if you look at gestures as such at the end of the day It's all about motion detection, right? I do something there's a motion happening there something detects that motion and that fires some command So ultimately a motion gesture based interactions boil down to motion detection and you're probably wondering you just guy talking about gestures cameras motion detection and You're at a JS for conference. So most probably you're going to be working either with flash HTML or JavaScript Flash and flex were really great earlier. We could do a lot of stuff with that. Unfortunately. I got killed by somebody So you're stuck with HTML CSS and JavaScript till about a few years back. There was little that you could do with it Thankfully the web RTC came in that came in about two years back I think two years back is when the project really started and Now Google and Chrome obviously, sorry Firefox and your Chrome support that and now with Web RTC and your good old HTML 5 canvas. You can start doing stuff You can starting stuff with motion detection a small into what motion detection is and how you do gesture recognition, right? So according to Wikipedia There are two basic ways you could do that. Your first one is a machine vision Machine vision is where you're probably using infrared beams or you're using electrical signals or Some kind of capacitive signals and if you see a detection like that You see a break in that that triggers an event for a motion and buy some commands That's a little complex. You need some external hardware. You need some stuff for that a much simpler way is image processing In image processing what you would do is You have a camera that's taking images you take one image take an image after a few seconds Compare those images and if you saw some change, that's motion detection That's what we reply out there. So that's two different images. They look seemingly similar When I superimpose them you realize that There's only a handshake happening there, right? So that's where a motion detection is taking place Visual is very easy for us. We know that okay. There's a hand movement out there. That's motion detection happening on your hands How do you tell that to a computer? that Happens through a process called a blend mode difference. I think a very fancy name for something as simple as pixel color subtraction so what we do over here is You've got two different images, right? You pick up a pixel from each of those images You pick up your RBGA values of each of them So, you know that that's your RGBA. So that's for that particular color pixel That's 250 to 40 to 41 for 192 and the alphas are 100 So you pick up the RGBA values of these two pixels. You do a simple plain simple subtraction and Whatever is that output we use that In an ideal scenario if there was no motion detected your pixels of the image in the first place and the previous image Are the same so the pixels are the same and when you subtract that your answer is zero And if your answer is zero if you pass that back to an RB or GBA in canvas image, you'd get a black color. Oh This isn't that great. So what I've tried to do over here is I did a pixel subtraction on the same image I did an RBGA subtraction and what was the output? I just pushed it back So if you notice the whole image is black and only in the hand area where there's something happening That's where motion detection is taking place So that's that's basically how motion detection works. I'm going to try and show a small example now So using web RTC. I invoke my camera. I give access to that Over here what I have are two simple canvas stacks This way Not sure is it okay? Okay, you live with it. So I got two canvases The one on the left hand side is streaming my Webcam as it is the one on the right hand side is where I'm doing the pixel subtraction Now if I stand straight, I don't shake anything you see turns black the moment I start shaking something There's a motion detection happening right Very very simple simple pixel subtraction and that's giving you motion detection now if I use this motion detection and I kind of create hotspots and I say if you detect motion in this Certain area trigger a foot certain event So I've done something on the top right corner. So if I move my hand to the top right corner You see there's a motion detection happening and that's showing something right So as simple as that You guys want to look at the code Extremely simple. I think that's the beauty of it and probably that also the reason why I'm able to explain this to you guys is really really simple so If you look at the HTML part Nothing else. I just got three elements of video tag I got two canvases the first canvas is where I'm rendering the video as it is the second canvas is where I do the blend It's called the blend canvas where I'm doing a subtraction and the output of the subtraction is what I push over there On the JavaScript side, I think it's a standard part of any HTML file application that you'd use with canvas Create the variables. I'm getting the context of the canvas getting the 2d context on it Saving them into variables. So we said since I've got two canvases. I'm doing it for the two variables This is how you invoke the camera or get the stream from it. It's a part of the web RTC It's one of the objects called the get user media and that's that's got three parameters to it The first one is the condition where I say if video is equal to true Then go to the function. It says got stream and if you cannot get the video With you is equal to fall then go to a function says no stream and probably the no stream part I'm going to say sorry no video available But if I do get a get stream, then I'm just going to pass the stream on Toso's variable so that's the function over there and This is the one The subtraction is taking place again a very simple function. I call it a check diff It's in three parameters the data for my source image the data of the previous image and an output object and If you see here, I take the average of the first image Take the average of the second image and I'm just subtracting them all out They create a variable called diff average one minus average two and whatever is the average I'm just passing them back to my output as its respective RGB a You probably might have some question terms of why there's a number four coming in over there I'll probably address that later So when the end of it, I'm just picking up this pixel values picking up the RBGA Averaging it out because I don't want to do that every pixel level, right? So I'll just end of the day I just need to know if the answer is zero great if it's not then there's a motion detection So I just sum them up average it subtract them out This is the function where I check for The hot spot so if you remember on the top right left corner is where I created a hot spot So I get the pixels for the top zero zero fifty fifty Run that into a loop and whatever is the RGBF for that? I pick them up I Summable sum it up and then I check if the average of that is greater than 10 Why do why do the 10 is because if I really keep it accurate? Every small change any minor detection is going to trigger an event so rather than that I want to capture only Significant motion detections. So that's the reason I'm using a variable of a number 10 So any detection where the value is greater than 10 that's going to trigger a motion and just HTML I pass out a message saying if the average is greater than 10 Put this message if not put the other message So it's as simple as that so the function the application that I showed you or something like that Now using the same Concept you can probably go and build in some demos So here are two demos that we worked with I'm going to try and show you one of them Okay, let me see if I can do this right You guys go to your mantra on calm the jubongs calm ready new buyer dresses How nice it would have been had you guys could do a virtual trial room you go there check how it rest looks like If you like it then purchase it so trying to do that on the right hand side there are Do the resolution just change? Wow Okay, okay. Let's try this So blue shirt not that great Looks better Which one as simple as that maybe I could take it a step further and say if I like this move my hand this side And then that takes an image of that and I could probably pose that image on Facebook or Twitter or something like that So that's one example The other thing which came out of this was one second close this. Yeah, this was the other one So my guys in office did some we had a hackathon event right and So a bunch of guys sat late night over a Saturday Sunday and they tried and did something with WebGL 3gs and your motion detection So what did it is try and control a car in 3D without So I'm standing here move left Move right and I stop Go front turn So it's as simple as that the same. Yeah reverse Okay, I don't know so Maybe there's a guy you can talk to so this this demo exact actually was Done by done by bunch of these guys. So the guy in the blue shirt is Piyush. You know get up and say hi So He's one of the guys who did that. So yeah, that questions can be directed to him Yeah, so I'm done. That's about it. And yeah, my final slide In true Bollywood style. Yeah, this is not the end But I think this is the beginning of how you could try and use gestures in all your applications The concept is very simple. It's about So it's about being creative and trying to use it in different ways So this was one of the most cool demos using the features of Okay, thank you. So I just wanted to know this How are you doing this at JavaScript level like for example if you want to Detect the motion there has to be a function which is running on an interval, right? So exactly that is that not a performance issue or have you checked the memory profile page wherein you're getting memory peaks When you want to detect a motion because see a lot of subtraction you're doing RGB subtraction, so if you're not I mean if you're doing all that in the same thread, that's very Highly performance intensity, right? Exactly. So what are you doing for that? Yes? So you'll have to start playing with it. This things change from Application to application. Okay, so I do have for request animation function that's running and it's figuring it running at About 60 frames per second now depending on the kind of canvas that you're capturing You'd probably decide if you if you're just picking each and every pixel you would probably skip a couple of pixels You know if you looked at my function which says Get RBGA values. I was using a multiple of four So instead of using four you might use eight or ten so that the number of variables are coming at less Instead of you don't need to detect every single pixel you might need to do a group of pixels So you need to start playing with those two functions either from a hot pot the hotspots perspective or from that check difference function I love to play with it, but then yes You're bigger the canvas the bigger the picks number of pixel that you're reading that's going to be a problem So one of the things that a lot of computer vision libraries do is that you really don't really need to use 32-bit Information on every pixel and that's one of the things you did you took an average instead, right? Yeah, and that reduces to 8 bit. Do you think we probably need this? Maybe something like this in the browser hardware accelerated in a separate thread So you can say give me this canvas, but only 8-bit image Give me either colored 8-bit image well at 256 colors or grayscale so I can do analysis much faster and much better Yes, I think that's a good idea. I think you should be able to do that Things once you have the data you can pass it to a web worker and it can analyze the way it want right But then you're necessarily not doing anything in the main thread right browser gives you the data And you don't everything in the worker true but my only concern with that would be if I'm taking a grayscale image and I'm trying to do a subtraction right if I Because I'm taking this full image and I don't RGB a subtraction if there's a change in my RG or the B It's going to detect that okay, so you don't always have to do grayscale 8 bit doesn't always mean right I mean it would be 6 colors as well So yeah, that's one of the things that that's a performance things ever employed in a lot of computer vision libraries. That's true exactly Question here sure please I wanted to know how is it different from image stabilization? Like image stabilization are we using the same principles detecting a motion sensitivity and doing a stabilization like a lot of cameras They actually yes. Oh I wouldn't know that how it works on cameras. I mean I'm using the canvas out here So I just pull the stream of steam and put it into the canvas But then yeah, I what I didn't show was some functions where I'm using a threshold value where any Any color pixels are a greater than say 50 or 60 are negated out. They're either converted to a black or a white If you saw the video it was either black or white. You didn't have any grayscales on the other colors That's because anything greater than a certain limit was converted either to a black pixel or a white pixel So I mean I was using that but that still sounds a question I know Ladies and gentlemen big round of applause for wincy roofers blow the roof off