 Hi everyone. First of all, I would like to thank each and every one for coming here. And it's an absolute honor for me to be amongst the best of the latest scientists in the world, especially Fabio who is here with me tonight. And he has always helped me and Deepti as well. So, okay, tonight I am going to talk about recognizing human features in deep networks. But before that, I would like to tell you a brief about myself. So, who am I? My name is Aksha Bahadur. I am a software engineer at Symantec. And I am also a machine learning researcher. So, by day I am a software engineer by night. I am mostly researching in machine learning. And these are the three links you can connect with me on LinkedIn. If you want to, you can see all my, all my latest work, if you can, if you want to see my open source implementations, then you can connect with me on GitHub. And I also have my personal website. So, you can go through that as well. Okay, so special mention to ODSC India because firstly, I think they did a very great job to include all the people who have same, you know, same aspects in terms of your science and machine learning to bring them to the same platform. And so, yeah, thanks for that. So, the agenda for tonight is something nonsensical for like four points. And then eventually we will talk, I will be thanking you guys for being here. So, yeah, now that the agenda is defined, we will first start, right? Okay, so initially when I started out with machine learning, I wanted to work with some cool technologies. I wanted to work with let's say autonomous driving. But couple of months back, I, one of my friends made me see this video. And after that, after seeing that, I kind of had a profile, it had a profile impact on how exactly I thought machine learning worked. So, let me share this video with you. And after that, we can talk. It's always been very, very difficult. So, yeah, so that was Tonya's story. And that had a really cool point back. Because you know, even though she was differently able, she made sure that she didn't feel that way. So, coming towards one of, coming back to the topic, this was one of the first project that I worked on. So, basically, it's so, it's not that difficult, I think, almost a researcher that has started with machine learning, has started with a digital recognition using MNIC dataset. So, so this is simple, a basic classification problem in which you have to classify different digits based on from zero to nine, zero to nine. And so I did that in different phases. So, first I studied about simple logistic regression. And then I did that. And then when I came to know about neural networks, I implemented the same problem in neural networks. And then when I started deep learning, I came, came to the same problem. I tried to solve that using deep learning. So, ultimately, it was a effort of about, let's say, one and a half months in which I used to research all the time and different projects and different techniques that could be used to solve this problem. And if you want to see the results, I have the results with me. So, if you can see, what I'm trying to do is that I'm using convoluted vision technique to make sure that I'm able to track the numbers. So, what I'm trying to do is if you see clearly in my hand, I have a blue colored object. So, if you can see, I'm only tracking the path of this blue-colored object and tracing out the path and then feeding that digit to the model and getting the predictions. And if you can see, the deep network mostly performs quite well in terms of shallow or largely regression function. So, yeah, if you see mostly deep network outperforms all other networks. So, this was, I think, because of my 3-way intuition with convoluted vision, I was able to do that pretty accurately because I was using convoluted vision to get data from the online web, I mean through the webcam field and I was able to get predictions in real time. So, yeah, so this was, you know, this project was, I think, one of my first plays that I did. Then after that, I went on to make a different project, it's known as Hindi alphabet recognition. So, the idea remains the same, right? If you want to see the video, the video, I'll play the video for you. So, everybody who knows the Hindi alphabet, I think it's pretty difficult because there are different shapes involved, like the difference between letter wa and ba is very minimum because you have a single slant line in between. So, I started doing that using the previous network that I worked with, the 3 deep network, lawyers' integration on channel network, but then that was not very effective. So, I went ahead and I learned about CNN's convolutional linear network because CNN's are very effective when it comes to getting different shapes out of the image. So, I was able to do that, I was able to do that very accurately and that was, I think, my second project and my mom was very proud of that because she's, she supports Hindi pretty much. So, the next project that I worked on was facial recognition. So, this was pretty cumbersome because this was based on a research paper. So, this research paper was the face net by Google and I used transfer learning. So, basically, I got the official weights from the face net repository, but I implemented additional feature to that. So, the additional feature was, so if you are not looking at the camera, the facial recognition system would not work. So, this was taken from the iPhone X release that iPhone X had the same facial recognition system that if you don't look at the camera, the phone will not unlock. So, if you want to see the result, so this is actually my brother. He was the, you know, the lab rat for my project. So, if you see, he is able to detect me correctly and my eyes are open in this particular system and now it is closed. So, whenever my eyes are closed, the prediction does not happen, the facial recognition does not work and when the eyes are open, I am able to detect pretty accurately. So, during this project, I came across several concepts such as one short learning, different concepts such as inception network, how does that work and so yeah, I think it was pretty fascinating because this was a bit difficult for me to achieve. So, I would like to keep it short and I would like to move on to the next project. So, this is, I think I have been interesting project. So, first I will play the video of the result and then I will talk about it afterwards. So, what happens is that I am trying to get different hands, I mean, the, upon different hands, I am displaying emojis. So, the main thing is that how do you get just the hand from the image, right? From the whole image, how are you going to classify just the hands? Where exactly are hands in the image? So, we and I friend, we came up with this idea that we can only go ahead with certain area that we will covering it. So, let us say only this part of the image you were concerned with. So, you were only, the region of interest was only this part and so we use similar concept that we were using for color tracking is just that you are tracking now the skin color and how we did that was the skin color is mostly brownish. So, we went ahead and we did some trial and error basis. So, we found that a certain range between in the RBG challenge was for brown color and then we take mathematical and of the entire image to the brown color and eventually we will get the hands and then we created our own data set and eventually we went ahead and we overlaid this image based on the prediction from the model. So, one of the interesting things was that we had only 10,000 images because we were creating our own data set. So, we had 10 gestures and for each gestures we created 1000 images and so let us say if I rotated the hand a little bit. So, let us say if my hand was the hand was this and if I rotated it to 10 degrees to the left it was not recognizing it properly because I was definitely over fitting at that time. So, what we did was we went for image augmentation. So, we rotated the image for let us say 10 degrees to the left 10 degrees to the right and then after couple of times we added that same image back to the to the data set. So, eventually after the couple of this image augmentation we had a data set of around let us say 40,000, 50,000 images and then I trained the model again on that and after that I got pretty good accuracy. So, next is I think a very decent project that I did. So, it is just an extension of something that was already being done. So, this is rock, paper, scissors, this pop it is it is inspired by The Big Bang Theory in one of the episodes Sheldon has this idea of playing this game. So, basically if you want to know the rules I have the rules written. So, if you want to know the rules, rules are simple. Scissor cuts paper, papers covers rock, rock crushes lizard, lizard poisons fox, fox smashes scissors and so it goes on, it goes on like that. So, yeah. So, I did that it was pretty similar of to what I did earlier just that I made a few changes in the conditional statements just to make sure that you know people are interested in you know you can achieve not just I mean you can do these things for fun as well. So, the main idea of being doing these two projects this one as a emoji one was that I thought it was pretty cool and but similar implementation could be used for let us say hand sign recognition that sign language you can also use similar technique to get that and I think a lot of people have already done that. So, I did not do that implementation I went ahead and did these implementations because I thought that this was pretty interesting. So, yeah and also one other project that I had in mind that I wanted to show you is let me just go ahead and show you that project. So, if you can see this is the behavior going project that I did. So, the data is taken from a simulator and this is not the actual road, but somehow you can see that the steering wheel is turning into the exact direction of the road and what we did was we have we got this these data set from you from Udacity and along with the steering wheel angles. So, we we trained the model accordingly you know we implemented some CNN, CNN network and we did some regularization as well to make sure that we are getting the correct results, but the thing is that it was not very fascinating because you are not doing it in the real world. Then I came across this is a research paper from NVIDIA and NVIDIA was doing it on to real words. So, if you can see this is actually the real word in which you can see somebody driving the car it is not very accurate because see it is taking the left turn just because it is seeing some objects. So, we had to take care of that, but other than that the main problem was that in this project the data set was pretty huge. So, it was I think around 4 GB or something and for me at a laptop quite old laptop. So, I was not able to get some correct predictions or I was not able to claim a model on that. So, what I did was I pre-processed the image, I reduced the size of the image, I did some normalization as well as I changed the channels maybe not just instead of RGB, I changed the RGB to black and white or HSV format just to make sure that I can have some data to work with and but I the model that I worked with I think I improved that quite a bit because it is now showing pretty good results. So, these are some of the projects that I have done and so during this learning experience there are couple of points that I thought were really important. So, first is foremost is of course you have to be you know you have to be helpful towards each other because this ML community I think is very strong in terms of helping each other and also make sure that you enjoy this process because once you start enjoying this process this is not something of a burden to you it comes very naturally to me because even though I have to work a bit hard but I really enjoy the whole process. So, yeah so if you have any questions and answers according to related to any projects that I did you can go ahead and ask me that because I will find your questions and I will answer them. Yeah, sure. Yeah, so yeah definitely. So, I think the data said that I worked on had road markings but I did check it with different road qualities and I was not getting very good results. Hence, I did not go ahead and post post it on some other sources that I was working on but definitely if let us say the road markings for north there or if certain other objects came on to the road then the prediction was haywire. So, we I mean that was just the initial implementation that I did maybe you guys could use that implementation and build on top of that so that you can have a better model and you know you can improve the accuracy slowly. So, mostly I think in the flying car there are a couple of let say things involved. So, first you need to have a predefined route right and then maybe in the predefined route you may have certain objects in between. So, you have to track that and then you need to come up with different solutions that maybe you can show off towards left or show off towards right and let's say you show off towards right then how will you come back to the original path that you that you were planned from the beginning itself right. So, there are certain things that you need to take care of that is just I mean the project that I did is not the entire autonomous vehicle driving. It is just that I got the image and I could I was able to predict the steering angle. There are a couple of other things involved that is you need to make sure that the prediction is you can spot the prediction on the sidewalk you can spot the traffic signals where you are you need to speed on the stop. So, there are couple of different things involved not just the road the angle of the road and the steering wheel. So, there are a couple of things that you need to make sure before you actually put this level of model into production. Yeah. So, in that case what I would probably suggest is that keep everything on the cloud keep all the calculations all the computations on the cloud itself. So, that you can just send the image to the cloud and get maybe get the response back maybe per frame because if you keep that everything on the vehicle itself that will become very hard to train on. So, if you see if let us say I have seen for some implementations of not a 3D you know 3D plane, but I have seen certain implementations of autonomous vehicle in which maybe they have autonomous toy car vehicle. So, they did the same thing. So, they can't communicate it with the cloud using flask API in Python and the model was entirely in that in the cloud itself they were using some get APIs to make sure that they were able to predict whether the person standing in front is an obstacle or you can go ahead on the same path. Yeah. So, that is definitely the case. So, you need to make sure that you have good network because either it can be two things. So, first is you keep everything on the car and in that case the internet doesn't matter, but that will become a bit bulky you cannot you know make sure that you have that amount of working capacity on the device itself. So, for that you need to have good internet. So, that even if the model is in the cloud and even though you are sending high resolution images through the internet. You could do that. I mean I have I mean I have hence explored that part of TensorFlow. So, I think if you are saying that I would take a word for it, but in doing that also will require some amount of you know some computation in the device itself. So, if you want to remove that then definitely cloud is the way to go. Yeah, mobile nets here you could use that. Yes. So, we squeeze net mobile net they squeeze the entire network into a smaller model. So, that you can get a faster accuracy. So, that was about it. Thank you for being here.