 Speaker, first is Ankush Sharma. He is an undergraduate from IIT BHU Varanasi and he has been a founder of Python since his college days. I would like you to give him a big round of applause for him. The second speaker will be Sivan Verma. He is also from IIT BHU, third year. He is also a Pythonista. His interests are image processing and network programming. So please can we have a huge round of applause for both of them. So now is the important part. You can tweet using the hashtag Python India. So just open up your laptop, start tweeting if you like something or you don't. Kill the volunteering team with your comments. Second thing, the Wi-Fi password is geeksrs. That is G-E-E-K-S alphabet R-U-S, geeksrs. So if you want you can just join the SSID would be Python India only. So that's it. For any help you can just come up to me or tensing or anybody, any of the volunteers here. So I hope you will have great session. You will enjoy it. So thank you. Hello. So shall we start? So good morning people. So this is our talk. Python the IIT of real world computer applications. So interesting title. A little bit complex too. But you will know its meaning till the end of this particular talk. So what we are going to do today is we are going to introduce you to a new way. So should I say a more cool way to make new computer applications which can be controlled by hand gestures. So about us. So I am Ankush Emma. We people are from IIT, BHU, Warnasi. ECE, Android Jours. With my colleague over here, Shivam, sitting over here. And meet my IIT friend who was part of this particular project. Abhishek here sitting over at the back. Hi Abhishek. So powering us to gesture control. This was the cool thing I was talking about earlier, gesture control. So you are going to move your hands in front of your computer camera. The computer camera will capture your hand gestures from the physical world. And translate them to actions. And these actions will be executed by your computer. Means you are going to have the power to control your computer by your physical gestures in the physical world. And you will do your operations in the camera by doing these physical gestures. Sounds cool people. So you are going to need some technology for doing that. Can anybody have any guess which technology I want to use? Any guesses? Okay, image stressing. Which technology I am asking. Okay. So the talk is about image stressing and how to use it in Python. First of all, I will walk you down through the basic issues internally. The idea of image, how images are stored in computer, how computer manipulates images and all other things. Then I will walk you down through the same possible library and Python stuff. Python is a library of Python. Then at last I will provide you with certain algorithms or workflow models you can use to make your own computer vision app. Okay. So let us start. So first, a brief overview. The order in which we are going to talk today. First of all, what is image stressing? Second, image stressing with Python. Third, how to use Python. PyGAM is a library as I already said. It is a codependency of SylpaCV, the computer vision library of Python. Lastly, Siam, my colleague will walk you down through the building process of touch-to-touch mapping algorithm and virtual user game. These particular applications we have made for this particular part. So, meet our friend. Both are Python programmers. Both have all sorts of experience working with Python from the development system programming. But both are image stressing. So, as you may have seen earlier, both are image sisters who are supposed to be adjusting them for that. So, both value wants to do this. So, he is supposed to do this particular gesture control game. So, how many of you people are noise and image stressing? Just raise up your hands. Okay. Thank you. So, what tools Bo has in his toolkit? First of all, I will say Bo is an excellent Python programmer. So, he knows Python. And most of you people also know Python. Sankini SylpaCV library, the computer vision library of Python. PyGAM game library of Python. So, we will keep our talk related to SylpaCV and the PyGAM libraries of Python. Okay. So, let's talk about first your stressing. So, basic ideas of image. So, how will people see images and most humans see images? Suppose I click a random bit like I click one now. Okay. So, I can say that picture has mountains, beautiful lakes. So, I can say it is a landscape. Means I can see things and adjust them. Similarly, I can say that these rows have more number of people than these rows. So, I can basically see pictures and adjust them. But computer sees things in a different scale. It sees things as numbers. And these numbers are what we call as light intensity values at different places in the image. And these different places are what we call as pixels of the image. So, basically, image is a group of pixels and each pixel has a light intensity value. So, you people are seeing this left color goal over here. So, there are, you can see in the left click corner, the values in the 90s are represent red color and the values in the 70 represent yellow color. These different colors are mapped to different numbers. Suppose you have a red value, it will be mapped to a number X in the computer. And suppose you have a green value, it will be mapped to a number Y in the computer. So, computer sees colors as numbers. And these numbers are what we call as the light intensity values. So, there can be different skips for representing these light intensities. You can use a 8-bit value or you can use a 24-bit value. We will talk about them later. So, now, you people know what is an image. So, what about if you are stressing? So, suppose I have a pic of Mona Lisa, I have a program, if you are stressing program running on my computer. So, what can I ask the program? So, I can, am I ready with? So, what I can ask the program? I can ask my computer, hey, tell me the number of red pixels in the image. So, the computer will look through all the pixels in the image and check out the intensities which matches the intensity of the red ones. So, it will display the count. So, for an image, the computer will give me some information. So, I can use these information to process the image. And after that, I can use this information to take certain actions. So, I can ask the computer, hey, shut it down yourself. If the number of red pixels in the image is equal to 100. Means, you have an image, you get information from the image and do certain actions depending upon that information. This is what we call as image stressing. Not about color channels. So, there can be, there are most popular two color channels. People use RGB and HSV. RGB stands for red, green, blue and HSV stands for hue saturation value. People generally tend to use HSV over RGB because HSV takes out the effect of luminosity of light at different places. So, here comes the point. Use stressing with Python. So, Python is a powerful programming language. It has all sorts of libraries for different stuff from network programming to system programming and all sorts of stuff. So, for image stressing, Python has a good library which is simple city library, simple computer vision. So, simple computer vision as an open source framework or as I should say a library which is used to develop rapid computer vision applications. It is a high-level library. Means, you don't need to know about basic details of image stressing stuff like bit fields, metrics, manipulation techniques and other things. You can build a good computer vision app by knowing just the basics of image stressing. So, let's dive in simple city. A note, simple city is not equal to open city. Simple city is just a wrapper across open city. Open city is a multi-platform library. As you know, it's a level for all programming languages like Java, C++ and other sorts of stuff. So, also simple city adds lots of fun stuff like OCR and barcodes which can be used at different points in your application. So, simple city provides a rich set of classes dealing with different stuff like cameras, displays and all these other stuff. So, you guys are doing image processing, right? So, you need an image. So, how do you get an image? So, you can use your camera to get one or you can get an image from a URL or you can use your phone too for getting an image. So, let's talk about getting an image from a camera. So, there can be many cameras connected to your computer at a certain point. So, you need to choose a camera which you are going to use for capturing the image. So, this camera is pointed out by the camera ID or we should, we can say it as descriptor. So, first of all, I'll import the camera class from simple city. So, from simple city import camera. Now, I instantiate a camera object by passing it to zero. Zero is for my default camera. My default level comes in my laptop. So, I instantiated a camera object Now, I can ask the camera to display live field the video. Camera live function will do it. It will display the live field. In case of live field, you can get the live intensity with a particular mouse coordinates by clicking just clicking on the image. Similarly, I can get an image from my camera. I can do camera get image. It will return the image object which is an instance of image class. If you get an image instance, it will look further about the properties of this image object. Note about camera properties. You can set the camera properties, the resolution and other properties by instantiating the camera object by passing in a dictionary. But these properties are properties are highly manufacturer dependent. And these are generally poorly recommended by the manufacturer. Meaning, setting a particular property will work or may not work in a particular case. A little bit quirky but it is observed here. Now you have an image right. You need to display it to people. So, how you display it? You need a display object for displaying images to people. So, let us create a display object. So, how you create a display object? From simple series, we will import the display class. The display class is used to instantiate the display object by passing in a tuple telling our resolution. Now you have a display object. Some of you might be interesting in displaying some information on that display object or pointing out some information about the image on the display object. So, suppose I want to display the HS3 values at a particular mouse coordinate. So, I need the mouse coordinates. So, you can use the display object to get the mouse coordinates. So, you can use the attributes mouse x, mouse y of the display object. You can see that xy equal to display dot mouse x it will return the xy coordinates of the display object where the mouse is. Now, you might be interested in cropping the image or doing some operations on the image on a particular event. So, suppose I left click on the image and you want to crop it. So, you can use the display mouse left event. It will let you assign image to this particular left click event. So, let us talk about image image class in depth. So, as I said earlier simply we can deal with variety of sources of images and videos. You can have a URL for a particular image or you can use a local file from your hard disk for creating an image. Simply you can use your camera to click one now. So, suppose I use a local image, local.png for creating an image instance. So, now there can be various methods for this image object. So, you can add text to this image object by doing img.drawText or you can mark certain, suppose you have a group picture in which you want to mark some random person. So, you need, so you want to mark this person by marking a circle around him. So, you can draw this circle by using the draw layer method of this image object and then you can draw certain shapes like circle or other things like tangles etc. etc. So, this is an area that images are stored in computer in a matrix in a 2D array where each element of the array represent the light intensity at that particular pixel in the array. So, if you want to get the array, the matrix of the image, you can use img.getNumPy. It will return a NumPy array. Now, you have a NumPy array. You can do certain operations on the array. You can manipulate the image, create your own image and then save it. So, hello and simply just a simple CV hello world program. First I will run it. Just I am going to show you a video just basic simple stuff. So, what there is happening you are getting live feed from your camera and displaying some information on the image. So, first of all importing all the necessary classes, image, display color camera from the simple CV using the default camera, creating a default camera object. Now creating a display object by specifying the resolution. Now in a while, I am capturing the continuous images from the camera and adding text to that and saving them on the display. Just basic stuff. So, I am going to show you a demo, basic image manipulation techniques. First of all, I have switched to the virtual environment I have created. Is this visible? Okay. Okay. I am increasing the font. Okay. I don't have skimmed, I guess. Yes. Okay. Okay. Thank you. Is it fine? So, first of all, I import the necessary classes. It will take a little while. So, now I create an image by using a local image file. Sorry, I need to switch the directory. You are right. So, you have an image instance now. You can get the height of the image by doing just mg.height. So, just 253 pixels and you can get the width. Okay. So, suppose I want to make the image half. I can just do like that. First, I need to make the array, the NumPy array. So, I have the array now. So, I create a new array from that, which is half. Half the size of the x coordinate of the image. So, w by 2. I will take all the values. So, now I need to show the image. So, I will create the image instance from this NumPy array. So, just mg1.ro. So, this was the cropped image and I will show the original image. This is the original image. So, you can do another operation like highlight operations. Like you can binarize the image. You can see the binarized image. So, simple thing is quite powerful library for image processing. You can do certain operations on images. So, let's talk about something different. So, now you people know about you people know about image processing. So, let's talk about the fun part gaming. So, PyM. PyM is a beautiful gaming library of Python. PyM is a co-dependency of simple CV. Simple CV uses PyM to draw stuff on the screen as you saw. As you saw the images drawn by the simple CV library. The surfaces and other stuff. PyGames. With PyGames you can add media to your games with all formats, one image formats. It supports all sounds quite nicely. You can use your keyboard, mouse, joystick, all other hardware which you use to control your games. PyGames provides good event driven API for controlling that. So, let's talk about commands in classes. So, in case of PyGames you need to first create instances of image, display and other surface stuff before you actually write code. So, it makes a little overhead code before you actually start writing code. So, you guys should try this boilerplate code before you actually start writing code. So, you guys can do pygame.initialize it will automatically create objects for you. The display object, the image object and other objects. So, if you need to So, you guys are creating games. So, you need to display the outer window. So, how will you create an outer display window by using pygame? Just use the display object which was earlier created by this pygame.initialize So, just do display dot set caption and the title of your game it will set the caption of your display object and then you can get a surface object which is whose name is screen air and by doing just display dot set mode and passing in the resolution. So, surface objects. So, games are designed just like you have events and depending on triggering of events, events you update your UI. You will see you add an image to your game or you can shoot a ball now and do other stuff like that. So, this updating is done by surface objects. So, surface objects in case of pygame every 2D object is a surface object. From images to displays all these things are just surface objects and what is the power behind surface objects is that you can draw one surface object over another. So, you have basically our window which was created earlier in the last slide. So, now you want to add something to that you can use the properties of the surface object to add something to it. So, these are the useful properties the fill. You can fill color to your display object by passing in a RGB triplet you can get the height and width of the display object or you can bleed your display object. So, bleeding means suppose you have 2 images A and B. The size of A is 100 cross 100. The size of B is 10 cross 10. You can bleed the small image on the bigger image it is just like a matrix operation. You put the elements of the smaller matrix onto the bigger matrix and specify the source and destination place. So, basic pygame program. So, you can see these small lines of code can produce a good program like that. Just basic stuff. So, first of all just importing the size is the boilerplate code and now just loading the surface of the bowl the image pygame.image.load bowl.bmp So, we have the image and we have the screen the outer display object by pygame.disc display.setmode and now in the volume just updating the positions of the bowl and screen that fill the color and then bleed operation updating the bowl the font and just basic stuff. As I said earlier pygame provides a good event drive and update in which you can trigger events in which you can trigger events and add stuff to your program and sprites are quite awesome in case of pygames sprites are basically characters in your games and the reason behind the popularity of sprites classes is that you can use the methods of your superclass on your objects on your pygame objects and you can check that if that two objects are colliding or not colliding and just work stuff. So now my colleague Shivam will carry us from here Shivam Hello I am Shivam So, here you can see So, now after getting knowledge of all these libraries related to image processing and gaming we are moving toward its implementation part ok Hello So, we are moving toward its implementation part that is application part So, now you can see here Bob is turning this toolkit away he is going to build something new So, we together decided to build two new cooler applications that I am going to discuss here one by one Firstly, we take the touchless flat P How much you play the flat P in your android form ok So, it is as the if you remove the word touchless the flat P it is just a game I need to cross the value my movie up and down and the motion of the word is controlled by the keyboard or in mobile you touch the phone ok but here, you can see we introduce the word touchless by using image processing So, in this game we are controlling the motion of the word that is Bob is moving up and down we are the gesture we are both from the gesture of the camera and Bob is moving up and down So, we are moving to its algorithm and building process of the application I want to show the demo video So, here you can see we are performing gesture in front of the camera and Bob is moving up and down ok and the webcam of the laptop is continuously capturing images that I will discuss in the algorithm So, that's it Now, what is the building process of this application what is the algorithm behind this application it basically consists of two steps capturing the hand gesture from the physical word and finding the coordinates of the marked object ok as you can see we need to test the gesture where it is going we need to update that point in the game we need to update the board position so, we need to distinguish the gesture from the physical word for that purpose, here is the marked object marked object can be a light or a small colored cap which we can wear on finger distinguishable, what does it mean as you know, we can deal images with RGB or HSE ok, in this mode we are using RGB so, distinguishable means the RGB range of the marked object is different from the RGB range of the background that I have in here ok, that's the distinguishable means so, we are using marked object for that now, what the final is going on there what is going on there as you can see in the next slide we are using gesture in front of the camera in front of the camera and camera continuously capturing the image and parallelly transferring into the image processing program that we call a competition part what the image processing program is it continuously taking images and input and do processing on it what is it processing it tells the whole image it tells the each and every pixel checking the RGB value of each pixel when images come in the range of the marked object RGB range, then it will be converted as a marked object point which is the image we get the different RGB values of each and every pixel so, after getting all this point that is the marked object point we do some calculation and finding its center and that center will be converted as a marked object point that's related, we need to update the board coordinates ok, according to that is the output of our image processing program coordinates of the marked object that is the first part getting the coordinates of the marked object now what is the second part we need to update the board coordinates with the coordinates of the marked object now here we have a gaming application gaming part, gaming program and here we have an image processing program where we gather coordinates of the marked objects we need to interface between them so before interfacing we need to consider a certain point that is scaling what is scaling? I am using a backup of laptop having different evolution and I build a game having different scaling factors gaming application so we need to scale both of them in such a way that board can cover all gaming display because it might have a problem that you are moving hand in this way and board can only move in a small distance it will not able to cover the whole part of the game that is the whole process we are going on there so overview of all these things what is going on hand gesture from the camera camera continuously taking images and give it to the in the processing program that is computational path it gives the output that is the coordinates of the deducted color and then we update interface the program with the gaming application and the board will update according to the gesture so that is the future diving into the world I will show you some important code snippets now here this is the main program the valid page and initializing here we initialize the camera if it is necessary to take the images you can see now we have the main process as I said it can use two steps first is image processing then second is interfacing here is the main program here you can see why loop is running there continuously it continuously taking images now here you can see we use 250, 250, 250 here we assign a RGB range of a marquee object here I pre-assign that I will use white color for the gesture as a marquee object for white color you know the RGB is 255, 255, 255 here you pre-sign the RGB values but in the next application I will show you that before running the application I will choose marquee object according to my will I can choose any of the before running application but here I pre-sign the RGB range that I will use white light so here I do in processing and calculating the white coordinate we need to update the only white coordinate of the world we no need to move the one in that direction so here I calculate the white coordinate that is the corner of the marquee object and here you can see I update the bird bird.fly.a bird.fly so that is the main part and we need to design a game for gaming we as you can see in the gaming part we have some background we pipes are creating pipes are removing pipes are removing so here is the core for that creating new pipes move pipes and remove old pipes and so here I will show you the beauty of sprites what are the sprites here we can deal objects with the rectangles suppose bird taken as a rectangle and pipe is taken as another rectangle now we need to game over game is never going to continue so we need to game over when bird touches the pipe it will do light so this is the part that is the beauty of sprites we make two rectangles if the rectangles are equal then the game will go over so that is our part this is the main program so now this is the overall building process of the touchless strategy application I will go what is going on there can we move the next application so our next application is virtual user virtual user in this application we are just controlling the function of a computer mouse we are just sure here is the differences in the touchless strategy app you see that we just need to control only one function that is moving bird up and down only if there is a one function but here we need to do any function mouse rubber click, mouse finger click mouse tracing mouse scrolling filter and secondly here we don't have any gaming app where we can easily update the system properties mouse functions are controlled by the print system as you know before overcoming all these differences I want to show the demo of this application here we are using webcam having different marquee objects having yellow, red different color marquee objects in my hand and different functions are performing in the display I open a file I am moving the cursor to see the functions are performing I will tell you the whole algorithm behind it now the algorithm behind this application so as I said we have to perform various functions to overcome that we are using different marquee objects that is we are using different colors and mapping that color to different functions so it will be more clear by that here I mapped red color for cursor tracing blue for demo click and yellow for left click we are using different marquee objects that is if we have different color caps on fingers or now what is going on there here we need to update the system properties for system properties we need an api to deal with the system so we make a wrapper across actually vm32 api it runs api to deal with the system properties vm32 for windows and actually for Linux so we make a wrapper across these api and we control the system properties now who tells the system that do this this will tell our age processing program our computational part this will tell that if red color appears in the image do cursor tracing if green color appears do right click, blue pair, double click yellow pair, left click now what the computational part do that is image processing here the load comes to image processing program it will also detect do it will also detect the color single processing running on there we are performing gesture in front of the camera having different marquee objects and camera just continuously capturing image and passing it to the image processing program it will do it will need to detect the color as well compute the coordinates it will require in the case of cursor tracing and in any case we don't need to compute the coordinates so this is the whole algorithm or building process application so dive into the code here I will show you a demo how we update the mouse properties I will wrapper across xlib api and I can use that what should we do what should we do now firstly we create a mouse object so we create a mouse object now we suppose you can see a cursor is there at the corner you can see the cursor is here so let's suppose I need to move the cursor here let the coordinates 100,100 so I will call the function now initially the cursor is here and now you can see the cursor here I move the cursor similarly I can perform double click mouse mouse it will it will single click it is for the single click if I move cursor to the file it will be the single click here I can use the terminal now what is the code so I told you before that in this application we are using instant markup object detection that we can choose markup object according to our build ok so here is the code for that we run the loop for the nf times that is for the number of times we are using markup object for the number of markup objects it will take image of a markup object and calculate the RGB range of that markup object and it will be used in the further application so that is the whole building process application any question so for questions there is a mic I will give it to you so there is a mic then you can walk up to that mic so questions first thing do you have source code for this online I will open the link can you share the link no permanently clear you can go to this particular link ok ok and the other thing is do you have support for multiple markup objects from the same view have you tried that or not we have some small color caps yes two circular caps and then multiple markup objects in the same view can you use that or not in the same view you can compute anyway you can tell the computational part if yellow and red both are appearing then you can do particular function it will depend on you this color appears do this is it visible here I wrote the link so on GitHub you can go to black part so if we are using multiple markup in the same space so is there any lag upon what you show in the demo that the but is moving in y direction in the part if we are moving in x y both the directions so is there any lag between the gestures and the update of that no no as we are moving in the y direction similarly you can compute the x as coordinate and update the part according to that it will not just only the delay came in the processing part taking the image we are similarly taking the image but difference is created in the computational part you can compute the x coordinate and update according to that there is no delay is not coming according to that we just have rgba value so why not why not your program consider alpha value rgba alpha this one we are starting there rgba we can also use hsv it is more powerful it will light up the brightness effect that if we are using that color in different mode and if you take in the light rgba will totally change but hsv will continue will remain the same we can use hsv more similar to image recognition can we do voice recognition and python voice recognition i don't know so your next step would be that maybe ok can we have a round of applause for both of them both of them ok before i first of all kush and shivan before i begin so the wifi password is geeksrs that is g-e-e-k-s alphabet r and u-s so it's all in small secondly please don't create your own access points so we will have few more hint which will come up to you with an app and it will completely shout like p-p-p and they will tell you to stop your access points and do that secondly you can tweet about this using hashtag and the next talk we have is by arun ravindran and it's on jango design and he's been tinkering with jango since 2007 so i was on my ninth grade then so he's been so can we have a huge round of applause for him and more important thing there's a meeting at 3 p.m. in the rd rd3 so if you guys want to join you can yeah it's just starting up some sorry but that my laptop just hibernated okay okay actually it will be in 30 minutes