 Yeah, thank you so much. So, good morning everybody here. So, today we are just looking to one of the use keys where you can use the machine learning in cyber security that is intelligent intuition detection system. So, this was one of my research product I have been working on this from the past two years nearly and published in a couple of conferences as well as a poster from Intel and there was a grant approved by Intel as well. So, just let us get started. So, a quick background if you see why exactly intrusion detection system what is the motivation why I have chosen this. So, in today's world if you see there are a lot of I mean importance for the network and system security and you have been hearing a lot of attacks that are happening day to day life and really new attacks of burning new attacks are coming up in like but by the time I speak the statement there probably one more new malware or virus has been generated or created and a lot of attacks have been done and especially information privacy has also become the paramount one of the paramount importance and it has to be taken care of properly by any industry or the business or the leading. So, these are the major things of you know these are the main motivation for me to work on this particular problems. So, to cut down and to describe the whole problem in a nutshell if you put it or intruders have become smart and nowadays they are easily able to break the systems and you know attack the systems and create any kinds of attacks such as denial of service or you know inject some malware and if recently seen or ransomware attacks how much hype they have caused it and so lot of other such attacks are happening. So, as a traditional approach I just think I was just thinking like you know can I build a system which can solve these kind of attacks intelligently and you know how exactly to tackle this kind of situation. So, just started searching about what kind of intrusion detection systems are there in the market and if you see there are a lot of ideas such as snot or anything which are popular right now. So, they use traditional techniques rule they are mostly rule based systems. So, you know they define some rules they preset some rules and then they try to predict what exactly or they try to identify what exactly the attack is. So, my approach was pretty much in a different way. So, I was I was as a human like what is the thing what are the things that we are able to do or a network administrator is approaching and how the network systems administrator's approach on a particular while he or she is inspecting the log file they know. So, I plan to design a system completely hybridized version of a hybridized intrusion detection system by using some of the machine learning or deep learning techniques such that we can introduce the human intelligence to the system and automate the identification of intrusions rather than having a huge number of administrators and network administrators and trying to you know inspect the entire log file throughout the days and days and if we are able to automate the process and we give a whole log of like a bunch of summary over the things that are happened in the past. So, it becomes much easier for the network administrator to understand what exactly is there. So, let me give you a much more you know feel on the what exactly the ideas and how exactly things work. So, this is just a normal network security software apart from firewall usually all the businesses maintain the ideas as well as one more layer for security and it just all it does is it tries to identify the malicious activities that are happening in the network especially outside the from the malicious it tries to able to detect the malicious user who are coming out from outside the organization. So, there are majorly two types of ideas one is active ideas and one is passive ideas. So, active ideas is the system I mean as the word says it is pretty active or the moment it try it identifies any intrusion. So, as I said just to define the term again intrusion. So, intrusion is nothing but something a suspicious activity some done by a person who is not an who does not have an authorization to do that. So, we will look into some of the use cases as well and you will understand what what could be these kind of the problems and all. So, when such case is identified or when such case is hated. So, active ideas is usually tend to go ahead and try to block the administrator completely sorry try to block the user and completely tries to eliminate the access to him. So, it is there are a lot of problems out there. So, just in case work administrator or the person who is authorized to do and who is authenticated to do it or a user or the employee tries to perform some activities on the system which goes beyond the rules by one person or two person still since it is a rule based system. So, more or less like if else conditions that were designed in the system. So, what it does is it just prevents access for that administrator or employee as well when he just crosses by even a one person. So, that is one of the major issue when it comes to active ideas and then quite completely on the opposite edge there is a passive ideas where it does nothing when it identify any intrusion all it does is it just informs or alerts the network administrator by it could be by SMS it could be by email or any other mechanism that we can set up. So, on top of it there are couple of other stuff that we can locate into it and one is the signature based. So, signature based so when we say especially in the security every attack or every behavior we can define with signatures. So, every malware or anything can be probably could be defined as a signature. So, we as a system administrator will usually collect a huge set of signatures and put it in a database and when they observe any kind of activity they try to match it with a signature in the collected database. So, if it matches they identify it as an attack. So, the problem here again is the database has to be quite up to date all the day and whenever there is a new attack it has to be updated immediately. So, that is one of the major issue and ultimately you can clearly understand that it would not be able to identify the new new attacks and all it just sees the past attacks and tries to predict something. Whereas, anomaly based is something which does these things. So, it takes out the baseline patterns. So, if you have a data database of signatures of all the attacks. So, it tries to take out the patterns that are there and then based on that. So, if a new attack comes in and if it is matching with the pattern that are there with the signatures that are there in the database maybe it can predict as an attack. So, that is one of the other approach and again there are couple of other approaches like it could be a network based or host based. So, a host based is something that we install the software at a host level in each and every system of an of employers. Whereas, a network based is something we deploy at a central switch or the central routers. So, to put everything is a good for visualization tree from the intrusion detection system. It could be from if you see a kind of techniques. So, one is intrusion technique and the source of data. So, the data these can be again classified into a host detection, network based detection. Whereas, here we can see anomaly detection as well as signature detection and as we move. So, there are three types of approaches that can be followed. So, you see here a couple of approaches here are programmed. So, when we say programmed as I said all the rule based systems. So, you define the set of rules if conditions and then you say what could be a malicious activity, what could be a non malicious activity. Whereas, self learning you use time series or machine learning. Whereas, here you do not define completely set of rules. Of course, you make the machine learn and then try to identify, try to give the freedom to machine to identify which could be a malicious and non malicious activity. And there are a lot of algorithms that can be used here over there. So, this is just because we search in like ML of a cyber security. So, if you see a lot of right now there are really huge number of applications where machine learning is needed especially all kinds of building tools when you build a penetration testing framework. You can automate it using machine learning and there are especially if I talk about in intrusion detection in network security. So, intrusion detection system is one of the major part or major place where you can use machine learning because as the problem was clearly defined in the beginning if you can intelligent if you can inject the human intelligence to the machine. I could it would automate the process that we the painful process that we are doing. So, this is one of the research papers that was published. So, machine learning for signature detection or malware detection we can see. So, there was a techniques are used they called fuzzy techniques. So, fuzzy technique is nothing but how many of you here know probabilities? Most of you right. So, does anyone of you know of what a fuzzy is? Cool. So, there are few people who knows fuzzy as well. So, when you say probability like let us say probability of a raining tomorrow is 60 percent or 0.6. So, you mean so there are chances I mean like if you take a 7 days out of that 60 percent of the rays in the past one week would probably be raining or generally if I just take like out of I mean when you say 60 percent out of 100 days it rains 60 days. So, that is what we generally mean by probability, but whereas a fuzzy is completely different. So, fuzzy also has a similar values like 0.6. So, when you say so it is the statement is true with the probability of 0.6 and statement is false the probability of 0.4. So, a fuzzy is a logic or the fuzzy is a pattern that gives a allows particular statement to stay in both the classes. So, especially why fuzzy is like let us say a situation can be malicious to some to some extent and non malicious to other extent. So, we can give the fuzzy rule fuzzy logic saying that it is like 0.6 it is non malicious and 0.4 it is non malicious. So, it allows that particular signature to be in both the states. If I if my I mean I can set my rules like if fuzzy value that the value that we get is like 0.8 in malicious I can simply prevent it from the system if it is 0.2 or 3 I can happily allow that solve the logic works. And this is in general of why exactly we need to move anomaly based ideas because signature detection signature based ideas are not able to get anything or not able to identify because of new attacks that are coming up in a day to day life. So, these are the major attacks that types of attacks are that we are right now attempting or we are probably solving it. So, one is DOS. So, DOS is a attacker I mean most of you might might be knowing what exactly the DOS denial of service. So, in a fraction of second a particular user tries to get the whole system down with by sending multiple packets or multiple requests to the particular system. Whereas, probe attack is one of the things in information security tries to gain the access to the whole user database or whatever the information about the target host can take. And user to root. So, let us say a person has a user access whereas, if he is trying to use the system root commands trying to access the root files and stuff like that. So, that is one of the important issue to tackle and then R2L is nothing but the other part of it. So, person does not have a authorized he or she is not an authorized user, but still try to attack this or try to gain the access to the system. So, some of the experimentation as usual if you see any of the machine learning problem the basic stuff the basic approach would be dealing with the data. So, the same case here. So, initially I was I just developed a small detection system which strained on the data set that were taken like such as KDD, CUP data set and all. But whereas, as I move forward. So, I have divided into different layers. So, these are the layers that I have working currently with. So, at next year and then application layer and then transport layer. So, the system was system you will be able to identify the attack at the respective layer. So, simple technique there was chosen a simple LSTM nothing but gated recurrent units was taken with just one single hidden unit and one single hidden layer. So, why LSTM is one of the passion that comes in everybody's mind when we see. So, LSTM are so as I said the attacks are more or less on a time series. So, if you say I cannot simply say this particular attack happened by just inspecting the one particular packet that is there. So, I would probably say it is an attack when I able to identify the behavior that happened in the past series of packets that are coming from one particular source. So, in a certain time frame. So, I need a algorithm or I need a technique that can take all these things into consideration and then process the information and then gives me results of stop. So, this is one of the simple technique I have chosen to do that. So, and for field selection I personally have first I have chosen decision trees and random forest especially to extract the important features. So, if you actually would inspect the whole packet and take out the information from the whole packet it would come around 225 features or so. So, 225 features is a really huge set and if I have to say at production in a production environment and if I have to inspect all those 25 features and then apply machine learning it takes maybe seconds as well which is not really needed for me. So, my system has to be so fast and so accurate such that if a intrusion is coming in I have to detect it at such a speed rate and then try to block him immediately. So, that I would not give access to him or lose the information of the importance of that. So, there is nearly that is clearly understand. So, the features that are pretty much important has to be considered and there is a huge requirement for field selection. So, that is why we use simple decision trees and random forest in the initial level and then identify the best features. So, again we identified the attacks based on the layer wise. So, rather than just approaching at one layer. So, series of layers where the malware detection happens. So, first set application layer these are the features that helps in finding out the attacks. So, whereas in network layer these are the features that may be majorly helped. So, in finding attacks. So, some of the results if I have to speak. So, on training data set that we collected whatever the accuracies are we got out merely on like around 98 percent 99 percent and all, but we have tested with. So, we have tested with 19 businesses in India. So, it is a product under a startup called iCyberSol. It is one of the major product and we have tested it with 19 businesses and for about 100 days and we have got around 88 to 90 percent accuracies and one thing that we majorly identified is the penetration testers over there have continuously try to gain access and send some try to do some new attacks which are not there in the database that we created out of those out of like for every 100 acts they try to create nearly 70 to 80 were new almost and most of them like the system was able to identify and that is where we can say how well the system is working and. So, best thing about is all the training and everything was not just happened in the cloud everything was I mean the production of we deployed the whole system on a Raspberry Pi with Intel Mavideus NCS stick. So, that is that is one of the fantastic thing I can say like the entire computation happens on a just with a Raspberry Pi board like even the hardware if you see it is just a small stuff like most of you I guess most of you know what Raspberry Pi boards are. You can take any small microcontroller and then I can put a the stick then deploy it. So, when research issues right now one thing as I said like it has to be much more faster and when it comes to be I have to say it has to it should be able to detect within a fraction of second and then another situation I can say as data sets. So, still the data sets are not you know upgraded to the current day attacks and we need really really use use data set to improve the efficiencies and accuracies. So, that is one more one more issue that we have right now. Yep this can this can be maybe it can be further more improvise see like I say if a 89 percent accuracy still there is 11 percent or you know a lot of cases it is still missing it. So, I cannot even compromise on one such failure because the whole organization security is going to be depend on this. So, accuracies has to be really so high around like 96 and 97. So, that we can since it is a very critical systems and has to work. So, that is what about the system. So, I like I will be available here we can we can talk about like more details about the system. So, just 25 minutes I just gave a quick introduction how the whole system was developed and yeah if you have any questions you can always feel free to reach out to me on these links. And yep just a quick introduction about myself I am a founder of iCybersol and non profit organization called society. So, iCybersol is basically in the cyber security domain and the societies in AI domain. So, societies are not for profit motive. And we train students, developers and faculties and all the people in machine learning and deep learning the past 6 months happily I proud to say that we have trained nearly 10,000 students and 1200 faculties to different programs. And yeah since I said like not for profit and anybody who willing to have such trainings we are happy to do it online as well can just always reach out to me. And if it is in from India we are the whole team can fly to the place and then teach your students and yeah that is what we into. Thank you so much and I will be available here. You can any questions you can post. Right now for the community edition will be under OWASP organization. So, recently OWASP organization has approved my project. So, we are launching the community we are open sourcing the community edition to it. The whole source I mean the basic with the basic features and all we are open sourcing it under OWASP organization. So, it will be soon published. Yeah for the entire artificial intelligence of machine learning part that is the entire Python. Yes. So, this the whole packet inspector that we have written. So, that uses some of the bash.