 So, hello everybody. My name is Svetlana, so it's my first time here. So it's cool to be here and I'm glad to be the part of the party. So for my cospeaker, well, I'm hope he's on his way here. So, you know, he likes Kanoya, and I took airplane, he took Kanoya, and I'm here now. And my cospeaker might be in the middle of the ocean. So, today I'm going to talk a lot. I'm going to talk about the things I'm sure you know well. It's all about shellcodes, zero days, memory corruption, vulnerabilities. So, we are in 2012 now, so why should we care about shellcodes at all? We'll know that it's pretty old technique, told for web2zero, told for some cloud stuff, and if I remember everything correctly, such attacks were first published in 1999. So, also we can look at some Microsoft security reports about malware propagation and it says that number one reason for malware propagation now, it's not zero days. It's surprisingly use on awareness. So, and the role of zero days is less than one percent. And as for end point security, okay, it deals well with malware and in such circumstance, why should we care about unknown at all? But let's look to the other side of the coin. Yes, memory corruption is still there. We still have a huge bunch of code written on the programmers still making mistakes, programmers still making vulnerabilities in their code, in their products. And we remember Microsoft report about vulnerability in remote protocol. We also know that tools like the deployed framework and related to it is widely used now for some blackout communities by contestants also. And we shouldn't forget also about attack attacks of critical infrastructure such as planes, trains, water pumps. It's really serious and we should care about it because it can lead to human victims. So, it's worse to detect it as fast as possible. And about end point security again, it's mostly signature based and there's nothing to do without zero days. Okay, that's another point of view. So, this is the way I should care about cell calls. So, it is CTF competition. During CTF games, teams usually write zero days with the help of automatic tools such as metasploits and others or even manually from scratch. So, during the process, all game network is full of exploits all the time. And if your team is able to detect such exploits and analyze it, you can gather some profit from it like you can gather some ideas from other teams. You can increase your defense level. And another point of view, we will live in digital area. And what is true is that we trust almost all fields of our life to digital devices such as cell phones, laptops and some other. We trust them bank accounts, health records, personal private information. We share it also with social networks, cloud providers. And the problem there is that such devices use a huge amount of different software written by programmers who work under pressure of time limitation, resource limitation, managers who control quantity and no quality. And here is IBM statistics. It's an increasing amount of vulnerability disclosures each year. So, as you can see, the trend is not so positive. Okay, what do we have a module called detection methods? So, mostly it exists in research papers only. If someone knows available open source tools, you can tell me, I would be very happy. And so, as about tools described in research papers, we can divide them into two classes. There are static analysis methods, dynamic analysis methods and hybrid also. So, the most common techniques used by static analysis are listed here. It's signature matching control flow graph analysis, instruction flow graph analysis, nob-slit detection and also methods of abstract execution. Dynamic analysis method presented by emulation and automata analysis techniques. And hybrid analysis methods would use all of those techniques. So, if we look at those methods, we can notice that none of them can detect every type of shell codes. So, if we want to detect everything, we can simply try to execute one after another, but it would be extremely slowly. It's boring slide, let's keep it. So, now I'm going to try a little bit sides. That's why shell code detection is feasible at all. In closer to viruses, which are reached with features, shell codes has certain size limitation and structure limitation. And given the set of shell codes detection algorithms, why don't we try to construct a classifier which will be optimal in terms of false positive rate, execution time and at that point, we shouldn't also forget the power force negative rate. So, at the first step, we try to identify essential code features. It could be generic, it could be specific. Some of them could be detected only by static analysis. Some of them could be detected by dynamic analysis on them and some examples of them are listed in the current slide. So, given such set of shell code features, we can divide shell codes place into several classes and in such a way that one class can include one or even more shell code features. Okay, here's example of what I said in the previous slide. We could name such specific features as, for example, correct disassembly from each endeavor byte of set or existence of multi byte instruction. We could name such common features, for example, like correct disassembly into a chain of at least kind instructions and so on. Totally identified 19 classes and here's a significant remark that none of existential code methods provide complete coverage of identified classes. During analysis of existential code methods, we noticed that mostly that almost all of them could be presented like some kind of combination of elementary classifiers or detectors of specific shell code features. Moreover, all of them use some common steps during their analysis, like there's a seven stage reconstruction of control flow graph reconstruction of instruction flow graph. Thus, it seemed reasonable for us to implement the shell code detection library in a way like described in the current slide. So, here is the main idea of our hybrid shell code detector. So, we try to construct optimal data flow graph from elementary classifiers implemented in the shell code detection library. And if some classifier includes flow to be legitimate, such flow doesn't pass to the other classifiers. And if we try to put classifiers which runs faster at the top of such topology, we could reduce legitimate flow as fast as possible. So, that's how it works. We are given the set of elementary classifiers. And at the next step, we choose from them such classifiers which provides complete coverage of shell code classes detected by entire set and which are optimal in terms of false positive rate and execution time. So, then we construct a lower resultant graph and repeat that step. So, we end up with decision-making model which analyzes all output from elementary classifiers and concludes the flow to be legitimate or to be malicious. As one of the important goals of that work was to minimize false positive rate, we noticed that we could achieve such goal in simple linear topology when we execute one elementary classifier to another and there is no flow reduced in such case. And there are relation results. We compared hybrid topology with simple linear topology in four different datasets. It was exploit dataset, exploit generated by the exploit framework. It was benign Windows and Linux binaries, random data and multimedia also. And here is visualization. So, red line stands for linear topology, blue line stands for hybrid topology. And as you can see on some datasets, hybrid topology is more effective than linear up for 45 times. So, there is a couple of use cases for hybrid classifier. It can be used such detection and filtering tool for zero days in the network. It also can be used in CTF competitions as it could help to increase defense level of team and could help to gather days from other teams. Just a second. It's a little bit of frustration. So, you can download the tool from Kittoros. Build them, build them, build them. Sorry. So, it's used in case of detection shell codes in the network. You can see in case something use case of detection shell codes in the files. We're using exploit generated by different models of metasploit framework. See, we are detecting different classes of shell codes like plain shell codes, shell codes which contains no sledge and also which contains the crypto. So, I'm done. Good news everyone. So, codes may be now detected. Some of the detectives have to put in times faster than before. You can download our tools and use it. So, if you have any questions, here's some information about me and my co-speaker Dennis and also information about research papers used in the current presentation. Thank you.