 To you. Okay, nowadays, companies, most of the companies encountering the problem that they need to accelerate the development process. The average time of a sprint is about two weeks. Because of that, most of the companies use open source more often. Actually, this number is under approximation to the real use of open source in some domain is much higher than that. And this is basically great, because open source is there, it's for us to use. In most cases, the license is okay, and it solves our problem very quickly. The main problem is that open source, like basically any other program, contains problems among our other things. It contains security vulnerabilities, and we must remediate those security vulnerabilities in order to make sure that those security vulnerabilities will not put our application at risk. In ideal work, we will remediate all the problems and fix them immediately. But in real life, we need to prioritize the work, because we have some limited number of resources and we cannot remediate all the problems at once. So we need to decide which vulnerability to work first, and which we can postpone to later sprints. Without knowing anything, without any special technology, the standard prioritization techniques is just looking at the severity. The severity is not enough. There are many examples that vulnerability is high severity and does not put the application at risk at all, and a medium severity CVE, for example, can put the application in a very serious risk. So because of that, we need more than that in order to prioritize the work and decide which CVE we will work now, and which we will postpone to later sprint. What if we will have some special technology that can say that some of the vulnerability does not put the application at risk at all, even if they have some high severity regarding the marking. So if we know that, we can postpone the work on those CVE to future sprint, or we cannot work on them at all. It doesn't matter because they are not hurting the application. The main idea is to look at the library, to look at the CVE, to look basically CVE or vulnerability is just a piece of code in the application. So what we need to do, we need to check basically if this piece of code is accessible by the application. If we know that this piece of code, the vulnerable code, cannot be accessible to the application, so it's perfectly safe to use that open source, regardless of the fact that it contains some security vulnerability, and this leads us to the main observation of vulnerability effectiveness. What is vulnerability effectiveness? We will say that a CVE in some open source component is effective if it contains some execution path from the application, from the proprietary code to that piece of code. In that case, we know that the attacker can take advantage of that CVE and can cause some input that will lead the program to there and can do something harmful, so it can put our application at risk. On the other end, if we know that some piece of code regarding this CVE is unreachable, on every input, it's basically a dead code, we cannot reach that vulnerability at all, nevertheless, what will be the input or what will be the use case, so we can safely say that this CVE is ineffective, and it can be there, but it will not cause any problem to our application. Okay, so this is the technology that we want to develop, technology that will take the application, it will take all the open source component and will decide which CVE are harmful, which CVE can be reached, and more important than that, which CVE cannot be reached. For doing that, let's for a moment look at this illustration. This illustration describes the ratio between dependency and proprietary code, and when the proprietary code uses some dependency, it has a small amount of dependency, those dependencies are called the direct dependency, those are the dependencies that we are using using our dependency manager, but those dependencies is a program, so those dependency uses also the direct dependency for them, but for the proprietary code this will be a second level dependency, and we have the third level dependency and so on. So obviously most of the code lies on the indirect dependency, and therefore also most of the CVE, most of the vulnerability at open source will come from an indirect dependency. Therefore if we want to analyze the program and understand which CVE can make some damage to our application, we need to analyze everything, not just the proprietary code as we are doing in some SAS tools, and not just the direct dependency as we are doing in some SCA tools, we need to analyze everything, the proprietary code, the direct dependency, and also the indirect dependency, and then only then we can come into conclusion if we have some ineffective vulnerabilities. Okay, so let's examine these in a bit more detail using that illustration, this idea. Suppose we have some application, proprietary code, and this application uses some libraries as a direct and indirect dependencies, and one of the dependencies is G function that uses another library, that uses another library with a reported vulnerability, CVE1 and CVE2, and suppose that G function is the only connection from the proprietary code to this library, and G function uses only two APIs, API1 and API3. So we know that API1 call the code of CVE1, so we'll say that CVE is effective because we have a trace from the proprietary code to that vulnerable code, the trace that go to G and therefore API1 and then to the CVE, but what is more important, CVE2 is ineffective because regardless of the input, we cannot reach CVE2 because it will be reached by some other API of the library but not API1 and no API3. So it's perfectly safe from the application point of view to use that library regarding the CVE2. So this is the main idea to give those marks to CVE. So standard SCI tools will say those are the dependencies and some of them can put the application at risk because they contain some security vulnerabilities. With this technology, we'll get again the list of the library with those marks of the CVE, but each CVE will get additional mark whether or not it's effective or ineffective, if something is ineffective, it will be enrichable in every execution part, including some reflective call to the library if we have some reflection or something in Java or something similar in other languages. So we want to develop a technology like that and please notice that the important mark is the ineffective and not the effective because without this technology we are assuming that everything is effective. With this technology, the library will stay effective, but we succeeded to prove that some of them or hopefully most of them will be ineffective. It means that there cannot be armed application in any execution part and actually in any practical use we can use that open source safely regarding those CVEs. Okay, so we want to develop technology that will calculate that, how we can do that. So this is not a scope of the presentation to explain deeply how we are doing that but I will try to give you some intuition and if you have questions, just stop me and ask as we go or at the Q&A session. So let's look at this low-word Python program. At this program we have some proprietary code that main function, we have also some two dependency, there is an enemy library that contains, for example, monkey class and some food library that contains, for example, implementation to banana, carrot and worm. And so we have a proprietary code, we have two dependencies. And we want to understand what are the connection, what are the relationship between the components in the program. So we start analyzing the program and we understand, for example, that at that location we are calling a constructor. So the location when we are calling some function is called a call site and the function that we are calling is called the target of the call site. So at that location, at that call site the target is the constructor of monkey and in that location, the target of the call site is the banana constructor. In this location we are calling put but if we will examine the program we will see that we are calling again we have another call to put and another call to, and a new call to worm constructor and now we are calling it. Inside it we are calling food.it. If we will examine the program we will see that at that location the only function that can be called is banana it or worm it. Those are the only two functions that can be called using on that location. So this is basically the relationship that we need in order to understand which element is reachable and which element is not reachable. We need to understand what are all the call sites in the program and in each call site we need to understand what is the exact target and we need to do it for all the program to the open source, the indirect dependency and to the application and that's all basically and if we will have this relationship we can using that will solve the vulnerability effectiveness problem. Lucky us we have some data structure that will help us to build that. The data structure is called call graph. So call graph is this graph is just a standard graph in computer science the edges of the graph are the functions in the program and the nodes of the graph are the functions in the program and we have two nodes an edge between two nodes if there is a potential call from that function to the target function after to the hello world python program we saw in the previous slides we see that we have an edge from main to monkey constructor or from main to monkey put or from main to monkey eat and so on and so forth if we will examine the monkey eat we see that monkey eat as an edge to banana eat or to worm it and if we will continue to look at that program we see that the current constructor is not reachable from the main so basically if we will have this call graph we can reduce the problem of calculating the vulnerability effectiveness to the problem of building that call graph because if we know that carot eat as some vulnerability we just need to build that call graph to do some kind of reachability BFS, DFS whatever from the proprietary code function we will see if the vulnerable element is reachable from one of the proprietary code functions if so we know that this vulnerability may be effective otherwise we for sure know that this function cannot be reachable therefore this is not effective it's ineffective and we can safely live without vulnerability in our code all of these in some assumption that we have an accurate call graph if we are misengaged that the call graph this will not be true so we need an accurate call graph okay so this is what we want to do and this is the problem we want to solve so how we can do that we say just build the call graph so it sounds very easy the problem is that building a accurate call graph it's undecidable problem it's impossible to build undecidable call graph not today, not tomorrow and not in 1000 years in some in some machine that is equivalent to modern computer so it's impossible to do it like that so we cannot build the call graph so what we can do we can approximate the call graph so we can use some approximation techniques and there are several approximation techniques for building a call graph one of the approximation techniques produces a call graph which will be under approximations to the original call graph does a kind of techniques called complete so if we have a complete algorithm for building the call graph we know that we have an under approximation and for understand that let's look on this tiny program on the bottom left so we say that complete is a part of the original call graph so the original call graph in this tiny example needs to suppose that all of this is inside main function so this tiny program contains two edges at the call graph yes we have an edge from main to F and from main to H and G is not reachable in that example but the algorithm produce a call graph with just one edge from main to F so if the algorithm will produce such graph for every program this algorithm is called complete and the graph that it generated is under approximation each missing edge for example this graph as missing edge from main to H is called false negative so if we have complete algorithm complete algorithm contains false negative and the missing data from the original graph because it's under approximation it's called false negative okay so this is complete graph and let's think a moment if this is good for our use case of course not because what we want to do we want to calculate the vulnerability effectiveness and we say that the important thing is the ineffective vulnerability so if we have complete call graph we know that we are missing some of the edges so if we are missing some of the edges and we see that something is not reachable maybe it's not reachable because one of the missing edges so we cannot use complete call graph to solve our problem although complete call graph is very easy to build we just need to run the program and record what's happened during the execution and then we can see what is the relationship between the functions so dynamic tools basically are based on that approach and generating an under approximation and this is not good for us because it's because we cannot calculate the ineffective vulnerability and in addition we need a good approximation at the indirect dependency as we talk and even if we have a great test that covered the application even if in 100% testing coverage we are checking the testing coverage just on the proprietary code not on the open source for getting test that will cover the open source not all the open source but the reachable open source in 100% it's almost impossible and very difficult to achieve especially when we are looking on the indirect dependencies so for all of those reasons the dynamic algorithm is not fit to our problems okay so let's see what we can do so we cannot use under approximation maybe we can use an over approximation yes we can generate a graph that will contain the actual graph but maybe contain some more edges so the extra edges that are not true are called false positive if we are generating a graph that contains all the information that we need additional information so the additional information is called false positive and algorithm that produce all the information that we need with some false positive is called sound algorithm so basically sound algorithm is better because if something is not reachable in a sound call graph we know that it's not reachable also in the reality in the actual call graph so this is a very good to us and may solve our problem to illustrate that let's see again our tiny program this we can see that this graph is an over approximation to the actual graph we have an edge from main to h and from main to f and this is great because this is the actual call graph but we have some false positive edges we have an unnecessary edges from main to t this is just an edge that we don't need and from main to g for example because the algorithm didn't notice that this is an infinite loop and we cannot call g function so those those kind of algorithm is good and and basically we can use them to our problem but we need to make sure that we don't have too much edges because it's very easy to build a sound call graph it's trivial problem at any location we can say everything is reachable so even if at every location we say everything is reachable we have a great sound call graph but not so accurate with a lot of false positive everything is false positive and everything will be reachable and in that case everything will be effective and we will not have any ineffective data our assumption our thesis that we want to prove that most of the vulnerability are ineffective because we are using a fraction of some API of the libraries and intuitively it looks like most of the CV does not need to affect the application so if everything will be effective it's not good for us so this is we need to build a sound call graph but with a very good accuracy designed to our problem to understand that most of the code is ineffective there are some tools that neither sound nor complete just generating some graph with a false positive false negative those obviously not good for us for example this graph that may contain a call from main to F so this is the real edge but missing the edge from main to H and as some extra edge from main to G for example so those tools are obviously not good okay so we want to build a sound call graph and this is not an easy task let's see what are the trivial way to build a good approximation to the call graph so if we have a strongly type language like Java there is some very easy and trivial way to build a call graph for example we want to know who can call from that location we know that it just can be all the E that came from the inheritance of that type so it can be that type if it's not an interface of course and all the descendants of that type so this could be a good sound approximation to the call graph but the problem is that this approach will lead to a lot of false positive and will not go to our task because almost everything will be effective for example imagine that you have some interface in some library and many other libraries that implements that interface hundreds of implementation then every use of that interface will jump to every implementation so it's not good and eventually most of the code will be reachable with that approach so it's also not good but this is a nice way and very efficient so if someone can live with that approximation so it can be a good way to approximate the call graph and in addition this does not deal with reflective call like reflection in Java so this is also not good and in some languages we don't have type at all like Python so in Python we don't have type at all so we cannot use that in Python we have some other approximation some of the IDE usually doing something like that when we press jump to declaration so what they are doing they just looking at the it's called name based approaches there is set of algorithm that very similar to that so they are looking on some function they are looking the signature of the function the name, the arguments and everything and they are assuming that every function with the same signature for example the monkey eat call the eat function and the eat function as the same signature of the eat of monkey then we make the algorithm by mistake things that there is a recursion error so this obviously also not good for us it will cause even it does not deal with reflection again reflective call but even if do it will cause too much edges in the call graph and will have a lot of false positive and everything again will become effective okay this is the actual call graph that we need for this tiny program okay so those are the method we talk so far so beside the one entry that we will talk about it in a moment so there is the dynamic method we say that the dynamic method is good from the fact that it does not contain false positive but it's complete and under approximation so it's not good to our use case because we cannot calculate the vulnerability effectiveness using that because if something is not effective we don't know if we are missing something or it's really ineffective and in addition it's very complex to deploy such tool because it requires an input to execute the program it requires execution environment we cannot just point the tool to some report and say scan it we need to execute the code and we need to generate some data that will execute the code with a good coverage to the open source which is more difficult to so this dynamic method are not good for us as we say we say that static method are better if we have sound approximation to the call graph and we say that there are some simple static method and we give just for illustration a class hierarchy method if we have a strongly typed language and the name based if we don't have type and we say that also those are not good because they are not deal with reflective call and they are generating too much edges at the call graph so everything eventually become effective there is another known method at the literature this method is great basically which is called under zone analysis or point to analysis this method is very good it's basically does not also deal with reflective call but easily can be extended to deal with reflective call so it's not so difficult to do that but the problem with that method is that but it's very precise and good for our use case and there are a lot of open source that implementing this basically but the problem with that method is that it's not scale if we will try to apply this method to even thousands of lines of code not talking about millions or hundreds of millions lines of code as we have in our use case because we are analyzing everything the application and also the dependencies so it will not scale it will not stop and we can not use that method so basically at the literature as is there isn't any good way to approximate the call graph for that use case because of scalability because we want to analyze the application and the dependencies and we are talking on millions of lines of code when counting the dependencies because of that in our company we develop new approach to do that basically what we did we succeeded to develop new algorithm designed for this problem for security vulnerability problem that is from our evaluation that we will see in a moment it proven to be almost precise as the as the pointer analysis from one end and also almost fast as the class hierarchy analysis when we check compare them it was almost the same regarding scalability comparing to the simple static and almost precise as a pointer analysis okay so this is what we did we will not describe the algorithm here but we will try to give you the intuition basically the class hierarchy method is great method so it can give us the types it gives us all the information basically but we cannot use that because most of the languages does not have type like python this is one reason another reason it gets too many types too many false positive there and it also does not deal with reflective call but let us leave it for a moment so we want to improve the class hierarchy so what we are doing basically we are calculating the type but we are calculating the dynamic type not the static type basically if we have some implementation of some interface with hundreds of decedents and we have one factory that generate one instance so we know that the type is these instances and not all other decedents so what we did we built abstract interpreter to the program that can very efficient interpreter the program and understand what is the dynamic type each location of the program and I will not go into detail but I will try to give you intuition using this running example okay so here on the top right you will see the type that we are calculating and on the top left you will see the interpreter location in that case the interpreter location is basically basically here I will see the time is basically here so we want to analyze that program so we know we have a call site we are calling in a monkey constructor so the interpreter on the fly build the call graph and understand that there is a call to the monkey constructor so we have the first edge of the call graph and when we are jumping a method we are looking at the type that we know so far and we are binding the types and adding the type that already calculated till this point and in that location when we are jumping till this look to this we know that the self is bind to the monkey type because this is a constructor so this is what we understand and now we know that a food a field of that class is bind to basically to non yes to no type because we assign it non in python and this is interpreter so the interpreter know to return to the call site and know that we call a constructor is the type of constructor is the class so it know to assign m variable the dynamic type of the monkey so the type of m will be the monkey now there is some integer operation now we are calling put so again as before we are jumping to the banana and we are returning from the constructor again the entire type is banana and then we want to jump to put but what put what is put which function is put we don't know the type of n is monkey so we know that at that location we are basically jumping monkey put so we are jumping monkey put and at the jump point we are binding the type that we know so we know that m type is monkey so we will bind safe to monkey and then we know that we just return from the constructor and the return type is banana so we bind the first argument to banana and basically we are jumping that location and we bound bind those two variable to those two types and now then we have assignment of food local variable with the type banana to full field so when we do that we know that at that location full field is banana previously it was none but currently is banana and then we will go to the warm we will do the same scenario again and again go to food but now food local variable is bound to warm and not to banana so in that case we are assigning again and we know that basically the food can be non-banana or warm during the execution okay so the interpreter will continue to the eat eat method and now we want to jump to eat so what is it we look at the type of m m belong to the type monkey then we know what is it, it belongs to monkey so we are jumping monkey and now we want to jump food eat what is food eat we look at the type of food the type of food is banana or warm so we know that we are calling it of banana or warm so we will call that location and by this this is just a very high level but by this by calculating the dynamic type during the execution we know how to deal and build the call graph and we know how to deal even reflective call okay so but how accurate is it and how fast is it so if it's scale to our problem so let's see the evaluation okay so first we want to have measurement for different metrics the first metric is to see if it's accurate if it's build the right call graph it's very difficult to measure that because we don't have the actual call graph for a real application so what we did we did something a bit tricky we took many projects we ignore all the open source of that projects but we choose projects with a very good testing coverage assume close to 100 testing coverage in that case we assume so this is assumption we assume that when we run the application with those tests so the call graph but in the scope of the application or the open source is close to the actual call graph because the test the test basically cover all the execution part of the application so if we are ignoring the open source we cannot do it in our use case we must analyze the open source but just for measurement we ignore the open source we build the dynamic call graph we build the static call graph and then compare them and the assumption is that the dynamic call graph on those project is close to the actual call graph because we choose a test with good coverage and this table is from the evaluation we did for Python so we can see that for a moment what is precision precision is a way to measure the false positive so if we have 90% of precision it means that we have 10% of false positive and recall is for false negative if we have 80% of recall it means that we have 20% of false negative so this is a good way to measure the false positive and false negative ratio so we saw that the precision is almost 90% it means that we have only 10% of false positive so we have a very accurate call graph and we saw also that the recall is great but this is a bit different than what I told you earlier I told you that we have a sound algorithm so if the algorithm sound the recall must be 100% so what we have missing edges what we don't have 100% here this is because in this evaluation we didn't use the reflective call feature so algorithm support reflection but we didn't implement that in this evaluation therefore we missed all the edges that came from a reflective call and those edges we see them now as a false negative okay so we saw the graph is really accurate but what about performance and what about the ability to calculate the ineffectiveness so we did a test in .NET in Java let's see a piece of .NET evaluation let's look in more detail in Java okay so this is part of .NET evaluation we took many public projects from Github and those projects contain vulnerabilities we run our analysis and check what is effective and what is ineffective and we saw that 63% were ineffective we did the same in Java in Java we did it on hundreds or even thousands of projects we had automation that scan basically Github offline so here we see just a piece of this evaluation basically the average was more than 70% of the vulnerability that the application is using those application in Github that they are using their dependencies more than 70% were ineffective and here we can see the running time comparing the line of code so we can see that the line of code is huge because each one of those application even if it was a tiny application it used a lot of open source and here we are counting everything the open source and everything so we saw that most of it is ineffective okay so let's see a small demo now so basically we talked about JavaScript we talked about Java, Python we talked about .NET, Python and Java we didn't talk about JavaScript so let's do the demo JavaScript basically just for this demo I just downloaded I downloaded a project from Github just a arbitrary project with the vulnerability and scan it, this is the link of the project so let's scan that project so what we are doing when we scanning what we are doing when we scanning a project we are analyzing the project we are seeing the dependencies we are seeing what dependency has vulnerability and we basically see that basically this project it took 27 seconds to scan and it contains three vulnerabilities those are the vulnerability of the project sorry four vulnerabilities this contains two CVE we see this green arrow it means that those are not effective also this CVE is not effective on the other end this CVE is effective and we can see here the trace of that CVE I think that you are almost out of time so I cannot see the trace but if you want you can come at a break and we will look at that in more details but we have also all the traces that from the application those are the traces from the application to the vulnerable element and you can see that the scanning time took about 27 seconds basically this is not a huge project this project contains almost 28,000 lines of code small project but still it took 27 seconds to scan it and we saw the evolution that even a huge project took a minute to scan okay so this is the summary we saw that this this technique help us to prioritize the work we can perform most of the work to later sprint what about that work at all we saw the algorithm is scaled to real life scenario and basically it's implemented in the company tool meant company and that's all basically so now if you have some questions I will be happy to answer so no question at this late time sorry I don't what is the vulnerable function okay so the question was when we have a vulnerability and when we have open source and the open source contains CVE how do you know what is the vulnerable element okay so this is a different lecture we also develop automation for that basically we have a research group that are doing that and we also develop automation tool for automatically calculate that and in our database for HDV we know exactly which is the vulnerable element yes no this is part of our sort of meant company we have some community addition but it's not open source no we are doing code analysis basically it's very similar technique that what's happening in SAS tool and we are building a call graph and then for when we have the call graph for generating the trace it's trivial it's just reachability on the graph so when we have a vulnerable element and we have the proprietary function we just see the trace of the call graph and building the call graph so this is the algorithm that we have developed that we talked about we didn't describe the algorithm but we tried to give the intuition behind that so so I think that that's all yes okay so now if if you have more questions you are welcome to come and ask