 OK, so we've got everyone here, which is a static analyzer. You speak up a little bit. And we speak up a little bit. There we go. So why are we in the room today is because programming is hard. Actually, it's pretty hard. And it's a new program. And you want to write very, very much as possible. You need to write all the popular cases in the work case. You need to keep track in your head of the possible values that this function can be called in or that it can close through your program from the user. If something had been out of your program, it will be out someday for someone, and your program will crash. And you ship books to your users, and you want to pay. And sometimes you need to take a deep step to create other books, because it's a great way to add services. So what do we do to embed that in engineering? Well, we try to do that code. So we develop pretty practices. We add tests, try to architecture the code in a way that minimizes the likes of the boundaries. We add more tests. And then you can sort of read the language that has more or less your opinions and your pitfalls. Because your language will not value a lot of text, you know? Or if you have a new menu to make files, for instance, you just try this one so that you don't forget both the files. On your language, you can have more or less which fixes them through the both of the random issues. Although you can always choose your language. There's tons of other in there. Showing your real work is a real good business and making sense of that in your life. Or you have to pick a particular language to use a particular language. So this talk I'm going to be talking about now is this. And it's just one additional way to give signal to developers that the code is going to go and doesn't have too many bugs. So in contrast to testing, Staticam is the way our company ever represents. Attempts to check all programs as and all values at the same time and make sure that the program works in any case. And you can also think of it sometimes and pilot it from key language features. For instance, if your language, for some reason, decided that all objects could be now and there's no reason to test at runtime, then Staticam is just the way to make sure that things are not at the point where you take them. So I think there is such an item either. We can come up and look at it. It's an item that is Java. And C family of languages, C, Objective C. And we have the C++ support here. And the main characteristics are it's inter-procedural. So if we try to pull the values from functions to functions and across translation units to find inconsistency in the code, something can be deep inside the function code and you can deep inside another function code. And the other characteristic I want to stress is that it's incremental. So it's very fast analyzing code changes. And that's very well done for those of you who use it. It varies on GitHub, using among stars. This is from back when we were going to do it there. People at the time seemed to like it. Since then, we've opened collaborations with other companies. So it's not just only a big book. It's a pedophile, a viewer, a follower, and more and more. And if you want to use it, for instance, in the website and your socials, you can get started on your code base. Or you can even try it on your browser if you want to quickly test some functionality. This is for an older version of it. It's a good, good idea for what you can do for Java. So that's it for this protection. Now we start with a quick demo. So this is a tiny project. It's a bunch of C files, and the Mac files that make, compile the files. And if I want to run in for this, first, I'm going to make in code in for. We just get, we use your first build command that's used for your project in for. And then we hijack the compiler and the computation. And then the computation. The build is done. To fill out what it needs to build, and then we hijack the compile calls to read the supply itself and analyze them. So here we see that it's detected by make mode. This is make talking. Then it compiles, compiling the files. And then once these are captured on this page, then the analysis starts. And here we find the three issues. And one is not the reference, so it's not here. So it's a point above, which I got here. The reference here, and here we go. It's going to be in the memory. So this one, we can stand straight away. And we declare a number in there, and with the reference it. And so we're running out of this. So we can explore the report. Here's another tool that she's using. Are you sure? Of course. Will you try to keep the terminal up at the bottom? Do you think so? We try. So yeah, it is big trace. Which you can withdraw. It's informative, but let's do it in Emax instead. So here it repeats the back. But then we see the actual symbolic trace. So these are the steps of symbolic execution that you got forward to take on the mission. So it says, oh, I entered in this other procedure. And if you go inside, it's in another file. And it's not also made a connection with the tool. And so you call this procedure, which was lab. There are these issues here. And then here you get to here. And the trace says, condition is false. So p not equal to null is false. So at this point, we know p is equal to null. And when we return p, we get out the function. And now what that means is that bug is null here. And then we get into get age, which is, again, another five. And what we see in this procedure is just a reference point. There you go here. Not the reference. So what we're going to do to fix this is that it doesn't affect. And then we're going to rerun in here. And if I rerun make here, what you expect is that only file that I changed will be compiled. So you can tell him for this. You can tell him for exactly that. It has to capture some reactive change. It has to react to a change that I'm doing in fluid analysis. So we've not discovered what it captured before. It was not like before. We just capture whatever is new. So here just person.c. And reanalyze just that file here. And then the file that you can omit. So here, in particular, I realize bug.c. Because it depends on person.c. And so now you see that there is one for bug.c. here. Because we're reanalyzing all things that depend on a change. So then the memory age is a, which should be easy to fix. We don't need to get a trace for that. So here we get real jbub. And we get some filbub. That's the end of what we still like. And we haven't freed it. So what we need is, we need to remember the age of what somewhere to run again. I only get my issues. I don't believe the real issues. Okay, so this is a quick game. So what we can do here was very simple my project. But again, very practiced and very hard projects. And this is the ordinary of this system. And the kind of things that we find, we go over quickly there on the website. And there's more explanation there. But briefly, for cc-class test, there's another front memory leaks and resource leaks. So if you open it back and you forget to close by descriptor, I think it is. So very, very important. For C++, it doesn't quite do the Raybound checks yet. That's the hard problem in stake analysis. So we haven't quite come up with something happy with it. But what it can check is, which is the pattern happens sometimes is the vector is actually empty and you try to access an element. So this is the object. This is also the text. It can be a very different note than I wanted to make. And I don't know for things. We do Objective C because we run on that for iOS. So there, the only thing I mentioned is that we infer the text with the cycles. With our cycles in the memory that you see the single counts in the arbitrary top bubble. This is an example of things that static analysis can have with issue language assumptions like this. And for that, there's no, there's a GC so we don't care about memory leaks. But we can have another reference on resource leaks. There's a tank analysis between two dots. There's, that's an interesting difference in game for some engineers to do that. So this is something where you can annotate some methods as performance critical in your code. For instance, things that run on the UI thread in the beta. And you can annotate other methods as expensive. And then infer, we check for you that there's never a co-chain from a performance critical method to an expensive one. You can see the website. We have also some Android specific checks that you can paste this and quite likely thank you. So I mentioned that this was just a type project. For instance, we run on Vector Go, which is the app for this Vector Go browser search engine. So here's an example bug. So the record by info was there's a resource you acquired a line through and it's never released. Some other line. What I mean in the code here is we had updated the resource here by doing this TV query. And here there's an if branch. And in that if branch you can return for the function before you put those. And so here you did the first one. So that's that. Another example. So here what do we have? It says that this field object here from this nested method code can be done. And then it's the reference. Let's see it goes into the function, which inside initiates things to now. Then it does a TV query. At this point the cursor can be empty. And so we don't ever enter the branch. And we get the final way. We close it and we return out, which we never changed from the beginning. So we just did none. So here the end of the space field object is none at this point. And we go inside this color function and info when you analyze it so that's one of the things it does is it references the form. Okay, you got that. Okay, that's it for now. So I will just say a little bit about how it works. Not going into the details, the various phases of the inference. So let's help you understand how to use it. And then I will tell you a little story of how we use Facebook. So the architecture is like this. So this is your program. It's a bunch of source files and the build system that knows how to define them. And these are the things that are very important. And so the first thing is it goes through the format which knows how to drive the build system to get to the source files and translates the source files from all these languages into a common, very simple, intended language also. Which is basically the go-to-lose and load-for and assignment. And then once it's captured all the lines in your project, it starts the analysis. And so for each procedure that is found in the source file it will produce summaries. It has some points to produce. It will attempt to prove that the procedure has no bounds. At some points it will not manage to prove that and at that point it will decide whether to report that as a variable. So then taking it to the first step, the capture. So here's a simple tab out class. Let's look at this method. Compute something. So it does the flip as a logarithm if you don't like that note or something. And right then we translate this method as a simple control program. So here you see you cannot read it. But what you can see is you can see the two E-branches and these are the instructions in the instructions just assigning a string. And then at the end both branches join to the exit note. So that's it for capture. It does that for all the source files and then we translate each procedure as something like this. And then the analysis runs on those smoothies or PhDs. And what it attempts to do is it attempts to understand two specifications for them. So what do I mean by that? So for instance this method it will discover that if the parameter exposes then it will run something and it is true you've written it now. And so the state before the procedure is called the pre-condition and we say afterwards it's the pre-condition and together they perform a specification of the function. So if you start with this pre-condition and you return the function then you will be done with the pre-condition. And so for this function there are two specs. And then if we look at the other method it calls the first one. The order is of interest then tries to compute the rest of them. And if we're smart enough to see that this function depends on the other one it will or if it hasn't already when analyzing this method it will realize that you need to first compute the specs for the other one. So at that point in the method it will have these two specs for the coding. And we call compute something of truth so if we look we expect to make sense of that and only this one makes sense. So when you call that function with truth you know that it returns now. So at this point you know that this is now and the next line it can use because of zero-prancing. And that's it for how info works in its main mode of operation. I want to briefly talk about two of the mode of operations. One is eradicate. So in Java Objects can be known and if you wish it once so you can use eradicates so with eradicates you can annotate objects that can actually know with ad-mobile and everything else will be assumed to be known. And what the eradicate does is it will check that the local base is consistently annotated in the fashion and what that means is and whenever you're there for something that was marked no then you better check from before and eradicate with the other two. And so if you do your local base like this in theory you should have no render and no render exceptions. So it's my good type system top of Java rather than the main mode of info which tries to really try balance. Another mode is for the client side of info we have linters which are simple simple check which are just syntactic and walking over the AC and looking at the nodes what's interesting is you can write your own checkers easily either inside infer you can add an infer to a node or a camel or you can even use a simple DSL we have and wanted to show what one of these checkers can look like so this is a checker ranking of XC and the property with a point type should not be the same what I'm talking about and so the checker for that is just destruct that's a mental report but message to report and the checker triggers and so mental report is easily expressed on here if this is a sign property then it's a pointer type and if I decide an objective C project like all then I can report so you can write a point range of checkers okay so now I open this I'm very happy to hear that thank you so how does it work so I think when the team arrived and they wanted to fix this so how is it a minimal amount of good and get to C and so there was a question of how to do that how to showcase bugs to developers because you have this tool you can output a list of bugs and you have people working on the code base so the first thing that dried is just an infer, get the bug list and try to be smart about see some people in the room about that so that actually works so well in terms of a couple we solved this already story for developers so completely switched to different tasks but then if it hasn't caused a crash in front then you can still do that and for bugs reported by static analyzers it's very hard to know there are actual bugs on that the analyzer on the analyzes functions locally and it doesn't try to stop from entry points or anything like this and so the metric here is really the same but just the developer thinks it's worrying enough that he takes it so instead what we found out is we have to integrate her inside the developer so when they go to the mine and last time to do that it's a different time so we're submitting code video and in her we comment on the code just like the rest of the CI and on the reverse so the developer sends a diff and so this is more or less a set up by Facebook so we use PiCator as our concrete system it's also open source so it goes into the CI and the Fabricator the CI runs the test and one of the things it runs is Infer meanwhile the code videoers in their opinion of the code Infer we try to find issues with it so for instance they say it on the back then what the developer will see something like this just message the same message oh hey I think this can be on the reference and then at some point everyone happy Infer is happy as well and the diff goes so in a bit more detail on the Infer side of what happens here what happens is a lot of good when we get a diff infer on the top revision one with the diff with the changes of the developer and we run it also on the version the base revision of it so before the change and then we compute only the new bugs the one introduced by the diff but presumably reduced by the diff and we recall those so we recall old bugs because as we saw the developers they're not really interested in fixing those new bugs and this way this gives a high signal to the developer that their responsibility is different or he's finding the stories to why this definition so for now these steps are infer and order and I think it's much better to render for example because the business that they do that is in facebook so yeah not in Infer self but these days we're working on bringing that logic in for itself so that more people can yeah, that's it so every month the program there are thousands of modifications and it records hundreds of terrain per month that is based by the developers and what I was talking about before what we measure is the rate which people fix bugs so we want that to be very high otherwise before we start ignoring the tool and then the part of the business most people are trying to start to come back so approximately 70% of the one in Infer is and that's it thank you for the function in the group so Infer is in theory what you do in Infer is when you have a maturity function you start by using any one of them and you analyze the others but on the results of that and then you go around the loop some amount of time and you convert them and then it's some theoretical point future space will stop changing Infer in practice we go around the loop twice and I think now we don't go around we go around the loop we start somewhere because in practice people don't write that many things so we don't have to do that if someone wants to I mean the theory supports it and the tool supports it it doesn't matter if somebody does another thing so the question is how do you compute control progress given that for example functions loops function pointers and dynamic dispatch so there are several issues in what you mentioned that all need to be tackled and the answer so the cfg itself is just a no representation of BST so it's nothing like in Infer when you go to function a method code and you do the method resolution and at that point Infer will track the precise type of type of the object in it and so you use that type to figure out the method but that means the biggest problem we have to overcome is that the code graph is now dynamic you can first start the continuous code graph and start analyzing function in order starting from the continuous code because you don't know which one is putting which other one and so the way we cope with that in Infer is that we are under an analysis so when we get to a code site it could be that we haven't analyzed that function yet because it's dynamic dispatch and we didn't realize that this was a code and so at that point Infer knows that the code is doing good in other functions but these are all very good, very tricky issues and we saw some in the dispatch and we didn't see some of the function pointers so we saw some of that so which one so what was your list again so in Infer Infer assumes everything is sequential for now we're starting to develop a differential analysis in Java and of course with extra functions so close to the exact function that's an interesting one so we have two approaches one is for libraries that I use often and for which it matters we have models inside Infer so it knows what's the hash map so you can always add models and in general but the basic thing it does is it finds a function that it doesn't know anything about then what you do is what we call inject mode we assume that function is the best people thing for the rest of the program so if you return some object and you reference it then oh yeah it was probably a little bit better and during that we found we enabled us to analyze a lot more code than when we used to do more of this instead so the question is there are 30% of issues that are not fixed by the developers and so as I said it's impossible to make this thing a false policy and mix in one vector and figure out things like this so we don't try to measure that and we do the developer but I don't care so what we do is in our team we have someone who can do that and we have developers in several directions of that and we go look at those and we have a discussion with developers oh maybe then you fix that it's like a bug me or oh sorry it's clearly a false positive we try to improve it for that case so we improve the tool continuously based on that for C which compiler is that part so the question is can we do you because there is an intermediate language for false policy so we do the analysis we don't talk about the source code so to elaborate so we do the analysis for clearing on top of the KNST so we start from this T for Java we start from bytecode that means we develop the strategy actually it's kind of in theory you can but the thing is we have I mean there is no particular difficulty there but we don't have a mode where we can just give bytecode inference and we will analyze it you have to give the source code but if somebody wants a nice time then they have to do it with code because there is a lot of code so the question is how does it compare with static analysis there are a lot of static analysis in the wild and when we found that when trying to compare them they all work pretty much different sets of bugs so the best strategy here is minimal but it doesn't come forward but we want to give this kind of open session but if you can't find any way then it's not going to work because it's a fast version of stable so the question is what it says actually it's a good question so I don't have a very good answer to that I can put technical differences in the tools static analysis is needed to analyze a single transmission where I think it works across competition units in general it's not just to find sort of debugs it's pretty inter-procedural so it will find bugs sometimes across encode chains and items and it's also not looking at the best site so yes this how do sports bring integration and how do I write the last one do sports integration and for example how do I write the last one how do I write the last one screen integration screen integration how do I write the next one I'm sorry how do I write the last one I don't know what that is that's your answer there's another we can talk about that we can teach you about that and as we consider the matter in the report in each yeah that would be great and thank you do you find what's for the kids so do we plan to vote in for like 10 kids so I don't see any difficulties with that I don't think we want to do that because we don't use it so it doesn't make much sense but what we're doing as I said is we're trying to put most of the logic you would need to do that inside so that you could say infer different analysis of this and that and then it would automatically just do what I said, organizing the top vision, the base revision and then the reports from in-car are simple objects so hopefully someone will write some interface for that I will just write the question for you do you work your line to write foundations, defense injection oh right yes so I think we'll write this result before we go the question was do we have any ideas or not at the moment it would be nice to have we're using it's just we can also format it in a way that it's not there's nothing to integrate to run current changes what's the question infer is written for camel you said that you run on deep just on deeps is it implied that you need some great storage for storing some for this right and the question is in the term that we run on deeps so don't imply that we make an external storage so not really so no because what we do is we really run it for on the deep and on the distributed so we run it twice for every deep so that we can do comparison and so forth so that's the longer story is we're most of our projects are back and we used to have cash to cash the results of the analysis so some very often the case that some parts of the analysis results are already in the cash and we don't run it if I recall correctly I saw one slide where the output from infer was pushed to a source code management system or review system was it correct or do you support integration with for instance bucket or a kit app that the remarks of infer coming to the machine pain so the question was what comes inside a review tool it's just integrative fabricator and that's not even infer talking to a fabricator API it's just the CI reading the results from for and then looking at fabricator we only have this sort of integration but that's something we're hoping to build tools for there was a lot of commented code where I showed something in HD this guy how am I going to do this everyone is working to write so the thing is it's just a matter of writing a compiler from your language to read a person so what we can do is we should be able to so if you need intermediate language then you can base whatever content you have we have times for other projects learning biophones so we were offered a lot of footage to run on that for example for Libra for Libra for Libra for Libra for Libra for Libra so we're here information so the question is how long does it take for access as a pyrofoam and we delegate so I'm running in pyrofoam so for the way that I work we do have so in a CI working perfectly fine so the answer in the deep analysis you can give the on this we end to we work within 10 minutes and so our P50 around that is much worse because because of the deep analysis in products there is a small change to trigger and so it depends on how much is in the so the question is I mentioned that we go look at work that people don't fix do we also look at work that people don't fix sometimes we're great at it because speed has been fixed and we first talked about it at some point so we got it and then what do they have like that's the most great work we've done we'll just do it while in fact we've got people the question is I don't think they do so these are not runtime bugs these are sure so my thing is info will be running on the next this as well so if you found it this time we'll find it next time so maybe you already know that we're going to do more questions ok thank you very much