 I think so. Our first talk today is about dependency analytics. And we have Sunil Samal and Agam Shah, both from Red Hat. And they're here to entertain you and maybe scare you a little bit. Because they're part of the group of people that had never seen the Terminator movies. Good morning, everybody. So as I said, I'm Sunil. I'm from Red Hat, and he's Agam, my colleague. So today, before I start with my presentation, I would like to tell you about an incident that happened back in May 2018. When the NPM security team received and responded to a report that one of the NPM package named Kate Cookies have a security issues, have a malicious backdoor that allows the attackers to inject arbitrary code into a running server and execute it. So the problem here is that the external applications using the package are under the risk of being exploited. So later, the NPM security team resolved the issue and unpublished the package from the NPM registry. So here you can see we use dependencies on a daily basis to build our applications. So dependencies are nothing but third-party libraries or modules that we use to build our application. And there are lots and lots of dependencies. So hundreds of dependencies gets published every day. So as you can see in the graph, let's take, for example, for the NPM. Each day, thousands of NPM packages gets published to NPM registry. So this many number of choices a developer has to choose from. So when we started looking into this space, we found a potential problem to solve so that we can ease the development process. So we figure out that there are three certain parameters that we need to see while choosing a dependency. The first one is security vulnerability. The second one is license. Third one is popularity and maintainability. And the fourth one is completeness of the application stack. Next slide. OK, what is CVE? So CVE is a security vulnerability that may or may not be present in your application dependencies. So it's always better to check for CVE before using any dependencies to build your application so that at a later point of time, you are not under the risk of being exploited by an attacker. And the second one is license. License can be tricky to choose. Like, most of the developers, they don't have an idea like how the license open source license works. So how many of you here know like the Apache license and the GPL license don't go with each other? So like most of the developers don't know this fact. So they suppose, for example, you are using a package which is in GPL and you decided to release your project under Apache 2. You know you can't do that because Apache and GPL are conflict like they are in conflict. So at this point of time, there is no tool or application which do that. So when we started looking into this space, we created a kind of extension for VS Code which helped developers to choose the right dependencies in the development phase only. So let's see a demo how you can use our extension and choose the right dependencies for your application. So my colleague, Agam, will explain, like will show you the demo and explain our goals and future use cases. Over to you. So yeah, thank you, Sunil. So as Sunil mentioned, there are a lot of dependencies to choose from. And these ecosystems are growing rapidly. Like for NPM, for Maven actually, there were 102% growth in number of packages in the last two years. So you can imagine. Now coming to our goal, we are providing our goal in three aspects. First is the higher confidence aspect where you can choose dependencies which are free of any vulnerabilities and which do not have any license conflicts. So you can have higher confidence while you are productizing your application. You have higher confidence in your deployments and your enterprises have higher confidence in you that you are shipping and building the same, like the best software out there. And the second is the increased productivity levels. So what we mean by here is you don't need to go anywhere to search for any security vulnerabilities, or you don't need to go anywhere to search for licenses for a specific dependency, because we all show you right inside your editor. Third is the reduced development risk. As we already discussed, there are a lot of malicious packages that are out there. And as a developer, we should be knowing that while developing a replication, because that increases our risk if we don't do that. So yeah, it might save you a company a few dollars, and you, your time, yeah. This is the current statistics of our platform. So as you can see for Node, we have like 577,000 packages and 1.2 million versions that we track for CVEs and license conflicts. And also, we have 438 CVEs as of now for Node and 531 CVEs for Java and 195 for Python. And we have 28 licenses, open source licenses, in our system. So if you're using any of the 20 licenses, then we can actually show you if there is a conflict or under what license your project can be released. Now coming to the demo part, so this is a VS Code extension. The presentation is already on the FOSSAsia website. So when you go to the slide, you can just click this image and it will land you up at the VS Code marketplace. So this is the marketplace of Visual Studio Code. As you can see, the extension is dependency analytics. You can just click on install, and it will open the link inside your VS Code editor right here. So I already have it installed. So let's just quickly go through the demo. So here I will be showing two kinds of applications. Let me just zoom in a little bit. Yeah, I think that's visible now. So here I'll be showing two kinds of applications. First is a Node application, which is a color chooser. So you can actually style your terminal using a color of your choice. And second is a Java application, which we created just for this demo for license conflicts. So yeah. So we are using pom.example for Java, as many of you might be using. And we are using Packet JSON, that is the NPM package manager for Node. And yeah. So here, as you can see, we are using four dependencies. First one is the ANSI styles. Second is escape string regular expressions. Third is support scholar. And fourth is Woodstrap. Now we are using specific versions. Well, that's for an example, we are using Woodstrap 4.1.1. Now as you can see, there is a red line down there. That means it has a security vulnerability. And if you hover over it, you can see that application dependency Woodstrap is vulnerable to the following CVEs. Now our recommendation is to use version 4.1.2, and the latest version is 4.3.1. So we actually show you, right inside your editor, how you can get rid of the CVEs. So if you see, there is a tooltip. Switch to recommended version 4.1.2. You can just click on it, and it's done. So now, actually, the CVEs that we saw, we can actually see it as a part of the stack report. So how you can generate a stack report, the whole report with CVEs, license, and everything, you right-click, and you see the dependency analytics report right down there. You can press Command D on Mac or Control D on Windows to directly just generate the report for you. So let's see the report now. So coming to the report part, as we can see, we have four cards based on the four perspectives that we have. So first is the CVE card. As you can see, Woodstrap has CVE 2018 14040. And if you Google it, you can actually see inside the NVD database that in Woodstrap, before 4.1.2, access is possible. So if you're using this, if you're using Woodstrap version before 4.1.2, which was our case, we were using 4.1.1. And your application is vulnerable to an access attack. Now, you as a developer might not know that before this extension tells, before this product tells you. And it's very useful for you in the development phase to make those decisions. Because once your product is shipped, and if a third party is able to access your application using an access attack or is able to exploit, then it comes to a responsibility part. So yeah, so it's better if you do it beforehand. Second is the licenses tab, where right now the suggested license is MIT because we have no license conflicts. So how is it MIT? Now we can go to the dependency detail parts, where we have all the four dependencies that were in the package JSON that have been analyzed. Now we have no unknown dependencies here, but that might not always be the case. There are, as you as I said, there are thousands of packages that have been gathered every day in NPM. So we might not have data for each and every package out there, but we made sure that whenever we see an unseen package, we ingest it for you. And whenever you run the report next time, we will have all the data you need for the particular package and the particular version. So we have automated data pipelines to do that. So coming to the dependency details part, as you can see, support color is as licensed under MIT. And it has this GitHub statistics. So why are GitHub statistics important over here? So you can actually check if your project or if your package is being famous across GitHub or not. How many contributors are there? How many stars? How many forks? How many dependent repos? What does they use at statistics? So you can actually make an informed choice that if this package is being used or is being maintained anymore or not, based on the status you are just seeing right now. So that helps in making great choices. And it has also this license, as I just said, it also has tags. So these tags are just from the NPM registry. If you're familiar with the tagging mechanism in NPM. And this is the bootstrap. So as you can see in bootstrap, we actually see a warning sign the security part because we had a CV, right? If you remember in the 4.1.1 version. Now, these are the CV IDs because we had a CV for this package. And the license is also MIT for this particular dependency. Now the third package is the escape string. And here it is also licensed into MIT. And ANSI's status is also licensed into MIT. And hence, our project license can be MIT. Now if there were a restricted license for this, any of these packages, and there would be a license conflict, I think Sunil already mentioned about Apache and GPL V2. They don't go together because of some patent clauses. So that can make a license conflict. We will see a license conflict in our Java application soon. So yeah, it's a part of the next demo. So then comes the insights part. Now, insights is basically what packages can you use? Think of it as a recommendation system for your application dependencies. Like if you're using these four packages, then you might be interested in using these packages as well. Like it's based on our data collection that we collected public data from GitHub, like what developers mostly use. And we've created a machine learning model that basically recommends you packages based on the packages that you're using right now. So for example, in terms of Python, you might be using NumPy, SciPy, and because they're pretty common nowadays, then you might be interested in our model we'll say you might be interested in pandas or TensorFlow or Scikit-learn or Matplotlib or Seabahn, those kinds of things. So basically, we have trained our models based on the data that we had from the developer's usage point of view, like what kind of packages usually developers use together. So it's kind of helpful because you're developing web application, and web application is not a common thing nowadays, or a machine learning application. It's not a very common thing. Every everybody is doing it. So you might be interested in doing what other people or what people usually use along with those packages. So you don't have to go and search every time, what is the best package out there for this particular, let's say, floating library, right? So that's useful. And we also show a confidence score, like how much confident we are while recommending this package to you. So let's say for Chalk, we are 66% confident. Now, as you can see, there is also feedback button. I like, OK, this is plus one and this is minus one. So if you're happy, if you're using it, then you can just say, OK, I'm using it. It's great for me, it works for me. It's a plus one, and oh, I don't like it. You know, it doesn't make sense for me. So it's minus one. Also, if you drop down, you can see a recommended version, what version to use of that particular dependency. Also, you can see the same statistics and the licenses for those recommended dependencies as well. So you can actually make a choice of what dependency to use based on the confidence score, or either you're comfortable looking at the statistics, or you're comfortable looking at the license that the dependency is using. Maybe you want to project into MIT. So you might not want to choose any dependency which is under less, more restricted licenses as CPLV2, right? It doesn't make sense for you. So those are the kind of things. And coming to the next part, where we are using a Java application. So here we are, as I just learned in this time report, similarly as I did last time, we are using seven dependencies. Now here, as you can see, in the licenses, there is a license conflict. Now, how do we show the license conflict? So we actually show you the dependency name which the license is conflicting with. So dependency is javax.sublet API. And what is the license? CDDL plus GPLV2 with class path exception. That is the license this dependency is licensed under in. And we also show you the other one, the org.javasist, colon javasist, which is licensed under Apache. As we already, I think, discussed Apache and GPLV2 don't usually go together. So here is the license conflict that you just saw. Now, how you can actually see what is the current version you're using and all the statistics right over here. So you might want to actually change your dependency to some other compatible license. So here is the dependency detail path, where we are seeing seven dependencies that we are using. And as you can see, there is a check mark in the every other field except the license step for sublet API and also for same goes for javasist because they are conflicting. So you can actually know what kind of things are broken here right inside your editor. So that's fine. Then there's the inside spot again. Now here, we are using many Apache packages. So as you can see, the third recommendation, Apache Commons Lang3, which is used with most of the Apache packages out there, if you're using some Java Apache packages. So the confidence score is really high. We are ready for personal confidence that you might want to use this dependency. So yeah, that's I think pretty much it for the demo part. So you can actually use it if you're using VS Code right now, and you can try it on your laptops if you're using Visual Studio Code. And right now, we support Java and Node. We are releasing support for Python and Go soon. So we will see it in the future use cases part. So coming back to the presentation, yeah. So what are the future use cases that we have? So as I said, Python, we are ready to support. It's already in beta. It's going in the internal testing. Second is the Go lang. Third is transit dependencies, which is already in the internal testing. So what I will talk a little bit about transits. So what transit dependencies do is how actually, so let's for an example, in this particular case of NPM, we only had four dependencies. Now, if you go to the Node modules part of it, once you're in NPM install, you can see many dependencies, like AJ, VEA, Khan, which you didn't mention here at all. At all, right? They don't appear at all. Because those are like dependencies of dependencies. Like maybe ANSI colors need it, or bootstrap needs it, right? You don't know. So we are actually going to show you a complete stack of what your dependencies are using, as in the whole transitive report as well. And we are going to show you the statistics, the license conflicts, the CVEs, everything for all of your transitives, right inside your IDE. So you can actually make a choice as to how and how. Sometimes what happens is there is not a problem in some direct dependency. For example, there might not be a problem in bootstrap. But one of the dependencies of bootstrap has a problem. So how would you know without actually looking at the report? So that is kind of tricky as well. So that is one part. So we are looking at transitives. It's already in beta. And I think it will be released along with Python. Third is the container application. Fourth is the container application scanning, where we actually are planning to integrate this whole thing inside Clare, if you're aware, a tool by CoreOS, which actually scans your container images for runtime vulnerabilities. But we also are planning to do it for application vulnerabilities so that we have a complete end-to-end picture from runtime to from your system to your application. And you can provide your whole report as to how your application is vulnerable at what levels. Fifth is the probability vulnerability based on AI. So right now, what happens is 60% of vulnerabilities are never reported to NVD. So think of NVD as a public database for vulnerabilities. It's not easy to get through. It's not easy to just look at it and tell what package is vulnerable. But it's public. Like anybody can see it. Anybody can use it for their purposes. So that is like 60% don't actually even end up in NVD. So what we are going to do is we are going to look at GitHub. We are going to look at GitHub. So particularly, we are going to look at GitHub commits, GitHub PRs, GitHub issues. And based on the text, if it looks like a CVE has been fixed or any security vulnerability has been fixed based using machine learning, then we are going to show that right in Studio ID before it even gets out there, even if it gets reported. So that's one part. And the fourth is predicting service upgrades for OCP. So this is more to do with OpenShift. So if you're using OpenShift 3 and you want to migrate to OpenShift 4, now if your application is compatible with OCP 4, before even deploying that, we will tell you using machine learning that, OK, how similar applications were deployed and if they're compatible at all. So you can actually make an informed decision beforehand if you want to migrate to OCP 4 or you want to remain to OCP 3. So that's about it. You can find us on GitHub. And you can chat with us on Mattermost. Mattermost is just like Slack, but it's open source. And it's all public. So you can file us an issue or you can rate us on Visual Studio Marketplace. And you can leave us some reviews on Mattermost. And if you have any queries or any questions or any feedback for us, then you can just reach out to us here. Thank you. I think we have time for questions, right? Yeah, one or two minutes, yeah. OK, so you're basically asking for any other platforms, right? Like Sublime or IntelliJ or VIM or anything, yeah. So we have plans to do that. And we have received a number of feature requests from all of the developers that we need for IntelliJ. And so we are prioritizing it accordingly. I think once we go back and give this feedback to the team, we will, I think, prioritize it accordingly and let you know. And you can just follow us on GitHub or on Mattermost to see where we are. And you can actually file an issue for this, marking it as a feature request. I will, at least, suggest you do that so that we can look at it. Yeah, it doesn't get lost, yeah. Any other questions do we have? Oh, you can find us at the center of this booth. If you have any questions or you want us to demo something to you or you want to just try the extension, you can just drop by. And we will see what we can do. I think, thank you. Thank you.