 אז תודה רבה לכם לתהיה, ותודה לך לצלת. ככה, אני באמת חושב שבאותו בין-אינטרנטיילים, להסתכל כמו פירטי, אז תודה לך לצלת על עצמך את ג'אר, אני חושב, שזה לא יעבור. אז עכשיו אנחנו אעשה על ג'אר פיילים, אבל מהו, דפנטיילים יכולים לבין את ג'אר פיילים. ובמשל, כמה ג'אר פיילים, קלאסטיילים וקלאסטיילים מהו. דפנטיילים, אנחנו לא נשאר על כפיילים, ‫לכן שזה פתוח קצת ובשפחי, ‫אבל זה שוב קצת הכי הכי הכי. ‫אני דניאר גרגר, ‫אני סיכורתירי רסורציה ‫את פלואלטר נדבורקס. ‫אני עושה פיזיקה ‫באופן אוניברסיטה של ישראל, ‫ולדעתי שרבה מאוד ‫בסבלתי על השפחים. ‫למה שרשתי לך על זה? ‫בסבלת של כמה שנה, ‫אני עושה על שתי רעות ‫בשפחות שהם עושים ‫במעברת, ואני째 עם wouldn't need to look at your first I'd analyze the dependence that includes so I started to work our how I can do that and i i discovered that there is no single good resource that you can use to that you have to go looking at various documentation of various projects you have to go look to read various blog post as hundreds of blog posts that the various people wrote. You need to go, even actually, open books in quarantine. Way too much stuff. So I decided that they need to meet some kind of single resource that can be looked at, and you can just take a look at it and I understand when you can find those dependencies. Now I acknowledge there are various automatic tools that extract all of those dependencies and let you know in a nice organised list. אבל אני חושב שאומרים שכבר כך תלווה את זה. אנחנו לא נראה כשם הולך להקראת את המשין. ולא רק כשם, משין צריך להקראת את זה שם, או שם צריך להקראת את זה. אז יש שם מובקש כשם תלווה את זה. אז היום אני אעשה את הנפצה. אנחנו נשאר את השמחה. אנחנו נדת על יבוא וג'ר פיילים בלבד. אנחנו נדת על יבוא תלווה של דפניסי מינדג'ר. ובמשל, שאתם יכולים לראות עצמם עצמם עצמם זו תלוון מייבן, שתלוון פרומק סמל ובסקס, ולאנטי אורס גאי אינישיאטיב, שזה תלוון עצמם עצמם עצמם עצמם ובאמת תלוון עצמם עצמם עצמם שבכלל, כשבאים את כל third party dependency management tools עצמם עצמם עצמם זה מה עצמם עצמם עצמם ומה עצמם אז אני עובד את המחשבים, כשאנחנו יכולים לשחק את בו של מייבן ובייבן נראה שזה עצמם עצמם אז מה זה עצמם עצמם עצמם עצמם עצמם זה נראה כשתלוון, שהסובליטים התלוון מלחבים את הלבים איתך פעם בגמרי, שזה בעצמם למה זאת, וכי את הלבים גם הייתה להשכח את הלבים לתלוון פרמק סמל של פרמק סמל בגמרי So Java came as a solution to that where developers could write once and then run it anywhere they want. This was solved using something called the Java Virtual Machine. The Java Virtual Machine, basically each machine that wants to run some kind of Java code, first compiles and installs a virtual machine that takes that Java byte code and then it runs it itself so the machine doesn't have to actually know the Java. And this allows sort of universality in outcomes. I'm sure pretty much everyone has seen this picture at some point or another. It's even become a meme, I think, at this point. Everything runs Java. And I think that that's why this topic is important because anywhere you go, you'll find things that run Java and things that run Java usually depend on other things. So Java itself is actually quite old. It was first released in 1995. The first officially supported version was released early in 1996. And the problem that that creates in the programming that we use today that's used with open source and with sharing of various projects is that open source wasn't as big of a thing back then in terms of Java and its sharing of resources. So today lots of projects include dependencies as opposed to back then. Actually so many dependencies that your dependencies have themselves dependencies. Transitive dependencies are quite a significant topic in today's open source world and it creates quite a few problems. So and the tools we'll look at today, they will all address this problem in one way or another. But before we continue, I want to talk about JARs. So what is a JAR? A JAR is basically a big file that can include anything. You can put any file you want on it, but if you want a specific kind of JAR that's called an executable JAR, which is the kind of file that you can use the Java command and run it, it has to follow a specific structure which looks like this. So it basically, there's two main folder chains. The first one is the one that includes your classes. For example, here it's in the old example folder structure. And the other one is the meta-in-folder, which includes various metadata about the JAR file, and includes several interesting files to look at, such as the pomex ML file and the manifest file. So how do I actually find those dependencies in the JAR file? Let's take a look at Maven. So the Maven uses the file called pomex ML, and a few fun facts about Maven. So the name of Maven means one who understands in Yiddish. It's a tool for building and managing Java programs, and it provides a way to share those projects with other people. This is done for the Maven central repository, where you can upload your projects and download others using Maven as well. So this is what the pomex ML looks like. It's quite a simple pomex ML. And you'll notice that at the bottom, there's a section for dependencies. So each dependency object will be comprised of a group ID, an artifact ID, and a version. All those fields Maven uses to find which specific artifact you're looking at in the Maven central repository. Now, as for tools that Maven provides for managing those dependencies, one that is quite commonly found is something called a dependency scope. So dependencies scopes basically allow Maven to adjust the way those dependencies are handled, from the way they are handled in the class path that Maven sets, which we'll talk about in a moment, to the way that transitive dependencies themselves are included in those projects. So there are six scopes. We won't talk too much about those scopes. Say it a bit out of scope, but not intended for our discussion. But in these slides that have uploaded online, you can go and look at it. I've seen in those here, they're available there, there's a little bit more information. So now that Maven has found all of your dependencies, how does it actually place them into a jar file? So Maven, we need to acknowledge the fact that Maven uses something called plugins. So some people even call Maven a sort of a plugin running framework because there's a whole hell of a lot of plugins and they're all very important. So the plugins will take a look at today, according to Maven assembly plugin and the Maven Shade plugin. Those are the ones that are currently recommended to be used and the ones you'll find most often. So the Maven assembly plugin, it has several options. We'll look at specifically an option called jar with dependencies, which what it does, it's quite, as the name suggests, it creates a jar that has all of the dependencies included in them. So the way it does that is it includes them all within the same jar file, which is often called an uber jar. Now the problem with including all of your dependencies in a single jar file is that you can run into a problem of classes overrunning each other. So for example, if you include log4j, for example, and then some project includes u and log4j, at some point when those classes get loaded, you have the same class appearing twice. And we'll touch on class learning in a moment and why that's a problem, but it's a problem and it's not a good situation. So how it happens is that it takes all of those dependencies and it unpacks them and it takes all of those class files that are unpacked from those jars and places them into your jar file. This, you can understand the problem when you start loading too many dependencies with this problem, you eventually run into a situation where you have the same classes appearing in the same place and they'll have the same fully qualified name and that's a problem. So a different plugin was created to solve this situation called the Maven Shade plugin. Now it does basically the same thing, but what it changes is that it shades all of your dependencies. Now what is shading? Shading basically means to take those classes that you added and you change their name a little bit, somewhere in the path, which means that if you include more and more and more dependencies, they won't have the same name and because they won't have the same name, they won't override each other and it solves that collision problem. The next tool that Maven provides, which helps manage dependencies, is called the POM parents. It's not really a feature, it's more of a way that it works. When you have a POM XML file, it can inherit from another POM XML file and it inherits all of its fields. So if a parent defines some kind of dependency, the child will have the same dependency. The child can override each field and that indicates some interesting dependency management scenarios, for example, the parent defines the specific group ID and artifact ID, but doesn't define the version. And so the children can define their own version or any other combinations you can think of. So what happens if you have a project that depends on the same dependency twice with two different versions? So Maven has a specific solution to that, which is that it includes the nearest dependency to the project. So what do I mean by nearest? Let's take a look at these three of dependencies. Let's assume A is our project and then as you can see that throughout all of those dependencies, eventually we depend on, dependency in D of two different versions. What happens in this scenario is that Maven would include version one quite all. What this means that if C uses any kind of functionality that is provided in 2.0, it won't have it and this can cause various exceptions and problems with running your code. So if you want to include the specific version of the other dependency, if we go by the same kind of logic, you can just include this direct dependency. And when you do that, specifically version 2.0 will be included and that's the one that will be used. And if you're looking at it from an adversarial standpoint or as a researcher, that's the one that you should be looking at. So 1.0 just will be completely discarded. The next one we're talking about is called the OSGI. So OSGI is a framework for modular Java programs. It was defined, it was made to solve a problem where back then there was no modularity in Java. It was a monolithic program that loaded monolithic programs and sometimes loaded dependencies. It was created the first version as early as 1999. So it defines a specific kind of specific kind of manifest, specific kind of handshake that various bundles as they're called use so they can interact in a more modular way. So the main configuration for OSGI is called the manifest file which appears also without. So let's take a look at it right now. So this is a very generic manifest file. It includes the three first lines of mandatory. The next ones are included when you use an executable jar file. We'll talk about the class puff and the main class fields in a moment but when you use the OSGI framework, you have a few more headers that are mandatory and if you see those headers in the manifest file, you know that manifest file belongs, probably belongs to an OSGI bundle and it can run as an OSGI bundle. Now how does an OSGI bundle actually include those dependencies? Well, it's another field in the manifest file. It looks like this usually. I know it's scary, lots of text, but the important part is the import package one. When you see that, everything that comes after that is a comma separated list of just under packages that OSGI defines. So that was OSGI. Now let's talk about what happens when we take those third-party dependency tools and we send them to Java. So the manifest file, we've talked about it. We've just seen it. Now let's talk about those two fields that we haven't, that I glossed over in a moment or so. The main class field, it must include the fully qualified name of your program's main class. So when you want to actually run the program, that's where Java will look for it when it starts running. And the next one we'll talk about is the class path, which gets its own separate field because it's quite important. So this is the class path. You may not like it, but that's the class path. So the class path is basically an information booth. When Java wants to actually find the classes that you're using, where it goes, it goes here. It goes to the class path and looks for all of those classes. This tells it where they are and then it sends all of the class loaders which we'll touch on in a moment to those places. So it's basically, the default of it is the current directory from which you're running the Java program, but you can set it manually. You can set it using the environment variable, using the class path argument for the Java command or in the manifest file as we saw before. But what happens to the things that are on the class path? So how does it get resolved? So let's say we have the org test example class, which is located in my package. Then what happened is my package, that top level directory must be in the class path and the class file must be in the org test subdirectories. Now, if we want to conclude it, a JAR file, we need to put the JAR file itself on the class path, but the class that we want to use itself has to be in an internal structure, in the JAR file, for example, org test example. So we need to acknowledge that there is a split in Java. What happened before Java 9 and what happened after Java 9? So Java 9 introduced some various features into Java, specifically about modularity. We'll touch about those in a moment, but let's first focus on what happened before Java 9, the classic way. So what's class loading? When you have a class that appears on the class path, someone has to load it into the Java virtual machine so you can use it. So that's where the class loader is coming. The class loaders, they load your classes, but they also load various Java internal classes. So with Java Util, Java Lang, all of those need to be loaded themselves. There are three class loaders. They are called the bootstrap loader, the extension class loader, and the user or system class loader. Each one handles a slightly different loading of different classes. So the bootstrap class loader is kind of an exception to the two others. What it does is it loads the Java core platform classes themselves. Java Util, Java Lang, as I said before. It loads them from rt.jar. You can modify it from code, but we won't talk about that. It's a little bit out of scope. We just need to acknowledge that it's there. The second class loader is called the extension class loader. What it does is it loads all of the files from jrelibx and you can't modify it. Again, it's out of scope. We won't touch about it too much, but we'll talk about the next one, which is the system class loader. So what it does is it loads the classes from the class path that we've provided. It loads them to the Java machine for user use, but it also has another functionality that it loads classes that you generate at runtime. But when that happens, there's a certain situation we've touched upon in terms of what happens if you include the same class twice. So this is where this all comes in, that there can be only one. Once you try to import a class, you can only import it one time. If you try to import it a second time, it just won't get loaded. So this is done because every class gets imported using its fully qualified name. And if it was to allow two of them, you'd have just undefined the situations where you reference two things at the same time that can happen. So in German I, what we had is called the module system. So the module system was made to increase the modularity of Java programs and to reduce the size of all of the loaded projects that you use, including Java itself. So what Java noticed is that whenever you run a program, you load the entirety of the Java platform, which at that time became quite large, so they wanted to reduce that. And by providing a sort of modularity, you reduce the size because you can separate the Java UTL, Java lang, and so on into different modules, which were of smaller size. And by the way, it can also improve the security of Java because if you don't load something, any adversary that eventually gets access to your system somehow and has the ability to run Java code or run as your Java code, it just won't have access to some of the functionality that it would have had if the Java core wasn't split into modules. In practical terms, what changes is that we have a new variable called the module path. The module path is basically just like the class path, just for modules. Instead of using the class loader, it uses a module loader. It's quite similar in its functionality for our purposes. So specific changes that we have to acknowledge is that if you try to import from one side to another, for example, from a modular jar, you want to load a non-modular jar, it won't happen. You can just put it on the class path and it won't get loaded. So that's become the module system. It doesn't really want to go outside. It doesn't want you to get all messy with the regular class path system. So what you have to do is you need to load that in with the module path. What happens is it gets converted into the default module. Now, the other way, if you try to load the modular jar from a non-modular jar, it again won't work over the module path because the module system was never invoked. So it doesn't go looking for it there. You can override this using the add module command in the Java executable, which will tell the module program that it needs to start actually working it. It needs to do the loading itself. So let's do a quick recap about what was talked about so far. So we looked at the Pomex ML. We looked at the manifest file. We've spoken about a little bit about class loading and we took a look about Java 9 modularity and how it affects that. So why is it important? Why did they actually choose to talk to you about all of this information? As we said before, Java is a huge programming language. It's everywhere. It's pretty much every computer, every even not computer, any kind of electronic device runs Java. And because of that, the security posture of that language is quite wide. But the thing is that the resources for this programming language just aren't as good as other languages. And just the information that I've shown you so far, you can't go to a single place and have it. You have to go to discover it for yourself. But the thing that is actually important about this and the reason that I've chosen to spoke about this is supply chain security. So if we want to secure our dependency management and actually have a more secure supply chain, we need to understand how it works. We need to know how those dependencies are managed. We need to know how dependencies are added. We need to know how they actually run. And even if we look at Maven and how it works, if we don't take a look at class loading and the specifics of it, we'll lose the bigger picture. Things that Maven adds sometimes just won't get loaded into the class path. Or even it isn't the class path, it won't get loaded into the Java virtual machine. And at that point, if we don't focus our security and we focus our attention on the right places, we'll end up with a lot of wasted work. But more than wasted work, if we alert for every dependency that we've just found somewhere, we'll create alert fatigue. And too much alert fatigue can be quite a problem for our dependent, for various managers of projects, for developers. And I think that reducing alert fatigue is one of the biggest challenges and most important things that we can focus on right now. Because even if we can solve every vulnerability, not having the time to do it, not having the focus to actually go and take care of the important parts, just will make all of that work irrelevant. So by looking at the right places, knowing what goes where, we can reduce that. So if anyone wants to read a little bit more about what we've discussed, I mean, just a little bit more about dependencies and how it works. Take a look at these places and IV and grader are both third-party dependency management tools. They both provide this similar functionality to Maven. They often use Maven itself. We didn't discuss those because they don't even footprint in the JAR file. So it was a little bit out of scope. You can look at the internals of Maven and OSGI. Let me warn you, OSGI can get quite complex. They've thought everything through quite deeply. For example, when we talked about the way Maven handles two depending on two different versions of the same dependency, OSGI is a completely different solution to that and it involves quite complex mathematics. So we need to go into that. You can read more about class loading, but I warn you, it comes pretty much handy in hand with the Java security model. They're interesting subjects, but they're also quite complex and quite technical. And yeah, so thank you very much. You can find me as that on Twitter or X or whatever. I don't really tweet or X, so your choice. And that's my email. So yeah, thank you so much. Does anyone have any questions? Yeah. Yeah. I think that both behaviors can be justified. You can justify both the behavior of looking at all of the dependencies and everything that may be included. But I think that that we should be more focusing on the way the things that are actually run. Because as I said, alert fatigue is quite a significant problem and we need to find ways to reduce it. At the end of the day, if something doesn't get run, something doesn't actually reach the execution point, it's less important. It is still important because malicious users can utilize things that are not actually running but represent, but we need to prioritize things by how big of a problem they are, by how much the exposure of that can face the pose a risk. Okay. Anyone else? Any questions? All right. So thank you very much.