 My name is Mila Slavsuki, I come from the Red Hat. And I was working with SPD Star stuff for about a year on the site project. And from that comes my interest in the software build of materials, because I hit it quite often, and I wonder, what the hell is this? So before I start, how many of you heard about software build of materials before this presentation? OK, OK, almost everyone, I guess. And how many of you know what software build of material is? OK, we have two experts that know more than I do. So I guess you check who raised their hand and then ask a question with them, not me, after the presentation. So I promise to deliver you explanation very easy term. So promise me if I don't want to be precise, just for the dummies, for the entry level, as I am, am I? So the software build of materials is very easy. It's basically, at the best analogy I heard, is a list of ingredients from which the product is made of. This describes you from which ingredients the product is made of. So anyone want to guess what it actually is? And usually the order is from the most important material of the product, the least important materials in the product. So Washi's software is something very close as shampoo. So that's Washi's software for the first. Yeah, but it doesn't show you and doesn't tell you what or how the product is made of. So it's not the recipe. It doesn't tell you whether you put everything in one pot and cook it together, as I know on top, or you do that in some precise steps. So the software build of materials in IT world give you this picture. Nothing else, it doesn't tell you how to handle this teeny part, this problematic part. It doesn't tell you what the big blocks are or how to work with that. It just give you this picture and then it's up to you. This is actually maybe the biggest problem of the software build of materials that it. It's not solution. It gives you just one thin layer which you can uncover and under that is a huge pile of other problems you may need to solve like what to do with the dependencies, what to do with the vulnerabilities, how to replace something. So it's a whole rabbit hole into other stuff which you previously didn't knew about, you didn't care about and now probably you should care about it because security stuff, et cetera. So it's just map of your product. No solution what to do with that. Back to the food analogy. To show you what can happen with your product and how the software build of materials may help you. This is my favorite sauce from my local supermarket from Delilo and I was hit by this several times because you bought it, this is nice tomato sauce for a pasta. Nothing on the ticket that may warn you. This is the ingredients or it's in check and you again start at the most important stuff. So it's tomato, tomato sauce, then some onion and that's it and some silly stuff at the end. But the silly stuff at the very end is the chili and it's actually so spicy that my daughter can't eat it. So I have a whole pot of the pasta for myself. It's very, very spicy and it's not set on the front page like hot Italian pasta. So you have to have the bill of materials, the ingredients to actually find it and the very same thing goes for the software bill of materials only when you see the big picture that is KCD diagram, you can find, okay, this part is probably very at the bottom, very tiny but supports all these big blocks and maybe problematics and maybe do something about it in future. So how the software bill of materials actually looks. It can look like this. This is software bill of materials or some virtual artificial made company. It just lists what you are using in your project, in your company. It's just not too interchangeable. Like if you send it to other company, they can't merge it, they can't process it too well. So it doesn't work very well in our current distributed world. So we probably want to something else. And around year 2020, 21, we had some accident, famous Solarvines, Microsoft has some issues as well. So UK, US have keeper security executive order. We said, you have to use software bill of materials. It will help you to auditing, give you the map of what you have in your system. It will help you find vulnerabilities in those teeny boxes at the bottom. I think can help you finding the licensing of the project because again some teeny box at the bottom can be, can have strange license which prohibits everything at the top. So since that it becomes a thing and people start caring about it because they were forced to do and it's actually good thing that they were forced. So the keeper security executive order said that the software bill of materials should have at least these fields, there are other fields but they can be optional. So you have to have supplier name, component name, et cetera. So some fields and from that moment we have to, not competing standards, we have actually two quite existing standard, one of them is SPDX standard. It actually started around 2011 and it started as licensing auditing tool, so the origin are around licensing and licensing management but then grew up to the full software bill of materials management tool. Then we have sweet text which actually is not actually software bill of materials format specification, it just allows you to identify component. And last is Cyclone DX, which is more recent stuff. It comes from the Dev Ops origin and it's very lightweight and focus on how to describe which component has which vulnerabilities and whether it is good for you or whether it should upgrade some components. So they have different goals, different audience and both SPDX and Cyclone DX is fine. So if you are interested in your project, either one of them is fine. How the software bill of materials actually looks. So this is example of the software bill of materials, the actual SPDX document. And if you notice, there's some header preamble and there is something which you describe the component and that's some package name, hello. And there are some information and if you come from the red hat Fedora word, it may reassemble you to RPM preamble section. So there are most things from there. Then we have the list of the files. Again, something which you can easily page with RPM dash Q dash dash list and then some identification. So actually if you are using RPM or any other package manager, this is very easy to retrieve with few RPM dash query commands and within 15 minutes you can have software bill of materials. From my point of view, the most problematic part is this one and this was the license. And in this case, it's a license declared MIT, which means the upstream, the author of the package declares that it's under MIT license and license concluded means that I didn't care about it, like no assertion, I just took it and passed it and I didn't try to do anything about it, find it or it. So this is something which we are trying to change in Fedora now because right now we can even do this because in Fedora we are using the old system identifier, we call it now Callaway system because it originate from the Tom Callaway who was a legal guy in that time in Fedora and he invented identifiers for the licensing. So he said, okay, GPL version two should have identified GPL v2 and that's it. But it was no standard, no one actually used that but Fedora. So now we are moving to SPDX list from the standard and we are actually trying to audit it again about licenses the packages is using. And this is where you may hear about SPDX in Fedora. I'm sending every two weeks a statistic how we are going with the conversion to the SPDX identifier. This is current burn, not down but burn up chart and hopefully next year after summer we will be finished if we do something miracle we may be even faster. Right now, I and a few other people are focusing just on the licensing ID. There's guys from the products security who works on the software build materials itself. For the first time in history we have the data about licensing in machine readable format because previously in the Callaway system it was in just in the Viki and it was just described in HTML page. Right now we have in JSON and in Tommel files the license and set attributes whether they can be used for anything or just for data or fonts, et cetera. We have even the formal grammar so you can build your parser and say what is good and bad. One interesting situation is that we changed no effective license evaluation which slightly complicated this migration and that's actually why it is so slow because that means that previously when there was a license and there was a, that's a real situation there was a package, one parallel package which say this package can be licensed under any open source license. And when you come with that you say okay I'm choosing GPL version two and put it as a license field in the package. Right now you can't evaluate or the guidelines are that you should not do that and you should actually say that it's and that it was the longest string in the, from the all RPM specs so you should GPL version one or GPL version two or GPL version three or what the fact publicly sense or MIT or FHG and it was 800 characters long. We are right now in the process of getting the, because the SPDX is, it's not so young but it's definitely younger than the Fedora. So Fedora has a longer list of the licenses we have, then the SPDX have so some of the licenses are not in SPDX list so we are trying to push it there very hard. A lot of licenses has been recently then added because of the Fedora. In case there is no license in the SPDX which for some reason don't want to add it there you can standard use license ref dash something and it means that's your own license and it will never be in SPDX list and it's up to you to describe what's the meaning of that license. Now you may think that okay, thank you MIRROR you explain me what's the variable of material it is and now I'm supervised and that's for me and I don't need any other information. This was just crash of that thing and what I describe and what you actually thought about is the analyze of variable of materials so the list and the document you created when you are building package for RTO container but you can have various types of the built materials for example the built so you may describe what GCC you use during building of the package is which version of it was it, was there some ruler it is in that time when you were building the software or you can use the source tag like was it hosted on the VLAB or VHAP in the time when there was some security incident or if you may example or otherwise were on or the other round down so deployed so it is your software deployed to get some other software which together may cause some problems or what's actually needed for the runtime so your software is using storage SD buckets from the Amazon so but it's not described there and it may be problem so you can have runtime of the materials which try to pull up some of these relation and dependency on SD buckets and it can be Amazon version as the or some kind of virtual small shop provider which may have some vulnerabilities and it depends what you are actually using and there are other stuff you want to mention here because there will be no more easy and no more four dummies so what do we, so what do we I will conclude it with that what do we learn that learn that SBOM is just a map it's list of ingredients what makes your software that we have two parallel standard one from the SPDX from the Cyclone DX both are accessible that bill of materials actually easy to generate if you are using RPM if you are using containers made of scratch from GitHub deployed directly then you may have some problems but the licensing from that is the most tricky part likely and we have various SBOM types and the rabbit holes go deep down that's it any question for me so the question is what will happen if we use two pieces of software one with BSD one from the kernel which is incompatible so I may answer it for Fedora but I may not answer it regarding the software bill of materials because software bill of materials don't care it's just maps so you may say okay I have this whole deployment and I have one part of BSD and one part of GPO version two it's fine or if it is some proprietary software or some hidden secret and no one knows no one sees you fine the customer may be fine or not it's just maps so it doesn't tell you what you should do it's up to you and in Fedora I don't know if it is different component then it's fine if it is linked together then it's not fine and it's more question for legals and I'm pretty sure this answer will not be straightforward so I don't know so for the record what I consider is most important for generating S-BOM that was the question right oh I'm not sure whether I can answer it because I'm coming from the licensing part so the S-BOM for me is some some additional part which I was like just curious what the hell it is so I'm not directly working with the software bill of materials documents so I don't even know whether SPDX or Cyclone DX is better but what I find interesting in one discussion is that and that you should have tools that generate it like I'm not even thinking about like that people should generate it that should be fully automated and there is a one initiative which tries to provide a tool that the S-BOM will be added to every software project how you can do that because and when you retrieve it together with Tarbull from the upstream and you retrieve this software bill of materials you may or not trust it and if you don't trust it and you want to validate it then you should have a tool which actually generate your own bill of materials and then you can compare it and see whether it's valid or not but then if you have the tool which you can generate it on your machine then you probably don't need to software bill of materials from the vendor or the upstream so this is interesting situation but definitely yeah everyone including the author including the vendors including the customer should have some tools which can generate the bill of materials and they have the should have the same output this is for me the most important part of this Okay, so the question is what the hell is this graph? So this is this show our migration of the SPDX licenses from the old Callaway system to the new SPDX format the blue one show like this start it actually start in December of last year this is the point zero and this blue one show how many packages is already converted and this is estimation how with this pace how long it will take us to the 100% and this is this yellow part is how many how many trivial conversion are available and I'm in a group with two lawyers and they probably hate me for just saying it's a trivial conversion because of the audit is not straightforward because the not evaluating licensing so you should evaluate it but it's trivial from the point that right now the license in the old Callaway system is for example, a GPR version two and we have only one identifier in SPDX format which X equals to the old one on the other hand this red part that is terra incognita because that either means that more options are for example, old Callaway system had a BSD identifier for BSD to close and BSD three close and in SPDX we have to choose which of them because they have different identifier with MIT the situation is even worse because MIT in Callaway system represented I think eight or 10 SPDX licenses and it may hide even some licenses which don't have the SPDX identifier which you have to apply for and it may take a week or two months to actually get it from SPDX and get it to federal license data so if you want to work on that you should probably start quite as soonish this is all packages in federal like packages in federal doesn't have this SBOM related to as far as I know in federal we generate the SBOM for the containers where we take the container as only one part and one file so as far as I know we don't dive into the container itself so it's super easy bill of materials. Yep, so the question is why the trivial conversion are not really converted by automation? Because I'm in a group of four, two lawyers and I'm one of two engineers there and all other things that we should evaluate it manually because we at the same time change the meaning of the license so previously we can evaluate it the license now we should not evaluate it so the license string in some cases actually change even if it would be conversion from Callaway system to Callaway system because you add more licenses with some operator license one or license two for example and sometimes the license evaluation was done pretty long time ago and it may mean may not be true and it actually happened it's not so rare like last week I was trying to convert the RPM itself the RPM package and it was not straightforward like RPM web said that this GPL version two license the copying say this is GPL version two with some exceptions for RPM IO and RPM leap and license string in RPM header said something different so we have three cases and the issue is still open in RPM GitHub and I'm discussing with Panu, Miro, Hrenshok, Neil like what's actually the final concluded license so it's not straightforward even if those trivial cases so it's hard trivial is not so trivial so the question is whether there is problem that we have two standards and whether we can whether we can convert from one to the other I don't know I like this but I don't know I never work with this format directly so as I mentioned I'm coming from the license part so I was just screwing what was around me around the license part so I don't know so the question is can you ask our Fedora package maintainer help yes yes yes yes you can and please do because you know your package well and you know like whether there was some old system like the RPM which is 20 year old even more or whether it's new and like what the upstream say is true or you know which files are used there and whether it's really trivial and you can convert it from GPL v2 to GPL-version-2 and that's it and you have 5 second job or whether you audit it and know which files are there what can be the problematic part so you know it very well and we have tools like license-fedora-2-spdx which can help you convert the strings there are other tools like scan code ascalonocli, license check which can help you audit the files so this is where you can help and we are organizing workshops so if you hesitate with something we can help you but you know more than we know so please help us and the ultimate goal where we are doing that converting that like the material is things spdx licensing strings are other things and any other future software which builds on auditing on management of licensing will build on some injustice standard and that seems to be the spdx especially about the licensing so we don't want to do something for just a moment which will handle the Tom-Cloway system so we want to use something new and injustice standard so other tools somebody else will use use the new standard things and we are there as well further on first other question okay so no other question if you will have any later find me if you want to help with conversion of your package to spdx identifier let me know and I will do my best so thank you