 I'm going to give you a talk, another view of what Koji is. It's the build system that Fedora uses, how it works, and how you can use it to build your own software. So what is Koji? Koji is a task scheduler, a results gatherer. It's being the Fedora build system since the core and extras merge, and it's also used internally inside of Red Hat to build Red Hat Enterprise Linux. So there's a whole bunch of other companies and people that are using it to build their own software today. So it schedules tasks based on a priority system that's built internally. So when you submit something to happen, it magically happens, it tracks the results. So when you build an RPM, it tracks what came out of that build, it also tracks what went into it. So what software are you used to build your software on? It's quite a convenient and powerful tool. What Koji is not? It's not a software release lifecycle management tool. It's not a deployment tool. It's not a repository management tool. And it's not a continuous integration tool. So if any of you have used Hudson or something like that so that every time you commit, it builds a new version and sees if you've broken the world or not. You can't do that with Koji. Well, you could using the API, but it doesn't have that build into it. It's not something that by default is going to do your continuous integration. You can pull repositories out of it, but Koji by itself doesn't make repositories. There's deployment tools like Puppet and other tools to deploy your software. And you can use external software to Koji to manage the lifecycle, but it's not a built-in bot. The design goes with Koji. The main one was build reproducibility. If we built this software today and six months from now, we're trying to debug a problem with it. And we want to know exactly how did we build it? What did we build it against so that you can reproduce that environment to work out what the problem is to debug the issue. And it heavily uses existing components. It uses mock, yum, rpm build, create repo, all open source tools that are out there under the hood to do its job. So mock. Mock is a tool that creates a cheroot and executes commands in the cheroot. So Koji takes advantage of mock. It builds its source RPMs. It builds its binary RPMs inside of mock cheroots. Something you can execute commands privileged or not. We copy the output out, which we then upload to the server. So mock ensures that you've got a reproducible environment. It installs whatever's in the repo that it's configured. And we have what's called the build group. And that sells, you know, mock, these are the things you put in first. And then we look at this build requires of a spec file and we install all of those things. So when we've installed all of that, you know, the package should be able to build. So it's always building against the most minimal amount of things needed to build. Unless, you know, you configure your build group to have the world, and then you're not really going to know what is actually needed to build your software or not, but it will build. And it uses yum in the back end to create the cheroots. So what is yum? Yum is a software installation tool. It reads repository metadata and installs, you know, packages based on that, resolving dependencies as it goes. It's, you know, pretty simple but powerful tool that probably anyone that's used RPM or used, you know, Fedora Railway System is, you know, complained about that being slow or messy or whatever. But it is a, you know, powerful and complex tool. We then use RPM build. You know, everybody first learned how to build, you know, an RPM by doing RPM build minus BA or minus minus rebuild and build it. It uses RPM build to, you know, create your binary RPMs. RPM build converts the bare components, you know, the spec file, which is your recipe. It tells you what you need to install to, you know, to what you need to have installed to be built against how to build it, what commands you need to run to build it, what you should get out of it at the end. You know, in the source RPM, we have the spec file, the upstream table, patches. We end up with a, you know, source RPM and a binary RPM. So Koji itself has a bunch of components. I actually don't have them all here, but these are the ones that deal with the RPM building. We have Koji Hub, which is the central brains. Koji Row, which is a repo administrator. Koji D, which does the building. Koji Web is a front end and the Koji command line. So Koji Hub, it uses XML RPC for all of its, you know, communications. It's a passive application. By itself, you can set up Koji, you know, the hub and it's not going to do anything. It doesn't, you know, runs as much as my Python. It's not active in any way, shape or form. It's entirely passive. It is also the brains of the operation. It tells you, you know, what you can and can't do. It tells you, you know, it does all of the writing to the file system. It does all the communications to the Postgres database that's in the back end. And, you know, it has all the brains as to which task is the next one to be, you know, run, which machine should that task be run on. You know, you don't want a PPC build to end up on a x8664 builder. It's just not going to work. Koji RAM manages the repositories that are used internally inside of Koji. So it watches the tags and then sees when, you know, you've built something. It's landed in a tag that feeds a builder and goes, okay, we need to build a new repository and creates, you know, one with the latest in the tags. KojiD is generally the thing that is the most active. It runs mock, creates the source RPMs. It creates a binary RPMs. It does runs create repo to create the new, to actually generate the repositories. It pulls the hub for new tasks to, it keeps the brains, you know, in Koji Hub to actually, you know, run and do their thing and keep, you know, keep moving along. Koji Web gives you a visual status of what's going on. You can look at Koji Web. You can see, oh, you know, my build's running or you can look up build logs so that, you know, you did a build of GCC last week and it failed and I forgot to do anything about it. You can go back and, you know, grab the build logs and, you know, check, you know, what's going on. It gives you a method to sign up for notifications of builds. The notifications inside of Koji itself are pretty simplistic, but you can, you know, say I want to know every single time that a tag happens or a build happens or just this particular package. It's a rudimentary. It also allows you to cancel resubmit tasks. The CLI is the most, is a much more powerful way to interact with Koji. It lets you create users. It lets you submit tasks and the task being either, you know, regenerate a repository, submit a build. You can do, you know, a scratch build or you want to move a package from one tag to another tag, manages permissions, allows you to add tags, manage the tags and targets, change, you know, their names or, you know, where things end up. There's a bunch of tools that use Koji that let you do some other things, like MASH is a repository generation tool. SIGL is a tool for, you know, signing your RPMs. But it is a package-managed life cycle management tool, allows you to transition your packages through states of testing and, you know, going to stable and all that kind of stuff. And then package database, which manages the package lists and who's the person that's responsible for that package. Koji is also capable of doing other types of tasks and builds other than just building RPMs. You can create live CDs, which is, you know, something for Fedora 15 with the release engineering. We're going to create all of our live CDs and appliances inside of Koji. You can create Windows builds, which is not something Fedora is going to ever do. And you can also do native Maven builds. So it creates a Linux cheroot and uses Maven to, you know, pull in all the jars and stuff from a Maven repository and build, you know, your Java Mavenized product and spits out wars and EARs and jars and all the poms and all the stuff that goes with, you know, the Java world. And then it also optionally converts those into RPMs that you can then use in, you know, regular RPM builds. So it's kind of a cool thing for the, you know, the Java world. They like to ship things a bit different because they want it to run everywhere and don't so much care about, you know, packaging for one distribution and the other distribution and whatnot. So inside Koji, it organizes things by tags and targets. The tags are a collection of things, you know, generally package names. So you have an optionally architectures. So we build my infidora, we build all of our RPMs into a target, which says we're going to use, you know, for what to write now, we use dist-f15 as our target. And it builds things using the tag that is defined for that target, which is dist-f15-build, which has the group metadata and all of the packages that are available to it. And then at the end of it, it puts it into a destination tag. So we tag the package into dist-f15, which is through inheritance, which I'll show you a bit later on. I'm going to demo that and hopefully work. It puts it into the destination tag and then Koji looks at that and goes, oh, we've got this new package. Now that's in dist-f15, we need to make a new repository that includes that new package. So, you know, tags have an inheritance. So you can say dist-f15 inherits from dist-f14 updates, which inherits from dist-f14, which inherits from dist-f13 updates, and so on and goes down so that, you know, you're not starting from scratch for every single time, you know, you're building a new release. Yeah. Tags. Yeah. Yeah. So, yeah, the tags will inherit from each other. Feel free to ask questions at any time because I don't like talking all the time. So, yeah, the tags can inherit it from each other so that if you just say, if you've only built a package for Fedora 13, we use the tag inheritance to make that package available for Fedora 14 and, you know, Fedora 15 as well. So instead of having to build it for all of them, you can use the tag inheritance, build it for the lowest common denominator, and it then, you know, populates through the inheritance to be available in all the other ones that are built upon it. One thing that tends to catch people up a little bit is that all the RPMs have to be unique entirely in the NVRA. So it's a name, release, version, and architecture. So you can have, for instance, three different source RPMs called foobar and baz that each spit out a RPM called foo-1.1-1 but each one has a different architecture. So the three different source RPMs can build a binary RPM that has the same name and it's available for all of them, but for a different architecture. But you can't have, say, two RPMs that provide the same thing. So, let's say, you can't have two, you know, source RPMs, so let's say GCC decided it was going to bundle Glib-C and make a sub-package called Glib-C dash, whatever, and you can't have two source RPMs providing the same binary RPM by the same name. It catches people up because they're like, you know, the source RPM is unique, but every single RPM has to have a unique, you know, NVRA. So you can use epochs and things to make, you know, to drop back versions and stuff and it actually doesn't take that into consideration. So if you use an epoch to rebuild an old version, you actually have to do something to make it unique in another way to have the build complete. Another thing while I'm wearing here that Koji has this, it makes the repositories from what it calls the latest package. And the latest package, most people tend to think of as, well, what has the highest NVRA? You know, which is the highest NVRA package? So if you build, you know, Glib-C 2.1.0 and then you go and build Glib-C 2.0.9, so it's an older NVRA. In Koji terms, that old one will be the latest package because it was the most recently tagged package and that tends to catch people up. They're like, well, I've built this newer version, why isn't it available in my build route? And it's just because it goes based on the latest tagged package, which allows you to drop back to old versions and, you know, do some things. Otherwise if we just put, and we only have in the repositories the latest version of each package, so you can't just have, you know, a hundred different versions of, you know, GCC and you get to pick in your spec file which one you want. It, you know, doesn't allow you to do that, but you can do that by having a bunch of different tags that each had a different GCC version and would let you build, you know, that way. So Koji tags inherit from other tags. They define which architectures you want to build for. So in your one, you know, Koji hub, you might have, you know, 50 tags and you have builders that do, you know, I686 only, you know, 32-bit machines and you've got other 64-bit machines and you have some Itanium. You can use different tags to build to different architectures and it tells you where to build in. Targets are kind of whenever a little bit, what you use when you submit your build. So you say that you built the target and defines what's used to populate and where to go at the end. So building your own software using Koji. The simplest path is to use external repositories just in that if you use the internal repositories, say if you wanted to build for Fedora 14 or REL or whatever, if you don't use external repositories, you need to import it and then you need to continuously import the updates and keep that updated and manage yourself. Using external repositories, you can just point it at a mirror and whenever there's an update available, it'll get picked up and available for you to build against. You can only use a single mirror. So while YUM supports using mirror lists and you can, you know, like by default, Fedora OX is configured to use mirror lists. It talks to a server and says, give me all the mirrors that are available. You get a big list back and then if downloading fails, it jumps to the next one and continues on and you're always happy. With Koji, you can't use mirror lists. You have to point it at a single mirror. It doesn't have that ability to, you know, jump through the different mirrors. And if you wanted to build Mandrieva or Suze or some other distribution, you'd need to build the sporting tools. You'd need to build like YUM or Mock and whatever tools are, you know, the underlying core components if they're not available for your distribution. You'd need to build them to do that. So we're going to do a demo now. So I set up last night a little VM and I have a Koji instance that I configured on it and right now we have no packages configured, no tags configured. You know, we've not done tasks. We don't have any, you know, build targets. We have a single host that can build the base architectures i386 and x8664 and we have a couple of users, the admin user and Koji RR, which is the user that Koji RR uses. So I came up with this. I mirrored Fedora 14 and onto the VM. So I'm just going to go through some steps and feel free to ask questions if, you know, you want. Is that better? So I just, you know, put in a little file the steps we're going to do. So we're going to add a tag here. Added a tag Fedora 14 base. Now I can't see my mouse. There it is. So let's add another tag here for Fedora 14 add-ons. We're going to add another tag for Fedora 14 add-ons and we're going to tell it that its parent is Fedora 14 add-ons. So then if we go back to our little tags page here, we now see the few different tags and if we go to Fedora 14 add-ons testing, you can see through it shows the inheritance and says, oh, we've got the add-ons as a parent. It keeps the users in the database. It does have, so Koji for authentication will use either Kerberos or SSL certificates. This particular instance I'm using password authentication, but it's not recommended for you so it sends the password in clear text and you've got to put the password on the command line. I've actually aliased Koji to Koji-user-admin-password-password just for demonstration. So if you use LDAP for your users, if you're using Active Directory, you should be able to use the Kerberos side of it to do the authentication. But it keeps its users internally in the Postgres database and it doesn't really have any ties into being able to use LDAP or something like that. So now we've got this. It goes over the screen. Let me just make it a little smaller. So we'll add another tag. Yep, I did too. So in this tag here, we're saying the parent is the add-ons and we're defined, because you don't have to define architectures on every tag, but you do have to define it on the tags that you're going to use to build against. It needs to know what architectures we're going to build for. So in this instance, we're going to build I686RPMs and X8664RPMs. And then we will add tag inheritance with a lower pr... What do I do here? Yeah, good job. That's what you're teaching me for running it last sec. So we added the tag inheritance and put it... The first time you add a tag in the inheritance change, it assumes a priority of zero. And if you just try to add it without defining the priority as a second tag in the inheritance chain, it will throw an error saying, oh, we've already got one. So you need to define a priority for the second tag. So if we now look at the build tag in our web interface, we have the add-ons and we have the base. So then we're going to add an external repository, which is mirrored on the VM. Hopefully I've got the spelling right here. So we're adding the external repository. We're defining here the minus t to add it to this tag. We're going to call the external repository Fedora 14. We have a URL that is available via the Apache server on the VM. So it's created the external repository and it's added it to the tag. So now if we refresh here, it shows up external repository Fedora 14. It's inherited from the tag that it comes from. So we now add a build group because whenever Koji tries to do any builds, it pulls from the build group and it creates a comms file and it uses a build group to populate the initial... I failed miserably here. I'm sorry about that. Probably should. So now we've added the group and I grabbed the package list here. So the next command is add the group packages for this tag. This is the group and this is the list of packages we're going to add. I grabbed that list of packages from the Fedora 14 from the Fedora 14 build tag on the Fedora's Koji. That's the packages that are defined as the minimal build root in Fedora. So they're ones that you don't need to add as build requires because you can guarantee that they're always going to be available. So I've added that group. So if we go Koji list group, so now list the groups where they're added tells us the package. So now we're going to add a target. And adding the target, we add the target, we say here this is our target name. This is where we populate the build root from and this is where we're going to have the packages that we build end up in, you know, this is where we want to tag them when the build completes. So now that we've added that target here, we should have a task. Hopefully Koji R should be run. I should have checked to make sure Koji R is running. It's not. It hopes when it's running, Koji D is probably not running either. So now we've got a new repo task because Koji R has gone, oh, we've got a target. So now we need a repo for it. And it said, let's make a new repo task. Koji D has then gone, oh, we've got a new task and it's now doing its thing to run these two create repo jobs. So we want to add packages. So we're adding a package. We're saying that you have to define an owner of a package when you add it. We're saying that the admin is the owner because it's the only user that we can actually use. This is the tag where we're adding the package and then it's available to all the tags that inherit from, you know, that tag. And we're going to add bash and binutils. So they're added. So now if we go to packages, we see the two packages, you know, click on it and it says it's in this tag. We've got no builds. Get the whole page in. So now what I want to do is wait while it runs create repo. So the create repo task, what it is actually doing is it gets the list of all the packages that are in the tag and gets the latest version and populates a list and then creates a repository from that. And this is where we've got this external repo as well. So it grabs the repository metadata from the external repository and then merges the two together to give you a single repository that is referenced in the mock configs that Koji builds. It's running on a VM on my laptop. No, it's a Fedora 14 VM on my laptop that I threw together yesterday and took me about 20 minutes or so to set it up, but I've done it a whole bunch of times. The first time it usually takes a while to get all the... how it works and what it does and get it all going and once you understand it, it's pretty simple to set up a new Koji. Yeah, I could have just installed it on my laptop and done it. I just did it on the virtual machine so that when it's done, I can delete the LVM partition and it's gone and cleaned up. That's the only reason I ran it as a VM, but it could have just been on the machine. If you're going to build packages for Fedora, what you would do is you go through the process of becoming a Fedora package maintainer and you do the development work locally and then you submit the build to the Koji build system that Fedora runs and it goes through there. Yep, there is patches around that will cross-compile, but they won't ever be accepted upstream. The goal of doing Koji, using Koji, and having the reproducibility is also kind of mandating native compilation. So if you... like Mock uses Yum to populate the Churrut, so if you wanted to say cross-compile ARM on X8664, you don't really have any way to run the post scripts in the RPMs and you don't get the environment set up exactly how it's intended, so you can use a VM for it. You could run KojiD inside of a VM that's an ARM VM on an X8664 box, but it's the way that we've developed it always to do native compilation. It will do 32-bit builds on a 64-bit box. You can build any compatible architecture, so if you've got... the builders in Fedora are all X8664 boxes or PowerPC64 boxes, and they build 32-bit and 64-bit packages. As long as the machine is capable of natively running the code, it will do the builds and that architecture. Correct, they're done in a Churrut on the build host and each build is done in a brand-new Churrut. So we're getting there, looks like we're probably... Nope. So we have some repositories created now, so I have a couple of source RPMs here, so we'll go KojiBuild. We're going to send it to our target, which was this, and let's build Bash. So it uploads the source RPM. By default, administrators are allowed to build source RPMs and everybody else has to build from SEM checkouts. That's configurable. You know how you want to do that. You can even force it so that everybody has to build from, you know, source configuration management checkout. So we can see here that we've got this task. We're watching the task. We're creating the new build. We're going to build it for I686 and X8664. It's now being picked up by the host. So you can watch on the command line if you want. You can also go to your tasks here. So we can see the task. And so it's now going through the... You know, we have all our logs. It's going through... If we click on the watch logs... Well, it's logs. It's installing the... It's installed the packages that we defined in the build group plus their dependencies. It then installs the... Once they're installed, it then installs the source RPM and rebuilds it natively on the architecture whether any architecture-specific build dependencies are picked up correctly. You know, if you put an IFR S390X and add a build required specifically for that architecture in the spec file. And... No, it just builds from spec files or source RPMs. So the way that Fedora builds, we have a command utility called FedPackage which is a tool that interacts with the Fedora revision control system, the git tree, the git repositories that we use for each package that hosts the spec file and, you know, the patches and then we put the tab holes in a look-aside cache. And the build SEM... Build source RPM task, it does a git checkout because you tell it, I want to do this build from this git repository and this git hash. It does a checkout, gets the right hash that you want and it then does a FedPackageSources, it runs which is a configurable command. So you could configure it to, you know, run a command to say, you know, make... Tarball or whatever... Whatever command you need to run to get a tarball of your source. So FedPackageSources, you know, then downloads the tarball from the look-aside cache. It finds the spec file that's in the git repository and does an RPM build minus BS and builds a source and creates, you know, the source RPM that it then feeds into the build tasks. So, you know, if you wanted to build, you know, from a SEM, CVS is supported, Subversion, Git, Mercurial are the supported SEMs and there's some patches floating around for Perforce and a couple of other proprietary ones. You can, you know, do that and then you can define a command that you need to run to create your tarball so that you can make your SRPM. So you just ask to... If, you know, we first create this SRPM and then we use that to create the, you know, binary packages. So it looks like we're still actually installing. So it's still running here. We'll just cancel running and run top. So Fedora uses a tool called Bode for its software management release. So when you build a package, you create an update in Bode and Bode then manages the package through, you know, different tags. So we have, you know, updates candidate tag and then we have an updates testing tag and then the updates tag. So when you, you know, do a package build for an update, it gets tagged into the updates candidate tag and then you create the update in Bode and Bode then goes through that and says, okay, we're pushing updates, package has been signed. It then moves the package from the updates candidate tag to the updates testing tag and then we create a repository from the updates testing tag, push that out for testing. Then when testers test it, they can give karma in Bode and when it gets enough karma, it then goes to updates and becomes a stable update. Yep. As long as the RPM on the build box is able, is capable of installing the RPMs, you can build for different releases. So Fedora builders right now run REL6 because Fedora 14 and newer requires a 2632 kernel with optimized glib sieve to be on a 2632 or newer kernel. So the builders are running that and that RPM supports all the SRPMs all the way back. But we had, did RPM changes in Fedora 10 which meant that the RPM on REL5 was no longer capable of reading, you know, being able to install the RPMs into the Churrut because the hosts yum and RPM is what installs the packages into the Churrut, it's not using, you know, the targets. So we had to then build a new RPM that we put on all the builders that supported the newer features and we ran them on REL5 until we got to the point where we had REL6 available and we moved the builders to that. So as long as the hosts RPM is capable of reading the RPMs and installing them, you can, you know, cross compile. So the builders in Red Hat run REL5 and they build all the packages all the way back as far as, you know, we support. That goes as far as you need. Yeah, so you could use, you know, you could run all Koji on Ubuntu and build all of your RPMs on Ubuntu and it's going to work just fine. We don't currently support building devs but there's no reason that if, you know, there's a Debian equivalent of Mach was available and a little bit of porting of KojiD that you couldn't build, you know, .debs as well. It's just not something that, you know, we do and no one in the community stepped up and said, hey, I really want to use Koji to build my devs. Here's a patch that does it. So if somebody wanted to do that, I'm pretty sure we would accept it. What's that, sir? You could build Mandrieva packages or SUSE packages. You would need to, you know, make sure that, you know, you had to, if you used an external repository, you'd need a yum external repository and, you know, we don't support the native tools but, you know, that could maybe even be done as well. So let's see where it builds at. So it's done, so let's go to all tasks and I'll build failed for some reason. Okay. It's failed because it reads in from the external repository metadata and says, hey, we've got this RPM already because we keep track of, even if it comes from an external repository, we keep track of all them in there and it's saying we already have this package in there for some reason. I think it was actually in the external. I was in the external saying the version of Fedora release was different between, I guess, between the 32 and 64 bit, which it should not be. So I'll try Benutils but I have a feeling that something in my rush to set it up is not quite exactly right with the repository that I'm using. Okay. I mean, Koji started Life of Rejoice as the internal Red Hat build system and when we got to the point where we were merging Fedora, the core and the extras together and making everything available externally for anyone to contribute to, plague the build system that was used by Fedora extras wasn't quite up to what we needed from a build system to do it all and at that point, Red Hat open sourced Koji and released it and made it available under the GPL for everybody. Authenticating users. Yep. But we have an LDAP infrastructure to authenticate stuff. Okay. So at the moment, the options are SSL certificate, Kerberos or username password. The username password is very insecure but Koji Hub supports plugins so you could quite easily write some kind of authentication plugin that would talk to the LDAP server. It's just not something that we've had to do so we have... Yeah. But if you're using active directory, it has Kerberos and you could use that. Yep. So a build, it's permanently stored in the database. So at the end of a build, when you do an actual build, it's in the database we store what was in the build route and what we got out of the build and we know where we put the files and we keep track of everything to do with that build so that if we ever need to, we can reproduce the build. A scratch build, we do the build in a Moctu root and we put it in this directory called scratch and we leave it there and we don't bother tracking what was in the build route. It's only stored in the logs and we don't import the build information into the database so it's essentially as if that build never happened because... So it's a good way to do a quick test. Is this change really right? You just do a scratch build and it builds and you can grab it, test it, it works and push it out. So a scratch build is just designed to be a temporary. I want to check if this thing does what it's supposed to do type thing whereas an actual build we store, we keep track of and we do that. Excuse me. Unfortunately, this is all the time we have for this now. Okay. Yeah, we'll have to wrap up. All right. So my build is still going and hopefully it will actually complete. So the only other things I had to quickly add is the mesh is a really good tool to actually get repositories that you can use out and then you can... It uses create repo to make the repos and you could use something like RHM push to push to satellite to deploy your software. And Koji list-api lists the whole Python API that's available for you to extend and build upon Koji and do lots of cool things with it on top of it. And if you have any questions, I'm pretty easy to get a hold of. I'm D-Gilmore on FreeNode or you can email me on D-Gilmore at FedoraProject.org. So no appreciation from the conference organizers. Next slide. Thank you.