 Well, hello everybody and welcome again to another OpenShift Commons briefing today. We've got another fascinating new project going on at Red Hat that we'd like to introduce you to called Builda. We have Nalan and Dan from the CRIO team that are going to give us an overview of that and some examples, demos of that as well. We'll have Q&A in the chat and after the brief overview and probably a little demo, we will have live Q&A with Dan and Nalan. So I'm going to let Nalan take it away and we'll get rolling here. Okay. Hello everyone. As Diane mentioned, my name is Nalan Diabai and I work in the containers team here at Red Hat and with us on the phone also is Dan Walsh. Dan. Yeah, my name is Dan Walsh. I run the containers team at Red Hat and we've been working on a whole series of tools to make using and running containers easier and more robust. So I'll let Nalan take over. Okay. So today we're here to talk about Builda, which grew a bit out of the CRIO project. So in terms of background, CRIO exists primarily to be a runtime for the container runtime interface that Kubernetes and OpenShift use and at least as far as Kubernetes is concerned, building an image is not part of that, so it's out of scope for CRIO. So a while ago, I guess it's getting on to several lines at this point. Dan had been telling me about how we really needed something in this space to be able to handle this problem and because it wasn't going to be part of CRIO, it was going to have to probably be something else. And so we knocked something together and the result is Builda. But let's back up a second and see what the problem that we're trying to solve is. Essentially, an image, because that's what we're trying to produce, is just described by the manifest. It's got the manifest as a JSON document and it lists a few things along with information about them, their digest IDs, what type of information it is. For the purposes of the images that we're building, the things we're most interested in are the configuration blob and the file system layers. Now when I say an image, we're actually talking about multiple slightly different formats of images, but for the most part, the tools that consume them are going to be able to consume all these formats also because they're just all out there in the field right now. So the first item in the manifest that we care about is a configuration blob. It's really just a JSON document, another one, yeah, which contains some encoded settings and defaults for how to run the container, you know, which command you're going to run, what environment variables you set for it, who it's going to run as. The key thing here is to note that it is not exactly the same as a configuration that you would pass to a tool like RunC. It actually contains a much smaller set of information than you can pass to RunC. It doesn't even include fields for things like process limits or C-group configuration. So you can't just, well, the upshot is you can't just take it and pass it to RunC and have it go because you have to do some mental processing to make that go. The other type of thing that you're going to keep in an image is file system layers. Notionally, the reason I call layers is because when you're building up the root file system to run your container, you're extracting one on top of another on top of another until you have a finished product. And there's a wrinkle in the way that those work in that in addition to being able to add things like you normally would if you were just untying a bunch of archives, layers include special information called whiteout, which are just specially format entries which say delete this thing instead of creating a new file with this weird name. And that's how a layer can actually remove contents from a layer below it. It doesn't actually make the image smaller, but you don't see it in the final product when you actually go to run the container. The good thing about layers and the primary reason they exist is so that they can be shared among multiple images so that if you're building one image based on the Fedora base image and then you build three or four more based on top of that and you push them all to a registry. They're still sharing the same underlying layers for the Fedora image and the incremental space used by the additional limit by the new images is really just the set of changes you made in them. So it really does cut down space. So if you're talking about building an image, you can start with, well, okay, let me back up a second. So when you're actually going to run the container, and this is a little bit redundant, you're just going to build up the refiles and do some processing using your configuration block to build a configuration for your one time and kick it off. This is actually the simplest part of Cryo and Cryo actually does quite a bit more than this in order to do the things that Kubernetes and OpenShift expects it to be able to do. So the hard part of building an image from your container is, you know, you've got the set of layers that you started with when you created the image and you've made some changes to do with file system. You need to figure out what those changes are to produce a new layer because images are essentially collections of layers and when you're building an image based on another image, you're always adding additional layers. So the new image is not going to be any smaller in terms of this space that its layers use when you're storing them on a registry. And, but at runtime, the container may actually be using less space. Anyway, the hard part of this and the most time consuming piece of building any container image is finding the set of changes to the uppermost layer, well, the newest layer. So how do you do this? Well, if you're using a union file system, something that lets you efficiently compute the differences between layers or actually has layering built into it, you can do something clever. You can extract just the changes for that layer. In fact, originally, and this carries over into the layer format, well, the format of whiteout closely matches the format used by a UFS, which is an implementation detail, which, okay, fine. If you're not using a union file system or say you're using block devices or you're not doing any copy on right logic, you basically walk the file system tree and do the equivalent of a recursive diff. This can be very time consuming. In fact, a lot of time and work has been put into trying to make this faster. So, once you've done that, the rest of it is fairly simple. You create a new layer for your changes. You update the manifest to include the new layer. You generate a new configuration blob, which may include information from the previous configuration blob if you started with another image. And the net result is a brand new image. So we produced a tool called Builder that actually just automates all of this for you. Builder from is a command line. It pulls down an image, generates the file system, creates a new writable layer if you're using copy on right file systems. You can mount that image for, sorry, you can mount that container for direct manipulation. You can have it run commands in it using RunC, and then you can commit your changes producing a new image. Other things it does is, well, one of the other primary drivers of this goal is that we would be able to reuse a lot of the libraries that we're using in Cryo. So, when you pull down an image with Builder, Cryo sees that image. If Cryo has already pulled the image down, Builder is able to use it without having to download a new copy. As soon as you commit a new image with Builder, Cryo sees it because they're using the same storage space. And yes, it can also push to a registry because that's a thing people expect to be able to do with a Build tool. Now, the Builder command line is closely modeled after Dockerfile instructions, but in addition to being able to do things on a command line, you can feed a Dockerfile. It actually uses the OpenShift Image Builder library to drive this. So, the behavior is a little bit different. You don't get multiple layers, you don't get layer caching, but it produces a perfectly workable image. Now, Builder itself is primarily a library that's used by its API, and that's where we're going to be trying to integrate it into OpenShift. Actually, that's the entirety of the overview. It's actually a fairly standard core project. Now, I can cut to a demo or, Dan, did you have any comments you wanted to add? I got a quick question. Yeah, go ahead. Yeah, you know, we have sourced image. I know Builder objective is to really build some Builder image or modifying image without having a big Docker environment or OpenShift environment. Instead of having one more extra tool, is it possible to make it a subsystem of sourced image that way there's one more less thing people have to remember and update or whatever? So, I would like to know what's your idea behind that. Sorry. Yeah, I mean, if you're building in an environment where you're using the command line tools, then Builder is a perfectly usable thing, but longer term we want to just have it built into the building facility that we have in OpenShift. So, you don't actually have to know the reason Builder to do the image builds. Got it. Yeah, so one of the things we've done is most of the functionality can be put into a library. So, if OpenShift wants to call into the library directly, then it could do that or far it could exact the command. So, right now we're, I mean, my main goal with Builder is potentially giving us a more secure way of building sourced image from a, from a OpenShift point of view so that we could potentially replace sourced image not using, not requiring the Docker socket any longer and just be able to do everything internal to OpenShift. Understood. Thank you. Now, one of the interesting things that Dan touched on just now is that we don't use the Docker socket. When you're building an image inside of a container, you're normally having to, well, the normal procedure is to run the Docker client inside of a container and have it connect to a socket to the Docker demon. Builder is not a client server application. So, it's actually doing everything in process and that's a trade-off and the thing that we have to work on in order to be better integrating with, in in order to better integrate with OpenShift is to try to get the set of privileges that it needs in order to do what it does, down to a much smaller manageable level. Now, right now, you can't run the whole thing inside of a container. You can do, well, if you make it a privileged container, you can, but we're working on that. So, I have a quick question. So, if you're not using OpenShift, can you use this directly with Kubernetes? Because it's a tool that's useful for people outside of the OpenShift ecosystem. So, you can use, so Builder can be used as just a tool that replaces Docker Build. One of the things I think he's going to show in his demos is there actually is alternative ways to building without using Dockerfile. But bottom line, yeah, you can use Builder to build any image using the similar workflow that you've done with Docker Build in the past. Also, since Cryo and Builder can share the same, share the same storage, you could actually build a container into your local storage and then instantaneously allow you to run Kubernetes container, you know, have Kubernetes instantly launch that container in your environment. So, Builder also has some really cool features in that it can store in different types of, you can actually, when you commit and push, you can actually push directly into container registries like Docker I. But you can also push directly into, you know, right into the Docker daemon, you can push into, you can sort of extract it out into tar balls, you can do everything you can do with tools like Scopio, you can also do with Builder after you complete. So, it's built on the same tools that Scopio, Cryo and other tools are being built on. So, it's got a ton of flexibility to it. All right. So, let's see a demo. Okay. As Dan described, this first one is actually just a recording of using most of the command line applications. What this recording does is, first we check to make sure we don't actually have any images on the system. We check that we don't have any containers on the system. These are all commands that are built into Builder after all. So, here we create a scratch container and we're going to use that to install most of the user space. Well, we're going to mount it first because otherwise we wouldn't have access to it. That tells us where it's mounted. We set the environment variable because that's a lot of things to remember. Then we're executing DNF on the host to actually install the entirety of, well, Fedora 26's user space into that previously empty container file system. So, this is going to run for a bit. We'll grab the source package for, nope, that's not what it's called. Nope, that was down. Unfortunately, the package server was down at the time, so I had to download the source RPM from somewhere else. So, in this case, we're having built, well, having downloaded a source package and copied it into the file system, we're able to just run RPM build or YUM build up to install the build dependencies for it inside of it. At that point, we were actually, well, having installed the entire build environment into that container, we're able to rebuild the source package there. Or, rather, we built an image with just the build dependencies. This actually takes a while. We removed that working container. This was the build environment that we generated. Create a new container using that, mount it, copy the source RPM into it, and then just run RPM build to rebuild the source package in the container. This is not entirely dissimilar from the mock tool that people who build packages for Dora are probably already familiar with. At this point, we copy the binary packages out of the container, remove it, create a brand new one, which only contains runtime dependencies. Install the packages for this. This is a much smaller set of packages than we installed in the first image because it doesn't contain any build tools. We installed the binary RPM, and then we run it. But, as Dan pointed out, since, well, a lot of people are more familiar with using Docker build, we wired it in using the image builder library from OpenShift. Yeah, let's hold off from this demo for a second. Can you just deposit? Yeah, so one of the things I wanted to point out in the previous demo, we were using all standard tools that are provided from Linux. So if you're familiar with Docker, you'd have to do a Docker copy to get content in and out of container images. In the demo you just saw, he's using the standard copy command to copy stuff in and out of images. The image is mounted on disks, so you're able to redirect the NF to install, and you can actually do a make install with a dash dash desk directly into the container image. So it basically gives you the full breadth of tools that are available in a Linux distribution in order to build your container image. And then the final output, he showed, one of the big problems with Dockerfile and building Docker images is that the artifacts needed to build the container actually included in the container. So for instance, if you just use any container, it requires DNF to be inside the container or it requires an apt-get to be inside of the container. But really, if you're running in a container that's just going to be running, say a web interface, you really don't want all these tools in. Similarly, you know, G-libraries, other things like secrets have always been a problem with when doing Docker builds because you end up with the, say, you need secret information, say Kerberos keys or something like that to get access to content, you know, some kind of content that you want to install in your container, you actually end up having to put it inside the container. Lastly, the other thing he showed during the demo was he was doing builder runs. The builder run is kind of an interesting tool in that it uses, it actually launches a container to run the command. So as opposed, he showed you DNF install from the host installing into the container. But if you want to follow sort of the traditional Dockerfile method of running, you know, yum or DNF inside the container, then you can use the builder run command, which actually uses run C on the system to create a containerized environment. Into the little cherooted environment that you have and then runs the commands. So that demo really demonstrated sort of the full breadth of commands outside of using Dockerfiles directly. Now the second demo is going to actually show how you could use, if you have lots and lots of Dockerfiles around, how could you use builder to support that Dockerfile build methods. Yep, so I'm going to un-pause it. And those of you who have built Docker from sources probably notice the first thing it does is it builds a build environment container image just so you can build the entirety of Docker in a container. So as a test case, we actually use builder to build this one. And that's what we're watching here. We're just stepping through the top level Dockerfile that you find in the normal Docker source tree. It's installing, it's building a lot of dependencies from sources. And this is going to run for a while, but this is essentially what happens when you run build a bud or build using Dockerfile, which is the longer version of the command. Now one thing I want to touch on that Dan was mentioning earlier is that we do have a build a run command and that does run, use run C to run a command using the file system that you're creating. What I've found is that people tend to try to use it as a shortcut for Docker run. It is really not comparable. It's much more comparable to, well, the Dockerfile run instruction and how that's handled. Now this compile does take a while. Those of you who built Docker from source are well familiar with that. This is getting toward the end, I think. This actually requires a fix that we haven't yet integrated into image builder and both these demos include some code that we haven't yet integrated into build up, but that's ongoing work. This has been edited for brevity because normally some of these tests take a lot longer than we just saw. So a couple of other interesting things, depending on your use case, builder is either faster or slower than Docker build. A builder for the general developer use case at this point is a little bit slower because we're not caching. So if you're continuously modifying your Dockerfile and then rebuilding, builder will start from scratch on each one of them. So if you're running a DNF command inside of a Dockerfile, it will go back and rerun the DNF where it's currently Docker build. Notice that that command already succeeded and you haven't changed anything up to that point, so it continues later. So therefore, in those cases, for developers sometimes it could be a little bit slower, but in the case where you're starting just doing a Dockerfile directly that's already been compiled. So it's sort of like a build system environment. Builder is actually a lot quicker because it doesn't cache the each step. So if you're running, say you had a Dockerfile with 50 steps in it, each line or each line in a Dockerfile ends up being executed and committed. So every operation is going to be going through a commit. Whereas in a builder environment, you get to choose when you do the commit. So if you're doing a build from Dockerfile, then you end up with just one commit. So you have the start of the container and then the exit of the container. We're hoping to in the future add more of the caching to allow you to optionally specify caching, but at this point it's basically we're doing one commit per Dockerfile. But that also gives you that gives you ability to do stuff in this inside of the Dockerfile and then remove it and not have to worry about it actually accidentally getting cached and being shipped out of your product. So there's this cost and benefits of these methods. Just a quick question. Can you talk a little bit about the relationship between source to image and builder? Is builder going to be incorporated into source to image? Yeah, so right now the current plans with OpenShift is, as far as cryo is concerned, OpenShift Online is moving towards using cryo as a replacement for Dockerfile running standardized containers. But we're going to be still using Docker builds for source to image. So it's just that the operations teams didn't want to take on both just switching over both ends of OpenShift Online and OpenShift to it. But in the future we want to replace either using the library directly or just using builder. We'll be moved into source to image so that we can build directly on top of builder, build images using it as underlying technology. So this is basically removing any dependency on Dockerfile? Well, you can still, well, source to image doesn't rely on Dockerfile now, but it removes dependency on Docker itself. Also, one of the other issues not only mentioned in the beginning was around Docker is a client server operation. So if you're running containers, say you're running a Kubernetes environment and you want to leak the Docker socket into the container, then you're really, you're not building the container process is not controlling the actual building of the image. It's really just connecting the client to the Docker server that's doing the image build. Once OpenShift moves to using builder, then the process that's actually building the container is all inside of your source to image container, so that we can better associate the C-group constraints on users using a builder type method than you can doing it through Docker. So if you say that this user who's launching a process to build a container only gets so much of the CPU or only gets so much memory utilization or only use so much disk space, it becomes a lot easier for OpenShift to be able to control those types of items, especially for things like online or even if you're building and using OpenShift in-house. So, and the road ahead too, though, it's, you know, I mentioned this earlier that this is something that you could use just with straight up Kubernetes and curious. I mean, right now it's in the GitHub under Project Atomic. Is there any plans or thoughts about maybe moving this to be an incubated Kubernetes project? Yeah, we've been talking internally about potentially submitting it to the CNCF. Since it's really not tied to Kubernetes, I don't think it makes a hell of a lot of sense to put it under the incubator, but we would contribute it to CNCF if that's something people want. Cool. So people want to contribute to Builda. They just go to the GitHub repo and they can connect with everybody there or reach out to Nelland or Dan via either IRC or if you're a Twitter through Twitter. So there's lots of good ways to connect with everybody. This has been the perfect length of a briefing. I've been trying really hard to get everybody to do it within a half an hour, and you have done it. So if there's not any other questions, is there anything else that either of you would like to add, Dan or Milan? Nope, I think that's it. You have done the nail on the head here in terms of 30 minutes in. So perfect timing. We look forward to more talks. I know we have a talk coming up sometime soon on something called K-Pod as a teaser. And hopefully we'll get that in in the next couple of weeks as well. So one thing in the chat. Oh, just a nice job, Jen. So thanks again, guys, and we'll talk to you all soon. Yeah, great topic. Thank you.