 discussion, so thank you for being here. Great. Thanks. Thanks. Welcome, everyone. I also want to thank Scale for this opportunity. This is a fantastic conference. This is my first time here, and it's really just incredible. And there's an added bonus. So I live in Raleigh, North Carolina, where Red Hats headquartered. And they're iced in right now. It's one of these disastrous southern ice storms where the power lines are coming down, gusty winds. So everybody's miserable. And I escaped. I know. Yeah. Unfortunately, next stop is Belgium and then the Czech Republic, where it's going to be even colder. But hey, I escaped the great snowpocalypse in the south just by a few hours this week. So this is really just fantastic to be here. So let's talk about pulp. Here's the problem that we're trying to solve. Is that managing and distributing software is really messy. There's a lot of problems when you try to insert yourself into that process that someone like Debian or Fedora or Red Hat or any number of proprietary vendors do of making repositories of software and other types of content available to their users. One of the problems that there are just a tremendous number of types. So we have, of course, Debian packages and RPMs that have been the classic staples for the last decade plus of the Linux ecosystem. But then you have things like puppet packages. And we're talking about how Chef packages may be their stuff, how Ansible doesn't really package or version their stuff. We have Docker images. We have just other types of system images, ISOs. OpenStack has some image format options. How do you manage and relate all these different types of content to each other? How do you manage them in a sane way without just using one, two, or three different tools for each different type of content? And what happens when these repositories change? You get new content. Software engineers are always fixing bugs, releasing new features. How do you make decisions about an update for a package came out? Do I put that additively into a repository? Do I replace the previous version and yank that that had some critical bug and put the new one in? When is it time to say here's a clean repository I'm making a break and version 3.0 is going in this clean brand new version, a clean brand new repository? Similar to how a distribution, we get Fedora 22 and then we break compatibility and get all new versions of things in Fedora 23. How do you draw those lines and how do you have tools that help you draw those lines? And then locality of content is really a big problem. So Red Hat, for example, is the use cases on my mind the most for obvious reasons. We distribute a tremendous amount of software via a CDN. So we use Akamai primarily to distribute software to our customers. But then what do those customers do? Do they have access to the internet all the time when they want to have access to this content? Maybe not. We have a lot of really interesting use cases. What about inside a public cloud? Do you want to be pulling content if you're running in Amazon EC2? Do you want to be pulling your content from the public internet all the time? Or can we help you get on-site mirrors of content? And then what about your own content within your infrastructure? Do you have a development team that's producing your own packages and you need some software to help you manage those repositories and deliver those packages to the right parts of your infrastructure? And then with locality comes control of access. And this isn't really like DRM style control necessarily of like I want to make sure only people have paid get this, but more along the lines of stability and quality control to say we have the brand new bleeding edge bits that just came out of development this afternoon and that is in a repository that our testing environment has access to. And we can promote that to a different set of repositories that our production environment has access to. And we draw a hard line in between and ensure that these different sets of infrastructure only have access to the software that's been through whatever other process for quality assurance we want to put in place. So with that, let's talk about what is pulp? Primarily it's about managing repositories of content. We started not surprisingly with RPMs and the YUM family of content. There's a lot of things in there. There's RPMs and delta RPMs and source RPMs and various other things. Arata gets into an interesting area. We quickly discovered that the concepts of managing a repository are fairly common across types. So then we looked at puppet modules and discovered we can have a core set of tools that knows how to take a piece of content and take a repository and associate them with each other. And we can have a lot of pieces of content and make associations with a lot of different repositories and mix and match, move them around and have a whole REST API and a whole common set of tools and user interface that knows nothing about actual specific types like what is an RPM or what is a puppet module? What do you do with this? The stuff in the middle doesn't have to care. So pulp supports a lot of different content types. We're gonna list a bunch of them soon. But that's one of the primary goals of pulp is to be this general framework that you can plug support for many different types of content into. We do have this pull through cache feature. This is brand new. It's in our 2.8 beta right now. We've not released it yet, but the release will be forthcoming in the next few weeks, hopefully. If you're really interested in that, we're gonna talk in detail about that soon. You could help us test it and try it out. But it's a very exciting feature that's been missing, especially from the YUM and RPM family for a long time. I've certainly missed it. Pulp's open source, totally open source, GPL. It's on GitHub. You can go see everything's developed in the open just as you'd expect. You can see the pull requests. You can comment on our pull requests, see how the team reviews each other's code. We do peer review on everything. We always love contributions, even if it's just bug reports or compliments or beers after the conference or anything. I think you might like to share. And Pulp is a Python web application. Now, before you get your hopes up, being a web app does not necessarily mean that it has a beautiful graphical web interface. Pulp does not. Pulp has a REST API and a command line interface. This doesn't rule out the possibility of a beautiful web graphical interface in the future, but right now we have a command line interface that we'll get to know in just a minute. So let's dive into what we do with Pulp. There are two primary things that we're going to demo on. So the first is we want to create a repository. Now don't think about this in terms of like running the create repo command line tool that writes some files on disk and now you have a yum repository of things. This is a more abstract concept of a repository in Pulp. It's really just a record in the database. It's empty. You tell Pulp what type it is so it can get the right plug-ins plugged into both sides of that repository. Now that you have your repository, you want to get content in. So there are two primary ways of doing this. One is to synchronize with some remote repository, like Fedora or like Debian. So you would give this repository a URL and say I want you to synchronize with that remote source and it'll grab all that content and pull it in. You do that manually or you can schedule those kinds of operations. And then uploading is the other primary way of getting content in. So Pulp's REST API allows you to provide a file, tell Pulp what type is this. So you'd say I have an RPM. Here's this RPM. I'd like you to put it into this repository over here and Pulp will do that for you. And then the third way is once you've got content in the Pulp, now you can actually move it around to different repositories and make copies. And an important property of Pulp is that copies are cheap. They're really just a record in the database. So if you have 10,000 RPMs in a CentOS repository and you want to, for example, we asked about snapshotting, make a snapshot of what does this repository look like at this point in time? You can copy all 10,000 into another repository. Look at a split. They're just a bunch of association records in the database. And now you have a snapshot point in time. So you can change that original repository however you like. Once you have your repository populated with some content, you're ready to publish it. This is an important distinction in Pulp conceptually that the act of changing repository doesn't necessarily make those changes visible to the clients that are consuming that content. So it's not until you run a publish operation that, for example, Pulp does the equivalent of running create repo and actually writes some files to disk. And now the content you've put into that repository is available to clients over HTTP, for example, although that's not necessarily the only way you can distribute content. You can also, Pulp has the ability to write out content repositories to ISOs. It'll generate ISOs. You can tell it how big they should be. We have a community group that has written another plugin. They'll actually rsync the contents of a repository to some remote location, which is pretty nice. And for Puppet Modules, we actually have a really interesting use case. Rather than making the act of publishing their repository, we had a very strong use case for running Pulp on the same machine as a Puppet Master and wanting to, the act of publishing to actually install the Puppet Modules into a local environment, which is really just a directory on your local file system of the Pulp server. So that's what the publish operation there means. It could mean many other things potentially, like distributing your content with BitTorrent or anything else you could think of. So with that, let's do a quick demo and we'll see how this goes. This demo does in fact require the Wi-Fi to work, but the Wi-Fi's been remarkably performant in this conference. We'll see. Is this readable in the back? Oh yeah, yeah, let's get the, if we can get the lights down now, that would be the time. Cool, better? Okay, great. Keep me posted. Okay, now I can't see my keyboard. It'll be fun. So Pulp Admin here is the command line interface to Pulp and Pulp Admin itself is really just a REST API client. It's laid out hierarchically, which makes it easy to plug in new commands for new content types that you add. So in this case, we run Pulp Admin and the next command we see here is, you see the cursor, great. RPM, so we wanna do an operation on RPM content and we wanna look at a repository, do an operation with a repository and the action we want is to list. So we've listed and we see there are no repositories. So let's create one. So here's a command to create. So we have that same RPM category. We wanna look in a repo, we wanna, the action is create. So now I've given it a repository ID for this thing I want to create. I've called it the zoo and I've given it a feed URL. This is a remote, yum repository. It's live on the internet. You can also go and install packages from the zoo if you choose, it's a great testing repository. So I've created it and we're done, success. But we don't have anything yet. We can do this list command again and we see we have this zoo repository but there's no content. So as we discussed before, the next step is to do a sync. So we're gonna synchronize and run here is distinct from schedule. So this is the other option. We could instead of syncing right now, we could schedule this operation to either run once in the future or run repeatedly on some specified interval. And I've told it which repository I want to synchronize. This is gonna fly by a little bit but we're gonna scroll back and parse through a little bit of what the real action is. Okay, we're downloading metadata. Now we're getting RPMs, that was real fast. Got all 32 before we could get a progress update. Published, okay, we're done. So let's look at what just happened real quick and then our demo will be over. Okay, we started synchronizing. Pulp downloaded some metadata. This is essentially like a manifest. What's in this remote repository? It chewed on that for a moment and decided which of these things do I already have, which ones are available, make some decisions about what I need to download, then it downloaded them. Looked for some other types of content that it didn't find, not important. That sync task completed. Now by default, Pulp will then kick off a publish operation which it did here. So it initialized some metadata. It started writing to that metadata. It wrote out all 32 RPMs to those appropriate metadata files. Did some other things and we're done. So that's the general user experience of Pulp. Oh, and now we can run this list command one more time and we see that we have some content now. We have 32 RPMs, we have some package groups. We have four erotum. This is the way that RPM type distributions express essentially things that bug fixes that are available in a standardized way. So that is the end of the demo. Any questions? I kind of tear through this kind of fast. Just raise your hand any time you have a question or yell at me if you don't see me. Let's go back to our presentation and we can get the lights back on if you like. Thank you. Where were we? We're at the demo. Okay, great. Content types. These are the content types that Pulp supports right now. Of course, the RPM family. Docker images are a very important one right now. Very popular. Very interesting exercise in keeping up with all the change that's happening there. The change from their V1 API to the V2 API. They've added a lot of really exciting features, but at the same time have changed essentially everything multiple times. Puppet modules are another very important one. Also similar changed a lot over the years. Several different versions of their API. Each one better than the next. Python packages. OS tree. If you're not sure what an OS tree is, it's worth Googling or whatever your favorite search engine is. Really fascinating technology. It's the foundation of things like Project Atomic, which is another thing that's worth knowing about. Regular files. You can just have repository of whatever old files you want to stick in there. We have community support for Debian packages. And then we have a community user who's developed a plugin for NPM, but has not shared it with us yet. So we're eagerly awaiting that contribution. Let's talk a little bit about who uses Pulp. Red Hat Engineering was customer number one. All software that Red Hat distributes to our customers and to our users is managed by Pulp. Internally there's a giant Pulp has, I don't know how many hundreds or thousands of repositories available. Tremendous amount of content. Public Clouds. If you've ever in Amazon EC2 started up a Red Hat Enterprise Linux AMI for example, and then installed software, you have pulled that software from Pulp. So we have Pulp running under a different name called the Red Hat Update Infrastructure, which is basically Pulp with an extra user interface on top of it to do a very specific workflow. Most of the public clouds have this running inside them and the images that you would instantiate have the information baked into them that will automatically pull from, for example in Amazon, your region, whatever region you're in. So you get the data faster and of course free bandwidth. Catello is the upstream of Red Hat Satellite 6. This is a whole content and systems life cycle management project and product that we could talk for hours just about that. That's one option if you're actually looking for some more help in terms of having a graphical interface and a set workflow that uses Pulp to manage content and go through a promotion workflow, Catello and Satellite are a fantastic option, plus you. And of course we have a thriving community and a growing community that we'd love to add you to as well. So here's a specific use case, this is a real basic one, is just to mirror some content. Python packages are a nice one and a popular one, but it could be any of these content. So you can synchronize packages directly from the Python package index. You can add and remove, get exactly the packages and exactly the versions that you want and keep them. Here's, I've been using Python for a long time. This is not such a big problem anymore I think, but years ago it was a huge problem that sometimes versions would disappear from the Python package index. You'd be using some like Django, whatever at some version and depend on that version and build all your custom stuff around it. And then when version.next comes out, the old one will disappear and it's endlessly frustrating. So Pulp is one way to take control of that and have your own well curated slices of that upstream content, exactly the versions that you need, exactly where you need them. So another use case is this promotion idea that I alluded to earlier of development, testing, production and essentially we accomplished the promotion by using the copy operation. You start with a repository that either your development team uploads directly to or maybe your Jenkins or whatever continuous integration or build system that you have is automatically uploading content, dumping it into some Pulp repository. And that is attached to maybe just some basic testing framework or testing infrastructure that will run some automated tests and give a thumbs up or a thumbs down. And if the thumb goes up, then that can get copied all of that into the next repository that you've set up which would be some flavor of testing or quality assurance. And maybe some lab has access to that and you can see where this is going. Finally to production, which could either mean deployment if you're in that business or production could be like in more of Red Hat's example, just a CDN or some public space where you're making that software available to the people who are going to use it and deploy it. Yes, great, great question. The question is in the promotion workflow it doesn't handle dependencies. The answer is sometimes. It depends on the content type. And this is one of the messy parts of dealing with all these different types of content is what is a version on an RPM mean versus what is a version on a Python package mean? Different things. The algorithm for taking two versions and comparing them is different for all these different types of content. Oh, that's incredible. And the way you express dependencies is also, it's a lot of work to keep up with it all. So right now RPMs I think are the only content type we have that supports dependency resolution. You absolutely can copy and say, I want you to copy VIM from this repo to that repo and pull all this dependencies with it and it will do that. The questions? That's certainly one option, yeah. So the question is more detail about the curation process. How do you choose what you want to promote or not essentially and perhaps even what do you want to back out? There's a whole search interface. When you do this copy you can either pick and choose I want these specific names and these specific versions of things, names and versions. Here's a list, copy only those things. You can remove things in exactly the same way or you can just go whole hog and copy the whole thing over. The power's all in your hands with the tools. Let's keep moving. Oh yes, please. Oh, this is a very good question. Is there a way to find the difference between two repositories? This has been a feature that's been requested in the past. We don't have it. We have, we are on the cusp of offering that. So stay tuned. I think shortly after pulp three.o comes out, which I hope will be in the next year or so, maybe six months even. That is going to be a possibility. Some, there are some tools for these specific content types that will let you compare. RPM I happen to know, I think has at least one tool that will help you compare two different repositories. But not yet. But that's a very asked for feature. So Pulp's a distributed application. We're just going to talk through the pieces a little bit so you can see what you can do with Pulp and how you can scale it. So there's the REST API. There is content served via HTTP. Well, as we discussed, it's not the only way you can serve content, but by default, that's what Pulp does with most of these types of content. Serve them via HTTP. And then we have these long running jobs which kind of throw a wrench in the gears of what you would think of as a normal, simple web application like this. So here we have these smiling users on the top left, and they're interacting with a web service. And this web service has to be HTTP. Pulp mostly deploys and is tested with Apache web server, but doesn't necessarily have to be tied to that one. Normally, you have this web service. It talks to a database, it keeps state there, and that's the end of it. But having these long running jobs really does throw a wrench. So imagine one of these users that we have on the left says, see that repository over there with 10,000 or 100,000 RPMs? I want you to publish that now. That's going to take substantially more time than we can reasonably wait before responding to an HTTP request. So we need some way to deal with that. So we added this extra stuff. We added an AMQP message broker. We support a couple of different ones. And this pool of workers. And there's some magic that happens in those workers. Actually, it's not magic. I actually kind of don't like when people refer to software as magic. It's very well defined and very well thought out. But for these purposes, we're going to think of it as a black box because I could do a whole presentation on the guts of the algorithms of how that works in terms of prioritizing and keeping different resources separate from each other. Anyway, I've already said too much. Let's trace real quick through how a request flows and bounces around through here kind of like pinball style. So this user in the top left we have here, imagine that he wants this publish operations. He's requested, I want to publish this 100,000 RPM repo. That request goes to a process that's handling web requests. That process starts doing web application type things and adds a record to the database. It puts a record in the database about this job, some information about what repository this is, what kind of action you can imagine, what kind of things might go in there. And once that information is there, it'll put a message on a queue on this message broker. And from that point, the web process is done and responds back to that user and says, okay, I queued a job and here's a unique ID you can use to track that job over time. And at some point in the future, one of these workers will notice that message on the queue, it'll grab that message off the queue and go access the database and now get to work and actually do that work. The beauty of this is that you can scale these different pieces independently. So if you have a workflow where you have a tremendous number of client machines that are accessing content very frequently, but your content doesn't change very frequently, then maybe you want to scale up your web processes across a lot of infrastructure, throw hardware at it. Conversely, if you have a lot of churn, for example, this is one story I enjoy telling, we have one community user. They have hundreds of thousands of RPMs that used to be all in one repository. It was kind of nuts. We talked them into splitting this apart a little bit, but the general use case is this. They had a whole bunch of small web applications, small, slightly reusable web applications. And if you imagine this matrix of web applications and then all these different skins that they put in terms of branding and who knows what else, they end up all these different permutations of packages that need to go together and be built together and be shipped together. And they had these individuals, small, very agile teams that worked and made a little changes on these things all day every day. And they would rebuild essentially everything every 10 minutes all day. So there's just incredible amount of churn and that was actually part of our inspiration and motivation to make these different components of PULP very scalable. So they're now able to scale out this worker pool and go either just scale up a tremendous amount of infrastructure and retain a huge pool of workers, or we can facilitate this cloud burst concept of at midnight every night, everything gets rebuilt and retested and we only need to spin up some new virtual machines in some public cloud or some private cloud for that period of time when it's cheap or whenever it's opportunistic. So we can just add a bunch of workers, do a bunch of work and then throw them all away. On to the next topic, PULP's extensible. Getting back to that concept of there's a general way you can approach managing or repository and associating content with it and not caring about the details of what is this piece of content and how does it work, how do I use it. PULP has this central core with all these tools and then what you use a plugin author provide is just how does content get in, what's some kind of funnel that you can use to hand content into that core and then when a user asks a PULP to publish something, provide a distributors, what we call it, that's the piece that PULP says, I have all this content, the user asked me to publish it so here, you know what to do with this, go do that. So how does content come in and how does content go out are the two important concepts that we've separated from the core. So all we need are three primary things. First, a type definition. So again, PULP is a Python application, we use Mongo engine right now. We're in the process of migrating toward using Postgres instead of Mongo. As you can imagine, it's a big deal, it's taking a lot of time and we have to be very, very careful. But in any case, you define your type. So for example, Docker blob or Docker manifest and you tell in this very simple class, you tell PULP what makes this unique. For example, with an RPM, anybody recite what makes a unique RPM off the top of your head? Never, yes, you get extra stickers at the end. Name, epoch, version, release and architecture. For a puppet model, it's just the name and the author and the version or author you might call username, they've kind of switched terminology. Anyway, you tell PULP what makes a unique one and that's it. Now you provide an importer. This is the thing that implements, how do you get content in? How do you interrogate a remote source and figure out what's there? What do I need to grab? Actually download it and then stuff it into PULP. And then there's a distributor. This is what gets you from, I have all this content in PULP sitting here. Now I want to write it out to disk as a yum repository. Or now I want to serve it through the Docker V2 API or one of the several different puppet APIs to my puppet clients, for example. And that's it, those three concepts are how our plugin writers add support for extra types. And those plugins can be versioned separately from PULP which has given us some nice flexibility. So any questions about this? I'm just gonna move on. Good question, how do you deal with versioning of plain files? You don't. But PULP doesn't give you that. PULP manages unique files, it establishes them as being unique based on their checksum and file name. That's correct, you cannot assign a version. It would be an interesting idea. It's something that we've thought about and if you have some specific ideas, you could maybe brainstorm after it about it. That's something that there's a general need for that. I would strongly suggest that semantic versioning is the right way to go. If you're not familiar with that, go to semver.org and read and be enlightened. That is largely, in my humble opinion, the right way to do versioning. And it's the way a lot of different types of content do their versioning. So we could add that, for example. That could be an option in PULP in the future. The command line interface is also pluggable. This is not very interesting, but suffices say that hierarchical layout that I demonstrated is the way that you would then, in addition to RPM content, add a new branch of the tree called Docker or Puppet or ISO or whatever other content you want. In terms of integrating with PULP, this is where the exciting part happens. If you want an out-of-the-box and out-of-the-box ready-to-go experience where you've got promotion workflows, you can group different types of content together and promote them and snapshot them, something like Catello is really the way to go. If you are, for example, Red Hat Release Engineering, you have very specific needs and specific workflows in terms of build infrastructure and test infrastructure and quality assurance and signing infrastructure, being very strict about who gets assigned packages, what's that process of reinjecting them into PULP, and then what's the gait of releasing? There's embargoes on bug fixes where we're not allowed to talk about, and everybody agrees that we're gonna release a bug fix for some security thing at noon on this date or whatever it may be. If you have a lot of process and custom workflows like that, now you wanna integrate with PULP. So we have these REST APIs, well-documented, it's generic, so you can manage all the different types of content. One REST API keeps it very simple, at least. Simple as it can be. I guess I shouldn't oversell that point. It's still a messy business, as I said, right up front. We publish events to an AIMQB topic exchange. If you want to respond to events, for example, after a publish has succeeded, I want to now kick off a test job with Jenkins to consume, maybe install those packages on a machine and run some tests or whatever. You can do that. We also have HTTP callbacks, a similar way. This is what Catello largely uses at this point to figure out what is PULP doing and when, and when does it need to respond to something. Okay, this is a very exciting feature. Oh, yes, question. Yeah, good question. The question is, essentially, is there a Python library that is usable as a REST API client to PULP? Yes, we call it a bindings library. You can absolutely install the PULP bindings by themselves and use them to call all of these operations. There is also a Ruby gem, I guess. I don't speak Ruby. But Catello is written in Ruby, so I have to at least tolerate Ruby, I'll say. I don't know if anything gets Ruby. I just don't really know it. But they have also a library in Ruby that they use to talk to PULP. I don't know how easy it is to install by itself, but if anybody has an interest or a need to interact with PULP from Ruby, that would be my first stop, is to check in with Catello. Or just get on the PULP email list or the IRC channel and we'll point you in the right direction. Ah, K, excuse me. K-A-T-E-L-L-O. And again, Catello is the upstream of Red Hat Satellite 6. Catello brings together, oh, who's familiar with Foreman? Cool, so Catello brings together Foreman and PULP and this other thing called Candlepin that's really only important to Red Hat, but it's the entitlement management part of Satellite. Catello brings those other three things together into this unified workflow and user experience. Okay, pull through cache. This is the exciting bleeding edge brand new feature. I used to be a Debian and Ubuntu user for a long time until I started working at Red Hat. I still sometimes use Debian and Ubuntu at home. I enjoy it. I don't like getting into distribution wars or anything like that. There's a lot of really interesting merits on a lot of different sides. I like keeping my brain in different areas and seeing what are other people doing? What are the best practices over there and what can we learn? Anyway, one of the things I really liked about my time with Ubuntu and Debian was this thing called apt cacher that was a simple pull through cache where if you had a lot of machines sitting in your private network and you wanted them to not each all download an entire copy of the base install of Debian, you could use apt cacher as your proxy and it would do deduplication, do the right things. Sometimes it worked better than others but it was there. It was useful. It was very, very useful. That's never existed really for RPM. There've been a couple of attempts. There was an attempt to add RPM support to apt cacher and that seems like it didn't go very well. So pulp is on the brink of releasing this feature. It's first supporting the whole young family of content and then once we get that out and proven and tested and okay, we've done this sanely because it's a huge complex feature, then expand that out to all the different, well, most of the content types we support. There are a couple that for technical reasons really are not a good fit for that model. But so this is currently available in our 2.8.0 beta. You can go to the pulp website and find the beta repository and install it and play with it. What it basically does is you still create your repository. Actually, you know what? I'm just gonna pull this up and show this real quick. I don't think we need to worry about the lights. I'll increase the font size a little bit. Wait, did that work? No, I did not. Okay, we're gonna look at the help text for the RPM repo create command and we have all these knobs. We can turn, oh, so sorry about the line wrapping. Okay, the download policy at the top here is how you tell pulp how you want to make your content available. Immediate does what pulp has always done. Background will finish your sync very quickly, add all these records to the database and then start a second job later to go download the files in the background. And in the meantime, you can go ahead and publish your repository. You can start copying things around. All that content is usable through the pulp API while it's being downloaded in the background. Third option is on demand. Pulp does not download any of the actual files until a client requests them. So we'll look at exactly how that works here. This did not stay on the slide I was on. Very good question. Am I in approximately the right place? Here we are, okay. The question is how does background downloading scale if you want to background download a whole bunch of repositories at once? To some extent it's up to you to manage that. So for one, you can tell pulp how much bandwidth it's allowed to use for each download operation. You can also control how many workers are available to pulp and it won't do, it'll only do one job per worker and that jobs include syncs and publishes and these download operations, all these things. So by limiting the number of workers, that's another way. You can also limit the number of concurrent download operations that happen within any given worker. By default, one worker will do five downloads at a time. You can dial that down to one if you like or dial it back out. So essentially, as a distributed application, we've avoided getting into the essential feudal attempts to do smarter things that almost never really work out and cover all the bases of trying to understand what your user really wants to do right now. So here's how that, this is what that looks like. So starting from the left, we have a yum client that some user type yum install, vim. And yum talks to pulp and says pulp, I want this file. Pulp first looks at its local file system and sees if I have the file locally, great, I'm just gonna respond and serve that file. If not, it goes through this new workflow where the first stop is squid. There's nothing specific about squid that we've done so you could use something like Varnish or any number of other similar types of proxies. Squid just happens to be very widely available so it was a natural choice for us at least. Squids in reverse proxy mode and it talks to this streamer thing as a microservice so it's also part of pulp. We've made this pulp sandwich with squid in the middle so all of this conversation happens locally either on one machine or at least on your local network so we don't have to worry about man in the middle and your SSL connection if you want to do that. So you can still stay secure and the streamer is responsible for knowing, okay I know what file you want, I know that that file is available in these eight different repositories so I'm gonna pick one and go access it and if it's not there then I'm gonna try the next one and go access it and start streaming the bits back. That's why I called it the streamer. So there are actually four unique HTTP requests happening here and bits get streamed all the way back the other direction to the um client on demand. Does that all make sense? Any questions about that? It's fairly straightforward. There are a lot of edge cases that we needed to cover to deal with this but it's a very exciting feature I think is gonna be very useful to a lot of people. Okay we have this idea of consumer tracking. We've in the past done quite a bit more with support for this concept of consumers especially managing what is installed on various machines in your infrastructure. There's a pulp agent you can run on each of these machines and pulp will talk to it and tell it to install and update and whatnot. We're getting out of that business. So I don't wanna talk too much about that. Probably as a pulp 3.0 most of that's gonna go away. Certainly the agent is gonna go away. What's not gonna go away is the ability for pulp to help you still keep track of what is installed on which machines in your infrastructure and especially what updates does each of those machines need that is recently available. This is another feature that Catello and Satellite do a really good job of displaying in a graphical way and putting that whole workflow together to identify heart bleed just happened. And there's actually a really interesting blog post and there's a presentation I think the video is not available. But a really good blog post from last summer about IKEA, how they responded to heart bleed using Red Hat Satellite 6 which is this feature in pulp under the hood to identify this updates available. These are all the places that need that update, push the button, it gets the update out to all those places and you're done. So pulp can still help you with all that reporting of who needs what. We have some documentation as most projects do. pulpproject.org slash docs is the central place to start and from there we have links to docs for specific content types and each one is separated into both user docs and then like integrator docs and developer docs if you want to write some software that integrates with the REST API those dev docs would be the ones for you. This slide is more a reminder for me than something for you but I have these fantastic pulp stickers up here. They're very nice stickers too. They have this like vinyl finish on top. They look very stylish. I would love to hand some out. I also have a limited number of Red Hat satellite stickers and I think three or four Forman stickers. So if you like stickers, come up and I'll hook you up with some stickers. And with that, that's really all I have to talk about. So what are the questions do we have? Thank you, thank you.