 О, я вам покажу. Покажите, покажите. Как вы слышите меня? Слышно? Отлично. Спасибо, что приходили. Добро пожаловать. Меня зовут Майкл Рифнак. Я командир для пульта проекта. У нас есть какие-то красивые т-shirts. И у меня есть коллеги здесь. Спасибо, что приходили. Мы здесь поговорим сегодня о пульте, и о пульте-трии с пультом. Мы начнем с quick overview о пульте. Потом мы идем на Chapter 2, о пульте, которая говорит, что это о пульте, и как вы manage it with pulp. И потом мы возьмем вопросы. Это много спусти в 20 минут, поэтому мы будем дальше. Стоп, если у вас есть вопросы. Встаньте. Пульте здесь для solving problems вокруг репозитурных software и distributing repositories с контентом. Это очень мягкий бизнес, и есть какие-то классические проблемы, которые мы пытаемся сделать лучше. Мы можемplesit-трии тоже, и все эти артефакты. Вы должны учиться в какую-то же jusqu comb, и сделать что-нибудь сJE. Пелт может быть в reunited that и сделать все это в одном месте. Само-само-само у тех, как это изменилось, как это всё, что crabs уделяет, как вы выживаете Deixa, как вы хотите обучать информацию, как вы хотите попробовать. guardians of content. Где там есть budgets? Is your content physically, do you want to put it close to your infrastructure that needs to install it, maybe you have big images of some kind and you want to distribute them to your remote data centers, so they're near the machines, so when they need to boot up, it's already pre-populated, that kind of stuff. And how do you want to control which content is available to which parts of your infrastructure at any given time? That's the general process we're playing in. So what is pulp? We manage repositories of content, we support many different content types, we'll see a list in a minute. We started with Yum content for reasons you can guess and branched out from there. We do have a pull through cache feature, not with OS tree, but we started with our Yum content, so if that's something that's of interest, we could talk about that another time, but it's an exciting new feature in pulp, relatively recent. Of course, we're open source, we're on GitHub, as you can join us there, all of our development happens on GitHub, and we have a public red mine, so we'd love for you to join us there. And it's a Python web application, so we use Django, we use Celery, standard Python things. So the first step in basic workflow of using pulp is to create a repository, it's just a record in the database. We haven't put any content in yet, so the next step is to get some content from some remote source, like Docker hub, or the Red Hat CDN, or Fedora mirror, whatever kind of content you're bringing in. Or if you're building your own, you can upload your content into this repository. And then a key aspect of pulp that brings a lot of value is that copies are very cheap. So, for example, thinking of RPMs, pulp recognizes one unique RPM, and if it sees that RPM in a dozen different repositories, it only stores one copy of it on disk. And then within pulp, you want to make, for example, a snapshot repository, you can associate that RPM with your snapshot repository. Pulp's not going to make a second copy on disk of that RPM, it's just a record in the database. So this enables a lot of very interesting workflows to do promotions and snapshots and these sorts of things with relatively low expense and overhead. So now that you've got content in your repository, a key feature of pulp is we've separated content coming in versus content going out. So this publish operation says, all right, I'm pulp, I have all this stuff in my database, and I have some things on disk, all this content I know about. Now it's time to make it available to clients. So in the yum case, it's the equivalent of running create repo. It'll spit out sim links to where these RPMs are stored and write out the yum metadata. Or for other content types, it does similar things. In some of these content types, it even has to serve an API, for example, with Docker. So that's the basic workflow of how pulp works. Now pulp is extensible, this is a key part. We found that there are a lot of common workflows and common ways of thinking about content that cross the boundaries of any individual type. So pulp provides this core that has a REST API and the infrastructure of asynchronous workers and this kind of stuff. And then a plugin writer tells pulp how to interact with some particular piece of content. For example, yum content is where we started. So as a plugin writer, you're responsible, one for getting content in. So you would write a bit of code that knows how to go. Given a URL to a fedora mirror, you would go retrieve the yum metadata from that URL, interrogate it, parse it, figure out what you want to download, download RPMs. And then hand them to pulp in a specified way. Through a pulp's API where it knows how to work with this piece of content. Then the other side is you would write the distributor that knows how to make this available afterward. So this would be the equivalent of create repo. So pulp says, hey plugin, here's this content I know about in this nice generically packaged way. You know how to make it available to users, so here's your problem, do something with it. And the plugin writer's code is what would then lay down the Simlings and the yum metadata and that sort of thing. These are the content types that we support right now. Debbie and I put at the bottom, it's a community effort at the moment. We'd love to have more involvement. Pretty sure it doesn't work right now in its current state. But a lot of people are interested in it. So if you are interested in it and interested in helping in any way to make it happen, please get in touch. Testing, documentation, brainstorming, any of those things in addition to writing code are all very helpful. Pulp is a distributed application. So this is a key feature as well to understand that we have three different types of services essentially we run. One is a web service, it runs our REST API. A separate distinct service is the one that makes content available. Serving static files or serving a registry API for Docker or Puppet, that kind of thing. And then the third kind is these worker processes. So we have the need to do long running jobs, long running tasks. So we have separate worker processes that do that. And to illustrate this quickly, here's what a classic web application looks like. Like you're a WordPress or something. You have some kind of web process that handles requests and it stores its state in a database. Pulp adds this concept of workers in an AMQP message broker to add this concept of we need to do long running jobs that exceed the time in which we want an HTTP client to be hanging on and waiting for a response. So to quickly walk through the actual workflow, if you look at the top left there of the screen at one of these happy smiling clients, there's that one that's blushing because I'm pointing him out. He says he wants to make a request to publish a repo. So his request goes to HTTPD, which is what we use to run our application. It writes a little bit of state into, oh my clicker stopped working. Nope, my computer stopped working. Let's try this again. Okay, here we are. Okay, HTTPD writes some state into the database and then puts a message on a queue in the message broker. And at this point, the web handle is done. It sends a response to the client. It says, I queued that job for you. Here's an ID you can use to track it. See you later. And at some point in the future, one of those worker processes, when they become available, they'll pick up that message off the queue and get to work and do the actual work. Oh, a key aspect of this is that this is scalable in different ways. You can scale each one of these pieces individually. So if you have a lot of clients making requests all the time for content, you can scale that part by itself. If you have a lot of churns, we have one big user. They have like 10 or 12 different teams that are constantly making changes to their code all day long, and they rebuild the whole world like every 10 minutes. It's kind of nuts, but they do it. And so they need a lot of workers to do all that churn and do all that rebuilding and republishing of content. So you can scale these in a lot of ways across many machines and make it highly available. And finishing up with the pulp overview, chapter one, pulp is meant to be integrated with. It's fine standalone, but really shines when you integrate it with a build system and whatever your workflow is for your organization. We have REST API, of course, documentation for that on our website. We publish events to an AMQP exchange. So you can monitor what's going on and respond to those. And then we also can do HTTP callbacks for certain events to trigger some action in your workflow, as required. Okay, so we covered a lot of ground real fast just now. I'm going to take a sip of water. Before we move on to chapter two, it's OS tree. Are there any questions about that? Please. Okay, so the question is, roughly, does pulp have the capability to sign, like repository metadata it creates? The answer is no, but we'd love to. That would be a great feature. And we could brainstorm about that sometime if you want. Yeah, we would love to have that. Right now it's an external operation. We'll create it for you if you want to sign it, go for it. Okay, we'd better move on in the interest of time. So what is OS tree? Now you all came to talk about managing OS trees, so I'm guessing most of you know. A few of you are probably just fans of pulp. Maybe one or two of you wandered into the wrong room. Like this person. Come on in. Join us. Welcome. Thank you. So I'm going to try to give a real quick overview of OS tree. In fact, Colin has some great talks that you could watch on YouTube, that go into much more detail on what it is. But here are the three key points that I take away from this. So think about immutable file systems. In a nice starting point that many of us are familiar with is the idea of a golden image deployment, where you deploy in your infrastructure, you build some virtual machine image. It has your operating system at a certain state. It has maybe an application built into it, and most of what you need to go. And after you build it, it's immutable. You track it, you can hash it, you can sign it, you can then ship it off to your infrastructure and deploy it, and you know exactly what's there. And you notice the same everywhere. And then when you need to change something, whether it's a security update or an update to your application with new features, you build a new image and throw away the old one and deploy a new one. And again, you can track it. So this is sort of an old concept of immutable file system trees. Atomically switching between those is another feature that we're familiar with in this same workflow, where you take, you're not going in and running an update on this infrastructure. You are taking an image that you build completely new, you're testing it, and only then rolling it out to your infrastructure and completely replacing this other totally immutable file system tree. Now, here's where OS Tree gets interesting. It has Git-like content-addressed repositories. What does that mean? Most of us, I think, are familiar with Git. So imagine taking your entire root file system approximately. And putting the whole thing in Git and committing it. And then you have a branch name where you can track that state of your file system. It's now immutable, it's hashable, it's signable, and you can track it. And if you want to make a change, instead of in the golden image idea where you create a whole new image and deploy it and test it, you commit something new just like in Git. And you create a new commit, new hash, new trackable thing, and then you can move your branch head or create a new branch just like you wouldn't get to keep track of source changes. So it's really, really fantastic. A lot of interesting opportunities that brings along, including the ability to now atomically switch between two branches within an OS Tree instead of having to ship an entire new image out to the periphery of your infrastructure. Okay, any questions about that? I hope not. Okay, we gotta move on. And, oh, the last thing. OS Tree is a library primarily, which I really appreciate, and then a client secondarily. So there's a command line client, but it's billed as essentially a demo and a showcase for the library. So Pulp makes heavy use of the library in doing everything that we do with OS Tree. So let's look at examples. So this is what Pulp-Admin looks like. That's our command line interface to Pulp. Pulp-Admin is just a REST API client. So everything you're gonna see here with Pulp-Admin that you can do directly through the REST API and more, to be frank. It's hierarchical. So in this case, let's walk through this left to right. Pulp-Admin. OS Tree is the plugin we're operating with right now. Repo is, we wanna do an action on a repository. Create is the action we want to take. I've given it a unique ID. I can make up anything I want. I chose F25. I gave it a feed, I gave it a URL to where I know that Fedora has an OS Tree repository. And then this last line, the third line, is a branch name. So I chose a specific branch I wanted and Pulp will only retrieve that branch. By default, if I did not do that, it would retrieve all of the branches that are available, which in some cases could be a lot. Once we've created this, and it's interesting actually to point out here, at this moment nothing has happened except we've created some records in the database. We haven't written anything to disk. If you went looking on the disk for an actual OS Tree repo, you're not gonna find one. So far it's a very lightweight, just tracker data in our database. So now it's time to retrieve some real data from an OS Tree. So we run this sync command. This goes through that asynchronous workflow that we looked at in the diagram earlier. So REST API client, which is Pulp-Admin, makes its request, so on and so forth. Some progress spinners were going here, as you can see. But this is what it looks like when it's all done. A worker process was writing progress state into the database as we were going. Pulp-Admin was pulling to get that process, and the whole thing just worked seamlessly. So what did we get from this sync operation? This is the important part here. So I've run a repo search command. We're searching within a repo. I didn't provide any filter argument, so it's showing us everything, which is exactly one commit, because we only wanted that one branch. And I've highlighted the three key attributes of this commit that Pulp tracks and that makes this useful to you. So the remote ID is really just a hash of the URL of the remote source, so it's a way for us to keep track of, for any given commit, what remote repo did it come from? The branch name, I think you're familiar with that. We saw that. We specified that. And then the commit ID is the thing that will change over time. Just like in git. So you can see from the dates. I did this several days ago, sometime last week. If we did another sync right now, we might get a different commit. And when we did this search, we would see a second entry here. Same remote ID, same branch name, but a different commit. And this is where Pulp enables you to start being able to do things like reconstruct at a point in time. What was the latest of this branch? What was at this branch at this point in time? Or do a snapshot? So you could recreate that later. We'll see an example of how to create a snapshot in just a moment. This history tracking is very powerful. And this version attribute I'll point out real quick. It's not part of the OS tree spec itself formally. By understanding, it's like a de facto standard. They're kind of pushing along to make it unofficially official. So it's highly recommended. And it's a way that Fedora or anybody else using OS trees can express the relationship between one commit and the other as history moves along on any given branch or even among different branches, depending on how you want to use them. Okay, how do we make a snapshot? We're going to go through it right now. You can create a brand new repository. And you see highlighted in orange here, this is pulp after all. I've created, I just picked a new name and I put a date in there just to keep track of it. That was the date I did my sync. And again, there's nothing in this repository yet. We didn't make any changes to disk just now. We just created a record in the database. So now we're going to copy all the content from the F25 repo to this new repo that I just created. And that copy, again, is just records in the database. But that will move those records and make a new association between that commit and this new repository. And now, if at some point in the future we want to make that available to clients over HTTP, for example, then we would run this publish operation. Publish is that, as we talked about earlier, that operation of actually laying down some files on disk and making them available to a client. And for all those kinds of operations, we're using libOS tree, which, again, is fantastic to work with. And we're very thankful that we're able to leverage that to do these kinds of operations. So a couple of quick use cases to go through that we've been kind of hinting at and beating around the bush at anyway. You can just do mirrors, raw mirroring of OS trees. If you want to have an onsite mirror, this is a fine way to do it and have a little extra functionality around managing what content is available, where, to which parts of your infrastructure, for example, and maybe metering out. I want to receive a copy of the latest and greatest thing right now in this upstream repo. But I'm not going to copy it over to the production repo just yet until I run through some tests, like that kind of stuff. And then the other prime use case for pulp is promotion. So you can walk these things, just like RPMs or documenters or something else. You can walk them through a promotion workflow with the same copy operation. And that same idea of I start with, like, a raw latest and greatest repo, then I have a testing lab, and I can make stuff available there and see how that goes. And then only when I'm totally satisfied promote that up to my production infrastructure and make it available there. So it gives you just that kind of better control. So I'm out of time. But the good news is I do have stickers. Actually, Brian has stickers over there. Brian will say hi. So he'll give out some stickers after the talk. But for now, what questions do you guys have? There's a lot of territory to cover, please. Ah, okay. So the question I think is about OS Tree and it's about why change the... For people familiar with Git, why not use the exact same interface and look and feel as Git with OS Tree? The answer is I have no idea. I don't work on OS Tree. Please. So I could try to repeat that for the record. I'll try to remember the things you just said and say them quickly. One is that Git does not do... OS Tree does not do extended attributes. Other way around. Git does not do extended attributes. Thank you. Okay. And then SHA-256. You want to use SHA-256 with OS Tree. And the third one is that Git requires a service to serve things. Whereas OS Tree does not. You can just put static files on a web server. You can r-sync it around and whatever. Other questions. Does pulp have a push feature? A push versus a pull? Can you elaborate on what you were getting at? I see. So if I understand, I think what you're getting at is so pulp can, in addition to retrieving content and giving you an onsite copy of content, if you want to manage it there and then push that out to somewhere else, definitely we can do that. Is that not the question that you're getting? That is. Okay. So yeah, we even have an r-sync functionality that was added recently where you can, for certain content types that we manage, you can, after a publish, have pulp r-sync that content to some remote location to be served by a CDN or something like that. You're interested in? Oh, uploads. Thank you. Okay, great. Yes, you can also upload individual artifacts to pulp. I don't think you can do that with OS tree at this point, because it's just a lot more efficient to do a sync, even if you were doing a sync from your local machine. But you can, for most other content types, upload artifacts directly to pulp. Out of time? One more question? One more question. One lucky person. Yeah, so the question is, can pulp do snapshotting or...