 Good afternoon, everybody. It's nice to have you all here. We're going to be talking about the Linux system roles project And I think we will start out by introducing ourselves. My name is David Lehman. I'm an associate manager at Red Hat I focus on storage management for the platform Before that I was an engineer and I worked on anaconda the OS installer for many many years Yeah, hi Can you hear me? Okay? Hi, so I'm Shirley Ratko I'm from the Tel Aviv office, Israel I'm working at Red Hat. I've been at Red Hat for the past six years I'm a senior BI software engineer working on the over project and also on the Linux system roles So let's go over what we're going to discuss today a little bit So we're going to give you an overview of what Linux system roles are We're going to introduce you to two new roles. We've been working on the storage role and the logging role and We're going to show you a quick demo of How it actually works so I think many of us have been here before sitting pretty with some cool automation scripts and then something changes in the operating system and Not so cool all of a sudden New releases with new features and new configuration tooling have a tendency to break management scripts so some examples of this do you remember when Linux transition from system 5 in it to system D all the heartache that happened then How about going from manually editing system 5 network configuration files going to network manager More recently we switched from IP tables to using firewall D The list goes on and on But the common theme is that Things are changing and it's breaking people's automation You know sometimes in the thick of it It's hard to remember that these changes could also be thought of as progress or as technological advancement Because they present a you know, not only a real but an ongoing problem for systems management so We set out to try to improve the situation somewhat and One thing that became clear right away is The thing that's changing is not the what it's the how So we realized that we're still doing all the same tasks, right? We're setting up services We're configuring network interfaces firewall rules so on and so forth The thing that's changing is the tooling that's used by the operating system to manage these configurations so we realized that what we want here is to a Big piece of getting past this problem is a way to express the configuration such that it conveys the The essentials of the configuration itself without getting bogged down with the details of the implementation that's being currently used by the OS so Another way to say that and more common software terminology is that we wanted to abstract away the implementation so that the user can just express The important part that you know the pieces that transcend the current tooling of the operating system Okay, so For what David they mentioned we've developed the Linux system rules So what are the Linux system rules? They are a collection of Ansible rules and models. How many of you are familiar with Ansible? Nice nice. So I think at least one is not so let's quickly go over what Ansible is Ansible is an open source automation platform It's very simple to set up but very powerful It's composed of an engine and a YAML files That You can create your recipes And just call out the models from that engine It helps with configuration management application deployment tasks automation and Like I said, you can just create a recipe and deploy it on top of your environment Across flavors of Linux like Santos Fedora and well and on multiple hosts and get to the desired states of the machine And it does not use an agent on the remote hosts like puppet or self so The Linux system rules what we did is basically created an consistent configuration interface For rel Fedora and sentence like in an API basically So you don't need to be we created an abstraction layer From the implementation So you don't need to be bothered by the tooling underneath You just need to do your configuration will be the same even if the underlying Technology changes It's maintained by the subject matter experts Experts in networking storage and so on And it evolves with the subsystems So even in the future if we are changing the technology underneath your configuration and across Updates your configuration will stay the same. This is the intention It's compatible and tested in rel six seven and eight and also in Fedora Currently we have several roles that are already released and tested like the network and SELinux time-sync post-fix key dump and there are also a Roles that are a we've also started working on or planning to work on like the storage for example Which is supposed to be released in rel 8 and the logging wall, which is still a development and other additional roles for image builder Cockpit subhana and so on Okay, so now we're going to dig in a little bit to some of the details of two of the emerging roles We're going to start out talking about the storage role All right, so The first thing that I think everybody thinks about when they think about storage is complexity And this is unfortunate because we're not all trying to optimize something to get the absolute maximum number of IOPs or whatever The truth is that for most of us Storage management should be pretty simple because most of us have pretty pedestrian needs We want to create a couple of volumes put a file system on there and set it up to be mounted and that's it So when we set out to do the storage role the overarching Principle or goal was to simplify local storage configuration for the vast majority of cases So within that goal there were several other principles that sort of building blocks or pillars if you will so the first one is that if you're going to make things easier on people you have to Provide a nice concise way for them to express or define the configuration that they would like to see on their system a minimum of boilerplate and You know just as little typing as possible So the next thing is Which is something that helps us achieve the first thing is to provide reasonable defaults when that's possible. So for example The default volume type is going to be LVM the default file system type is going to be XFS That may change in the future, but the interface is going to remain the same. So for example Who knows in Fedora 35 the default volume manager may be Stratus But that won't change the interface of the system roll for storage. So The next thing is that we wanted to handle non-essential details Automatically, so partition allocation is a perfect example of this if you're creating an LVM volume You shouldn't have to know how partition allocation works. You shouldn't have to specify a Disclabel type a partition table type flags start sector and sector none of that All that's taken care of for you if you want to create an LVM volume group. You tell us what disks to use we do the rest Lastly we wanted to as much as possible. We wanted to use existing Logic to do the heavy lifting. There's already storage management APIs That have been tested and used for years One of those is blibbit. It's a Python module that's been used for storage configuration during The OS installation phase since Fedora 11. It was split out into its own package in Fedora 18 So it's been tested quite a bit and that's what we're using to do the heavy lifting here All right, so now we're going to do a couple of examples The first example is I think as simple as it gets What we're talking about here is just throwing a file system on a on an unpartition disk and then setting it up To be mount mounting it and then also setting it up to get mounted on boot And you'll see here that there's a list called storage volumes You put an item in the list You set the type to disk you specify a disk you specify a mount point And then you can optionally specify the file system, but it's going to default to xfs. So that's strictly optional And that's it And that's all you have to do and when that's over with you'll have your file system will be created and you'll have an FS tab entry and you'll be ready to go Okay, the next example is a little more complicated. This is an example creating a What we're calling a pool, but in this example what that translates to is an lbm volume group And we're going to use that volume group to store the data for a mongo db installation And so there's two volumes we want to have here ones for data And ones for logs So again, we're creating what we're doing here is we're adding an item into the storage pools list There's just two lists there's storage volumes and storage pools Volumes are for volumes that are not in a pool and that means disks So we put an item in the pools list We give it a name We don't have to say that it's type lbm, but we can if we want to We say which disks to use and then we create a list of the volumes in that pool The first one is data. You can see again, we're commenting out the defaults here just For hopefully a visual aid Really you can see that all you need to define a volume is a name amount point and a size And then Second one similarly simple Okay, so then one more example and this is just going to show some other options that are available to you Again, we're creating a pool and we are creating one volume in that pool and we're just showing that you can Specify a file system label. You can also specify options to make a fs And you can specify Options to mount and those mount options are obviously going to go into etsy fs tab as well So that should hopefully cover A good bit beyond the absolute simplest of use cases Um, let's see. All right, so now I'll give you a little bit of a feel for where we are and where we're going The current status of the project is that we just released Version 1.0.0 this week. It was released to galaxy on thursday And it's currently in testing in preparation to be released in an update of rel 8 We're hoping it will hit for the war 32 as well And the support there is for the whole disc example that I showed you You can also do a whole disc with a partition that spans the whole disc And the purpose of that is if you have a multi os situation that Partition table will signal to non-linux operating systems. Don't clobber this thing. There's something somebody put something here And then the last one is lvm now. We don't support yet. We don't have support for um thin provision to lvm Or cash volumes or raid or anything like that, but all that stuff is In the pipeline. It's just not in version 1.0 Okay, so um the roadmap this is the order of this is kind of arbitrary. This is subject to um change But obviously we're going to want to add support for block device encryption using lux and software raid The advanced lvm functionality likes and provisioning cash And lvm raid and we're going to want to add support for compression and deduplication using vdo And we're going to want to add support for stratus and all that stuff can be added without changing the interface Another thing that's good about this roadmap is that The underlying library blivet already supports All but I think two of these things so The amount of work is really limited to just the work in the role itself to plumb through to the underlying library So that means that it should be this roadmap should get traversed quickly Compared to if we had to implement our own storage library that is all right, so the next slide here is some challenging but possibly high value features that are Uh Maybe or maybe not at risk some things that seem like they would be really useful for a lot of users But there's a possibility that they're not going to fit within the constraints of of an ansible role And when I say that I'm talking about Some of the principles of ansible are that things should be item potent meaning if you run it twice they should get the same result and At least I'm not totally sure about this, but I think things are supposed to be deterministic as well I think that kind of Is a part of the first one And so there's a we're not totally we haven't worked out the details of how we can make these work, but We have varying levels of optimism for varying for the various features here, so I'll spare you the suspense here So automatic device names, you know a lot of the time you don't care what your logical volume is called Right, you just care about the file system in the map point So it seems like we could make things even easier for people You could go from having to specify three pieces of information to just having to specify two Another one automatic size is if you're creating a pool and you only put one volume in the pool It stands to reason that you could make that volume occupy the whole pool if if the user doesn't say otherwise Right, so that's another one that could be good, but I'm not sure if it's going to work or not Again automatic disk selection same deal You can go out and list the disks in the system and you can figure out which ones have something on them And you can use the ones that don't have anything on them The last one here is I think possibly the most useful of the bunch and also the one that I'm the most optimistic about And that's percentage based sizes I've been thinking about this the last few days and I think it's going to work In fact, I I implemented it the other day And this is like You can if you have a pool and that pool could be either a partition drive or it could be a volume group If you have multiple volumes within that pool It might be nice if you don't know the exact size of the drive It might be nice to be able to specify the sizes of the volumes as percentages of the total size of the pool and so I think it's looking pretty good for that Honestly, I don't think that's a risk for for item potents at all. So, um, you know Pending review and all that kind of stuff. We'll see how it goes, but I think that one's going to be good And that's really lit for the logging role So thanks guys and Shirley's going to tell you now all about the logging role um, okay, so logging role and Like we said, we intend to give a higher level architecture and sort of an api In the log and the Linux system role. So for the logging, we mean it's pretty basic What we want to do is to collect logs from different targets locations and then ship them to different destinations We want to make the configuration as simple as possible and to be able to spread them across our system And to the multiple hosts that we have So We want to collect multiple logs to multiple destinations and apply default settings when we can As the base of the logging role we used our syslog Our syslog has been in rel Since rel 6 it's almost 10 years now and it's become the basic Default tool for rel for logging and It's it allows us multi-threading secure connection diverse destinations That it supports it can filter any part of the syslog message Fully configurable output format and it's very it's suitable for class relay chains And by the way, we did try other Collectors, but At least for now this one is the one that we feel the most comfortable with in regards for performance So when you run the logging role by default Just like this What you'll get is that The row itself will make sure that you have the latest package of our syslog It will deploy the our syslog config the Default one which collects journal records and send them to files or based on their the the the application that send them and For example for valog messages for valochron And based on their severity, for example, if you have urgent Logs then you'll get notification if you have logged in users So this is the default and Another use case that is already implemented is If you already have your own syslog configuration file and you want to spread it across your environment, you you are able to use the custom files Here you have in the variables you have a logging outputs bar And which is a list of the outputs you want to configure In this case, there is the custom files and you simply need to state where that your file is located and it will be Deployed all across your system Another interesting use case is being able to ship your syslog data To elastic search. This is already implemented. So we have an output to elastic search And both with certificates are without So you can send your journal information to elastic search and have your own One source of logs to ship everything in and you'll be able to get your Platform to dig into your logs and see what's going on How we will do that this is an example So again, we have the logging outputs in this case the type will be elastic search You will need or simply need to state where is your elastic search instance What is the index you want to ship it to in elastic search if you have certificates? You can state where they are And you will mention which logs you want to collect in this case journal And We are already using the our syslog role in production in overt, which is the virtualization management System so we are shipping it We are using it for both shipping metrics and logs to elastic search. So For metrics we are using collectee and we added a plugin that sets the sensed metrics in syslog format To our syslog by tcp and we are also collecting the The overt application logs and we are sending them to elastic search. This gives us a full monitor solution It provides you the way to visualize everything with kibana and creating Prebuilt dashboards for example and alerting We are also going another step in our next release And creating a UI based on the linux system models and in everything so We will have the option to select the host list that we want to Deploy our syslog and collect dion and then this and in the underlying In the underlying infrastructure will create the vals for the linux system for the role And simply run it and all of the hosts that we've selected will be it will be Configured to send the logs to elastic search And this is the result you'll get The kibana dashboard You'll be able to drill down to your logs and Do the analysis there So the status of the logging role the logging role is still in in development It's still missing some of the features that we are planning to add Currently we support like I said the default artist log config The sending the journal to elastic search deploying custom configurations and it's already used by overt But on the roadmap we are planning to add profile based configuration and So we have the general use case, but we all we also want to add additional profiles for resilience and security We want to add additional inputs This is something that we are still Thinking about we are not sure what will be the additional inputs and if you can think of Good use cases and then we will appreciate it if you go in to the project and create an rfv This will certainly help And additional outputs currently the We plan to add remote artist log and the remote message buses like Kafka and mqp Now it's demo time Okay So this is a demo that was put together. It's going to use several of the system roles to configure a system It's going to configure some of the firewall some network interfaces Time sync and it's also going to set the system up To run the cockpit web client and Composer the image builder So the beginning here This is not live The beginning here. This is just going through some of this Showing that you have to have several of the system some of a couple of packages installed You have to have several of the system roles installed via galaxy So those are some of the prerequisites And so then we're just going through these are the top level variables for the file and Basically this is These variables are to set up a couple of network interfaces and set up time sync And as you see there the The way this playbook is structured is that the the time sync role is Invoked up there Directly and the other roles are done as tasks and that does two things One is it allows us to control the order of execution and the other one is that if one of the earlier Roles slash tasks fails. It won't try to keep going So that one right there is going to install the cockpit web console Nice and simple The next one is going to set up the firewall for cockpit Again Hopefully that's the the language there looks concise So then next thing is to set up a volume for image builder to use So it's going to create Using the storage role. It's going to create a pool with a single volume in it And it's going to set that up to mount it at Varlib Lorax composer And that's all based on the you know, it's conditionalized if you wanted to skip that for some reason So the next thing we're going to do here is set up the image builder GUI Again, I think that looks like less typing than Other alternatives And so then this is just an example down here of setting up a using the networking role to set up a bonded network interface I'm going to level with you. I don't really know the technical details of setting up a bonded network interface I was an RHC for real five it's been a while I think you can draw some part of your nature by making it your by macabre Yes Exactly and then of course the the other thing is that this is going to run on rel six or rel seven or rel eight or rel 25 Or you know fedora 32 So then setting up the components of the bond down there So All right, so then here This is just showing the initial Hitting and missing this mic on it. This is showing the initial network Interfaces you can see that there's two extras there that are not configured not being used And so then I think that the next step here is going to be to Run the role or run the playbook rather And then sort of verify the results. Okay setting up the time sink Configuring that setting up cockpit setting up its firewall. That's doing the storage Now that's done setting the mounts and then that's setting up image builder And setting up the bonded interface and that's it and you can see that Succeeded six of those plays changed things And now you can see they're gonna the bonded network interface is going to show up What's next I think Okay, this is the image builder services running will be verified here And then you can see among all the other file systems you can see the new one we created for image builder composer And to get a little bit more of a detailed view on that Here's a look at the lvm setup the last one there is the one we just created And incidentally They're already fixes written they're not in the they're not upstream yet, but there are fixes that you can see there He chose to make the Volume 19.5 gigs. That's because of the metadata used by lvm So if you just would try to run an lv create with 20 gigs It would not work because the disc is 20 gigs and this metadata usage So I've got a fix in my local tree that I've tested that will Just trim the the requested size as needed in that case And then of course the percentage based sizes will solve it even more elegantly I would say so What's that? So this is the way you can get the roles from galaxy you simply need to call them And here are the links to the landing page documentation github repo and We encourage you to go if you are an expert in your field and you want to contribute And that will be great and if you want to try it out some of the roles are already in galaxy some of them under are under github repo Give it a try Please And tell us what you need what other features you need And we'll try to integrate them in any questions Yeah, yeah, go ahead Um, my name is alex first off Let me thank you for the work that you've done as a sysadmin. I appreciate it a lot It's gonna help us a lot in our future projects And the first question is does it currently support non-red hat linux systems? That's a good question. And I think You could I could I could either um Be slimy and say that's unknown or I could say no Um, I don't think that we've put any effort toward Uh, running it on debbie and a rubin two yet Yeah Thank you Are there any plans to convert these to a red hat supported product? because You know All of us probably knows that puppet got the equivalent say Which is puppet enterprise Where they encourage all the other software developers communities even proprietary Companies to develop the puppet models and you know make them use Great question. Thank you. Uh, my name is terry and I'm on the rail product management team. So, um They do all the wonderful work. So Thank all of you But um, but yeah, I can answer that question. So yes, they are supported today So they're they're in galaxy as the linux system rolls. Uh, so you can install them that way They are shipped. Uh, michael de palo here with us. Uh, does the fedora packaging? So he packages those up as a fedora rpm Called linux system rolls in in rel They are in the extras repository for rel seven and they are in rel eight app stream repository as the rel system rolls and so Galaxy is the native Ansible native way of getting them and that's gonna match github So kind of like the latest greatest upstream as they add new functionality and then as we test it and document it and You know do all of the ci testing against it We will ship them as part of the rel system rolls package. So fully supported And so we will fully support that Ansible is a little bit tricky. We make the ansible engine Accessible with your rel subscription So that it's easy to get to and we have so many layered products that depend on it now But if you want full broad ansible support, you still need to buy an ansible engine subscription But we support this as like our user contract if you will because we need to make rel consistently Configurable as you move from rel six to rel seven to rel eight and beyond So like we owe that to you. So we're we're committed to supporting that for free Yes, that's that's one of our friends who planted the seed for this project. So Thank you. Um So that's a long winded answer. I hope that helps Thank you, Terry So as somebody who's done ansible for a long time One of the things that often frustrated me is as my code got more and more complicated I couldn't refactor it because i'm In yaml and not in python and the underlying mechanism for all this stuff is python Has there been any thought of taking now that you got it working and taking what you've got working and making it into ansible modules And and perhaps even clearing cleaning up the internal object model within ansible so that those are reusable components So that for example, I don't have to go across the wire for each call Right because that really does slow down. I have to do an ssh call for each individual module task I also have to Maintaining state inside of ansible if you look at your Playbooks that you were you're specifying for these there you have global variable names Which as a long time programmer makes me wince I know that that's the tool that you have to work with that's what ansible puts in here But if they weren't insible modules, then they would be explicitly namespaced by those modules Has there been any thought and I as I get I understand you got something working This is the reason why ansible is so successful. So you can share your code this way But the next step is to make this stuff maintainable. Is there any effort going to be going into moving it into the right abstraction level, which is the Ansible module not roles for these I know it's I shouldn't be asking this question. It's something on the sales side, but I have to So I think it varies from role to role what the architecture is actually I initially implemented the storage role using pure ansible using the existing ansible modules that are out there But like you described the yaml became a nightmare And I got to the point where the next thing was block device encryption And I was just like I would rather quit my job than even try this in the yaml so Because in storage, right there's arbitrary Stacking of the different layers and and the animal just is not equipped to deal with that So what I did was and I refactored everything into a python module So really if you were to look at the tasks the main tasks file for the storage role, there's almost nothing like There's like this really embarrassing thing where I spent 200 lines Filling in default values in some dictionaries because I don't know if yaml very well And then there's one module called to obliv a module called blivitt And then there's another one more call to a mount the mount module and that's it So I don't know if that's standard practice, but I think that to a large extent I know that the networking module looks like that or the networking role is similar So we did think about that. I don't know. It's probably not perfect, but we thought about it Thank you for this and I really appreciate the fact that you've got it connected to red hat virtualization That's very cool The thing about storage though is it's really scary to get right and it's really easy to screw stuff up And from what I saw here today was hey whatever's in the playbook just happens if I find a disc there I just write all over it. Is there any sort of safety in there that says Ooh, there's already a file system there or oh, by the way, it's already mounted somewhere That that protects me against doing stupid stuff Well, so for example if there is already a file system there that is of the of the Same type. I think it's just going to get used There storage is it's funny because I agree with you that storage is scary But I I don't agree about how a store just scary I don't think it's scary that when I push go it goes But what I do think is terrifying is all the different permutations of how things can be right like I can say create me an xfs file system on sda and mount it at foo And then when I go to look at it in ansible there may be An xfs file system on sda, but it may not be mounted at foo There may be one already there and it's mounted at foo, but it's got the wrong stuff on it There could be one, you know, like the possibilities of what you have to manage are incredibly complex and For the most part you're going to get what you asked for Um, so if you for example if you have some if you have an lvm stack on vdc And then you tell us to create a file system on vdc. We're going to wipe it off before we do that Um, and you know, that's the thing is everybody likes to put the safety buttons in storage, right? Are you sure but you can't do that with automation because it's no longer automated So I would love to have a gentler Response to that, but I don't know what it would be So to kind of follow up on that actually, um a similar question I had to give a concrete example that concerned me You commented on having defaults for say the file system or the volume manager. Um, and that can change Oh, is it? I'm sorry Uh, should I just start from the beginning? Yeah, that'd be nice. I'm having trouble with mine too. So don't feel bad Okay. Yeah, my bad So to give a concrete example to what he just brought up, um that I um So With storage for you commented on having defaults for say the file system or the volume manager or something like that And there's of course the possibility that could change which is fine, but um Let's say I I run this on a system I upgrade and then I run it again and the file system changed what happens Do you like do you read my disc or What happens there that's an excellent question if you have the um if it's already set That's kind of our problem to solve if the thing is already set up and it already has the same mount point And we're going to assume that it's what it needs to be Um what we'll have to do we don't have it there right now because the default hasn't changed yet But what we'll just have to do is add an additional layer of logic that says it'll differentiate between You said LVM Or you didn't say anything You know what I mean? It will be just a little more gentle if you didn't say anything We'll know to be a little more flexible in in using what we found Okay, thanks. I'm glad you thought of that. I'm I'm I'm yeah, I mean, you know We don't know completely till we get there, but um We'll work it out What Oh screen was asleep Okay, uh, like my colleague here want to congratulate you on this effort as well I'm much appreciated. Um, thank you. I have many questions where I go to most glaring for me Is there plan to incorporate ice cozy or other Block over wire Or even nfs Or are they not working file systems Into the storage role or is there like a different storage role that Is dedicated to this this kind of technology Well, I'm suspicious of you now because there is as it happens there is a Sort of an in development A pretty nice prototype out there that is specific to remote storage And we have Talked about we haven't worked out the details But we have considered Dedicating this role to local storage and then having a dedicated remote storage role that can manage those sorts of things Thank you Thank you