 Can you see it? OK, yeah. This works. I'll check out your, to see if your, because you're in the upper part of the screen. This one? Is this one currently a microphone? Where's our, um... We probably need to be standing today. So you've got this one microphone. We have the one microphone and then this. Am I, I mean, it probably will be, if we just put the microphone at the front, it might be easy. I don't know. I'm gonna turn this off for now. Does this sound OK? Yeah. Yeah, it's connected. Now, because my laptop knows it's a video device, because it changed that. And I'm also just pointing to HDMI. Mmm. Do we have... Where is the projector? Oh. No, it's not running. That's just off of a windshield over there. Usually the projectors have to be on. Phenomenal. Creative thinking and problem solving. And then we want to arrange them so that that one is on the left over here. And I don't know how to make it full screen. This one goes here. The speaker notes. OK. That is actually functional. There you go. Alrighty, everybody. At the beginning of our time slot, so I'm just gonna go ahead and get started as people filter in. We'll catch them up to speed. I'm actually kind of excited that we have a smaller group because then we can discuss the things that you see in the presentation that you might want to expand on a little bit more in detail. That's always super exciting. So we titled this project, Insight Out of Chaos Community Data Analysis Tool Workshop. And it's a pun that we're surprised no one else has made before that we're aware of. The objective of our work is to find insight using the data analysis brain share that chaos has, so applying metrics and knowledge models to open source communities, but also to find new insights from the overwhelming amount of data and information streams that open source communities generate. And the way we're gonna do that today is we're gonna show you the tools that we use to make that happen. Here's a little reminder of the technical requirements for the follow along component of this talk. We're gonna need Docker and Docker Compose because we're going to be bringing up a multi-container application and then obviously Git, which hopefully everybody has on their laptop to clone the repository where we have all the source. And then for the second part of the talk, we're gonna need some relatively modern version of Python so that we can hopefully avoid any package discrepancies, but I don't know how optimistic I am on that particular thing, so if we have any issues with that, we can settle on a virtual environment version that makes sense. And then we'll need Jupyter Notebooks. Jupyter Notebooks will let us go through an analysis project together step by step so that you can take that with you when you leave and maybe play with it and get involved with the community that way. Who are we to be giving this talk in the first place? It was on the initial slide, so I'm gonna go back to it with our names, but I'm James Constell. I'm a software engineer in Red Hat's open source program office. I'm Callie Doffy. I'm a senior native scientist in Red Hat's Ospo. And yeah, we're both a part of the community outreach team within our Ospo and specifically we are the two of two for the data team within that. And so about two years ago, two and a half years ago under a different name, we started, Callie started OSS Aspen, which at the time was called Project San Diego, which has been the name that it has been in different conference talks. And from there we have a couple of different repositories named 8knot and Repel where we do all of our work. It's all open data app development and open data analysis. So our team's day job. Our responsibility is to assist and support a variety of personas in the open source software space, particularly in Red Hat. Community architects are kind of our bread and butter. They're the individuals that we work with day to day and who we learn from. Business leadership frequently have questions for us. Technical contributors to projects have questions about how their decision making will impact communities. And finally community members. We talk to community members pretty frequently to help them figure out what the right move might be for their community at a higher level. And we help them by providing data science resources to understand and to make good decisions about communities in the context of their role. So someone who's in business leadership has a totally different perspective on community than someone who is strictly a community architect versus a community member. So we help figure out what the right strategy is for a community in their particular role. And in this talk we want to share our process for doing this and the resources that we have so that more people can get involved in this space. Our fundamental objective is to support community sustainability by preparing and upskilling community members to make more data supported decisions. So bring data rather than gut or really not replace gut because gut is really, really important when it comes to community data. But bring data into the mix so that people can see a higher level perspective on community. We're going to pause for just a second and get a survey of the room. Like why was this workshop appealing to you? And all honest answers are accepted. It might be that you didn't like PG Workshop or the Kubernetes Workshop. We're just kind of interested in what the level set is. Who you guys are and what would you ideally like to get out of a workshop like this? And then we'll share our objectives and we'll try to meet in the middle so that we tell you what we think is important from our experience and also maybe put more pointed emphasis on stuff that you're interested in. So anybody who'd like to volunteer their perspective, we'd love to just raise your hand and we can talk about it. Awesome. And for the recording? Yeah, I was just going to say we designed 8-naught very specifically that the barrier of entry, especially for people coming out of school or specifically data sciences, to be really low. We'll go into this a little bit later, but it's very modular and there's only... Okay, I want to work on a visualization. Here's the files that you need to touch. Here's the process that we'll need to be able to do. This is used in some entry-level software engineering classes for people to get their first taste in how does the process committing to GitHub work? How does contributing upstream work? And so I'll be curious to hear your feedback at the end of this of how accessible it really is. Anybody else? Thank you so much. You're what? Cool. Okay, phenomenal. Fantastic. And I think that we'll be able to really help with that because a lot of what we do is like multiple levels. Like, okay, do you just need data that you really can work with and trust? Do you want to go that next step further and have pre-developed visualizations that you can consume and just use? And then the next step, which Cali will cover, is how do you then extend that portfolio that we've developed that we think kind of covers the table stakes of what most people need and what builders have needed so far and then how can we extend that next step? So that'll be the later portion of this talk. So we hope that's really helpful for you. Cool. Thanks for sharing. Anybody else? No sweat. Cool. Thank you so much for those who did share. Here's what we wrote down initially that we really wanted attendees to get out of this talk. We wanted to level set a little bit in the first place and understand the motivations and novelties of the tools we developed. The two tools that we're going to be talking about specifically are Auger and 8-NOT, just to use the proper nouns. Then we wanted to give you a working knowledge of how to get 8-NOT running for yourself, which should be super easy. We've designed it so that if you have Docker Compose and the right credentials, it's just immediately possible. We do make the assumption, however, that you have a publicly available Auger instance, which we provide in our project documentation and we'll share the credentials with you to access one that will be available for this demonstration. But in the future, usually for any specific team, it's nice to run your own. We're working with the upstream or community maintainers on that to make that even easier to get running. Then we're going to go through and provide a framework to design a visualization or a brand new metric that addresses a question that you might have about the open source or about some OSS community. We've already developed that visualization walkthrough so that it's fine-tuned and it's ready and consumable, but hopefully you can apply it to another idea that you might have that you see in what we've developed and you're thinking, oh, I really want to see this thing. How can I go make that happen? We'll go through our framework of how we choose specific metrics and specifically we'll intersect with the chaos project which is, I actually have never said it out loud. It's a community health metrics project. I don't know the full definition though. I don't want to short circuit the whole definition of chaos. That's what I thought. Community health analytics for open source software. It's one of the most fantastic resources for getting started in this space because it gives you the first 99% of thinking about how you can approach a community health problem or try to understand something in a ton of depth and it's a bunch of documentation by really, really smart people. I was going to say for this workshop the things that are under the chaos branch, the auger project which we'll go into and then specifically looking at their documentation around metrics and metrics models and how that goes into choosing what visualizations you look at and then how to develop your own visualizations from my personal experience with a lot of these visualizations I'm like the data scientist who made them the technical side of actually coding it is probably one of the easier parts of it the developing the idea out from I have this question or even what is my question to what visualization should I make is much more difficult than I think people put credit towards. Absolutely. So we'll show how to get one of those visualizations running after we've digested the information that we can find in chaos and then apply it to a problem and then we'll show how to complete the loop so we have a visualization like we've already ready baked the thinking of what kind of problem we want to address we know how to actually get it running in a notebook how do we add it to an instance of 8-0 connect 8-0 to an available auger instance here's a really brief overview of the timeline today these numbers are pretty much made up some things will take longer some things will be really really really fast all depends on how things go so meet and potatoes of the talk our motivation with this whole project the whole like essentially our day job is that there are challenges with working with open source communities at scale when you're an organization or you're a community member really the roles aren't hugely different many corporate relationships with open source communities and at consuming the software and the documentation that the communities are generating that they're creating and that they're working really hard on and part of the reason that it ends there is because open source communities are challenging to understand like when you're a brand new person or you're a brand new contributor to a project it's really hard to dive in there's a lot of documentation to read there's a lot of code base to read there's a lot of people to know it's really not easy to get involved unless you can maybe interpret things from a higher level going beyond consuming these resources as just a consumer from the perspective of an organization requires a more fundamental understanding of how a community works and so addressing that gap from our perspective is bringing data science into the conversation there ought to be a way to aggregate the trends that we see in communities and visualize the trends to find meaningful ways to promote community sustainability that's grounded in professional community architect strategy forgive me for literally writing down what I'm saying but I want these slides to be able to be consumed later rather than just bullet points and finally business decisions need to be in conversation with the realities that a community faces this will hopefully prevent missteps and optimize for the shared future where a company can be invested in communities communities can feel like they're being heard by the company that consumes their resources and a more equally profitable relationship can go forward principally we use two tools to help make this a reality the first part is a data collection and engineering application called auger auger consumes git so repository data and git hub and git lab api data it's one of the projects in the chaos project and I think unambiguously it's owned by Professor Sean Goggins of the University of Missouri eight knot is an extension of auger it's a data visualization and analysis application from us in red hats ospo from us the community data team those are our names so auger started out in 2017 as a research project in chaos the core objective of auger is to prepare open source software project data with traceable provenance and a high degree of verifiability we were talking to Sean and his quip was quote unquote somehow this was very interesting to numerous organizations ospo's and communities auger has participated in google summer of code for I think five years it's had a lot of contributors it's a major part of the chaos project along with other great projects here's how auger works in the abstract so gonna kind of break from the slides a little bit and just talk ad hoc first step is to clone that repository and from the available information there start building out a table of the contributors to the project based on the git commits the commits themselves and the available files so build out the file tree see who has contributed to which files who those contributors are at least the information we can get from a git commit I thought that they cloned it this is a different conversation that I had with Sean yeah it has to mine certain things locally because they're not of it so it's two major pillars yeah so then the second step is to get everything that we could ostensibly mine from the git repo itself then we go communicate or auger communicates with the github and gitlab api to flesh out more of the auxiliary details so we collect event data primarily pull requests comments on those pull requests what kind of comments are being made on pull requests issues and all the comments on those labels on issues etc that fills out a baseline high level perspective of community and the source code that the community works on and then a suite of day two so not immediate like not the primary collection but the second degree of analysis workers are run on that data which provides stuff like the OSS OSSF scorecard and other really useful metrics this is a really high level architecture of how auger works there's an auger tool which is quote unquote auger and there's a postgres database which is what auger writes to and that's the consumable resource from the work that auger does all of the second day data analysis goes in the database communication with github and gitlab is the external interface that auger has so we don't go communicate with auger we look at the database that auger fills in this is a more expanded technical architecture of how auger works so let's go through the life cycle of a brand new request auger and that kind of request could be I want to see I want to do a full stack collection of let's just take an example like the containers podman repository on github or docker docker or whatever so we schedule a task we schedule a task and one process worker picks up that task to clone the repo mine it etc and then the data goes through multiple validation layers and finally gets populated into the database it's a relatively simple flow with a ton of engineering behind it Sean has said it before that it's tons of data carpentry on top of mountains of data so in the end we have an excellent consumable resource I want to take a pause to go look at the schema so you can see kind of the high level boundaries of what data is available that auger provides to a relatively well filled in instance starting with the schema and I hope this is okay on this projector so this is the big blown up schema view of auger and I'm going to start in this little quadrant where we have data on commits so commits are referenced by a repository commit hash, author name author raw email etc I won't do it you've ever seen a database view before like this I just wanted to take a quick it's all little lines to show how all the different boxes are connected I just wanted to make sure key relationships so I'm just doing kind of a whirlwind tour so you know the boundaries we've got commits with all of the data that's necessary to describe a commit issues contributor affiliations if that information is available from github contributors I'm going to go out here and look up here the yellow boxes are usually the ones that we work with pull request commits so a back reference from a pull request to the commits that make it up pull requests and all this crazy amount of data to describe what the actual particulars of a pull request mean etc you could go crazy in the schema and then going another step come over here mouse looking at the data just from the from dvver so you can see the tables that are available linearly and exactly from the schema for instance commits data and indeed we can look at the available data and look at an example of a query commits are identified by their hash so those are in dvd like each hash per commit is unique we have the author name if it's available author rot email you get the gist this is the kind of data that's available in auger just so that when we go on to the next step which is visualization you have a clear picture of what data model we're working with under the hood so ideally now we have a database and we're just going to say okay this exists and we have a lot of great data about communities the particulars of how auger does all the verifiability layers and the validation is out of scope for this talk as is getting auger running but the docs are very good for that so we have all these events we have commits contributors etc but it's just a database so we need some tool to analyze the data and prepare visualizations and when we start talking about this people have said oh yay another data analysis application how is this going to be different from the one before the many before this is I hop in for a little bit because I think sometimes it's nice to know where the project comes from and how it's got started and so in around 2020 I was brought into red hat as an intern I was still in college and I was doing a lot of one-off data requests for the Ospo team and we were kind of all from a place of like this is we've never done any of this what do we want to look at what do we care about and so I spent probably about the first six months just doing those one-off requests like building stuff out directly from scratch for every single question that was had and spent a lot of time looking at the other tooling that was available and like looking at auger looking a lot of the different stuff and the biggest problem that I was having at the time was that I wanted to be able to directly access the data and to be able to like capitalize on the research and the work that's been done in data science in other fields so pretty much looking at Python or R and looking at the 25 plus years of research that's been done I'm like how I need to be able to access this data directly before I can use these packages for I can do all the data pre-processing and be able to develop those complex visualizations and some of the more goals that Red Hat had internally around the analysis that they wanted to do around our communities and so that's pretty much how 8knot got started was really just trying to take data science workflow and principles that were done in other topic areas and apply it to open source communities and see where we could go with it this is a more abstract version of that same story but it's more of the technical considerations on top of what Callie's already said so we went and built a new app from the ground up instead of taking advantage of some of the awesome things that already exist and this is our rationale so there's some absolutely fantastic applications for visualizing the data that Auger has in a relational database specifically Postgres Grafana, Tableau and SuperSet I've done a talk on this before absolutely fantastic and what they do where they sit in this stack is they really solve SQL based visualization preparation so in the context where you have data that is in a SQL database or database that's accessed with SQL you can prepare a visualization and these tools are fantastic at that but they don't support very easily at least we couldn't do it super ergonomically more like ad hoc visualization pipelines and like Callie said we really wanted to be able to leverage what we could find in the literature so that would mean taking advantage of machine learning machine learning opportunities and using distributed compute and specifically working with stuff in Python native pandas because it's very accessible for people to come in and use what they already knew in school so pandas is a huge part of that so here these were the packages that we were specifically excited about using and that we couldn't use in the context of Grafana and Tableau and SuperSet so PyTorch, Dask, Pandas, TensorFlow NetworkX etc so once we had committed to using some Python based stack we had to make a decision about how we were going to actually build a visualization app around that there are awesome frameworks for doing this and the way that we made our decision is we looked at the range of complexity Streamlet who here has used Streamlet ever so I know how deep I should go so Streamlet is really cool Streamlet is based on the concept that if you have a Python script you go from top to bottom logically and in Streamlet each part of a page on an app gets rendered in order so you can use any Python tools you want to render out a relatively simple page and so for most applications that's more than sufficient you have access to the full Python data science stack, you get a really pretty performant web application from it for data visualization but on balance we decided that it was a little bit more limited than we wanted because we wanted to have more user extensibility and we wanted users to be able to have their own accounts and create groups of repositories and stuff like that so that wasn't an ideal fit the next level up in complexity was plotly- most data scientists personas are very familiar with plotly and plotly started building this web application meta framework called dash I'll get into those details in a minute but spoiler alert that's what we ended up picking the most custom option that we probably could have picked was React plus fast API or React plus Flask or any front end UI framework plus any Python back end framework realistically but neither Kali nor I at the time was a UI designer we aren't now no which will be very apparent when we show you 8knot and so we made the concession we're not going to roll our own UI from top to bottom we're not going to get super into back end we're not going to go that far at that time so we trade the convenience of declarative Python UI development and I'll show you what that means in a minute which is inevitably slower than client local JavaScript UI code for the unrivaled ergonomics the data analysis integration that Dash gives you and again I'm just speaking in the abstract we'll see that literally in a minute so the final conclusion was that Dash was the ideal candidate for this and this is kind of the data flow in the abstract for how Dash works there's one common React front end and for every Dash application developed by everybody it's the same and its job is to consume information from some flask back end or I think Julia and R are also supported and so it takes some piece of information that says this is the name of the component these are the styling things for the component this is the data that we want and then the React front end just renders it so all of the UI reactivity to the user and everything is driven from the back end the front end is pretty thin it's relatively slow because user interactions are the wire but it's also really easy to work with because we didn't have to go learn React to get this thing booted up it was very nice this is how we manage scaling this visualization task set at a higher order of abstraction so we have the application server which communicates with two caches cache that does back end processing for visualizations for us and another that goes and collects data from auger and then cache is it we'll see a literal representation of that a little bit later but the fundamental thing to take away from this is that 8nought is capable of scaling pretty arbitrarily for the amount of data you need to collect and cache and the amount of analysis you need to do for incoming consumers and that's a lot of the plotly-architectural framework that's available so building this as a meta application where we use auger as the data back end and 8nought as the visualization front end looks kind of like this so 8nought queries data from the auger database caches it and analyzes it the user communicates with the flask back end with this React UI it's all very nice and convenient and the key observation for this is that 8nought communicates with auger over some arbitrary distance the expectation isn't that you deploy auger and 8nought together you can use one common auger back end for any arbitrary number of 8nought instances and we haven't built 8nought with the consideration or with the expectation that auger is going to be running side by side with it which makes it really convenient for anybody to deploy in their own instance of the application for their use case so as promised we're going to go through a little demo of actually getting 8nought booted and then do a little tour and show how we connect it to auger and how we can work with some user accounts so if you're a user and you want to see a specific slice of repositories on aggregate how you can do that all of the links are on the scale if you look at on the scale website on the schedule there will be a doc with all the links I wasn't able to figure out how to do public sharing so everyone please request I'm just going to be sitting here on my email waiting for those to come in for then everyone can get access to everything but yeah all of the links from now until the end of this workshop should or will be available on there I'm going to figure out real quick I'm going to mirror my displays so that we can work together on this so so I yeah yeah well I'm going to go through that too so I'm just starting from an empty ish directory where I've already moved the configuration files that we'll need into it just so that it's easier and you don't have to watch me copy paste but I'll show you where we get those environment variable configuration things in a sec so first step the slide back I don't know where that slide went thanks we have a reference to our source repository which is OSS Aspen 8 knot and we can clone clone that and I'm just doing it into oh gosh an access request oh I see so if you go onto the scale website for this and so we have it in there and so then that's the one link so we can get access to it all the stuff that we need from here it would be in that web document the ergonomics of sharing the a document from the company google drive are not ideal is anyone still waiting for like his weight I don't know which links you can use yeah absolutely and if you're running if you're doing this on a mac just want to take a sec to say start the docker desktop application so that you have docker running and if you're on linux you probably you already know what's up okay so change directory into the 8 knot directory I think it comes by default on the dev branch could you confirm that for me so it starts on the dev branch which is where we want to be and I have already prepared the environment that we're going to start out with so I'm going to copy it here but this is what we're expecting to have for this demo so all you have to do is create a file called m dot list in the top level of the 8 knot directory with these credentials filled in and all of the credentials should be in that shared links document up at the top don't share our passwords with anyone they're very secret and proprietary right anybody still need a sec okay cool funnily enough we've done the hard part the only thing that remains I'm actually going to clear the screen so that's the top this is my recommendation for like a development scale of 8 knot locally I'm going to make that bigger sorry that went to the bottom and then when you run this it'll take a little while to get it all built for me it's instant because I've already built this and you are fine if you just do docker compose up dash dash build scaling just makes it a little bit faster whose docker is still building cool awesome anybody already done and you see all the pretty colored text cool everything going okay over here inevitable yeah yeah yeah yeah yeah yeah yeah yeah yeah yeah yeah yeah yeah yeah yeah oh yeah oh sure it's still going to take the lab off is docker desktop on on your computer oh they might but I just use brew to install it I don't know if you've got and is that currently running like you're currently installing brew how are things is it running for you yes no no what's going on you're lost okay I don't know that's the point so yeah sometimes you got both so oh we'll see I think I don't want to be able to pull up the it was also the desktop as well oh yeah that wasn't sure that was just a different number so this is docker desktop so you should have it so yeah I just installed everything last week but I did yeah let's see so yeah docker desktop just makes everything really convenient okay so essentially docker docker on mac requires that we run a virtual machine because docker processes like containers don't work on mac so we run a linux but it's just one of the primitives of how docker works like containers require linux then so we run a linux virtual machine and then docker desktop is the convenience yeah I just want yeah give me a second right now what I'm just doing is your docker is not or your brew is not picking up the docker readback and it's fairly up I mean you got the M3 right yeah so that's a load down I don't know if this is so I just should work resources we're going to give you 18 gigabytes memory swap resources sorry I shouldn't swear so you'll just have to remember that when docker is running it's consuming this much of your because it's running a virtual machine that big yeah so now we have that I think maybe brew doesn't have docker desktop well that's what I would expect yeah but it didn't do it for it I don't know what the deal is there might be a difference I thought maybe the repos were out of date so it didn't I don't know I this is beyond my awareness I'm kind of confused today because I'm just putting out photos of the scientists so I'm going on my caught up you should be good like this was the hardest part okay yeah we tried I don't know thank you for the yeah yep and this is why we hired a software engineer because before it was me and I was like this is not going to work out well for anybody yeah good James gone have you ever had a not running on your computer before exciting yeah this is going to be great hey and that's why very much been like you're not the only one of being like for this and for the one for SSNA there's a lot of people where I'm like if you just want to get up and running and don't want to figure any of this out yourself this is it this will be the easiest three hours of your time to be completely set up and ready to go oh my god I think you'll I'll be very curious to see what your thoughts are about the first response visualization because we're literally going to be diving super deep into that one because like the you know the puzzle piece slide that I do it for ours that's pretty much we're doing that live we're taking the pieces of two and switching out PRs and issues to make a new visualization because then you're able to kind of see the power of having a structural database and being able to see okay I like I want to see this visualization on a similar but different amount of data how quick you can just switch things out and then you have the whole a whole new visualization whole new view of things should be fine if any of the people who are if you're not able to fully build if you could open an issue on eight not documenting what your setup is and it's screenshots that'd be really really great and helpful for them we can actually spend some time figuring out why that's happening if we're not able to fix it live I know that's not super helpful for you right now but that is very helpful for us in making sure that this is like yes because I mean both of them we both have the same exact computer setup so it makes it he's really good with like doing a lot of different virtual machines and trying things out but in the day-to-day for which most people who are working on it are on Mac and like the specifics like the same chip and everything so the problems you'll run in on different computers new but I mean seriously you do it's going to be great it's so helpful yeah you're I know you're good I saw yours that's like the funny thing is like when it works you're just like and then you're done and then whenever it doesn't you're like I don't know if I can ever make it out of the black box that's why that's why I was I was from doing a call for if anybody's yeah yeah yeah yeah yeah yeah yeah yeah okay for those of you who have gotten past the build phase we can go off which is going to any local host so 0.0.0.0 or 127.0.0.1 or the worst or the local host no no no I'm just saying any host any local host and then port 8080 and that's where it lives and then it'll start saying this essentially like oh we're collecting stuff for a while right we're going to expect y'all to clap like that not really so in an ideal world it is that easy it's move your environment variable into the root and then compose it up and it just kind of pops up for you next to auger if you have a problem all the visualizations when you go to or it'll like complain a bunch in the logs but the visualizations will also go like nothing works but it's less of a black box and more of a complain in one place violently fail so that it doesn't do anything dumb in the front like it'll complain like a developer would expect it to anyway so right now we're actually not going to pause did you guys make any progress over there was it the VM size okay phenomenal okay cool and what about for you heartbreaking you're still going to do it oh god what did I do okay there's just like there's a mysterious black control up here that's very sensitive okay so we'll just keep moving forward here and do a little tour of the visualizations that are available out of the box that's where I'm going so there's a lot of features in 8k that aren't immediately obvious because they're either baked into plotly or they're not enabled by default so we're just going to do a little tour of what those are so you know where to get started with them so using any individual visualization we're going to use all the nice features that plotly gives us so we can focus in on an area and that'll select a smaller window of the data that's available double clicking on it will exit the focus so we go from zoomed in to zoomed back out we can deselect categories of data by clicking on the legend and then we have a little toolbar up at the top which is up here and this will be clear in full resolution but we can do a bunch of stuff with it scale, take a png, etc and then we have a little card at the bottom of every graph that describes what the graph is supposed to be showing and like our very abstract interpretation of how you can use it this goes into a breakdown of what each of the individual pages are kind of four generally it's hard to categorize these visualizations into like one specific angle or facet so these are very gentle provisions, mostly so that no individual page is super overwhelmed we have a separate section of what? yeah and then we'll see this in a minute we have auger accounts there's a notion of using auger as a user preferences backend which is very convenient so like I said before you can group different repositories together so if you're always going and checking out the five CNCF projects or you're always checking out your group of projects you don't have to always put them all in the search bar together and waste a couple cycles on that right so in general we make that an opt-in thing the reason that we didn't configure it and we will configure it in just a minute to do that is because it requires a lot of configuration stuff that supports this thing called the OAuth 2.0 flow that we don't want to just like dump on you out of the gate we have to specify a ton of URLs and go through the process of linking your app to auger in a reasonable way so we're just starting with a clean slate good question though and then this is just like how you go add a user group if you want to add a new group for yourself so instead of meeting you go to auger side to collect on the repositories you can just add that to a user group and that will start we'll start collecting on those repositories so if at the end of this workshop you're like I don't want to do any of this myself or host it but I just want to use the platform and if you want repositories that you specifically care about in that database you can add them in by using the user groups YAR so by default in the configuration that we gave you we're starting with the target repo set organization as the chaos project so that lives up here if we want to select another thing like chaos auger for instance we can go select it like this and it's the github URL that points directly to it alternatively if you want to slice by an org like we have for chaos there's like who there are netis kubernetes isn't available in this database but we can look at other stuff like swiftkelsey for instance I don't know what all the orgs are in here but for instance on our public publicly available instance we have kubernetes and kubernetes sigs and containers et cetera and then if you want podman container slash podman I don't know anyway it's there so we're starting off with the chaos organization and we can break it down so we can say okay in the chaos organization this is the overview of the programming languages that are used this looks at all of the files and breaks down their type auger does this for us what the package versions are like how old they are and then below we have per repo analysis so right now we're looking at grimoire labs sorting hat but if we want to look at auger for instance which is what Callie and I are most familiar with uh oggy this one's for oggy I think when we built this database it didn't include auger specifically but here's the scorecard open ssf scorecard and some general repo so what license it uses if it has a code of conduct whether or not it has a contributors.md file and that's how we check for this does it have a security policy et cetera and then we have a page that slices by contributor contribution statistics we just call it the contributions page pull request activity which includes some axis and concept of staleness how old a pull request is you can parameterize that for yourself so you might say there's no way I'm calling a PR stale after 7 days it's actually going to be 37 days and it'll re-render for you um and this errors and says okay days until staling is less than days until stale so maybe we'll make that 60 days and then okay our number of stale pull requests decreases relative um because we increase the days until staling range for instance we can track the number of pull requests over time as a trend merged versus closed I won't go into tremendous detail on each of these um issue activity how much issue backlog is building up in the project and look at the trends to see how many issues were closed versus opened we can see like okay trend decreases dramatically and we can see a bunch got closed a bunch of issues got closed issue assignment status etc and then some of the more interesting like deep dives on pull request conversation engagement are available so as PRs get opened how many are having a response are getting a response from not the PR opener within 2 days for instance or say 20 days this will take a little a little bit to render you see okay pretty much the same case like it's we're not seeing a huge gap and then we sliced by contributor statistics some of these are a little bit more deep little too detailed to go into like verbally in this presentation but we encourage you to go check out like the about section to see what the details are but we can see hey new contributors by month new contributor types over time are they showing up repeatedly or they showing up like a couple times and then leaving and we haven't seen them since contributors by action type so how many contributors were opening a PR in this span April 2023 how many were reviewing a PR number goes down because there are fewer reviews than there are open PRs etc and one of the most exciting pages that Callie and I developed and mostly Callie developed was this slice of organizational affiliation given project so we can actually take a second look exactly at what we know the most about which is OSS aspen slash eight not so that's our project which will go collect the data cash it in eight not and we can look at what the statistics are and see okay you know Callie and I know exactly who contributes to this project so we can inform what these things mean unique contributor email domains so 22% of the domains are red hat.com 17% are Gmail users no reply github.com other you kind of expect to see these yeah looks pretty well at like the idea that a lot of times whenever it comes to community or anything that you're analyzing you're not going to be able to directly answer the question that you would like everyone would love to know at the very I mean like this is the exact amount of contributors that are these ones are working at this company and these people are working as individuals that's just not realistic but you can look at different slices of it look at okay how many unique domains are being used what is the breakdown by the different types of contributions you can look at it from a bunch of different angles and get a really good holistic view towards that like be able to understand what the landscape looks like without being able to answer that direct question and so I think this is like a pretty good example of the type of care and the type of analysis that's like what it actually happens and what is you're able to get to for me one of the most useful examples of that this question like the problem with the question of organizational affiliation and augers value proposition and the data analysis that we do is actually this visualization where we try to associate activity in the project to individual emails associated with multiple emails on an account it's a lot to digest so I'll break it down so in this visualization we see this column where it says okay for redhat.com there are 3021 contributions what that means is that in this project there are 3000 contributions attributable to user accounts that are associated with at least one redhat email so that's ostensibly Callie and I that includes issues, pull requests commits, etc the same is for this users.noreply.github so there are that many events associated with an account that has that so all of these columns are kind of double counting but it gives you a really clear picture of okay this email domain generally is associated with this volume and it could mean that someone could have many different affiliations so I know that I'm a bu.edu email and roughly 50% of my time on this project was using that bu email and so that kind of makes sense and you can see like my MacBook Pro got put into the commit history so you can actually see it but it's really really useful to see slices like this to help paint a broader picture of affiliation in a project and then this is huh? yeah so this page digest is one of the cool perspectives that is available on the chaos website so all of these visualizations on this page and we've only implemented two come directly from definitions from the chaos project so lottery factor analyzes the top 10 contributors by commit so excuse me these commit hashes are respective of different individuals in our project or excuse me in chaos that have done some proportions so they've done at least 10 or at least k contributions and how many they've done and roughly 50% of the total contributor base is outstanding and then when we look at project velocity we see axes of number of commits on a logarithmic scale versus PR and issue actions and you can see okay up and to the right ostensibly means more velocity so grimoire elk has 3300 commits and 900 PRs um let's see where does auger live oh auger is not in this slice that's okay we see like the layout generally the projects are in the common set that we're looking at how much uh how much volume of contribution they see etc okay we're taking into account PR reviews merge issues a bunch of different data contribution types into one visualization and able to weight those contributions in different ways that's kind of the level of analysis that made us really want to use a high con environment to those type of things like we did a lot of more built in dashboard tooling that was available we weren't able to do that aggregated analysis to be then populated into different civilizations um and I'll talk a little bit about this page as well I mean this is a good example of another of that next phase analysis that we like to talk about this is a heat map series you can look at the uh about graphs but it looks at activity in a repository not just at like a higher level looking at it as a like directory file level and you can choose which view you want to have on it and you can see okay how long has it been since somebody who's contributed to this section of the code how long have been since they've been active anywhere you have sections of your code base that are in the contention to be maintained you have people who are contributing a lot that maybe should now be reviewers and you know that your reviewer base is going down because of the combination of these three different views on your contributor activity and your code base and so this is like each one chose a different view and put all together allows you to have the information that you would need to make preventative actions that could help your community out so then it's not that I don't know if the PPRs are open to a certain section of the code base and that's how you figure out that you no longer have a maintainer that is familiar or aware of that of that area or if you have some people that you should probably be giving you more responsibility in your community giving more opportunities and so this is something that we finished more recently and yeah I think that kind of shows where you would like to go with the visualizations and analysis that we're doing yeah so this I was loud this is a series of three visualizations that I would actually start with this one to like ramp up complexity from relatively simple upward so this is just a heat map of activity in the code base by number of contributions so this looks at PRs opened per subsection of the repo at a given level so right now we're at the very top of grimoire lab sorting hat which is a project from grimoire lab and we see okay in the directory sorting hat one PR was opened in September 2023 for instance so we break it down by pure heat map and then reviewer heat map looks at the same idea but how many PRs were reviewed targeting a specific subcomponent of the project and finally contributor file heat map is exactly what Callie was describing the interpretability thing is kind of better verbal for me so here we see okay the last time three contributors to the grimoire elk folder in this project we're seeing was June 2020 for instance which helps you look at this and back date say okay last time we saw these contributors anywhere in the project to the schema directory was June 2020 these people are no longer showing up in the project and identify there's relatively few people for the utils package or utils sub directory that have been around in a while there's only one person that's been around that could be a problem for us and then for any definitions that you need for like inspecting these graphs we have a page of all the definitions that we use that's a really big overview of the application and all of the faculties that come with it without extending it to include auger accounts what I'm going to do now and I'm not expecting this to be a follow along because it requires going in essentially registering the app with an auger instance so we're just going to kind of skip that and I'm going to just show it to you as an example I'm going to rebuild oh yeah I'm moving on in like five minutes yeah you can force stop this stuff this is my recommended way of bringing everything down so docker compose down dash dash volumes which will clean up all of the postgres cache things along with all of the running containers okay whatever so I'm going to copy here to the environment where I include all of the stuff needed for OAuth we're connecting to the same database what I'm going to do is use the credentials that I have for a previously registered instance to go communicate with the auger registration endpoint and then re-register this application as a net new thing you'll just see what that means in a sec so same way to boot this back up just with different credentials and now we have this new thing up here that has auger log in and sign up, refresh groups manage groups log out, all of those are disabled for the time being and I'm just going to log out so we can start from scratch so we go to this page and we're greeted by the auger frontend and the auger frontend lets you manage 8-0 instances registered with a given auger database as extended endpoints and manage your user groups etc so I have an account previously created with this and normally if we were just logging in we would just do this OAuth authorization flow but I'm going to short circuit that to go through the process of registering a new application so I have this connected app that I don't want to use we're going to create a new application I'm going to do 8-0 scale demo and the redirect URL is where the OAuth flow will try to go back to with a code that's needed to authenticate your user because I'm using this locally all I need to give it is what the route is it's a local web server or application server so HTTP colon slash slash host colon and we run it on 8080 validating that and based on the definition that OAuth likes the slash authorize endpoint is what we make available create this and I've got this new application 8-0 scale demo we've got this application ID and I have to bring down the web server to restart this with the correct application ID and I can change it here and then change it here and then change it here all we're doing is changing where it says client ID or app ID and then we need our app ID or excuse me the client secret that gets replaced here and here we can bring up this new application that's registered with Augur we're just going to watch this connected users thing in a minute and now we're authorizing our user account to go use this instance of Augur that we've registered when we come back with our user account we have my user name Jay Kunstall up here and here in connected users it says okay we have one connected user using this particular Augur instance so what's the point of these user accounts then well like I've said a couple times what if I don't want to always have to put in like OSS Aspen 8-0 and OSS Aspen R2 projects I don't want to do that every single time so I have the ability to go into Augur create this no scale group 2 you can see that I rehearsed this we have this guy and we have no repos in this group and if we add OSS Aspen slash 8-0 and OSS Aspen slash repel to scale group 2 we have two repos now added to this we can see for repel and 8-0 we have issues and commits associated with it we know that the data is available if the data is not available then Augur will have these two as zeros which just means data is collecting we're working on extending that so it'll say data collecting more literally but that's just what that means if you see that and then we can refresh the group and then under my user name I have scale group 1 scale group 2 if I select that and then search for it we have these two repos as the aggregate set which is really convenient because now I don't have to look that up again and we've had some user stories where people want to slice like I said earlier specifically on CNCF incubating statuses for instance same with the Apache Software Foundation like want to see what the activity has been in the Apache Software Foundation like the stuff that should be like are people still interested in that that's a slice that people have asked for specifically so that pretty much wraps up this part of the demo my recommendation is that if you're interested in signing into Augur and going through this user groups workflow work through the docs see if there's something that's not ergonomic and you don't like it and let us know if it's great also let us know some things will probably be extended in the near future just for API simplification like this will change because we essentially reference the same endpoints twice for two overlapping implementations of OAuth for different things that support both ends or two contexts so that will change but otherwise a lot of things should be the same if you go play with it anyway thank you so much for following along with me up to this point really hope that you learn something about the data visualization stuff we do and you can go use 8-0 and contribute to it at your leisure and then I'll turn it over to Kali for the second part of this talk do you want the live or at like a conceptual level how would you start to build them out and then we're going to go from start to finish building your own visualization from taking in the chaos documentation looking at some of the visualizations that already exist in 8-0 and taking all the resources that are available together to put and make a new visualization but first I'm going to talk about it more from a conceptual angle and so this portion we're just going to be talking about the value of community metrics the methodology and just the different analysis lessons learned type moments I think this will be good as well for people who aren't as familiar with the community or community metrics at all like kind of some of the different angles that you might want to consider if you're starting to do community metrics on a community with a community that's never done that before and so first we're kind of looking at what can these strong these community metrics start to enable and first we are looking at just building on the community knowledge like data analysis in this type of information of getting this information is not going to be the one thing that informs your community decisions it really shouldn't be it should be something that helps enable them it should take things and this is something that I've gotten from experience of things that might take 10-20 hours to manually look into or things that if you don't have the data behind it you will never be able to actually confirm those are the things that can be really helpful when you start to have this community knowledge you take what you already know you're able to confirm some intuitions that you have and be able to make better decisions and honestly quicker decisions you can leave the step of discussing it if something is happening you're able to say okay I can see in X, Y and Z way that this is happening now what are we going to do about it next portion we kind of talk about is staying informed in a sustainable way I feel like anybody who works in communities or otherwise we all have a thousand and one things to do to keep up with if it is not a maintainable process if it's not something that you have time for you're not going to keep up you're not going to keep up with it so a lot of the different visualizations that I think about making is about okay what is a large 20-hour process is there a way that I can make it a 5-minute process and how would I go about doing it and the last thing is just filtering through just the amount of amounts of data I feel like there's a lot of pressure to use data to make data-driven decisions you have all this mountain of information now what are you going to do about it it can be very stressful and hard to even I mean if you think about the auger schema you look at it in a huge view and you're like this is just the tables like how do I even get started how do I use this in an intelligible way and so those are kind of some of the stuff we start thinking about the next thing is looking at what type of perspective you want to be able to gain or share by these metrics and the one thing another one of the first things you want to start thinking about is your goal to gain information or to influence action is there an area of your community that's not understood and you're just trying to take that first step or getting there or is there like an initiative that you're trying to decide on or something that's already initiative that's already happening that you're trying to see the impacts and so those are kind of the one of the first things I start thinking about and the second one is thinking about if you want to expose areas of improvement or highlight your strengths there is times when you really want to hype up your community show it how great it is I'd say as a Ospo within a larger company I'd like to highlight just all of the wonderful things about X, Y and Z community and really show it in a way that's accessible but that's those type of metrics looking at all of it from being the hype up the things that make it look great that might not be the same type of metrics that you want to look at when informing your community decisions if that's you're trying to show maybe some shortcomings or trying to inform yourself on things that you could do better looking at things that are great you are isn't going to really be helpful in that process and so there's a time and a place for highlighting strengths and there's a time and a place of trying to figure out where your shortcomings are and the last thing you want to look at we kind of go into earlier is looking at community impact versus the business impact businesses are used to seeing numbers and data I have seen in a lot of our conversations just having the data behind whatever we're talking to more of the business minded people within Red Hat it is something they're familiar with it's something that gets them to be like community isn't this big scary thing that we don't understand and super unstructured those are the things that make it seem more accessible and put value behind the things that we are saying and then you can look at it from how is your community impacting open source overall how does it look at the ecosystem and so those are some of the things you want to consider and I would say these aren't always an either or situation but these are the type of framing that you want to do when you're developing a very deliberate metric and so for that it doesn't start getting there's just a lot going on so trying to really hone in on what your individual goals are for each visualization or set of visualizations you make starts to make this a more manageable process so first we're going to be like this portion of it we're going to be looking at the codifying the projects and the metrics we can talk about like debate there's a lot of different tooling we've talked about some of the different ones for this workshop we're going to be looking at auger and 8 knot but there's always room for debate on that stuff but this section if you use a completely different tool chain all the things that I'm saying still apply and so right here is where we want to start thinking about what do you actually want to know what data is accessible and a thoughtful execution of the data analysis and so we'll kind of go into some different analysis angles and these are just some examples there are a lot of different different things that you can look at but they can be kind of generally applied to different situations and so first we have our scenario one A is that you're building off of current data analysis you've already started to go down this path within your community of looking at different visualizations and then let's figure out that iterative process to make it better the idea here is to build off some of the more common or traditional open source community analysis commits over time that's cool to know but what does that actually show you or tell you or give you any way to make action from that at that point it's just a base number and so let's say you have some of the examples we have contributors over time say you have that value of knowing that there are 120 total contributors over the lifespan of an entire project and that's a value you can put on a slide but how do you take a step further to be able to make decisions off of it and then so this is like that would be numbers over time then you could take it the next step further and we've actually seen some of these visualizations is looking at the active versus drifting contributors from all the contributors have been active or have been involved in the project how many of them have been involved in the last six months, last 12 months is that value staying consistent is your active contributor base going down that now starts to tell you something more in depth about your community and being able to make some more decisions off of if your active contributor base just got cut in half in the last year something's happening and you want to investigate more another example of this is like the commits over time you can start looking at the commits by maybe a subset of the contributors that's the lottery factor or the bus factor is there a very small portion of your contributor base that is the like I don't know 50 percent of those commits that's a lot to be on a singular person and it's also a lot of risk what happens if that person leaves and if you don't know who that person is that's also the first step because you want to make sure that they aren't going anywhere or starting to be able to figure out how because you take their knowledge to teach other people seeing who else is involved the 1B of this this is something that I've actually just started talking about which is just building off of prior work there is work from a conceptual standpoint under chaos a lot of other people have made visualizations and metrics and so from a conceptual side or from a technical side you can start to take things from like a structural similarity standpoint so I have a little bit of a visualization with the puzzle pieces where it all goes together but you're only changing one part out a good example of this is from the first response what you can take what you do conceptually from PRs and apply it to issues and whenever you're using a structured database like we are that makes it to where even from a technical standpoint and we're going to walk through this that's going to be our visualization we're going to build actually is how to take all of the code and how to take the queries and make that one change so now you have a brand new visualization and how to start taking in those resources to make a more complex visualization or more informed visualization but also not to repeat work because nobody likes that another way of thinking about this is from a data similarity a lot of times from your visualizations you might have a bunch of pieces that make like for context the PR review assignments that was one of the visualizations and that looks at the assignments per contributor so you know if there's a vast majority of those assignments going to one contributor trying to understand that the same data and pretty much the same code can be applied to do the status counts so seeing over the span of a month how many PRs are being assigned or unassigned at all and so you're not even looking at it from the contributor standpoint you're looking at editor you can see by how long this is taking to load why we use a materialized view because it is a complex like layering because you have a column for or not a column but a row for every single message that or comment however you want to like think about it on a PR and it also will exist if there is no messages at all associated with a PR let's things are happening so this is just to be able to see what that query would look like so you have like your pull request ID the ID of that repository what the contributor which contributor contributed a or that's the contributor for the pull request and then this is the message that is associated with that PR and then the person who sent that message or made the comment and then we have our information about when a PR was created and closed so it's actually a little bit simpler to do it from the issue standpoint then the PRs because there is no reviews and issues so I'm going to do a little bit of black magic because I have a feeling that this I will take as somebody told us in a company meeting recently the baking show version of tutorials and so I pretty much when I developed this visualization the first time and when about any time I do I kind of make a new query I look at the structure of things that I already know and then I start translating in this case there is pretty much a direct to oh this is not the one that I wanted pretend like you're not seeing the snow book that I'm about to open it definitely doesn't exist and I'm doing this all live with you for the first time that's the one thing the mirroring doesn't allow you to have any movie magic so well this is the query that I was talking about and if you look at it it looks very structurally similar to the one that you just saw for the PRs and this is the issues version of this what I ended up doing to be able to develop this was I went line by line for the PRs version and then I just did the translation to the issues table so you can kind of see up here like issue pull request ID actually I think I might be able to open this and I was hoping it was going to let me do an additional opening up for us to see them next to each other but we'll be looking at you kind of like can be able to see this is the pull request repo ID the contributor ID so on and so forth and that's pretty much the exact thing that I did I just looked at the PR table and I looked at the issue table and I did the direct translation over and did it to where we were able to get the messages which is the comments that are associated with each of the issues and so there's just a it's pretty much all translation from PRs to that issues building so the next step in doing this once you have your query that you want to work with take this and then we will put it into our plot lead tutorial notebook in the exact same way there's a link directly in that document to the completed version of this visualization notebook of any of the things that I copy and paste over you want to be able to do locally or I can end up putting it in the document as well just communicate with me and let me know so this is what we have here now we have our import for the repo statement we have our query and let's run the query and see what happens so if you've ever used Jupyter notebooks before when the number pops up that means that cell has run and so now we can see and that's the query that we saw in the database editor so now that we have our query to work with let's kind of go back to the resources that we have available to us and see how they process the information for the PRs so we're basically just going to go and take this whole process data section and go and put it in here and so then we can all just take a look at the code that processes that query, the query of data that we just looked at I've been talking a lot I'm going to pause for a second to make sure people just keep on going you're lost yeah I'm so sorry SQL even with doing the build yeah I I've told James that I almost exclusively like working in virtual environments like this so then whenever I screw up I can just destroy it and have no consequences for my actions because I have completely I'm you want to know why I do this? I destroyed the filing system on my computer in college completely using Jupiter notebooks so that's when I learned that I needed to put myself in a playpen because I cannot be trusted with anything locally so that makes sense please then if you're not able to do it locally you're going to be the person that's going to be following this the most stop me whenever you're like I don't know why you just did this Cali and let's talk about it um oh yeah we can do that we can talk through that as well um also it might just be that a good idea to get a Jupiter notebook environment running for OSSNA we'll see the other conference that we're doing this at so a little bit of a content breather pause we can get back at it so we have our notebook we're in the tutorial notebook it has all the structure here we made our query to go and get the data directly from auger and now we are looking at the code that was for the PR's first request the PR's first response and looking about how that maps to issues so if we just go line by line this right here is just converting everything to a pandas date time object if you've ever worked with pandas or any type of date time it is a nightmare and so I pretty much opt to always converting to a consistent date time object to know that everything is going to play nice and so we know we're working with issues and not PR's so honestly whenever I go about doing this I'm just going to see where PR is and now we know we don't want to PR create it at we want just create it at because that's the data that we have now so I'm just going to go and change this over to create it at and then we're doing the closed at and so the next step that we'll be doing is looking at this next line of code and we want to drop the messages from the time before was the PR creator but now we're looking at the issue creator because if we're looking at people's first response we don't want the messages that say somebody opened that issue and they were the first person to comment something because what it always happens we don't want that to be a part of our data because that's not a first response that's themselves and so the next thing that we'll kind of go through is looking at we want to sort all of our information and looking at it by sorting by that message timestamp since that's the thing that we really want to look at and we want to drop the duplicates because we're only looking at the first so instead of pull request ID we can see that we just want the issue ID issue ID and then we want to be able to get that first and last element in the data frame so we want to know which one was the earliest issue created and what was the latest issue created or closed to be able to get that date range fully of what is the earliest issue we have what is the latest issue we have to be able to do the iteration across and so looking at these comments I'm actually going to edit all of this right now this is why I just do a little comment so then now we have we're editing all of the comments that go with the code so somebody can walk through this the same way that I'm walking through it right now on this next visualization so we're opening we're having a new data frame that actually I went a couple two steps too far my bad now with the earliest and latest dates we're going to use pandas to get a date range so this is pretty much going to give us a date column for every single day from that start date to that end date and so we're setting up a data frame to be able to put all of our information into to be able to build a visualization and so that's we got that date range and we're creating that into a data frame and this is for them whenever we get down to this next step and if you look at it for PRs and then we're about to translate it to issues we're going to do some we're going to look at every single day and see for that specific day how many issues are open and how many of those have gotten a response within our defined time line so if we look at the PR first response one the initial response days is two so we have how many PRs are open and how many PRs have gotten a response within that two day threshold and so we're going to go back over to here which one this one thank you beautiful you can ask James spelling is really is really my strong suit it's honestly where I there may or may not be about five commits for this notebook that was purely me trying to spell response correctly I wish I was joking so what we're looking at here I don't know how familiar or like how familiar y'all are with Python but this is pretty much just applying a function to every single row in the data frame this is something that I started to do a lot with these visualizations and really allows for the some more complex analysis so we see here that there is a function that we're applying to the entire data frame we don't have that function in our notebook yet so let's go and get the PR version and start to dive into what that means okay can you give me like a ten minute point from right here because we might have to do some more because I want to be able to show y'all how to integrate all of this into 8knot but I also want y'all to know what the code means since we're kind of just translating over but we're doing applying this function to every column and so we can look at what is this function in the context of PRs so this function takes the date and determines how many PRs are open and how many of them have a response within a number of days which is perfect because that seems exactly what we want except for issues and so we kind of can start going through this pretty quickly I'm going to do the same little plug and play of changing over from PRs to just the issues version of it which pretty much just means deleting PR from everything but what this is doing is looking at a singular day we're dropping all of the columns that have been created or all of the issues that have been created after the date that we're looking at we're dropping all of the columns that have been closed before the date that we are looking at and then we also want to include all of the PRs were in this case all of the issues that have not been closed yet yes then it wouldn't be open on that day anymore it'd be closed so it would never get past that second conditional no because we're only looking at what PRs are open on that day and then as of that day have they been responded to so in this situation where the first response is a merger closed that means that it's no longer open so it wouldn't be counted in this scenario yeah because I guess it's like we're trying to compare like how many are currently open and how many have a response within the time interval I guess that's going to be the difference between looking at something over a month and looking at something on a day is that then because in our scenario that means that the issue is closed or the PR is closed and so then it doesn't apply to this visualization anymore so we'll kind of go back into this one we're looking at the PRs being open, closed making sure that we're only looking at the ones that are actively open and then for the ones that are actively opened if they have a response within the time interval that we've defined which is number of days and so that's kind of what each of these steps is going towards I'm trying to be a little bit more efficient with my time because I want to go more into the 8-0 side of things but pretty much we can look at the overall this function that is being applied to each of the columns and so that's how you'll get for every single day a data frame with how many are open and how many have a response and so I'm actually going to pop over to the notebook that we have already built out which is line by line what we just did to save us a little bit of time but here is that number of days here is all the translation over that's pretty much just as me removing the PR from every single line of this and then you can see what that data frame looks like whenever you apply that function and so you have for every date you have how many are open and how many have a response in that what we have defined as two days and so that's how we get that initial data frame to be able to populate our visualization and so the next thing that we'll be doing to populate the visualization is that I'm just going to take the code from the create figure section because this is where all that code to go from process data section for every single visualization is where you process the data and turn it into a data frame to be put into the visualization and then now you have your sectioned out visualization stuff to be able to build a plotly visualization so if we look at this and compare it to what we've done for PRs kind of look in line by line this is all in plotly when they find very interesting about plotly visualizations is that you're pretty much just building on top of each other like you can make this one's like a go figure or you can do a plotly express visualization and define a pie chart and then you can just start building your customizations on top of it like updating your layout or defining their documentation is honestly phenomenal and I just go and be like I want to make my pie chart 20% bigger and then it tells me exactly how I build that customization on top so what we'll do here is go and instead of PRs we're doing issues are open the date, the open, the lines we want the PRs to be issues response, response, PRs issues and this will make sense more when we get there but we haven't defined a color sequence in this notebook and we're just trying to make sure everything works so then it makes the process over and ain't not a little bit easier so I'm going to make sure this is all been run because I don't think it has been yeah this is the other notebook, this is what I get but yeah you're right workshop and it's in a different location cool just running all the same things as before so we don't have to go through it why is it mad at me oh because I haven't defined so that's what we just did in the other notebook I just wanted to have something that was already pre-built so then we can actually work in ain't not so this is making the the database or not the database but the data frame and then this is going to build the visualization now we can see okay it's mad at me because I just went and took this directly over and in the ain't not it's just df and this is df responses and so this is how we do the movie magic and when I tell you this is exactly what I do every single time I make a visualization I'm not I'm not being facetious I'm being very literal because there's nothing more miserable than working in like full-blown like any type of app development environment and then needing to go like if you've worked with any data pre-processing there's so many little changes that you need to make and if you have to build and rebuild every single time you need to make one of those changes it takes forever and so would you look at that we got a visualization going that tells us with the number of issues that are open and then how many of them have a response within those two days so we've got all the bones to go and make this to be able to go and make our to make a not work I don't know why this one's to save something I don't want it to be saved do this but then I can never mind so sweet we'll go over here and now I'm going to go directly to the visualization template and start just going line by line down what we need to do and first thing that I notice and it's kind of in there I'm just going to hop over is that we need to import that query but we made a new query so we have to go and also add the query to our to the to eight not so I just go and copy the query template and we're going to paste it in here and that's the copy and so right here I'm just going to rename it to issue response query and so that first to do is really place all instances of name query and so I'm going to do and change this to issue response issue response issue response cool I went through step one I want to paste my SQL query into this string the query string so I'm going to go over here and I've got my SQL ready to go and then we go in here and it kind of says this as well is that this is how our caching system knows which repositories and so you just replace and instead of like a repo statement we just put it right here so a repo statement if you do it in a notebook it's just a little percent s so we've done that step now we want to go over to the index callbacks and we need to import the query so I'm going to go here I'm going to go import queries dot issue response query import issues response query so I do it there I imported it and then I'm adding it to the list of queries that are going to be run on the build of the application so we've done that now let's go to our next step so we have done it we've registered it and now we need to go and create a table and db in it and so let's go over there there is it there it is and so now we want to add a specific form this query and would you look at that the one that's right above is the one for PR response so we're familiar that the structure is very similar so I'm just going to go copy and paste it and take over and make this into an issue first response instead of pull request ID I'm going to do issue ID I'm going to change this to a text we can explain why later but I have to use the int is very large and I've learned that the hard way over the weekend because I've been using it for tech I just know this works so we'll go um and so now we've added this into this is all to make sure that all of our data is being stored we have all this is all the columns for that query we are again just going back to over here we have the now to db int we're going to update the doc string um I'm not actually going to update it right now because I want to go quicker um but and then I would never push this without updating the doc strings because documentation but for the sake of this workshop we're going to hop right along so now we have our query ready to go now let's build a visualization so we're just going to go and copy this template I'm going to do the exact same thing that I just did and for the queries and so I would know I want to add it to the contributions page so I'm going to paste the um query template I actually didn't put it in the right spot I need to put it in the visualizations um folder um here is the query the visualization template and I want to rename this um issue first response and now we just go line by line of like variables to change first one page we're just going to go down and change it to the page here so it's contributions second one is vis id and so it's just a short name for the visualization so I'm just going to choose um issue first response um the gc visualization which is pretty much this is a group card anything but it's just a again just a unique identifier for the visualization card um a lot of this structure just makes it to where you can easily reuse code um which we'll kind of see how that all comes together which is the reason why we're able to take a lot of the code from the other visualizations um we want to update that title um title is issue first response um I'm going to do the context of the graph later wink wink um that's just the pop over and so now I want to go over and do the IDs of the dash components which the if you haven't like done some familiarity with dash or plotly if you're wanting to get really in the weeds there's a lot of you can talk to me there's a couple of different like there's a couple different videos I'd recommend watching just to understand the callback structure which is why it's so interactive um but for the case of our visualization I know that PR first response the structure is exactly what we want whenever it comes to the callbacks because I know that the only input that we want for this visualization is that response number of days so I can go to the form of the PR first response and this has all the structure has the about and everything that we would need to go directly into making the issues first response um so if we go over to issues instead of the template has a lot of options already in there that you can do date interval that's one that's common there's a lot of stuff that's commented out for you to use um to make your user inputs for your visualization um so I'm just going to take all of this and take it out and I just want to remember exactly where I copied from so the row I'm just going to do from the form yay brackets okay I'm just going to copy and paste this all over so now we know that by putting this directly into the form this is going to make it to where our visualization has the exact same inputs as this one which is the response number of days in the about graph so we're good to go and the IDs are already all handled and unique because we use that page and this ID option and so even though we used it we took this from a different page and this ID we can do it here because we've defined it uniquely for this page and so we have our response days that is specifically for issues and we didn't really need to put any thought into it for the visualization also for more notes a lot of the different stuff that I'm talking about that's not on here this is just like a specific like you need to remember to change these things but there's an entire document of every single thing you need to do if you're creating a visualization straight from scratch and I'm not here just monotonely talking about it so now we can go and we need to change the name of the visualization graph we're going to do the same thing right here we're going to go and do issues first response and I got a little bit distracted of with the IDs the new dash IDs we now know that it's the response days and those are going to go directly into a callback and so again these are some of the things if you want to start making your own visualizations more specifically that'd be good to spend the 30 minutes kind of going through the dash structure but you're able to follow it just with some of the inputs to know that we're doing the callback for our visualization these are commented out because we don't this is stuff that we don't need and this one was like the pre-baked option but that's not what we have we know that because of the structure up here that we're just going to take the ID up above and do the here and then I'm actually going to comment out this bot switch we an eight not and you could make this without it but I know that this is going to make it to where any response from a bot is not automatically considered by now we can look at this definition we have our repo list we want to have the num days here and then the bot filter and so we have when we comment out here you can see on comment if the bot filter applies and in this case it does so we can go and do this and so actually it's called bot switch the bot switch is on so we have that we have this set up let's go back to up to our list the comments with the date times and the sort by I'll come back to all of this and then we'll let's just hop over to doing the queries a lot of times whenever I'm going through this I don't necessarily you don't have to do it in order I just delete every time I actually do the thing that it says to me to do so we need to go and import the queries so we know our query name is issue response query the and we want to import issue response query as issue response query and so we'll go down here and we want to change this this is pretty much going to our caching system and tells us please get us the query for the specific repositories that have been requested and so now we have it set up to where it's going to get the data we're going to make sure that this is just a preventative like error handling and now let's do that process data stuff that we had already done in our notebook and we have the num days because that's the input in for the days so for our process data you can kind of read through all of this whenever we're not walking through but it kind of tells you how to go and look at if any other visualizations have a similar process we know that it oh no please reopen oh please I will be sad I was really tragic let's see yes I was doing I was on a really good good track I would have loved to have been slower through this but oh come on I'm just going to reopen my notebook environment really quickly actually you know what I'm not this is Open Ship Day internal team this is remember I was saying that I put myself in a playpen I make other people build the playpen this is just the notebook that we were looking at before and so I'm just going to go and copy and paste what we got from the process data and we're going to put that shocker right in process data and we know we need to do the num response let's see what num days and so we need to get that open response thing we were looking at earlier so I'm just going to paste this over here put it in and so now we have all the data preprocessing and now I want to go and create the figure which imports that num days as well look at here um um and we're going to go and take the um create figure data and put it directly in the create figure section and actually that's from I'm going to have to re-edit this because this is not the notebook that we just did we just generalize this back down to data frame big shocker when I started going into all the deep visualization code people started leaving quick um so yeah we just took pretty much what I was doing pretty quick the movie magic is just copying and pasting from that notebook um and so now I know all the queries are done so I can delete that I know that we'll do comments and things hypothetically later um but we know the sort by the date time all of that has been handled by the stuff that we did in the notebook um and so now we just need to go through and to we just need to import the visualization into the page which is all directly in that note from visualizations.issue first response import gc issues first response and then we'll just make a new column this is just the putting that visualization on the page that we were showing um issue response issue response that's it and then that's all done cool we've done everything that we need to do based off of the template obviously we'd want to clean everything up but let's see if on the first run it works that would be surprising just built it the same way that we would before um I don't want to we don't have time to it'll work without the I've already run it with and would you look at that issue first response is there and let's see okay no it's actually not it's from the I've done this one time I've demoed it is that the results is in df responses and so I just need to make sure that that's the thing that we get from our process data function and let's try this one more time oh yeah so that's just the pip in thing though like so pretty much if you have um it's not a file the requirement is that I make sure that pip imp is installed and then if you're you cd into that repository you just have to run pip imp install and it installs all of the um requirements that are um set for that repository oh the requirements is like in a requirement txt I would assume yeah cause that was a point that we were kind of like I didn't really know with the notebook side of things of how much we wanted to set up an environment and all the different stuff but would you look at that we got a little working visualization everything's good and we can see obviously whenever you go and plug something back in there's going to be little things that you're going to want to tweak on um but we have a functioning first response visualization um I would I know we went through that pretty fast um good things to note whenever we try to do this again of trying to figure out how to simplify some stuff um but that is like an end to end process of making a visualization from concept to looking at the documents that are available looking at the code of things that are already existing in 8.0 getting it working in a notebook and then getting that based off of the templates working interactively in a dashboard environment so that's that people have questions comments I've just talked a lot for a long time so yeah for sure yeah please like tell us like reach out to us let us know open issues um like y'all are honestly like such a good um litmus test for us is you should be able to do this with our doc if you're not able to follow the documentation to create a new visualization that means that there's stuff that we need to add so um that means mutually um productive stuff so yeah questions comments yeah I know we just there's a lot of information for three hours I know I and that's we're gonna be doing a similar similar workshop at another event and so if there's any things that you'd be like maybe next time don't spend as much time there this is the stuff that needs more time great feedback always welcome so yeah that's all we got thanks for everyone staying till till the end yeah thanks like we just had an open source summit North America it's the it's one of the events run by the linux foundation yeah the joker and no postcard we actually had a pretty good it's like the discipline of being strong for days to get slides for this so having to be out there so that's not what they need to be out there no that thing has I brought it up to our doorman there is nothing that has changed since we've talked but I'm pretty much still in the that worked for everyone, and so whoever, I'm still, I haven't heard anything different. I was originally told that we were going to, and Chama's gonna try to make it. Thanks guys. Yeah, thanks. Appreciate it, hope that was valuable. Yeah, we'll be around the Red Hat booth all weekend, so if you're trying to stop by, have other questions, wanna work on it more, some of the stuff, more one-on-one, yeah, let us know. Open. Yeah. It's useful for people, but I think it's also a good, I think it's a good, if you don't care, even if you don't have an open source area, this stuff of this code base is so much lighter than a lot of other stuff, if you're just trying for the first time, how do I get something running, and then how do I contribute to something where I only need to understand this portion of the code instead of an entire set and making a little bit, and wanna learn about data, if you're trying to learn about data science in general, this is a good interest way, I've actually been working, yeah, it is, it is. Yeah, it was like, I came up with the concept of this project, and then the structure of everything, I wasn't happy when I was 15-year college, and then came up with the structure of being like, I'm what, and this is actually strategic, but it's true, I don't want to learn anything more about software engineering than I do right now, because I wanna be the data science persona, that if I'm the person that I just need to understand, Python, Dash, the graphing, it's like, I just have that knowledge base, am I able to build visualizations like that? And so that's the idea of being able, for somebody else to come in, who's just a data scientist, maybe not know anything about community, or know anything about the software engineering that you can come in and contribute, or come in and just like learn how to use Dash, and learn how do you do data visualization, and all that stuff, because I do have some friends that I'm like, who are transitioning to data science, I'm like, hey, just come and contribute to visualizations, and then you have something to show on, like job interviews, and being like, I have contributed, and I've done this whole data visualization platform end-to-end to make anything sound crazy if you want to. Yeah, if you're trying to go into data science, like definitely come talk, I'll be at the booth, but I do tomorrow, yeah. Today was just like a workshop day, and then the full conference kind of started. Okay, perfect, thank you. Yep, that's fine. And this is why you don't touch the, or turn it down, is there a sound more in there? Yeah, there is, but don't be fine. That's not what my voice carries. No, no, I'll say this thing. See how you touch it? Mm-hmm. You don't need it. That's okay, we'll just, we can probably just leave it. I mean, don't touch it. We can just be careful. Yeah. Otherwise it'll crack along your own base. Yep, internet access, and a chromium-based web browser. Yes, most of the Firefox is fine, should work. How's the Wi-Fi here, is it, is it pretty solid? I've been using my hotspot, and it's kind of slow right now. No, no comment, that's fine. It's all right, it works. Yep, chromium-based. It's, it's, you should be fine with Firefox. Ten, like, eight more minutes, until we're done. Yeah. Okay. No, I'm just, introduction, like next slide. Yeah, yeah, yeah, yeah, yeah, no. Yeah, she used my Slack photo, I think. What happened? Yeah, is the other building? Yeah, I don't think that there's any. I think she moved. Oh. She was, I think, there, instead of there. I'm one of them. So I don't, I don't have a booth. Okay, just a couple minutes before we start. All right, scene is that it's officially time to start. I'm gonna slowly let more people trickle in, but I'll start by giving both Brian and I an introduction, so let's get started. My name is Austin Iveson. I'm a solution architect at SUSE. This is Brian Six, my contemporary. He'll be walking around, helping you guys out in case there's any kind of like troubleshooting issues, because we do run into those every now and then. And so kind of just talk about the agenda for today, the objectives we have. So this is going to be surrounding kind of like the basic concepts of containers, and therefore Kubernetes orchestration, which is container orchestration. So we'll start off by going over some basics, cover some architecture, both within Kubernetes concepts, and then once I'm done giving you guys a nice little slide show, we will actually get our hands dirty. We have some instances that we'll be spinning up for you in the cloud, and you can actually hop in and deploy your own Kubernetes cluster, as well as our Rancher tool, and so forth. So just to give you some forewarning, this is not intended by any means to be something that you use in production. The environments we're spinning up are singular nodes, so there is no resiliency, there's no highly available aspects, don't copy and paste this exact situation, essentially, and take it to like a production environment. You will have a bad time. But it is a really fun lab environment. So by all means, if you follow along with the scenario and then take those steps and put them into your own little home lab, you can absolutely do that. I do that myself, obviously. So just some prerequisites. Up-to-date browser, make sure you're using either Chrome-based or Mozilla Firefox. Safari, for instance, has issues, so don't do that. All right, any questions? Start off. Cool, cool, cool. Oh, a question. Oh, yeah, how you would set that. That is a conversation I can kind of like get into that. There will be more like high level conversation surrounding the architecture for something that would be within production just within like the slides and then with what I show you. It's actually not many more steps. So yeah, and we'll touch on that. So thanks. Okay, so containers are great. They're fantastic. They allow for an immutable system that can be deployable anywhere. It's consistent. Think of it as like a file system with a read-only and a write-only layer. So because of the simplicity of it all, you actually just have your application's code and you have dependencies. You package that into what we call a container image. And so again, you're going to have a container and then the container run time takes that read-only file layer and puts a read-write layer on top of it. And so therefore you can deploy it anywhere and it's going to be consistent. So to what building an image, a container image looks like, we'll start off with this. So in the very first page, or very first spot, this is a Docker file. We are referencing where we're going to get this container image from. So the base image, the latest version is going to be the Ubuntu container image. We can add labels, so maintainer equals this. And then we can add some run commands, add, get, update, et cetera. We're gonna install Apache on there and we're going to set things like the working directory and so forth. So what this essentially does down at the bottom, we're exposing a web server to serve on port 80. So that's what this container does from the git and that's what it will do the moment you deploy it. So how do we run containers? A container image is going to be referenced and these little arrows would reference such things as like a Docker run. And when you do Docker run, it takes that container image and deploys containers. So the Docker run command is obviously going to take that image, runs the container. Again, the images are immutable, so a single source of truth and so forth. So again, they're portable. It allows for the build application of and all the dependencies to be deployed once and anywhere. Launching a new container of the same image guarantees a clean runtime environment when you start a container and it will be the same state in each of those containers. So perfect. This is kind of the architecture of what it looks like to have some containerized applications. We're going to compare ourselves to some traditional workloads there on the left, your left, my right. And then on the right side, we have the containerized applications. So what containers have done, they've essentially made the host operating system a layer just like the hypervisors. So we've got the infrastructure that stays the same but instead of having guest operating systems, we actually just make one singular operating system. We deploy a container runtime. This example is going to be Docker. There's other things like container D, et cetera. And that actually goes ahead and directs the building of those containers. So we need a place to store container images. So a common registry is going to be a Docker hub. So these container images are stored within here and this is when you can start running commands to either pull an image down, push an image into the registry. And then if you run an image and it hasn't been pulled already, it will do that automatically. So yeah, there's some other tools that you can apply to container registries such as like Artifactory, Quay, some options, or sorry, those are repositories themselves. You can deploy such tools as like Trivi, et cetera inside of your container registries to provide some scanning. But this is just some examples of what container registries look like. Hey, Austin real quick, do we have a slide on microservices at all? We don't. So go back to the virtual machine comparison. So there was a decision that was made, a bunch of guys sat around developing software one day and said, there has to be a better way to do it. And as a developer myself, I wish this would have been something I thought of years ago because it would have started this whole thing a long time ago. But the idea behind the microservice was instead of building an application the way we had always done where you would build it on your think of word, Microsoft word for example, it's one big giant program, the thesaurus, the spellchecker, the thing that tracks how many words and counts them together and does your English syntax and all that. So word is this big giant thing. Well somebody sat down or a group of people sat down and said, you know what, there's gotta be a better way. Because when one of those things blows up, the entire thing shuts down and it dies. So how can we do this better? So they came up with this idea of microservices. Let's make word not just one program, but 30. Let's break it up into a whole bunch of really teeny pieces. And we'll call them microservices, makes sense. And instead of, of course there's a downside to this, instead of you launching one program called word, you have to launch 37 of them to make word work. And oh, and one of them happens to be the user interface to word. But they're all different. And they will all talk to each other over an IP connection. They do not have this in memory capability that we would normally have with an application where I can talk to things that were compiled in memory together. We're, like all of us in this room could be a single program together. But we talk to each other over an IP connection. And this is important because a whole different world from just loading it up on top of an operating system as a singular thing. Imagine over here that all of those said app A, forget app B, C, D, and E. All of those were responsible to make app A work. But the beauty in having them all separated is that they all have their own life cycle. They do not all go from version four to version five at the same time. So the UI can go to version four, five, six, seven, eight, nine, 10. But the spell checker doesn't have to. There may not be much new to add to the spell checker. Or the thesaurus can go through a bunch of revs. None of it is relying on everybody. And you build in the notion that everybody else has died. If I can't talk to my other counterparts, don't just blow up in blue screen and leave me hanging. Give me a message, can't talk to the spell checker. Don't do that right now. Can't get to the thesaurus, don't do that right now. Oh, the thesaurus is back online. Yay, everybody's happy. So we're microservice based. We're IP based. So we're all talking to each other over IP. And there's some resiliency in it. The notion that I can just restart the spell checker if it stops. And nobody else has to know. So I can get upgrades faster. I can, as a developer, I can push things out to my customers way faster. If I'm a developer, I can just develop my own little piece here. I don't have to worry about what everybody else is doing. Because all you're doing is you're coming to me with a request over IP. Now, IP becomes important. And managing now, think of it, if all of us were part of this program that made this application work, someone has to orchestrate for everybody to start, to monitor everybody's IP address. Are you alive? Are you working? And if you've not, restart you. So there are some challenges. Because as an administrator of the IT department, I now have to worry about all of you instead of just the one singular word.exe that I used to have before. So containers got really popular. Microservices got really popular. And therefore they spun into this notion of a container because now I'm gonna take the application, a little bit of the operating system that's needed to run that part of the application. And I'm gonna build a little bubble around it and say, here's your container. Austin was just telling you, it can run everywhere. Wherever a container runtime exists. So wherever there's a container runtime, I can plop this thing down and it'll run. It'll spin up and it has all the things it needs. Each one of these can have its own version of PHP or Python. Every one of us can have our own versions of what we need, because this is how we built. Maybe I'm a Java programmer and maybe someone else is different. Someone else uses Python or PHP. Build it your way, create this container that has everything in it to make it run and then set it free and off it goes. And it can run anywhere. I've got a container runtime. And the term container runtime is very important. We'll talk about what that means as we go through this. But this is sort of this idea of why we're doing this at all. Because it's a better way to program. I can develop and get things out faster. Everybody's revving at their own speed. We're not waiting on everybody else. Don't have to. Okay. Well said. So yeah, exactly. And so if you have a couple of those containers, it's really easy to manage. But once you start having a whole bunch of these, essentially microservices is what Brian was referencing, it becomes quite an issue. And so back in 2015, a team at Google wanted to address this. And there was a lot of competition out there. I think Docker Swarm was also a container orchestration tool, but Kubernetes became the de facto standard for container orchestration. It's a powerful open source tool. It's extensible. And it allows us to manage all of those workloads. So it's declarative syntax. So you state in the code, I want these amount of replicas to be running and Kubernetes actually takes care of the automation of that and checks through certain controllers, whether or not these deployments of containers are working. So you can actually go through that. Because Kubernetes, everything is a resource. One thing that I like to reference is it is all API driven. So crud is essentially built into Kubernetes. So create, retrieve, update, delete those resources, et cetera. And so now we can kind of talk about what Kubernetes looks like from an architecture standpoint. The smallest deployable unit of Kubernetes is a pod. So this is going to be where those said containers exist now. So we've gone up a layer, right? We now have containers. And then on top of that, inside of Kubernetes, we have pods. Pods will either have a singular container running, maybe another container called like a sidecar, et cetera. And this is going to be where your applications live. They each have a unique IP address. So that's something to note. And we can show you what that looks like here in a moment. Quick. So why might you think it'd be interesting to have a pod that has multiple containers in it? So interesting answer. So to simplify the communication, everybody should be going over IP. But let me just, he's got an answer real quick. Yours? Yeah, yeah. So sometimes there are these, gosh, they have a name for them. Yeah, yeah, the init containers, the init containers that spin up. That sort of prep things for you. The other notion is maybe these two things should always go together. Yeah, yeah. So there's a couple of different reasons why they may need to be together. So the idea would be we have this little microservice application that we put into the container. The container is sort of the operating system, the base operating system, and everything that makes it work. And now we've added the pod layer. And the pod layer is very Kubernetes speak. That's a Kubernetes term. But it encompasses a single or multiple containers and provides some structure to it. And as he was saying, very declarative. So if I am building the software, if I'm deploying the software, I can declare, because it's declarative, I want 37 copies of the UI to be available. Because I think there's going to be a lot of people using it. Nothing possible in the previous world of deploying applications. You could never say I want 35 versions of the same interface running, but nothing else. Just the UI. You can't do that. It's all part of one thing. But with this, like I said, I want 35 of the UI running. And I want 27 of our example earlier, of the thesaurus running, and 18 spell checkers, because that seems to scale with what we're building. And you can do all kinds of cool things. So it's declarative. You tell it, here is how I want everything to look. You go do it. And you keep that up. You maintain that. You monitor the failures, and you maintain the numbers that I give you. Kind of an interesting approach, but it's great. From an administrator's standpoint, because I just say, here it is, let Kubernetes take care of making sure the environment matches what I tell it. Yeah, and to kind of like segue into that. So for instance, you know, the nomenclature surrounding Kubernetes deployments is essentially going to be pods. It's an abstraction on top of pods. So it takes a group of them, and it monitors them through the deployment controller of the replicas. And I can actually show you what that looks like via code. So this is, on the left here, we have yet another manifest language, YAML. I'm sure some of you may have heard of it. And we define what this deployment is gonna look like. So it has deployment as the kind. It has the replicas. So it's going to have three of those running. And then down there in the spec and containers, if you can't see that in the back, it says spec, containers, and you actually define the container's image. So this is going to be an Nginx deployment, and it's going to be versioned at 1.7.9 and et cetera. And so the deployment is monitored by the Kubernetes API, and it makes sure that has all of those pods up and ready. So this is a representation of what those pods are doing. The API is monitoring it, sending information to the controller and saying, oh no, one of your containers crashed. Let's just kill it and bring it back up automatically. And so then it will do that for you. And so then it's making sure that it's meeting that resiliency with the three replicas and so on. So to talk about more pieces within Kubernetes, we now have services. So these are going to be an abstraction on top of the logical set of pods. So these give addresses to access pods. There are such things as cluster API for communication within the cluster, the Kubernetes cluster. Node port, which can expose outside traffic to a singular node, so the host. And then we also have load balancers, can expose services on a cloud provider, et cetera. So to kind of demo. So come back just a little quick. The cool thing about services, in the previous slide, remember how he kind of killed all the pods and then he just regenerated themselves? Well, if I do that, what happens to all of the IP addresses that were assigned, what happened to all the connections, what happened to all the things that knew about the old pods, how do they find them? So they came up with this idea called a service. Well, we'll just create these things called services and here's what's really cool. They're not really objects, they're just these things we'll make up in our head. We'll call them services and we'll maintain a list of what's supposed to be out there and who's trying to talk to it and then we'll be sort of that interface between what you need to know about and how to talk to it and what's really running. So we'll abstract a lot of this way. In fact, Kubernetes abstracts a whole host of things away from you so you don't have to worry about it. But services are very cool because they keep you from having to know when there's a failure, well, where did it go? What's its new name? What's it called? On what host did it run on? We haven't even talked about the fact that this thing could scale to who knows how many number of hosts that are running could be 5, 10, 100. Where did it go when it's respond itself? So the services keep track of all that. Yep, and to kind of give you a visual representation, we have a service here that is going to be referencing a deployment and it's going to be those deployments. Again, each of those pods have unique IP addresses and this load balancer makes sure that traffic gets to those endpoints. But the load balancer is monitoring that, but what if one of these pods die again? They're not going to come up with the exact same IP address, right? That endpoint changes. And as you can see here, it takes care of that for you automatically. So now you have a new pod up with a new IP address but it's still getting load balanced perfectly. And there's no manual configuration that you have to do. So to talk about another one, we need to now allow traffic inside of a Kubernetes cluster. And so we have a service called Ingresses. This is going to be exposing your cluster to outside traffic via HTTP and HTTPS. It gives an external reachable URL which we can send our traffic to. So there's rules that we can define within the Ingress. So we can show you on this next page what that looks like. Can you go back to the previous one more slide? There. Notice the bottom. What IP addresses are we looking at? 10 dot somethings, 10 dot 23s and 10 24s are the things you see. They're not routable to anybody. In fact, when you build a Kubernetes cluster in your data center, it is a data center in and of itself inside of your data center. Nobody in your infrastructure can route into it. There are no routes into Kubernetes. We're abstracting all the networking away. So there's no way for you to get into it. So how am I gonna get to the UI? If I'm building this fictional word processor called Word inside of this, how do I get to the interface to do something to it if I can't see it? We need this. We need services to define it and then we need to have this Ingress controller that we're talking about here and rules that say, oh, you want to get inside of this cluster, induce and see something, point to it. Well, let me build an interface for you. And they called it an Ingress. Yep, and that is exactly what this is. So as you can see here, we have a user that is going to have traffic going from two sections of this website and this Ingress is defining over here what pass that's going to. So slash foo slash bar and this is going to then take that Ingress and say there's two paths that we want to direct that traffic to and we want to direct one path, the slash foo to service A and service A is going to load balance that and then send that to the pods somewhere down here and then the slash bar, same concept but now it's going to service B. So again, abstracting away the network and just this line is demarketing the Kubernetes boundary. So this is all external traffic coming into the Ingress and so forth. So let's just continue on. There's a lot of resources out there. Again, Kubernetes, everything is a resource. You have config maps, maps which store non-confidential data in a key value pair. We have secrets that are base 64 encoded, you know, TSL certs, bootstrap tokens, et cetera, persistence volumes. So storage in the cluster that has been provisioned by an admin. You can actually do that just manually or you can do it dynamically through storage class such as Longhorn which is a tool that we have at Rancher for that and then horizontal pod scalers and that allows you to set, you know, via some metrics like if pods are reaching a certain amount of CPU, it'll automatically scale out more pods, distribute the traffic, et cetera. So, and then the last, oh, and cron jobs as well. So in case you just want something that runs once to completion. So to talk about the architecture here, again, so because this is all done via API calls, it's speaking through the API, we're using Cube CTL, this command line tool to actually speak to the control plane. And as we can see here, the control plane houses things like the API server, a scheduler which binds pods to nodes, the controller manager which, you know, takes care of running controllers for deployments, replication controllers, services, let's see, deployments, et cetera. And then just off the left, we have the at CD. Again, it's storing in a key value pair. So that CD is leader-based distributed system and so forth. And oops, I wanted to go over here and I'll do that. And so on the right, for this right, we actually have, it doesn't say it, but they're defined as nodes here, but they're worker nodes. And this is actually where you end up deploying your, you know, workloads, not just the ones that are within Kubernetes, like the control plane and so forth. Did you talk about CRU's at all? A little bit, yeah. Custom resource definitions. I didn't mention them yet, but we will. Yeah, yeah. So yeah, and the worker node has a kubelet, that kubelet is serving information to the Cube API server, notifying it, like hey, this is what's going on in the worker nodes. The at CD that you're seeing there, by the way, is it knows about all the resources that are running on the worker nodes and all the resources that exist in Kubernetes and it knows everything. And it is, as he mentioned, as Austin mentioned, it's a unique database that was built to do this. It was called a key value pair. It's real simple, very fast lookup, but it just stores a ton of information and it is replicated when you have multiple control plane nodes. In our picture, we only have one. Our friend here has a question. It's a node, consider it a, that's a virtual machine or physical bare metal. That's a virtual machine, physical bare metal, virtual machine, physical bare metal, virtual machine or physical bare metal. Those are all individual real servers doing something. They have an operating system loaded on them and then they run Kubernetes. So it's a really slim format, if you will, or it's a very slim layer. You have a smallest operating system. I don't even need much from your operating system at all. I just want you to run. In fact, we have something called Sly Micro at SUSE. That is this very small, hardly takes up any room. You can put it in your back pocket, that kind of small. We don't need anything from, the operating system itself, all that it runs is the container runtime that Kubernetes is. Deployments are pods and then pods have containers, which, yes, you were right. And so you tell Kubernetes, you tell one of the control plane guys, here's my deployment manifest that you saw earlier. And then it reads through that and says, okay, I know what you want me to do. I know how many copies of things you want. And then I'll go grab them all. And when he pointed to something called NGINX earlier and he said I want an image in a certain version. By default, we know where to go. There's these things called registries that everybody piles their images into and then we can just pull from them. Was there a question in the far back? No, okay. So yes. The YAML file. No, correct. So the question was the YAML file or the statement defines the deployment, does not define the nodes themselves. And that is absolutely correct. Those are specific things that are built outside of this deployment. You say that doesn't mean that. Yes, yeah. If you do things in the cloud, we can actually like through some tools we have, you can actually just automatically spin up more. Yeah, we'll continue. So setting up Kubernetes takes a lot. There is a fun little project, Kubernetes the Hard Way on GitHub. You can go through that and manually set up Kubernetes cluster or essentially like distribution in a sense. But just like Linux from, you don't usually compile Linux from scratch unless you're like a bit of a masochist. And there are distributions out there. So we at SUSE actually have, we acquired Rancher back in 2020 and with that came a couple of things. You know, some tools such as Rancher itself and then also the Rancher Kubernetes engine too or Rancher Gov. This is a distribution of Kubernetes. It is going to be more security based. So it meets those FIPS compliance, CIS benchmarks and it uses container D as a runtime. Back in the day, Docker was actually the runtime but that was deprecated and removed from the upstream Kubernetes world. And so now there's things like container D and some other container runtimes out there. But it's meeting that standard in the cloud native world. And then we also have kind of the princess in my mind is K3S. So K3S was developed as supposed to be like a lightweight. It's actually just a visual joke because K8 is the abbreviation for Kubernetes. Slice that in half and it's K3S. So it's extremely small. You can deploy it on edge, IoT, et cetera. Similar to RKA2 with container D but this also comes with MySQL Lite. You can obviously use other. So you can use Postgres, EtsyD, MySQL, et cetera. And it is again optimized for like arm deployments as well. So, you know. K3S is in a replacement. Sorry, it's a distribution of Kubernetes. Yes, sorry. Like there's multiple out there. GKE for Google, RKE2, like I just said, et cetera. And a lot of major. Just like there's different Linux distribution. There's SUSE, there's Red Hat, there's Ubuntu, Rocky, Alma, there's a whole bunch of people that build a Kubernetes distribution that builds the infrastructure that supports these containers as they run. Cool. And so, you know, like that previous picture where it's like, how do you manage a couple containers? Well, you start building environments in, you know, production environments and you start having clusters where there's different teams, there's different products and different clusters. You need to be able to, you know, give access to certain people and how do you end up managing that? And that's where Rancher Labs decided to create their tool. And so, you know, managing the Kubernetes ecosystem also ends up being really difficult because it's quite large. This is just, yeah, database. This is all just databases. So if you look at the cloud native landscape, it's massive. You have everything from container run times, registries, networking, service messages, et cetera. So, oops, we went ahead and built out kind of something that we can bring up and have things going like day two operations pretty quick. So with Kubernetes down here, we have, again, these are the distributions that we support. You know, anything that's in the cloud native, you can work with Rancher, you can work with us. You can deploy on top of any of your Linux distributions, such as like RHEL, Ubuntu, Oracle, these ones right here. And then we deploy Rancher on top of a Kubernetes cluster. So, with that comes tools that we manage and build for you, both with Prometheus, Grafana, Istio for Service Mesh. We have a Terraform operator, and then we obviously have our Longhorn storage, which I mentioned earlier. So that's gonna be block storage that is going to be built into your Kubernetes cluster. So, now I can, you know, be done with slides and we can actually get into. And real quick, before we go, go back to the CNCF slide. So this notion of CNCF, the Cloud Native Compute Foundation, if you think about what, we're all this originated when Google built this, it was for building out their cloud services. Think about every time you open up your browser and type in a search phrase, it spawns off a container that performs your search, then when you're done, goes away. So, containers are expected to exist for moments in time and not any longer. There could be some that are persistent and we talked about stateful sets and, or at least we listed those earlier. They're kind of an advanced idea, but the containers are running. This landscape, these are all the people that are currently building little projects that all contribute to this notion of a cloud native world that run in this Kubernetes landscape that provide different kinds of things. And you saw the one screen he had just was databases, but I don't know, there's 200 or some here and they all do a whole bunch of little things. So there is a huge momentum, support. A lot of people are behind this effort. This is not some little fledgling thing, Kubernetes, it's got some momentum going and these are all the players that are playing in the space and these are the ones that we would say comply with the standards. As always, I can build something and then everybody takes some of it and does their own thing. So the CNCF, so you look for things that are CNCF certified, all of these guys are expected to run in the same place, execute at the same time, live together nicely, defines the space for everybody to live in and work successfully. So you'll see Cloud Native or CNCF, you'll hear these kinds of things as we go. A lot of people doing a lot of good things here. Cool, and so, so yeah, yeah. This is the distribution layer. Yeah, that's the distribution layer, that bottom layer is going to be your host operating system, so your OS layer and then what happens on top of that is we actually deploy Rancher as a Helm chart. It is just, and it also is going to then, it'll be a better visual representation. Yeah, we kind of, we kind of skipped. Is that one of any of those distribution? Yep. Yeah, so this is the operating system layer. This represents those nodes we looked at. This is the Kubernetes layer and notice some of the guys in the middle, those are things that you don't even have to build, they just exist in the cloud providers and you can just go pay to use their services. You don't have to build any of the lower stuff. And what's interesting is this stuff up here that Austin was talking about. Remember earlier, we talked about a single word EXE turned into a whole bunch of microservices and now there's 30 of them? We've got to manage that. Well, what happens when you have, and so you build a Kubernetes cluster to manage that, what happens when you have 50 Kubernetes clusters? I got to manage that. Right, it just keeps going up and up and up. And so this layer here is where we get into, well, what happens when I've got more than one Kubernetes layer? What if I've got this guy running and a bunch of those running and this running over here and all of a sudden I've got 50? Give me an interface, show me where I can see them all, help me to deploy them, help me to delete them when I want them to go away. Give me some ability here because you're killing me with all these Kubernetes clusters you keep launching. And so that's what we do with this layer. Yep, and now we can do the, I'm gonna give you guys some time to, can you guys see the instructions up here? So this is going to be Hobby Farm, and this is where we're gonna have some instances up and running for you guys now. So if you go to learn.na, maybe I can zoom in and that will make things better. I can't, that was foolhardy. So yeah, learn.na.hobbyfarm.io, and what we're gonna do is then you register with an email and a password, and then you enter in this access code, oops, you enter this access code over here. So you probably need to refresh your browser, but I'll give you guys some time to do that, maybe walk around and see if anyone needs help, but sorry that this, yeah, let's see if I can. Learn.na.hobbyfarm.io. If you ever wanna do another rodeo, yes. If you don't, then no. I will say, if, you know, I don't know if any of you have done a rodeo before, we don't have a way to retrieve passwords. So yeah, if you wanna end up doing a rodeo, we offer rodeos for other products as well. Definitely memorize the password for that or put it into your password manager. You're gonna register at learn.na.hobbyfarm.io. So, okay, okay, yeah. Okay, I'm going to swap my screen real quick. So I wanna make sure that everyone went over into, so if you're in the hobby farm landing page in the top right, you should have your email address listed there, click on that, and then go down to manage access codes, and there you will enter this access code, which is scale, lowercase, 2024, and then you should see a scenario pop up, and so I will show you what that looks like. Let me just, you know, change some things real quick. After, did you enter that access code when you registered? Yes, okay, perfect. Just making sure, because sometimes that doesn't get clear. I'm gonna change some settings to mirror. Well, Austin switches over, let me just throw something out to you. We talked about these things called Kubernetes resources, deployment, stateful sets, services, ingress controllers, ingresses, services, there's a whole bunch of things. These are called objects, these are resources. The beauty that they built into this idea of Kubernetes is, well, what happens if we wanna come up with a new object later? What if we wanna build a new one? So I was asking earlier about CRDs, these custom resource definitions. So you saw the manifest file that we saw earlier that was just a deployment where we kinda declared what the deployment was gonna be? Well, now I can just throw at this cluster. Guess what? I want a brand new object to be created. Here is its definition. I want everybody to know what it is so that when I start making calls to it and I start asking for things and I start passing information and I reference it, I want everybody to know what this is. So they came up with this idea of a CRD. So this thing is ever expanding. It can have all kinds of things. If you make up in your mind something that you see is missing, go create a CRD, build a whole thing around it, publish it, and then everybody can use it. It will not break anything. So it's meant to be extensible. Cool. So everyone should see this. If you don't, then there are some steps that we're gonna have to take. So the question is, does anyone not see it? Yeah, I think you don't, do you not see it? Okay, can you go help him, maybe show him the, so one thing, you might have to refresh your page. Again, if you can use either Mozilla Firefox or a Chrome-based web browser, but you can manage the access codes here and then under, so today's access code is scale 2024. So if we all click start scenario, we're requesting our virtual machine resources. We have these pre-running. So if you guys wanted to just make sure you're on this page and then we can begin scenario, give you a second to go away from this page onto the next. Cool. So make sure we can get everyone caught up here. Does anyone have any other questions real quick just to give an outline, the agenda, what we're gonna be doing here is, again, provisioning a cluster. We're gonna be doing that with our distribution RK2. Then we put Rancher on top of that. And then once we have Rancher running, we provision more clusters. Well, another cluster using Rancher. So, yeah. Perfect. And again, this should be, you guys should be seeing the exact same thing as me. We have an info tab, a cluster, zero one tab, and then a Rancher zero one tab. What's going to happen is, let's see, okay, that's good. What's going to happen is as we go along these instructions, we're just gonna click on some commands and it's going to start doing the installation on the appropriate VM. But there are some situations, and I'll address it in the future, but there are some situations where we actually manually need to paste something into the terminal. Please make sure that you take pause there and let me instruct you through it because if we run a couple commands in the wrong spots, it will have an issue. So, let's go next. So, again, this is not a highly available installation. This is a single node. And so what the first thing we're gonna do here is actually using a curl command. We're gonna click on this right here. And we'll see if we click on the Rancher tab over here, there's going to be a pseudo command that we can click on, and that's going to run installation script that we have for RKE2. Again, this is a Kubernetes distribution. From there, we need to do some configuration. So we click on the Rancher zero one tab, correct? Yep, just to see it, yes. It will automatically run inside of your Rancher zero one when you're clicking these. But this is the command that's being ran. If you click on it, and as you can see here, we've got everything situated. So now we need to configure the RKE2 cluster. So we're gonna make a directory in Etsy slash Rancher slash RKE2, we're gonna make a cube config and define what its permissions are. And yep, perfect. And then if we want pseudo system CTL enable and start, and this is actually starting that RKE2 cluster. Just give that a second. You guys will notice like down here, it's just paused for a brief moment. And is anyone? Yeah, so the first command installed Kubernetes on your whatever node that is, the Rancher zero one node. That's all it took to build a Kubernetes server. Now that's one, and we're gonna be, we're having, we are building a Kubernetes server that's doing all three roles. It's an SED server, so it's running the database. It's a control plane, so it's monitoring everything. And it's a worker node, so it's doing all three roles. If you were to build this out in a 25 node cluster, you could say, I want three SED nodes, and I want five control plane nodes, and I want everybody else to be a worker node. You can break it up that way. But today we're just got one, it's doing everything. So that first command created a Kubernetes distribution or Kubernetes server using our distribution, that's all it took. One simple command, and you are up and running. You can access the logs here and see what that actually looks like. It has some wait time, and that's fine. It just takes a second to catch up, but it is up and running. So we're gonna go ahead and go next here. So if everyone clicks on next, make sure you're at the top, you're gonna, I did it, it's just this log right here, if you look up. It's just saying that it's starting the Kubernetes. It tells you if it's running, sort of like a troubleshooting test, just validating along the way that we're working. Yep. And so now we actually need to install the kubectl, which is going to be the command line tool that we use to monitor Kubernetes clusters at the terminal level. So if we run this curl command, it will do all of that for us. No, actually you can get out of there. That's fair. Okay, so if you run the journal command, make sure that you get out of it, because I didn't. Now let's run that command and boom, we've got kubectl. We have it in user.local.bin, et cetera. So to ensure that kubectl can connect for Kubernetes clusters, you actually have to reference that kube config which we referenced in the previous step, in the etc slash rancher slash config and so forth. Yeah, so one of the keys to the kingdom here is that config file. So when you installed your Kubernetes cluster, that process creates a file, and the file could be under a couple different names and it doesn't matter what it's called, but its content is very important. That content is how you're going to talk to that Kubernetes cluster, because that Kubernetes cluster has a SSL certificate that is required. It is generated by the Kubernetes build process. Kubernetes itself creates this Kubernetes certificate, uses it for all communication. So everything that communicates inside the cluster is over SSL via the certificate. So we need this guy or we cannot talk to that cluster. We don't know how to talk to it. We don't have the certificate to do so. So this config file, for us, we initially put it under slash etc rancher rke2.yaml, but Kubernetes expects that file to be called config, c-o-n-f-i-g. And if you put that file in your home directory under the .cube directory, which is what one of these steps will do for you, this coopctl command knows by default, whoever's running coopctl, go look under the .cube directory in their home directory, find the config file there, use that to connect to the cluster so that you can look at it and query it and do things. Exactly. So now that we have that config, we're gonna reference it to that .cube directory. And so then we should have the ability to run coopctl commands because coopctl can see the Kubernetes clusters configuration. So let's run coopctl get nodes and we will have the name of the node. It's going to be the internal IP address. It's going to have those roles like Brian referenced earlier. Again, all three. So we've got control plain at city master, and then it shows the version. So this is important because we won't be able to go further if we don't have this working. Is this not working for anybody? Does anybody not have the coopctl returning? Is it giving you like a local host 88? If you need to install coopctl, there's this command, the curl, and it should give you this nice little representation of how much it's been installed. It's curl command that references the documentation for coopctl. Yeah, so to do that one. So your cluster's probably working, but we don't have the tool that queries it. So get that, kind of follow from the steps and see where you go. Is there something over here? We're getting both. Permission denied. Interesting. So that tells me that you're not pointing to the right coop config file. So I would go back and validate the previous step that you clicked on that made the doccube directory and that created the soft link for it. Make sure that happened. When you do a connection refuse, does that when you ran the coopctl command? Yeah, there are these. If you can't find the config file, it'll give you all kinds of grief. Make sure that you ran the ln minus l or s. What do we do? Do we do a soft link? Yeah, we did a soft link. I went ahead and typed in this command here at the top. If anyone's having issues, we just gotta let's make sure that we have the RK2 service running. So the engine here, v2, should be active. If you do system CTL status, RK2 server, and you would just type that in there. So, you know, just some brief troubleshooting. All right, I'm gonna get out of this right here and I'll have Brian hop around and help any of those who need to spend some time just getting caught up a little. So again, just to kind of backtrack a little, we're gonna run kubectl, get nodes. Those are those hosts. This is the VM. This is the thing that has a Linux distribution on it. And then we are going to then run kubectl, get pods. So if we run kubectl, get pods, tack, tack all namespaces, you can actually see all of the pods, therefore containers that get deployed with our distribution, RK2. So if you run kubectl, get pods, you'll see things such as, again, I mentioned the controller manager, I mentioned the kube API server. These are all, this is cut off, but these are all one for one out of the status and running, or there'll be zero one, which these were essentially a job that needed to run to what completion. And so we've installed some things here, canal as the CNI, core DNS, ingress engine X, and a metric server, those are all things that come with our RK2 distribution and so forth. So again, to reference the actual pod that's running canal, it's right here with RK2 canal, same thing with DNS and ingress controllers. So yeah, yes, all of those are running as pods within the cluster, but these are the, these are the working pieces of the Kubernetes cluster. And that's why they are in the namespace kube system. We will see some other namespaces here momentarily once we start installing other applications into our Kubernetes cluster. So if we click next, everyone should be on the page where it says install helm. Helm is a package manager, and what it's going to do is allow you to deploy these YAML files as helm charts. So just to specify, you saw those YAML files with like the services, deployments, et cetera. A helm chart is essentially a lot of those. And then they are just referenced into a file and you can deploy that helm chart via that. So first thing we need to do is curl that to make sure that we've got it running, or not got it running, but got helm versioned in there. So if we do helm version, so I clicked on this curl command, I'm now gonna click on helm version dash dash client. It looks like we're good. We've got the right version. And then we can use helm and list all the namespaces that it may have for helm charts. So as you can see here, RK2 was actually a helm chart versioned for these services that we provide with that canal core DNS and metric server. Like architecture or a version. Well, helm is, so helm was listed on that cloud native. So the question is, is like helm essentially like something? Yeah, it's not necessarily part of Rancher. We do utilize it, because it's a package manager that allows you to deploy applications inside of CNCF certified clusters. So it is just a tool that is basically become like pretty standard because it's a graduated project. It's like, it's got a lot of funding. It's maintained by a lot of people. So let me continue. Great question. So now there's some requirements that Rancher needs now and Rancher, we're gonna have to have our TLS certs self signed and so we're gonna install Assert Manager. So we're gonna do that from Jetstack and you can just do this with that, you know, a helm command and we do helm repo add and then it says Jetstack has been added to your repositories. And so there's going to be a helm chart in there and when you do helm install, it's gonna reference that helm chart and you can actually use some variables to set, you know, versions, you can set the namespace that it gets installed into and you can define if it doesn't have that namespace created, you can actually do that. So we're gonna just gonna list this out and briefly, Brian had mentioned CRDs, that is custom resource definition files. So that would basically be an application that's outside of the typical, like Kubernetes deployment, custom resources. So we're just gonna click on helm install. Yeah, can just pause here is everyone, all nice and caught up. We're just currently installing perfect, awesome guys. Now we'll just wait for this to start working and we can actually check the rollout status of our cert manager. So it looks like everything's good here but I'm just gonna use a key. Yep, yep. And this is, again, this is just gonna reference the helm install is referencing cert manager inside of Jetstack, well, it's referencing the name and then inside of that repo, that Jetstack repo, there's cert manager helm chart and then you're giving it the namespace version and you're creating the namespace if it hasn't already been created, which it hasn't. So if we check the status of the rollout, we can just click kubectl slash n, cert manager rollout status and it says cert manager successfully rolled out. Super sick, awesome, now. Does everybody know what helm does? Did you describe what helm is? Yeah, I have a reference, it's a package manager and the next thing that we do here is we're actually gonna start, we can also check the status of I believe the webhook. Yep, that successfully rolled out. So we're good to go, we've got the cert manager webhook and cert manager deployment all taken care of. So let's click next because the segue that I wanted to go into is rancher is a helm chart, essentially, just like most applications for Kubernetes clusters, we deploy it as a helm chart. So when we do helm repo add, we're actually going to add the helm repo that we have for rancher. So if we click this here, we're now on the install rancher step. We should be about like 25% of the way through the lab or the practice, whatever you wanna call this and then we clicked on helm repo add and it should say rancher-latest has been added to your repositories and then from there, we're gonna go ahead and install rancher. What this installation does is to have a host name, we're actually gonna use slip.io to have it something like a DNS that we can reference and then from here we're gonna set it to a version 2.73 and we're gonna create a namespace like we did with cert manager. So if we click on that, this will take a second and it's gonna say happy containering right there. So we're gonna go, everyone should have ran that helm install rancher command and we're just gonna go next. Oh, and one thing I forgot to mention or did you mention about the importance of DNS? No. Kubernetes loves DNS. Why? Because there are so many IP addresses. Every one of us has an IP address. And guess, we're using that 10 dotted number. So we're using an IP number scheme that can go up into the bazillions. So how do we manage things? We cannot manage it by IP, we gotta know names. So DNS is important. One of the resources you'll see is a core DNS pod will be deployed into your Kubernetes cluster to help manage all of the DNS names that will be created for every one of the resources that shows up in this Kubernetes cluster. Hobby Farm can get very overloaded with, because it's up in AWS, so it's with everybody doing something. Is someone having an issue with their Hobby Farm? Is it slow or is it just working? Within the terminal or like the actual, huh, do give it like a, are you on Windows? What's like the cache, like re-cache the website, refresh for Windows? Like if you're on Mac, it's Command Shift R, huh? Can't remember, it's been so long since I've used Windows. Control Shift R? Okay, yeah, that would make sense. Command Shift R is for Mac, so. All right, so if you're on this page with me, verify Rancher is ready to access. We ran a while true command and it's going to be basically checking whether or not this HTTPS, the website, is actually available. It did that for a brief amount of time and then said it wasn't ready, but now it says Rancher is ready. So I will go on to this next page. Again, raise your hand if you're having some, you need to get caught up and Brian can come over to you and kind of assist. So now we're in the page where it says Accessing Rancher. So to note, we're gonna start doing some manual command pasting to get some, a bootstrap password. So first thing that we're gonna do is go to this page here. If you just click on it, it should automatically open. The connection is not private, we're just using self-signed certs. So what you're gonna have to do is go to Advanced. And click on Proceed to Rancher, your guys' IP address, docslip.io. And if you do that, you should be good. You can also, if you don't have the Proceed to Rancher button, this is unsafe. If you just type that into the webpage, you can actually get past this. Connection is not private. I had a question back here, Austin. The SSL-IPIO, that is only what we're doing in Hobby Farm. You will not see this in your own labs. You'll never use this for yourself. This is just for us to provide DNS inside of Hobby Farm. So it's just a thing there. Don't remember SSL-IPIO, don't remember that. Does it matter? It's just in this lab. But you need to see this, this should work. Yes, so now this is where you need to actually get that bootstrap password. So as you can see here, it says for a helm installation, you need to run kubectl get secret. And it's just gonna decode the bootstrap password because it's just base64 encoded. And you're going to copy this. So if you click on that, kubectl secret command, and then you actually come back into your terminal, you paste it and then run it. It'll actually spit out your guy's bootstrap password. So then you copy that. And again, this is in the Rancher01. I wanna make sure that everyone's got the Rancher01 tab selected as of right now. And then you take that bootstrap password, we can verify some things here, bvg, bvg, and then cwv at the end for me. I guess you guys don't need to hear me saying that. And you log in with local user. There is a way, before you install Rancher, that you can specify your own bootstrap password. We didn't, so it generated its own. But we talked about how things are declarative. One of the things we can do when we install Rancher is we can use a declarative file to tell it, here's what I want your bootstrap password to be. Here are some other values that I want you to know about. Go, and then it will use your own password versus something like that. Cool, and so here, I'm just making my own password real quick, where the server URL should automatically populate for you guys using that slip.io. Make sure you check this by checking this box. You accept the end user license agreements, the T's and C's, and click continue. So we're logging in as the admin here, and then we're going to end up on the Rancher landing page. Yes, yes, it is admin. So if you, for whatever reason, needed to close down the browser and bring it back up and go to your slip.io for Rancher URL, you would log in with admin. And you can actually see that up here in the top right. There's going to be a little creature and admin. So, sorry, we've got Rancher deployed. So just to reference this, this page is the dashboard. I'm going to try and command it. That, perfect. This is the local cluster. So like I said, Rancher is deployed into a Kubernetes cluster, and then it is going to manage other clusters from that said cluster. I have seen this with a few customer calls. The local cluster is not for workloads. Don't, you'll mess up a whole bunch of stuff. So this is just for Rancher. We obviously have some tools that you can deploy in there like Prometheus and Grafano, to have observability of your local cluster, as well as like a backup tool so you can back up your actual Rancher deployment. But this is not where you put workloads. But it is good to note, because then you can see things like, how many pods are deployed. You can see how much CPU and memory that this node has. You can see the version here and so forth. So everyone should see this. Everyone should have it up and running. I'm going to, oh, yes, the local cluster. Yeah, perfect, great. I just, yeah, thank you. So the local cluster is the Kubernetes cluster where Rancher lives. So 110 is essentially like the base limit that we put for pods to be in a singular node. You can change that and adjust that, but that's saying that there's just 28 of 110 pods deployed. So do you remember when we did kubectl get pods? If we did that again over here, it would have a larger list. So I can go off script here real quick. If you guys want to either go up on the arrow keys and do kubectl get pods, I should have just typed it out, kubectl get pods, all namespaces. You can still do things within this Kubernetes cluster from this terminal and you'll see kubesystem, but now you'll see things like cattle system namespace, which has our Rancher webhook. You can see the cert manager that we deployed, et cetera. And I can actually show you what that looks like further. So what we'll do is go to this next, we're in Hobby Farm right now and we are going to go to the step that says creating a Kubernetes lab cluster within Rancher. So I'll show you how you can actually run the terminal within Rancher for your Kubernetes cluster and other places within Rancher and therefore your Kubernetes clusters. So let's go ahead and what we'll do is navigate back to your web browser that you have Rancher's webpage and we're gonna go to click on this create button. If you're not on this page, just go to the hamburger menu, click on home and come and click on this create. So this is saying we're gonna create a downstream cluster and what we're gonna do is we're gonna create another RKE2 cluster that we can actually put workloads on and for this example, this demo, it's going to be a WordPress deployment. So just to show you guys to walk through this, if you deployed this cluster in the same manner at home and you guys have access to Azure, AWS or GCP, you can actually use your guys' secrets and access keys to essentially deploy more clusters downstream but they'll live in those clouds. Yeah, if you go, there'd be some networking hurdles but yeah, yeah, exactly. And so then if you wanted this, you'd be able to deploy, sorry, this first page or this top section is actually going to be deploying different genres of Kubernetes distributions. So AKS, EKS, this section right here actually deploys our version of Kubernetes, so our distribution. See here we have it toggled as RKE2 or K3S. We also have a version called RKE1. It's got a Docker shim as the runtime so we really just recommend stay up to date because the RKE1, we still like from a support standpoint, yeah, people are using it to this day but we prefer people to use RKE2 because it kind of fits more of that Kubernetes standard. But anyway, you would deploy EC2 instances or you could use Linode, VMware, DigitalOcean. There's actually even other drivers that you could come here and look at. So there's cluster drivers. These are not necessarily out of the box ready to go. There could be some configuration but these are some other Kubernetes distributions, so cluster drivers. And then there's also different node drivers where it would deploy with RKE2. And so you can actually look at all these. So you've got like Nutanix in there, OpenStack, et cetera. But let's get back to, I'm gonna have to go back to the home page and click on that create button again. So where we're deploying is we're gonna scroll down to the bottom, make sure that, oh, why is this not scrolling? There it is. Make sure that the RKE2 is toggled and then we're gonna click on custom. So this is gonna say we're essentially gonna deploy onto a bare metal node. In this instance, it's a virtual machine but yeah. So we're gonna click create and we're gonna change some things. Let's make sure that we're following along with the steps. So we went to the Rancher home page, we clicked create, then we scrolled down to the custom cluster on existing nodes. In this section, we're gonna have some things that we need to reference. We need to give it a cluster name and then we're also, for later purposes, we're gonna give it an older versioning. So let's go back to our Rancher deployment. Let's take this and give it a name. Let's see RKE, so Rodeo, let's do scale. You can name it whatever you want. Do use lowercase, please, otherwise you will have an error. And then come into this dropdown menu and you can actually select the 1.23 version of RKE2. And then we're gonna leave everything else as default. So just like with the RKE2 cluster that we ran that curl command and we did like kubectl get pods, we noticed that the services, core DNS, nginx, metric server, those were all deployed with the curl command. But here we can actually start doing things like adding member roles. So if we wanted to do some role-based access controlling here, we could add a member to this cluster and give them read-only access. So the end user is somebody that doesn't get to actually deploy anything, but they have access to some internal documentation or something like that and so forth. But I'm gonna go back to this basic cluster configuration portion right here, scroll down to the bottom as long as we have the name and the Kubernetes version as 1.23 and we click create. And when we click create, we now have a registration command and then we have some other steps. So if we go to the hobby farm, we've accomplished steps up to six, or sorry, up to seven and now we're on step eight and nine and et cetera. So what we need to do is look at that registration command. We're going to check out what roles we're defining for this downstream cluster and then we're gonna click on show advanced. So we're gonna look at the roles. We have at CD selected control plane worker and then we also have the command itself. First and foremost, we're gonna click on show advanced and we're gonna come over here and we're gonna copy and paste from our hobby farm tab our private and public IP addresses. So this is where we gotta do some, make sure that we're using the right tabs and et cetera. So to be clear, make sure that you have this rancher tab selected, make sure that you go to the private IP right here, do a little copying, come back to rancher, go to the private 1.7. Yep, there it is. And then if you come to the public IP, copy that. Boom. And we now have host name, you can just leave that blank and it will just automatically name it. You can just give it something if you wanted for your own case, but I would just leave it blank and it'll automatically assign a name to it. Okay, the clusters IP, ranchers IP, so it's the, so again, because all this is done through HTTPS, we're actually referencing the, that is actually the rancher's DNS, right? So that's that slip.io. So in this registration command, it's talking up to our rancher deployment in order to have the downstream cluster basically get observed and, yeah, does that make sense? Because it's not just the IP address that's being referenced in the command, it is the, it should have your, oh, you know what? I am super silly. Holy cow. Thank you. Thank you. I messed up. See, this is what I'm talking about. I need to talk this through better. Holy cow. We're not referencing rancher. Please do not paste anything yet. So come back to this cluster 01. What's your name in the red? You deserve, you deserve something. Max? Thank you, Max. So actually come to this public IP of cluster 01 and we're going to replace those things that we just previously did. So we're going to change those to the cluster 01 private and public IP addresses. They shouldn't be. So in the node public, use the node private for, yes, you could just use the instructions. So this is, this is the cluster 01's public IP address and this is cluster 01's private. If you notice here, I have this tab selected and I had previously selected rancher 01 on accident and started copying big mistakes. But if you actually followed the instructions, you would have been able to just copy these into those fields. See how it says very important. Thank you. Thank you so much, Max. That was so great. So then we need to come back here, make sure that we've got the right IP just for sanity check. You can look and make sure that this node public IP is not the same as your rancher deployments IP. And then if we scroll to the bottom, we actually need to skip, we need to skip the TLS verification because we just have self signed. So make sure you have this highlighted. Now we have a registration command with the appropriate IP addresses. Cool. So let's just not paste yet and we're gonna go to the next step. So let's just briefly look at this and say start the rancher Kubernetes cluster bootstrapping process. So we need to make sure we have selected 01 and hobby farm cluster 01. And then we will run the command that we got from our rancher deployment. So come here, click that says copied. You should be good to go. Then we run this command. I've pasted it in there. Just click enter, double checked and made sure that I had cluster 01. And then if we wanna just get some provisioning steps, you can actually navigate to the rancher page. And you see here, you got some bootstrapping information and you can actually click on the provisioning logs. This will be saying updating. You can actually look at the machine pools. So it automatically gave a node name cause we didn't name it ourselves. So it's got custom, it has the OS as Linux right here. It has all the roles because we defined it as all the roles in the registration command. You can look at the conditions. You can see that there is no cluster agent deployed. So that cluster agent is what's talking up to rancher. So that is a distinction I like to make because it allows for us to operate in an air-gapped environment. The rancher cluster is actually not going around and digging up. The downstream clusters are sending information up to the rancher cluster. So that's something to note for air-gapped environments. Let's go back to the provisioning logs and just wait this out. It should take about like five minutes and we should have another cluster up and running. Again, this is the downstream cluster. So this way, yeah. So for the downstream cluster, yeah, it's just a singular node, like test environment. But if you want it like highly available, the minimums, three, yeah. So question in the back. You can deploy it, yeah. Yeah, best practices would be to deploy it into a Kubernetes cluster. You could use Docker instead of the way that we did, but that's like a deprecated method. But those are really the two deployment methods, yeah. Yes, you would use the rancher management cluster first before making any downstream clusters. But to note, that's a great question because I can actually show you some information here. I'm gonna let this just run. But if I come to cluster management, I can actually import existing clusters. And you could do, again, clusters, if you already have them in your cloud providers, you would just go there, enter in the, you may basically make your, yeah, here. So your project ID, the cloud credentials, and then it would have a list, or you could, let's go back, or you could do, so for instance, if you made a vanilla cluster with Kube ADM, you could just type this out, Kube ADM, you would click create, and it will give you a command to import existing Kubernetes clusters for rancher to then manage. So you could do that, but if you wanted, I think the one thing about imported clusters is they don't do certificate rotation automatically, so you'd have to do that yourself. But if you provision it through rancher, whether it's in the cloud, or if you wanted to do it via the bare metals, the cert rotation is a lot more accessible. So I'm gonna go back here and go to cluster management. I'm going to go into the scale, I named it scale rodeo, and I'm gonna check on those provisioning logs. Should just take a little bit. Is anyone having any issues? Is everyone pretty much waiting for this to update? Yours is done, yours was quicker than mine? Yep, when it's in machines, yep, mine just finished as well as I was saying that. So it took, as we could see, we now have two clusters running. Took about eight minutes to get the second one going just from like sending the command down. And you can look at things like the machines. You can see what IP address, both external and internal, those are the ones we provided. You can go through the conditions and make sure everything's up. So it looks like the cluster agent, yep, that is no longer there. Obviously it deployed one minute ago. And yeah, there's just some more information here, related resources, and so forth. So if we go back to Hobby Farm, great question. So the question is, would we use the same command for every node? That depends on how you wanna set it up. So if you wanted one with all roles, then yes, but that's usually not the case, right? Like so you would end up selecting, de-selecting depending on what your needs are. So for instance, if somebody has their workloads take a lot more CPU, they need more worker nodes. So they would then add worker nodes via this command. And then you'll see in here, there is TAC-TAC worker. And if we were to remove that, control plane, etc., et cetera. So I mean, the big thing is that there's, it's resource constraints, right? Well yeah, well, because it's going to deploy essentially those same like cube system pods into those nodes as well. So yeah, and especially like once you start doing things with like edge deployments, that's where that can become quite important, right? So like if you have a worker node that you wanna manage, but you don't want it to have the data store and you don't want it to have all of the control plane tools in it, you just want it to be a worker node on like a pie, then yeah, so it didn't. So because it's a custom deployment, it didn't create a virtual machine. It was a virtual machine that we had already set up. That's why in Hobby Farm, if you go here, there's two tabs. So it was as if we had like a, think of like a bare metal. So if you had like a Raspberry Pi at home or a Nook or something like that already set up with an OS ready to go. That's what happened. If, huh? Well then in that case, you would just run this curls command on each of those pies. Yep, and it would become one cohesive cluster. Yep, exactly. Perfect, great questions guys. So, oh yeah, and if you wanted to use a Windows machine, you could, but you know, anyway. So as you can see here, we went through the machine pools. We went through the conditions. We made sure that everything changed to active. So just for some clarity real quick, I'm gonna go to the machines. We've got it running. We have it active. Everyone else should be in that state as well. So I'm gonna click on next. And yeah, so we can actually start interacting with our Kubernetes cluster, the downstream cluster. We've already been interacting with the local cluster, which is obviously the Kubernetes cluster. So if you come into cluster management and you want to look at the things that are deployed on your Kubernetes cluster, you can go to explore. And this will give us some cool information immediately. So this is the cluster dashboard. Yeah, absolutely. There's a few ways to do this. If you go to the hamburger menu, there's gonna be, these are the global, like a global apps, but also, and under those global apps, there's cluster management. Or they have the explore cluster tools built in. If you click on your cluster that's not the local one, just click it right there or go to cluster management and then go to explore. And we should be on the cluster dashboard. The downstream one, so not the local cluster. Whatever you named, yep, the custom cluster, correct. Yes, yes. And so we're gonna leave that cluster alone. We're not gonna deploy anything else on the Rancher cluster. We're gonna be pretty much operating solely now in this what we would call downstream cluster that is a custom cluster. So I'd like to show you some things right off the bat. You can see the provider that we use, so RK2. If it was like a K3S, I would say K3S right there. It gives you the total amount of nodes. We have one node, we have 10 deployments, 410 resources, it gives you some information about CPU, the memory, and then the total amount of pods. But the cool thing that I like on this page is you can actually go into the kubectl shell if you click this. And depending on who you're signed in as, you can actually start running commands. So if you tap on that real quick, you'll see that we have, in the top right, you basically have a profile photo. This is the default admin. And so if I am signed in as an admin and I click on this kubectl shell, it's just a greater than sign, you know, yeah. Anyway, it will bring up a shell and you can start running commands as the admin of the cluster. So kubectl get all or get pods dash capital A. And you can come in here and you have another way to interact with the cluster. So previously we did that on the terminals and we were able to see all of our pods, but we can see now what we've deployed with RKE2. So we can see that in the kube system, we have those same things that we've already seen before. We have the helm operator, or we have the, in the cattle system, we have our helm operator, our flea agent, et cetera. And so that's where the Rancher webhook exists to speak to the upstream or the local cluster right there. So yeah, pretty cool. You can also download that kube config. And if you have kubectl installed on your own terminal, you can actually take that config, make it into a file on your local terminal, export it, and then you can just start running commands and easily switch between different clusters from your own terminal, as long as you download the kube config for each of those. So pretty cool. You can also just copy it if you wanted. So yeah, that's just some quick ways to interact with your cluster. And so we did the kubectl commands. I showed you all the kube config file. And now we're gonna start deploying applications. The first thing we're gonna do is we're gonna actually enable monitoring with Prometheus and Griffana. So if you come into your Rancher deployment, you will have this hamburger menu, make sure that you've selected your downstream cluster, and if you have, you then should see cluster. You should see workloads, and you should see apps, and then some stuff below it. If we select apps, we're actually gonna go into the charts page. So we're selecting charts, and the really cool thing, we had a slide that represented all of this. The things that come built in with Rancher, the things that we manage ourselves over at SUSA, we've got the CIS benchmarks, we have Istio, Longhorn, we have NuVector, which is a really cool container runtime tool that is basically principled behind some zero trust principles. So it's like, if it doesn't recognize the programs being ran in your containers, it just doesn't let them run processes. So pretty cool. But as you can see here, these are the applications that come with Rancher, all in blue. If we go lower, these are our partner charts, and these contents, these contain helm charts, just like we previously deployed helm charts with the command line, we can now do that within the apps catalog. So we're gonna click on monitoring, like this says here. So enable Rancher monitoring. To deploy Rancher monitoring, go to apps, we're there under charts, locate monitoring, we did that, and then we should be on the install button, or we should select the install button. So this gives you some information, gives you some versioning, and so forth. So you've got a nice little read me here. You can click install now and it enters us an install wizard. And then you can select like what project to install into, and I'll explain what that is later, but we're just gonna leave this as the default options, because we're actually gonna then change some information on once we click the next. So if you see up here, we have metadata and then values as the last step. If we select values by going next, we should now have some option pages. It automatically detects the cluster type. We're gonna leave all of these things as default. We're gonna look at these options over here and we're actually gonna click on Prometheus. So if we go to the instructions, it says, in the value step, select Prometheus, and then we're gonna change our resource limits. Because we're on a singular machine, this, the default CPU limits that monitoring uses is going to be this, but that's, we don't have that many resources on these singular nodes. So we're gonna change this from 750 lowercase m to 250 lowercase m, and then the requested memory, we're also going to change from 750 to 250. Make sure that that's what it says in the hobby farm steps. Yep, so for CPU and for memory, and then we're gonna click on install. And like I said, pretty much, Helm is like the standard that we use and many other people use for their package managing. And now you have an application deploying. It's essentially a Helm command because it is a Helm chart. Yep, Helm upgrade, but it's actually saying, install equal true, we create this namespace, we've defined some versions like we did previously with Rancher and with a cert manager, and then we've noticed success. So if we come back over into hobby farm, we should be good to go, says installed apps, and now we should be on the step that says working with Rancher monitoring. So in the left many of the cluster explorer, we're gonna select monitoring. So from here, we previously have this, we might need to refresh the page. So I'm gonna minimize this. So we'll have cluster workloads, apps, service discovery, et cetera, but if we maybe refresh the page here, monitoring should have successfully deployed, and now we have a monitoring option. So if you select that, you now have a dashboard for monitoring, and if you click on the actual metrics dashboard for Grafana, oh, it looks like it just needs to come up briefly. So it's still deploying that. Let's check on the workloads. So if you wanted to check in on your monitoring deployment, you can come up into this all namespaces section, scroll all the way to the bottom where it says not in a project and select cattle monitoring. We do note that you're doing this because we're gonna wanna change that after, and if you look at this here on the left, we'll have some information. So this is saying it's constantly looking at the cattle monitoring namespace, cattle monitoring system namespace, and we actually have 15 pods. We have two stateful sets, eight deployments, and five daemon sets. So we're gonna see if all of these are up and running, and as you can tell, there's all these should be active. Yeah, it looks like they just barely became active one, like a minute and a half ago. So that's probably why when we selected the monitoring page, it sent us that error. So we've got a lot of information immediately. Let's go ahead and go into the monitoring page down here. If you're following along, you'll be on the monitoring dashboard. Let's make sure that we deselect this cattle monitoring filter. So this is a namespace filter slash project filter. Let's just do only user namespaces up there, and make sure that that's what's selected. So click on Grafana, and this is the application we just deployed. You can federate your other clusters and have it all reference the same Grafana. You can obviously go in here, follow their documentation, and you'd be able to change what's being represented here, but the memory utilization, CPU utilization, disk utilization, immediately have visibility into your cluster. And in fact, if you are in Rancher again, I'm just gonna close this Grafana webpage. If you're in Rancher and you select cluster, end up going to cluster dashboard, we'll immediately start having some metrics showing up. So this is Grafana. It's embedded into that cluster dashboard, and you can see those metrics. It's quite cool. So yeah, if we go back to Hobby Farm, it looks like we're good to go to next. We've covered everything in the working with Rancher monitoring step, and now we're gonna go into the create a deployment and service step. So if you remember, we talked about pods and deployments. We're now actually gonna create a deployment which is made up of pods. So if you go to Rancher and you go down to workloads, you can either select workloads and click create, and you can have a list of things that you can create, and what it does is references cron jobs, daemon sets, so daemon sets essentially say, I want this to run on every single node or host within the Kubernetes cluster. Stateful sets, these are for stateful applications that need to basically be persistent. So these are not ephemeral, they're gonna have some persistent storage attached to them. But for this demo, let's click on deployment, and now we've got some fields to fill out. So if we go back to Rancher, or sorry, if we go back to Rancher's hobby farm, and the first thing we're gonna do is grab this container image. So it's going to be Rancher slash hello world latest. And then we are gonna go down to this first step and start listing off some more things that we're gonna add. So if we go into Rancher, the first thing we're gonna do is give this deployment a name, hello world. We're gonna go down to the container image, and the container image is going to have that Rancher hello world dash latest. So I'm gonna paste that right here. And then in this section right here, we're gonna actually pump this up to two or three or however many you want. And then from here, we're gonna go down to a port and we're gonna add a node port, which if you remember correctly, node ports expose an application to the outside of the cluster. So it's like essentially you have an application that opens up a port for outside use use. So if we do that, we're gonna scroll down. Did I see a hand by the way? Was there any? Oh, okay, okay, okay, okay. So we're gonna click on add port if you saw that, add port or service. So this is what it looks like, add port, the service type, we're gonna select node port. I'm gonna just say hello world dash NP. And then it's gonna have some instructions. We're gonna do, serve it on the private container port on 80 and I believe that's all that we need to do. So there's obviously some other options, like the high port will automatically be created for us, but let's see, I'm pretty sure, yeah. So now we can go scroll down to create and now we have a deployment. So this immediately says like, hey, this is how old this is. We have zero of three in the ready state and now we have three in three in the ready state. Took two seconds and we have a whole. That goes at like the node level. So for instance, when you have a downstream cluster with the control plane installed on it and just the control plane, it will actually implement some of those teams, right? Yeah, and it will say nothing else to schedule work except those control planes. So you automatically don't have all these workloads deploying onto your control plane. You never thought of like the UBPI here. You would just use the CLI. Or you can do it within Rancher. And there is a, you use the word cordon, which was an interesting word to use because there is a word called cordon, which I can run on a node and tell it, you are taking out of the cluster, you do not play a role anymore. Whatever you were doing, because maybe I'm gonna do upgrade on you. You're gonna upgrade the RAM in your box. I'm gonna shut you down and upgrade the RAM and bring you back up or do something like that. Then you would cordon the node off so that the scheduler doesn't try to continue to repurpose things to it. So it's like maintenance mode? It's like a maintenance mode. Or you're doing forensics, or you're doing all kinds of things, but you're basically taking it out of commission for a little while. And it's called cordon and node. And then when you're done, un-cordon, un-cordon, cordon nodes, one, two, three. And it does all three of them. Un-cordon nodes, one, two, three. Uh-huh, uh-huh. And some people will do a drain first and then because I can cordon the node take you out but not have drained first. Actually, I think cordon may actually drain you, drain the pods and take you out of commission. But, but, it has to be both. So the drain is kind of cool so you can get the pods off of it on another. Based on what your deployment said, do you have something that said that this has to be alive and running? Because if I, because we just say here, if you killed this, if you drain that, it's not coming back. Well, that's actually deployment because you said you wanted to have three running. But if you, if you just said, Coup C.T.L. run engine X and I spawned off an instance and I didn't tell it that I wanted to have a replica set that says I want to always make sure that something's running. If I were to drain it, it wouldn't go anywhere else because the scheduler was never told to be able to monitor that. And then un-cordon. And in fact, inside your deployment manifest, you tell the system, hey, I always want to make sure that when they do a rollout, when I want to do this kind of thing, when I want to upgrade things like this, here's my percentage of nodes that must always be available. So it already nodes, if I've got seven nodes in my cluster that I can take down two at a time because I've told it that I want to make sure that I've always got 33% of my resources always available, my nodes always available. I think it, I think it asks for a percentage. I don't remember. Exactly. Well, what's your question again? Which is the same? Apply to patch. But if you had to upgrade the operating system, you got to do the same thing. Particularly if you've got to take it down and do a reboot, then you'd want to drain it, coordinate, apply your updates to the operating system, apache it, do your kernel updates, reboot. When it comes back online, it would then be added back to the cluster. All through Kubernetes. Use a Kubernetes to do it. Yeah. Yeah. Yeah. Look at your service. Is your service up and running? Hello world, there's 80. 80 and not HTTPS. You clicked on the 443 one. The name of it is Hella World, or Quest, we're gonna put that, oh 48, we need to select the port that is in terms of redefining Hella World nodes for it. So that's from when we created the Hella World, we also created a service, and that service is a node port, something else that wouldn't stay on the TV services. Noting addresses, and we do hellaworld.io, that's where I wanted to go. We'll actually have that ingress, hellaworld, serving on our host, which is the public IP of the host, so in services covered, we have gone down, click create on the address, and then just click on the target, HTTPS, hellaworld. And the beauty about that ingress, by the way, is we chose port 80, because it's an easy one, but you could have chosen 32,767. You could have created a whole bunch of things, but you would never want to tell somebody about a port number to get to your website. It would never be a good idea, but the ingress, as you saw in that definition, told me which port to use and gave me the name that I was gonna pick, and so therefore it matched the two, and I said a path was slash, so I just gave the URL with a slash, it did all the translation behind the scenes and took me to that location so that I don't have to tell people about port addresses. Well, we're doing the routing for you there, so you don't need to do any other routing, we're specifying it all right there. No, that doesn't make sense. What's the question about the service? Okay, so you specified blog right there. That's the web server right there, okay? You would create another rule and you would point it to the new location all like a different path on a web server. You're just trying to specify on the same, well, I guess the question then is it wouldn't matter to us as long as your web server, whatever this is pointing to from the original one, as long as it knows how to handle the address, we're gonna send it to it, we could point it to the same one, you could point them all to the same URL. Well, that would be a problem. Well, I guess it depends on who's, where does that packet go? Does it come to here first or does it go somewhere else first? If it comes here first, then we're gonna follow the path here that's been defined. So the real question will be where are you defining all these other paths that would come here first? Well, and that's what we're doing here is we're specifying the path here that doesn't, this path that we're specifying here, blah, isn't defined on your web server. It's just something that we are creating that when you type in the path and you specify something that matches here, we do an automatic redirection. Whatever you select in this dropdown service, the target service. Sure, I guess so. Yeah, and that's what I was saying to him. It all depends on what his web server has been told. Correct, correct. And the reason we're doing this at all is because your choices are, if I have 27 web servers that I wanna provide access to, do I have 27 IP addresses that are unique to the world that they can all point to? Or do I wanna have 27 slashes? Here's one URL with 27 different slashes pointing to each of the different ones. Or do I wanna have a load balancer that knows about 27 different public IP addresses that I'm going to point to? How am I gonna route between them? This one requires only one IP address where a load balancer says I've gotta have an address for every single exposed service that is unique. Question back there? So one of the things he was talking about is how we rewrite that URI that comes in. If we go back to that CNCF chart of like 200 people, guess how many ingress controllers there might be? Way more than one. And so the capabilities that you're gonna have will be different based on which ingress controller. Here we're using engine X, but you could use a whole host of others and they would have different capabilities. You can see here, we went from Closper to then project name spaces and then we clicked on create project and we named it state list. There's also really cool benefits to this. You can actually assign users to projects as well. So, for instance, if you've got like a project for int at A, you can give users access to just that project inside of that question. So it allows them to walk in a whole lot of other Rbats in the initial goal space after the goal. So if we click create, we will now have a project listed, project, only in state list WordPress. If you click create name space, we're also going through a state list name space. You can do some things like container resource limits. You can actually limit name spaces to a certain amount of resources, but we're just gonna leave everything as default and click create. So where we should be is in the projects and name spaces of your downstream cluster, your custom cluster, and you have a project and it should say state of WordPress and within there, you could have a name space called state list WordPress. This is specific to Rancher, not part of the project. Yeah, that's a tool that we added, but upstream Kubernetes, like if you were to spin up a vanilla Kubernetes cluster with a few ADM, they have that hierarchical name spaces now, but yeah, this portion is specific to Rancher, because they adopted our new status essentially. They took that and made it their own as well. So we're now going to add a repository. So in that catalog, there's a whole bunch of applications that were listed. So if we come into Rancher, the first thing we'll do is go into that left menu and we'll click on apps again. So apps and then repositories. So with RK82, when you create a downstream cluster, these are the repositories that automatically get pulled in. You have the Rancher charts and then the partner charts for apps. And if you click create, we're going to name this Rodeo, and then we're going to come back to Polyform and copy some stuff. So we have this index URL. I'm going to select that. Come back to Rancher, paste it into the index URL, make sure that we have the ETCS URL, index, generate, about helm, all right there. And we leave that as all that we need to do to double check. Yep, and then step four, we'll click create. So now we have another repository in there and what we should be able to do is let's make sure that this is the step that we want to go to. We should be able to go to apps in Rancher, then charts and we should actually get some more options. So this is where I like to specify. Come into this filter and you will see the all selection, deselect that and then go down to the, if you name that repo that we just added to Rodeo, select that and then we actually have a whole bunch of charts that are in that repository. So now if you have your own repository that has helm charts in it, you can actually reference it and have them as easily deployable from here. So for instance, you know, if you want you can deploy Tetris, I've had people deploy Quake, cube into their clusters while we've been doing these and they've been playing games. So you know, yeah, that's, these are all applications and they all have helm charts related to them. So make sure that you have this Rodeo filter on if that's what you named the repo because we're gonna be using this version of WordPress. So if we select that, we should see that like essentially the README page doesn't have any chart information because this is all managed through us. And so if we click on install, we have some information. So step one, we installed, we went to charts and then we select on the WordPress app all underneath the Rodeo install wizard. So then we should be able to select the installation into the stateless WordPress namespace. So in step one, step one, you can select what namespace to deploy this into. So there's a whole bunch of namespaces, but we have this stateless WordPress namespace which we created previously. Click next, we're now on a form that's gonna have some information that we need to set. Let's make sure that we set that. So in the WordPress setting, we're gonna select up, we're gonna create a password. And then in the service and load balancing settings, we're gonna set a host name. This is coming from the Helm chart value. And this is the installation wizard is something that we maintain for the fields, like these fields. Otherwise, if it's just from another repository, it would just look like this. It would look like this. We will. So this is what it would look like. Otherwise, but yep. But because we have a field set up, WordPress, password, WordPress, you never know when you need to end. You need extra one every now and then. That makes sense. So now we have WordPress setting selected. We've given a password to this WordPress application. And then in services and load balancing, we're gonna change the host name. So the host name is going to be WordPress. It's gonna be WordPress, the public IP address of the node. And split that up. Once it does, it creates that WordPress. It'll work closely with that. It's like seven, and it's like that's eight. Cool. So if you click create, or click install. WordPress is gonna be really bad. That's a WordPress setting. Sorry. It would just be user. That's the default one, just user. But if you come back into this data, it will show you what that looks like. We gotta wait for a second to just deploy anyway. The WordPress password. And then I change that to that WordPress IP address. So we have this shell popped up. We can actually even close that to something through apps, go to install the application. We can see the WordPress currently being installed. All that we can note is we can come over to the workload. And if we actually do projects and filters by the state of WordPress, we now have one deployment. We have one stateful set. And then we have two fonts. So that is a 3DB. And then those are just some workloads, but then there's some theme presses that we actually created. So as you can see here, the instructions are set. So it's a search set, discovery, and ingress it. And you click on the URL. So here's that target, that's ingress. Step six, you can do a backbox of a key, that's admin, the host. We've got Hello World, it was published. But let's make a new one real quick. We've talked about pods, clusters. They're meant to be ephemeral. And we didn't actually do anything, we didn't attach this, I should say, to any kind of persistent storage. So if we were to redeploy the database that is referencing those posts, referencing all of the applications, we'd be done them. So if we come into Rancher and we actually go into Workloads, redeploy this, and then we come back to this. We thought we had this beautiful, beautiful, wonderful blog post. And once this comes back up, we would actually get a WordPress application with no information on it. You'd have to start it from scratch. You'd lose all your settings, your user settings. It basically becomes a fresh installation of WordPress. So to avoid that, what we're gonna do is now create a storage class that we can use to dynamically provision persistent storage. So we're gonna use NFS, and we're gonna do that by going into the application market range again. Also, is there any other questions? Do we need to take a break? I've been up here for a while, I forgot to mention them. Five minutes, right? Yeah, do you guys only have five minutes? Yeah, six. I will do that. Cool, so in five minutes, you'll get some water coming back, I don't know if it stopped working. So now we can leave this all as default. We click install, and so everything here is actually setting the NFS provisioner as the default storage box. We previously did not have a storage box. So if you install any other kind of container storage interface, you would, all of you have on is to create more storage boxes. So walkworms with, you know, you can add things like, do we want to call you to be encrypted? And so yeah, there's like a whole bunch of different storage boxes option when doing that. But we now have, we should now have NFS provision. So that means we should be able to look with storage storage boxes. So storage, storage boxes, and we have NFS. Check, check. Thank you. All right, is that, that's much better. Okay, cool, cool. So as we wait for that to install, let's go look also in the service discovery and ingresses, and I'm going to close out of the shell, because that's just giving us some information. And we have a stateful WordPress application. We've seen this before, we should all be familiar. It's just waiting to come up. I might have to just refresh the page or let's go check in on those apps. So if you come into apps, installed apps, we're waiting for that WordPress application to be installed in the stateful WordPress namespace. Does anyone have any questions? Is everyone keeping up fairly well? Cool beans. Oh, one thing I would like to know, I forgot to mention, because I think this is really cool. So if you were to go into like the pods section under workloads, you can actually create, you can also view the logs. I showed you that earlier, but you can make a shell to run commands like right into the pods container. So you're in the container, actually running commands. So pretty cool. And another, you know, just way to have access to all of your infrastructure. So just waiting. This has taken a second. Container's not ready. Oh, we should also note that there's going to be persistent volumes created within the persistent volume section of storage. So if we go to storage, persistent volume, we now have these PVC claims for WordPress. And it's using that, again, that NFS. And this was all created because we simply had a default storage class that this referenced upon creation. Yeah, well, yeah, exactly. We deployed the NFS server into Kubernetes. Obviously, that's not best practices. You'd want it not in the Kubernetes cluster, but yeah, you could reference it via settings and set it up so it's external to the cluster and so forth. Cool. So we've got the persistent volume that was spun up and we've got claims that go with it. So let's go ahead and get going. So, oh, let's log into the WordPress and create a new blog post. So we're going to select this stateful WordPress section, or the, sorry, the statefulwordpress.ip address. We're going to log in user, cool. And we're going to create a new post, stateful publish. So now, if we were to delete to the Maria WordPress DB pod or click redeploy, we should see a different interaction. We'll view this post here. So we have this up and running right there. I'm going to go now to the workloads. I'm going to go to pods. And if we're in stateful, we should see this pod here that is going to be referencing that database. So if we redeploy this, actually, let's do it this way. That is a stateful set, right? Yeah, it is sick. So what we'll do, we'll delete it. So that's going to go down, but because it's a stateful set, it always wants to be up. I just killed it. You know, somebody accidentally went in, they killed it, but the Kubernetes API, there's controllers that are monitoring that via that API and says this actually needs to exist. So it immediately spun it up again. So what we should see is this should go down briefly, but as we wait, it should come up with that reference to what this persistent storage in let's see, persistent. I deleted not the volume, but I deleted the pod that is interacting with the volume. Does that make sense? So I didn't actually delete the persistent volume. I basically redeployed the portion of the WordPress app that takes care of that data storage. Well then, yeah, then you'd lose your data. That's gone. Good question. The question was if you deleted the persistent volume, what would happen? You'd lose the data. You'd have to have some sort of backup and there's tools to do that. For instance, if you wanted to install Longhorn, it's another open source tool that we offer and you can actually go through their backup volumes and you would be able to back it up from like an offsite storage data store. So we try to offer pretty much all the tools, all the nuts and bolts for Kubernetes cluster just within this page. So, yes. If we go back to this, it's saying database error and it's backup and we actually have that persistent storage. So, oh, question. Yeah, so that's the thing is like you'd likely want it to be either in a different node so you could do it through etcd or you could have a server spun up outside of your cluster and it could be referencing that. You know, like S3 buckets or something, anything along those lines. There's ways to set that up. But yeah, I mean, for like a highly available environment, if you had, you know, three nodes that were taken care of your store and you had all of those referenced there, that'd be, you know, highly available because if one went down, it could just come back up and you could either have replicas across those nodes of the persistent volumes so it could reference those from any of those other nodes, not just the one, does that make sense? And all your major storage providers have drivers or plug-ins, CSIs that plug into Kubernetes. So if you've got pure storage or something like that, some sand somewhere, they have drivers that plug in that give you your persistent volumes so you can just make claims all you need and it automatically selects and builds and creates and goes. Yeah, so as you can see here, you would have like Rook, KubeFS, these are all CSIs that are in the cloud-native landscape. So, you know, name-dropping, but Longhorn's right there. A lot of people have some already built. Yeah, pretty much, these are just those data pieces. So, for the cloud-native storage or there's networking, again, I'd reference container D, so as you can see, these are the big projects, the ones that are graduated, but cool. I have one more question. Yeah, so if you wanted to kill the stateful set, there's that. I'm trying to think if, so when Rancher deploys something via Helm, it usually brings it back. Yeah, so that's how, we just killed it. I was trying to think. There's some charts that get referenced in manifest files of the Kubernetes cluster that will just never be allowed to come down, like you're not allowed to delete them, because it would just break the cluster. But I'm just killing those stateful sets right now, and we essentially, this should never come up again. So, we would have to go reprovision the app, or redeploy the app, and yeah. But you could, in theory, redeploy the app and then reference that volume, just because that volume is still here in per, yeah, it still exists, right? Yeah, I mean, best practices in regards to deleting your application? Well, you have to delete all of it. So, if you have to delete the stateful sets, or the replica sets, or whatever your deployment created. You could come in here and do like a. If you deleted the deployment, if the deployment created it all, you go in and delete the deployment, it knows all of its resources to undelete. Or to delete it, I should say, not undelete, but to get rid of. So, get rid of your deployment. Let's continue on, and so, we've demonstrated that we can get that storage back up, and pretty much the last bit is we're gonna upgrade the Kubernetes cluster, and so, what that looks like is if you come into this hamburger menu, you click on cluster management, and then we go into this ellipses, or kebab menu, and you click edit config. Just to reference, oops. Just to reference this, you can do things like take snapshots of the cluster in its current state. You can restore from that a previous snapshot. Like I said, the rotation of certificates are right here, but what we're doing here for this is gonna upgrade, and we're gonna just edit the config, we're gonna change the Kubernetes version from 1.23 to 1.24, and click save. And so, we should see this go previously, like current, it's not available right now, it's updating, but we will just wait for that to come up, and yeah, so I'm gonna click back into the cluster manager, cluster management, and wait for this update, and we'll see, it's already stating that it's in the 1.24, but it's just gonna give us some of those provisioning logs. I didn't take a snapshot, I should have, because then we, it's in one of these resources here, let me see over here. And that snapshot is only of the LCD database, that is not backing up that WordPress database or the storage, it is only the LCD. Once this comes up, I'll show you, it's upgrading. Yeah, because there's gonna be a list of resources. I wonder if we can go into the local one and have it. The snapshot would be stored in one of these resources. I'll have to double check, but let's go into cluster management and wait for this to come back up. Cool, and are there any other questions you guys might have? I would note that there's actually ways to do this programmatically in newer versions of Rancher using things like cluster API, which is an upstream tool, but we have created our own version of it called Rancher Turtles because Turtles all the way down, because Kubernetes all the way down. So that's available as a 2.82. So again, this maintained rodeo has not been up to date. We're on 273, but did I just click update again or something? Mm-hmm. Yeah, we do have that ability with Terraform and then also that cluster API Cappy. So that's something to look into. So let's see, Turtles all the way down. I love that my taxes showed up first, Rancher. Let's see if it shows up with that. Nope, Turtles. So with this right here, Turtles, you can actually go in and follow along with the documentation and read me and you can actually do it programmatically, so. Yep, teaches you how to set up the Rancher server itself, installs an operator, and so forth. So if we come back into here, I think, did I just accidentally kill this? We'll wait for this to come up. Are your guys' clusters still configuring or is it just mine? Cause it was ready for a second and then I accidentally pressed something so I think I messed it up. Still updating, cool. And now mine is running inactive, cool. So this is where the snapshots would look like in the UI. So I just clicked create snapshot and then if we should refresh and we should have a snapshot that we can reference. There it is. So this one was just created. We can restore to it, et cetera. So that's where those snapshots would be located. And then, yeah, are there any other questions? That's pretty much, we've reached the end all be all. If you go to this next page, don't click finish cause it will tear down everything. And if you already have, I'm sorry. But so this will be available. You guys can still mess with it. You can go in and install Quake or Tetris and we can just mess around, have some free form Q and A. It's kind of what we planned for the rest of it. So yeah, and if there's, we can come hang out with you down there. Thanks for listening to me talk for three hours. Thanks. And Brian too, obviously. If it gets stuck in reconciling. Ooh, you could do some, you could probably just restore it to a backup. So if you go into the local cluster, there's some extra tools for just the local cluster in that apps catalog. So if you go into charts and you observe in here, there will be Rancher backup. Let me see, yep, Rancher backups. So this is a way to actually back up the entire Rancher instance. So, oh wait, this isn't referenced to Rancher going down. If it was reconciling, you would just basically tear it down and you should be able to follow the snapshot. Right, you should be able to follow the snapshot documentation to bring up that cluster referencing that snapshot. Does that make sense? And like it says here in the docs, you can do it locally on the SED node which Brian had mentioned earlier, or you could go to a backup store that's S3 compatible. And then there's also ways to set up recurring snapshots, but yep, cool. Any other questions? Yeah, yeah we do. So as it stands, you have to use Rancher to then create a harvester cluster and you essentially would just boot up that ISO on some nodes and then you can import that cluster into your Rancher server, so let me. Yeah, you would just be adding another layer, yeah. Yeah, say that last bit again. Yeah, exactly. You can use like, so you can always, so and if you have a harvester cluster, you can always use, so if you create a harvester cluster, you would have those bare metal nodes, you would import it into the, what was virtualization management right here. And as you can see here, I don't actually have one. It's just gonna say run this command into the harvester cluster, but you'll be able to create VMs, et cetera. But the cool part is then you can create more clusters from that harvester cluster. So it's like using Kubernetes to make more Kubernetes. So it's like similar to vSphere in that sense, but yeah. You can use that cluster API to bring up bare metal clusters. Yeah, yeah, so you can use it alongside Rancher. So yeah, this is our Terraform provider, just had a release like last week, so. So with the production stuff, you essentially would just have more resiliency. You would, I'll show you, let me, I'm gonna unplug this briefly, and then I'll get my, yeah, go to like what, more production ready Rancher would look like. So for instance, like with the local cluster, this is, this has three nodes, and it's best practices for Rancher specifically to have a three node cluster with all roles is what we prefer. There are instances, like I have a, I was just on a customer site and they had to deploy more worker nodes to in their Rancher cluster. So they have two extra nodes to manage the Grafana application because they have so much data aggregating through. So that would be, so that's like more production ready. You would have things like, like you'd have a vault for things like secrets. You could have a registry like Harbor to have your own container images so that you wouldn't be pulling from a public repository. But the big thing is like, if you wanted to create a cluster that was downstream, so if I went into this, let's see, and wanted to create like a downstream cluster in Amazon that was actually should handle the workloads for a production cluster. Let me, what creds are these? AI. So this is referencing like my cloud credentials in AWS. Huh? Yeah, I was. Thank you. Sorry. Apparently it doesn't automatically mirror. It just wants to goof me. I'm just sitting here talking. Oh. You know, thank you. Yeah, so for instance, I'm in a cluster. So I wanted to first show you this, the local cluster. We have three nodes here with all roles. That would be more best practices. And then you'd obviously want like a DNS because you don't want to expose like the external IP address like we are with the hobby farm. But let's go back and I can show you what like deploying a cluster would look like. That was just for the Rancher deployment. And best practices for Rancher deployment is three with all roles. But if you wanted like a production ready Kubernetes downstream cluster, I can show you what that would look like. You could spin it up in moments using RKE2. And then also this is going to be using EC2 instances. So I'm going to deselect that. I'm going to make not 31. That'd be crazy. You could, and then your bill would be insane. I'm going to leave that Rancher VPC default there, default there. I'm going to name this test cluster. I'm going to add a pool. So now this is going to have three separate pools. I'm going to create a pool for control plane nodes and see over here how it has these little notifications. So that's saying like, Hey, this is highly available because control planes didn't necessarily meet that standard with just two. But if you actually add another pool, Oh, I need to select a VPC. I'll name this to copy and then here. And then we're going to make our data. You could actually combine this with the control plane that technically would meet that standard and just click create. And it will sit there for a second. And so this is version two dot eight two and we have extensions in here. I'm going to let that spin up. But if you actually come into like, we have extensions like elemental installed. So you could go manage the OS. You can go create your own operating system that's specifically been containerized for right there for like Kubernetes. You could use harvester there. You could also install the operator Cappy into this cluster. So it's pretty sweet. But let's go back to the home. We obviously have a lot of pending right here, but this is the cluster I just barely started and that we'll see how long it takes to come up. But it should be fairly quick. And that will be a representation of what a production ready cluster would be. And then, yeah, yeah, yep. Cause it's like leader election based like finding quorum. Yeah, cool. And then if you wanted like day two operations, it would be like installing like monitoring and then in ingress like engine X. No, that's literally how you would do it in like a production cluster. Yeah, other than there's other like things that you can do inside of Grafana. Like, like I said, you could federate it and you could have a whole bunch of clusters all feeding into one monitoring application. But pretty much just following along with like you'd create a long horn. So you have a CSI like a default storage. You'd create OPA gatekeeper to do the policy, manage those policies. So people have access to the cluster, et cetera. So things like that. So there's, I mean, first things you could always, you could always reference our documentation if you wanted to find like information on actually like, are you talking about like how to get up to date Kubernetes or not up to date, a production ready Kubernetes cluster going or? Anything from like kubernetes.io, Reddit's Kubernetes threads are always good. We do like blog posts at SUSE on what kind of toolings we use. It's all over, man. There's not like, yeah, so. Yeah, but oh, and if we go over here, we should have a cluster spinning up unless I'm out of my, unless my credentials are wrong, which they might have. It's been a second since I've, I just was managing my IAMI on AWS. So I might have broke this. But yeah, cool. I mean, again, docs links are right here. So you'd use like container security. All this stuff is essentially more like day two observations. The big thing I like to reference is like continuous delivery with fleet. It will have a repo in, oh, let's go here. So if we come into this global tool, you can actually create, get repos. And the moment that you add a label to a cluster, it'll automatically take these repos and deploy it to that cluster. So if, so like, again, if you have, yep, yeah, but yeah. And so that helps with like kind of getting up to speed with day two operations, specifically cause you could just create a whole bunch of helm charts. In the moment you update it, it will be referenced by fleet. And then on top of that, the moment you add a cluster, a label to this, to the continuous delivery, it'll label the cluster and deploy all of the applications within these repos. So, oh, and if you're looking for an NVIDIA GPU operator, we actually do support that on RK2. So you can spin up your AI ML workloads on RK2 clusters if you have GPUs in those nodes. And that operator like, obviously works with like, I think it's a text like CUDA and everything and gets it going. So, yeah. Yeah, that operator, you'd have to install the operator, but yeah, exactly. So, yep, yes, yes, in the, in the hobby farm, yep. Yeah, yep. But again, you know, if your flavor is like Ubuntu or something, you know, where we like to be interoperable, you can go ahead and deploy it on there. There's like a support matrix for which ones are, ba, ba, ba. So, like this version of Rancher, so it's saying you can deploy on this, on these versions of Kubernetes, these distros, and then you can deploy. So then you would look at what those Kubernetes clusters, like so for instance, if you wanted to deploy on RK2, these are all the repositories that we support or not repositories, Linux distributions that we support. So you can deploy on Ubuntu, Rocky Linux, RHEL, CentOS, Oracle Linux. Cool, yeah, you're welcome. Oh, and then that cluster, we just brought up a whole, yep. So this one is all in the cloud, this is all in AWS. And the one cool instance about that is you can actually just come in here and scale up and down nodes. So, instead of having to like go in and add one manually and then run a command, mm-hmm, yeah. Yep, and then this one's received its IP address. You can actually SSH shell, if you provision them through Rancher, you can actually go into the nodes and start running commands just from within Rancher. Again, you can also go to the cluster itself at the top. Once this is deployed, you can go into that explore button and you can run things on that kubectl shell. But, yeah, yeah, you're welcome. To use predetermined groups in Azure. So we use, yeah, what? Yeah, why? Right now? Yeah, it's on the body phone. Oh, no. We could just try signing in and registering again and then you could technically have two VMs that were provisioned. Does that make sense? Yeah. Because then you could just run that curl command really on both of those and then add it to the downstream cluster. Yeah, but your page too. So all of these users are referenced in OpenLDAP. And if you went into, so for instance, a downstream cluster can have clusters and project members. And so you can actually add people via, let's see, so you could add them by individual users and then you can also add them into groups. I just can't remember what the names of the groups are right now. So yeah, so then when those people sign in, you could actually make their permissions so scaled down that they could only see that one cluster or you could have your Kubernetes admin that runs everything, see all of them. And so I run into guys that they're managing all of it and they see all of it. They see the local cluster, they see the downstream clusters. And then what they do is for some other teams that want to deploy their applications, they give them access to just a cluster or even just a namespace within a cluster. So, but yeah, I wish I remembered what groups these are all called, LAT, AMS. Ah, see, so this is a group. So you could add anyone inside of that group in OpenLDAP to this cluster and now they'd have access to that cluster. And if you look at it, they should already like, they should already have a bunch of those. Yeah, you already have like all these groups. I thought they were automatic. In RK1 we used to, but in RK2, in K3S, because all was automatic. Hey man, have a good one. Also, way to catch the freaking, he deserves a chameleon, do we have any? You do? Yeah. No, yeah, I am. Yeah. Well that, and he also corrected me, I almost messed up and had them to run a command on the Rancher node. Yeah. Yeah, that guy's going places. That guy's going places. Yeah, so the control plane does have, it does. I think, did you guys know where I came from? Dude, so Monday morning, fly out. From Salt Lake to, I went to, I went and hung out at the Rivian with Jeremy. Was there until Tuesday night, I went there at like 8 a.m. Didn't stop hanging out with the Rivian boys until like 9 p.m. Got out of my hotel at 4 30 a.m. on Wednesday. Flew to Salt Lake. Back home, back home, went and fed my cat, got back on a flight at 9 p.m. in Salt Lake, got here at one. And then just scrambled to get here. And then went to sleep and like woke up, I woke up late. I was trying to get up early. Cameron was messaging me. He's like, hey man, can we talk? I'm like, I'm not functioning. And then now I'm here. So yeah. Rivian, you made it. Yeah, dude. Yeah, I did, fairly. So, dude, Rivian is sick. I test rolled there. Yeah, I was just gonna ask you guys to get to the rest.