 Good morning, everyone. We've got a couple minutes till we get started, but before we do, feel free to enjoy some of the coffee in the back brought to you by Data Bricks. I believe they wanted me to call it Data Brew. If you're feeling especially caffeinated after it, feel free to tweet out to Data Bricks. So thank you for the coffee this morning. I don't believe they're in the building with us today, but they're big fans of scale and the local community here, and they wanted to help out. So thank you. So good morning, everyone. We'll get started in just a moment here. Again, if you're in the room and would like some coffee, courtesy of Data Bricks in the back of the room, you can grab yourself some of that Data Brew. But also just a couple quick reminders on things. One is, while we don't necessarily require speakers wear masks on stage because we want you to be able to see their faces and their smiles and their gestures, we do ask that you, if you're not drinking coffee actively, that you please wear a mask. Just help keep us all safe and healthy. Hopefully by next scale, this won't be a thing anymore. But for this scale, it's great if you could. Awesome. So thanks again to everybody that came out this year. It's been the longest break we've ever had between two scale conferences in the 20 years we've been running them. We started scale in November 2020. I think it was November second 2020 at the USC campus 20 years ago. There's some, sorry, 2002. Sorry. I should have had some more of that Data Brew. But anyways, in 2002, we started scale as an opportunity to bring the local community and the LA area together. LA had something like 20 lugs all over the place that often we didn't get to see each other. And this was our one time year to get together in one place and learn together. We also thought it was an opportunity for us to bring in all the folks that we normally don't get to, we get to hear about them on the internet about these amazing open source developers that didn't make it down to the area. And we thought, if we create a conference, they'll have to come. Doesn't work that way. It turns out it's a lot more work than just saying we have a conference. Please show up. But it seemed to work out and we've been having a good time for about 20 years. We did it every year up until including during the launch of the pandemic. And last time we were together and now we're back two and a half years later. So thank you for coming back for the re, the re ignition or the re kick off of scale. We're super excited to be back here and we're glad to see, well, I don't get to see your smiling faces because of the masks that you're all graciously wearing. But I'm sure the smiles are behind them. So thank you. Now I have some great news for you. You don't have to wait two and a half years for the next scale. Well, I guess I don't know anything for certain. But I do know that we have signed a contract for a venue into in six months at back in our home in Pasadena at the Pasadena Convention Center. We we are very appreciative to the L.A. X Hilton who when we called them in a panic back in March saying, oh, McCron is coming up and taking over. Can we please move our conference to July? And they they made the space for us and welcome this back after probably about seven years that we had not been here. And so we're it's a little bit of a throwback to be here, but we're glad to be there. So March 9th through the 12th at the Pasadena Convention Center, you can mark it in your calendars right now. I think as soon as we are not tired from this event, the CFP and registration will open. My guess is probably the next week or two. Yes. So the other thing I want to point out is that everything that you see around here is done by volunteers. Nobody does this as a day job. Usually it's a, you know, midnight to 4 a.m. job. And so if you see something that you don't like here and you'd like to make it better, much like in your favorite open source project, please come make a PR, come be part of the conversation. If you see somebody that looks tired and you want to help them out, please come volunteer to help them out. This is a conference that's put on by the community. It's not something that we put that's put on by vendors. So again, thank you for being part of our open source family. It's really good to see you all in our in our in our in our annual reunion here again. And yeah, we're we're excited to kick off here. So a few just a couple administrative things. One is remember, we have game night tonight in the Plaza ballroom. That's a great opportunity. A lot of conferences run, run big parties with lots of booze or, you know, and drinks and other things. We want to be family friendly. Usually there's lots of kids running around scale. There's not not everybody's not everybody's here to just hang out at the bar. So we make we love lots of games. We'll have a casino night. We'll have a bunch of escape rooms. There's a hotel wide escape room slash scavenger hunt that I am very curious to see how it works out. I say that because I think Luis will do an amazing job. But the one thing about scale that I don't get involved in the planning with this game night. And it's a surprise to me every time and it's always done amazingly. So I encourage you to join for that. We've got some some great sponsors behind that. There's also a live show by MC front a lot. If you want to hear somebody rap about lots of geeky nerdy, Linuxy things, he'll be down there as well doing a show. And then of course tomorrow, we've got to you know, today, we've got an amazing keynote in just a minute, but tomorrow we'll have two keynotes, which doesn't doesn't we don't get to have often. One is in the morning, we'll have Demetrius Cheatham talking from GitHub to share a little bit about how we build more inclusive communities. And that's I think a very important topic as we try to re re grow and bring the scale community back, but also, you know, the open source communities that you're all part of a part of at home. And then we'll close with vent surf in afternoon as a closing keynote. And so one of the most one of the most my favorite parts about scale beyond seeing all of you my friends and family is that I get to see I get to invite get to introduce some of my open source heroes on stage every year as we bring them on. And who's more of a hero than the guy that created TCP IP? So none of us would be sure for that. But with that in mind, I'm stealing much of Eva's time. And so we'll we'll go ahead and get started here on this. So a few weeks ago, even I connected on the on the internets. And I know we had a conversation and I said, you know, we've had an amazing year of just like, insane headlines about open source, which is, you know, 20 years ago, when we started scale, open source was like the rebel in the room. Like, what do you mean you want to use this? I talked about, you know, I saw talked to my university about it. And they'd be like, No, like, go use windows in the lab or something. Don't this is not a thing. Now, it's the default. But it means that we're under the spotlight. And so we started talking, I thought, is there, there's something that we as a community should be doing differently to make sure that we're not making the headlines all the time. And making better headlines. Sorry, we're making better better headlines. Yes. And Eva graciously agreed to come share with us some of their experiences and some of their thoughts on how we can improve. So with that, no, for without further ado, I'd like to welcome Eva to to scale and to the scale family. And they'll be sharing with us a little bit about their experiences and in open source security. So thank you, Eva. Thanks. Good morning, folks. So there's a couple spots in here where I'll ask questions. Feel free to just shout out answers. I keynotes aren't usually interactive, but I do love an interactive talk. So you're welcome to. With that, I'm going to jump right in and test. If you were paying attention to my first question. Here's a screenshot of a headline. What year do you think this was published? Shout them out. 99 I heard over there 99. Can I get another answer? 2006. Awesome. How about 1993? Wired magazine. I remember this. I was learning how to like rebuild computers and tinker with stuff back then. Talking about the cypher punk mailing list and this movement of people who are all interested in freedom and encryption. I thought that was really cool math because I was a math nerd. Next test. How about this headline press release from the White House talking about cyber and open source? What year do you think that was from? I heard a whole lot of answers all at once there. Love it. 96 over here. I heard 2010 over there, I think. No, a few months ago. So how did we go from the cypher punks and crypto rebels and people talking about, oh, this is, you know, this little counterculture movement of free software geeks to the White House cares enough about this to call press conferences after press conferences and hearings in front of Congress talking about open source supply chain security. Yes, there have been several of those just this year. Both groups of people are asking the same question. Am I safe? Ultimately, I think all of us ask this question even implicitly. If I pick up my cell phone and I want to, you know, do a transaction in my bank to pay a bill, I need to know am I safe to do this? Is that app safe? Is my phone safe? When I get in a car, if I was driving an automatic, almost self-driving electric car earlier this week, I'm like, well, there's a ton of software in here, probably a lot of it's open source. Am I safe in that car? We're all asking this question. And how do we answer this question when open source is everywhere is what I'd like to talk about today. Open source is even on Mars in the Ingenuity Rover. There was a headline of like, Linux is on another planet. That's so cool. Yes, it is. But it belies the banal reality that open source has permeated our world on earth. It is all around us. Studies published in the past 12 months say that 85% of all smartphones are running Android, that's open source, or based on open source as parts of it that are 70% of homes have a smart home device in them. Those generally all have open source in them, lots of our cars, our light bulbs, our washing machines have open source software built into their stacks. This has necessitated government involvement because it is part of our national supply chain and not just for the US for every country on earth right now. And that necessitates a change in our approach as community leaders, as contributors, as companies invested in this. So I'm going to lay out a little bit of a timeline. I could put a million different things on this arrow of time. I've chosen a set of things to give you a picture of how these two events 30-ish years apart, the cypher punks and that wired magazine and executive order 14.028 last year on software supply chain security, how they're connected, at least in my mind. And the rest of this talk will sort of attach to elements of this timeline. So I have to go a little bit further back in history to the creation of two foundations, maybe you've heard of the free software foundation and the electronic freedom foundation EFF. I'm not going to go into those too much just to put them in temporal context. That's when they happened. Blinky text on a web page was invented 32 years ago. It's easy to forget that 32 years is not all that long, right? The Internet isn't that old. And at that time, we didn't have encryption. It was just blinky text and plain text on the wire. Does anybody in here remember the Bernstein VDOJ case in 95? I see a couple hands. Heck yeah. I'm glad I'm not the only one. So this was a test of an individual's right to publish their own creative work. And at that time, the government said, well, if you're publishing encryption, you can't do that. We control all the encryption. I'm summarizing basically. And it eventually went up to the Supreme Court, where the Supreme Court ruled that no, it's a first amendment right of free speech to publish your creative work on a piece of paper or on a disk or on a website. And that gave birth to the ability for us to do open source without government oversight. Now, in the middle of that court case, two other foundations were created, the Apache Software Foundation, the open source initiative. And the connection between these events is somewhere hidden in the histories of time and Supreme Court archives, but the Apache Software Foundation itself and one of its founders actually submitted an amicus brief in this case, that same person is briefing Congress today on the same topics. So these are actually connected by the same people doing the same work. For context, 99 is also when I dropped out of college for the second time and got a job in a dot com startup. Because, you know, encryption and deniable file sharing and the internet and dot com boom, I didn't need a degree. It was fun to build tech. But a little bit later, the bubble burst. Oh, yeah, Supreme Court case, finding closed that year, I was tracking it because of the work I was doing, the dot com bubble burst. And for about eight years, we didn't have a lot of pipe of press around open source or technology, but it was slowly building. In oh age, Sun bought MySQL for a billion dollars. And the first time that any open source company had reached that level of financial interest and scale. And so venture capitalists and big tech industry really took notice of my mic just got on the back. Okay. So VCs and big companies took notice of open source and began calling it a business model. In the same year, two more things happened related to open source that are pivotal to our sort of history here. Bitcoin and Android. So oh, it was pivotal to this acceleration of the use of open source in our world. A couple years later, with all of the renewed interest in open source now being thought of as a business model, highly permissive licenses Apache and MIT licenses were tried as a vehicle for enabling collaboration and economic growth. First in the OpenStack Foundation, and then in the Kubernetes product and CNCF. Each of these created a flywheel, if you will, of startup companies, rapidly building new technology funded by VCs or big companies. Rapid innovation happened with a new funding model that we had not seen an open source until then and Kubernetes, you know, I think we all know that that's kind of exploded and cloud native and just the use of clouds in general. And now with machine learning and IOT, this has gone even further even faster. We have just this rapid acceleration of the creation and publication of new technology through open source seen as a vehicle for commercial growth. I dare say capitalism's focus on commercial growth isn't always good for the planet. So what else happened along this time? We had an evolution of our tools for encryption for keeping our data safe and our person's safe online. SSL in its original form had lots of issues. We now have TLS 1.2 and there's more work happening. Even stuff happening now to keep our data safe while it's in memory. Confidential computing or enclaves, not on this graph. But as we moved faster and faster towards commercial growth, we paid less and less attention to how communities are built and how software is secured. And in 2013, Snowden reminded us of the risks here of not having an encryption. Bernstein was all about our right to create encryption to protect our data. Snowden reminded us of how important that is. And soon thereafter, we had a series of major internet shutting down breaches. Bugs, vulnerability is at all different scales of software from a single maintainer project to a piece of core infrastructure to a chip architecture and use by the two largest chip companies in the world. This is why this is why governments are paying attention. It's become ubiquitous and it breaks. So what changed in that window of roughly 30 years? We succeeded. Open source succeeded as an idea. It is a meta tool by which we collaborate to build new tools faster, to accelerate science, to accelerate society. It's powered by two generations of growth in those areas. But with that success come new problems. Most people never notice how much open source software is around them or in their firmware, in their devices. You just buy it, you use it, you don't have to care. And any technology where you don't know how it works is essentially magic. So guess what? You're all magicians. We are all magicians here, right? We're building amazing tech that other people don't have to understand to benefit from. And we get to do it together. But as we are casting our spells, we have created a problem. We have failed to teach the rich language of our culture. So people misunderstand us when we talk about open source. Is it a verb? Do I open source software? Is it a noun? Is this thing open source? Is it an adjective? Is it a business model? Is open source a counterculture movement or just a license classification? We need to develop more precise terminology to educate policymakers, business leaders, and the next generation of contributors. We are magicians, but we are also all apprentices, downloading other people's spells and casting them without deeply understanding them. Tell a quick little story about hard drive firmware. How many of you know the SATA interface API spec by heart? Right. Nobody put a hand up there. That is a very old specification. It includes an API for a privileged user route on the host to modify the firmware of the hard drive. You might need to update that firmware to fix a bug. Cool. There's no out of band mechanism to verify that. So if a malicious person modifies the firmware, there's no way to ever remove that. It's a real issue. There's been proof of concepts of this almost a decade old now. All through our stacks, we have these kinds of things where ancient spells lie and wait for people to find them and resurrect them and use them for good or nefarious ends. So we magicians have not passed on the safety necessary. And every new generation of contributors or computer users is thereby an apprentice, not able to cast the spells with full knowledge. We have to change that. The growth of our community in number of contributors, number of projects, the amount of usage has so rapidly outpaced our transmission of cultural norms of how to do this safely that our success is curving back on ourself like this graph. I want to give a shout out to someone up here in the front row, IEEE's open source software product governance working group for trying to tackle some of this and create a set of definitions around open source product governance that can be adopted to everybody. So at least we have common terminology if I jump from the open infra foundation to the CNCF to the eclipse foundation to anywhere else to spin up a new product of my own. We are using the same terminology to describe how we do this and how we develop trustworthy software that will help address some of our cultural issues. In the early days of open source you might have heard, I guess the quote came from then, but we all still kind of say that all bugs are shallow given enough eyeballs. It seemed reasonable when there wasn't that much code. It's completely unreasonable today. It's about as realistic as saying I can move the earth with a large enough lever. Yeah, but I need a fulcrum and no one has a lever that large. It's nonsensical. We have too much code and both too many eyes and not enough eyes. Fun stat, right? The mean contributors per product is one. So we have a problem of scales, not scalability of our software, but scale of our community. We call it all open source software or an open source community and yet we have dozens of projects with thousands of contributors and millions of projects with just one, two or three contributors. And yet any one of these can be considered critical software in the sense that it could break the internet. And they're all open source, but if you are part of an organization deciding what software to consumed into it in a product, how do you pick? How do you even know? We don't have a standard way of doing that right now, of modeling trustworthiness. So yeah, we know it's going to break. That's, we have to accept that software will break regardless of how many contributors it has. Now you're probably familiar with this meme, it's a lovely x-case if you comic. You might know I'm about to mention Log4j. You're right. So if Log4j is that little tiny block being developed by three people, I don't think they're actually in Nebraska, but that big block is Kubernetes. Do you know what they both have in common? Either one fails, all modern digital infrastructure collapses, right? Both are part of our dependency stack. So we really have a problem of different scales here. Now speaking of scalability, what hasn't changed in 30 years? Human scalability. Back in the 90s, there was this anthropologist, Robin Dunbar, who studied primate social group sizes and dynamics, and he studied how primates develop trust and use that to model humans, and how humans develop trust in groups. He postured at a maximum social group size. You might have heard of Dunbar's number. He actually postured at three numbers, not one. We only really talk about the first one most of the time. You might have heard 150, 220, it's in that kind of range. This is a community of village. You probably know everyone's name, maybe one or two details about them, like if they have a cat or their favorite type of food. You have a little bit of trust because they're not a stranger. So our primate brains go, ah, that person's not a threat. I can look at their PR and I don't, I'm not going to assume it has malintent. The next smaller group, this is your platoon size, this is your two pizza box rule. These are folks where you're working towards a common shared goal, maybe on the same release team or on the same feature team. You work with them pretty regularly. And the smallest group, in military terms, that would be your SEAL team. In open sources might be your project leadership, your, your PTLs, or your, your steering committee. These are folks who work together day after day and hopefully develop, from that collaboration, a deep abiding implicit trust. These numbers are roughly fixed in our primate brain. We can't change the size, we can't increase, I should say, we can't increase the size of those. But to maintain those, we have to perform the rituals of social grooming, like seeing each other in person. So Robin Dunbar, this was done in the 90s. He thought, has the internet age, has social media changed our ability to form connections and bonds of trust with each other? And so he studied that. And in 2016, published a follow on study, turns out the internet has not changed this. In practical terms, this reflects the fact that real, as opposed to casual relationships require at least occasional face to face interaction, opening the door for someone, sharing a meal, shaking hands. And this is incredibly relevant to open source communities. Back when I was in OpenStack, we experimented with the pacing between our conferences. And we learned from folks in Debian and Ubuntu before us who had also done the same experiments, that after about nine months of not having face to face interaction with your peers in an open source community, people tended towards less trust, less willingness to assume good intent and social interactions and more readiness to assume harm, to get angry and grumpy or snarky. And as we fluctuated how often our conferences were, we could see that in sort of the metadata and the social vibes of the projects. Well, the pandemic has thrown all that to the wind, right? We've seen all across open source communities and really all across the entire world a huge degradation in our ability to trust people who aren't exactly like us or part of our tight-knit network. I'm super glad that we're having conferences again. Because this, our ability to collaborate, it's built on trust and we have to meet in person to be able to build and maintain those trust relationships. I need to talk for a few minutes about trust because like the word open source, this word means different things to different people. And once again going to reach back in time and pull forward a definition from 30, 29 years ago, from Dorothy Denning writing about the military's approach to purchasing systems and trustworthy systems, she said it is a declaration made by an observer, not a property of the observed. So when we're talking about software supply chain security, it is anti-factual to say this is trustworthy software. The correct statement is I have measured it and I have chosen to trust it. And then I am publishing an attestation of that fact, my measurement of its trust worthiness or my assertion of its trustworthiness for a given purpose at a given time. Those two caveats are critical in how we talk about trust. Nothing is trustworthy in every sense, not even my best friend. I might trust them to watch my house when I'm out of town, but maybe not, I don't know, fly an airplane with me in it because they don't have a pilot's license. Trust is contextual, trust is time bounded. And when we talk about trust in a thing like my phone or my bank or a piece of software, that is a proxy for saying I am trusting the people responsible for maintaining it. Let's not forget that. So how do we measure the trustworthiness of an open source project? We measure the trustworthiness of the people involved in building it and the processes that they used to build it and protect it from other actors. Security vulnerabilities are going to happen in projects large and small, fact of life. Given that, what kind of systems might we use to gauge the trustworthiness at both ends of the spectrum, the big and the small? For this, I want to connect two more concepts, the four opens and the four freedoms. So the four opens were formatted again in the open stack community about seven years ago now, though the seeds came from previous communities. And it's used in a couple others. These are basically that the software itself must be open source. So the code is available under an OSI license. The design process and the documentation also must be open. So people can learn what the software does, how it works, not just by reading the source code but also by reading design docs. And they can participate in evolving that. That the development process of creating and evolving the software is also open, public and transparent. So people can become part of shaping the growth of that software. It's not just someone else maintaining it opaquely and throwing it over a wall. And lastly, that the community itself is open, that its governance structure is documented, is measurable, is public. These build trust to the four freedoms reaching back in time, originally from the 90s, I think it was two and then finally four freedoms in the early aughts, the freedom to run software, right, however wherever you want to, the freedom to share it with your friends, to study it, read the source code and change it, recompile it and then share your innovations. That's the foundation of free software and open source. These go hand in hand. The four opens describe how we build it together to make it trustworthy. And the four freedoms to find what you as an individual can do with it. I think every one of us should have the right to fix our hardware, to modify our software, to have that sense of ownership and autonomy over the tools, digital and physical in our world. That itself is an incredibly powerful innovation. However, when many individuals collaborate with the goal of building trustworthy software that and and that output becomes part of public infrastructure, there is a hefty responsibility that comes with that. I snuck these in last night. Let's talk through a couple of examples real quick. Does anybody does everyone remember Heartbleed? Cool. Yeah. SSL, foundational tool for encryption, massive holes in it. We fix them. Hopefully there aren't any more. Everybody remembered these two cute little icons. Spectre. I didn't hear as many as I'll talk about in a minute. Spectre and Meltdown. This was a chip vulnerability affected most CPUs at the time. And it broke the isolation between applications. Less important maybe on your PC at home, but still fairly important if you don't want to say your Facebook tab, an ad in the Facebook tab reaching over to a tab with your bank and like copying data and breaking that isolation. There's a lot more going on there. More interesting for open source is this one. Left pad, right? Tiny little project that was just a useful tool. Didn't do anything really special by itself. Just kind of useful. Ended up being used everywhere and then one developer, one maintainer decided one day he was angry and deleted it. And it broke the internet. A similar event happened on PyPy just a couple of weeks ago. Phillip are not so much out of anger but he's like I don't want to deal with your requirements for being a maintainer of a critical project. So I'm going to reset my counter by deleting and re-uploading. It broke everybody again. So we have this individual responsibility that isn't, you know, it's a hard conversation to have still. Especially in the context of protest wear, which has really come to the forefront of conversations since the Russia invasion of Ukraine. People have realized, we've always had this, ability as maintainers that we have power. And yet in doing this, these actions also affect all of us. It reflects on the entire open source community when people do that. So what can you do regardless of your role in projects or companies? We all need to adapt. We all need to build new tools. Today, a little over half of all organizations realize they don't have a security policy for open source. So if you're an open source maintainer, maybe you can help your company build one of those policies. And of those, about a third, don't have anybody responsible for securing open source or responding when there's a vulnerability. So yes, companies need to do better but open source communities also do too. There are a bunch of new tools being developed. I'm not going to actually talk about the tools specifically in this talk. I hope there's lots of other content this weekend that goes into them. Things like the Salsa project or Notary and Sigstore for signing packages and signing Docker images. The Git Bomb project for developing a supply chain artifact resolution to trace a component from your source code file through every build step and linking and packaging all the way out to the end to be able to build that traceability without any effort on developers. It's a bunch of work happening on reproducible builds, which is key to being able to verify that when someone gives you a package, it hasn't been tampered with in the build or shipping process. If you can rebuild it, there's a bunch of efforts going on to try and solve some of these challenges at the sort of foundational layers of open source communities and how we build software. I think the most important aspect of these efforts is to try and do it in a way that has zero cost for open source projects, because it is really inappropriate for a company to show up as we've probably all seen on Twitter and be like, hey, I think there's a vulnerability in your product to fix it for me. That's just rude. So what I'd like to see companies do is make better tools for open source to make those issues transparent and easily fixable or reachable by companies and companies come up and say, I think I found a vulnerability. Here's a fix for it. Here's some money to hire someone to fix it. More advocacy towards that is needed. As individuals, if your project becomes super popular or used in something really popular, then you have a responsibility to act with the knowledge that your actions affect others and reflect on all of us. Don't be the weak link, right? Protect yourself, protect your code, protect your computer. Please do use 2FA. It helps SMS isn't audio check. Okay. SMS is not a very good two factor off mode. It's pretty easy to spoof that, get a token. They're cheap. Some projects companies companies are giving a two factor off tokens pretty regularly. And lastly, try not to work alone. I know lots of us, myself included are single maintainers of open source projects. It's kind of cool. It's fun. But if that project becomes picked up by a critical project, a big project, then you have a responsibility to not be the weak link, the only person. I'm actually quite serious about this. You are, sorry to say, a potential vector for hostile actors to compromise large software packages, indirectly through the small dependencies that you maintain. So your responsibility is to help secure those, even if you're the only maintainer on it. This is ultimately about balancing the rights of the individual going all the way back to the nineties and Bernstein, the rights of the individual to publish our creative works and to use free software against the responsibilities of a community promising to create trustworthy software for others. That is now part of global infrastructure. It's a balance there. There isn't a single answer and a single solution. But be mindful of that. Because we have all felt the risk of single maintainer projects. And the risk is not just coming from the lottery problem. We have politically motivated attacks. We have profit motivated attacks coming through open source, compromising individual developers as a vehicle to greater compromise, even if only for a day, something like left pad could be a massive vector for malware, if the account is hacked for just one day. And so any general supply chain security solution has to be adoptable at both ends of that spectrum, the super big projects and the small individual ones. This is some of the big feedback I keep giving to folks, both in government and policymaking and companies like, yeah, that's some cool software you're building and product you're selling. But that's too heavy a lift for the long tail of projects. We have to be mindful of that as we're building new stuff. So the industry has a ton of responsibility here too. Companies need to recognize that the choice to use open source is a choice to externalize their risk budget. We need to start measuring the cost of open source and including that in our budgeting because that cost comes in the form of staff to maintain it, to patch it, to install, to learn it, and to handle security issues when they come, say, from a log for J. If you can't afford that cost for commercial software, maybe don't use the open source version and ignore it, right? Use it if you can afford it. But your investment, you'll get a better ROI if you put some of it upstream, either in the form of your staff working upstream with a project or throw money at a company or a community doing that work already. So you are benefiting downstream, take responsibility for it. Also, the industry should really align our incentives to the long term. Don't build open source as a quick sales cycle and then abandon it once it's adopted. That is damaging to all of us in the world right now. No one wants a bridge if it's going to fall over in five years. And consider the ethical implications of things before you publish them. Today, from government on down to social media, companies are faced with a challenge of deepfakes and rapidly trying to build new tools to detect and counteract deepfakes. Deepfakes are the result of publishing machine learning as open source. It wasn't the original intent of image classifiers or NLP natural language processing. Wasn't the original intent, but it is a side effect of giving the world this tool for free and people making changes to it. And that is now a tool for fishing attacks, for fake news and propaganda that is going to be very hard for society to deal with. So when you build the next really amazing cool invention in open source, consider the impact of publishing it. What will someone else do with it that might not be what you intended? I'm going to leave you with a couple of closing thoughts. And I think we'll have some time for questions, maybe a little bit. Yeah. Demographics are changing in the US and in many other parts of the world. The diversity in the next generation is so much higher, right? Gender and racial diversity. So if you want your open source community to continue growing and have new developers, younger generations coming in, you need to make it a space that is safe. You can do this by not just writing a code of conduct and posting it on GitHub, but actually developing a practice to handle consent incidents, sexual harassment, bullying, cyberbullying within your community, build that muscle in the team, in your culture, because it happens and it's far too often swept under the rug and people just leave. So your product is not welcoming, is not ASA of space. The next generation of contributors are going to go somewhere else. You don't want that. You want to be welcoming. And lastly, I'm going to borrow a phrase from one of my favorite magazines, Aeon, that humans are the ape that automates. We are primates, but we build tools and we teach those tools to other people that stores energy, like muscle energy, going back to like how to create a fire, how to use a bow, how to use tools to dig and farm to today open source. The printing press was a tool that enabled the rapid iteration of ideas and the rapid sharing at a scale never before seen. The creation of the printing press lead the foundation, I believe, for public education and for massive societal change, government organizational change. Now open source lets us iterate on a different kind of tool, a tool that is not just ideas, but that directly reshapes our world. Any one of us can write an algorithm or a piece of software and have it running on millions of devices the same day. We can design a new component, the model of a 3D printer and people on the other side of the world can print it tonight. All of this has happened through open source software and our culture of knowledge sharing. That is reshaping society today and governments are responding. They have to. There are bills in front of Congress right now to require software bills and materials to require regulations around use of software in different ways. There are bills trying to change how we encrypt and transmit data online. Government is responding. We all have a responsibility to move forward together on this, respectful of how society will change as a result of our actions. And it's an incredible work we do. We are magicians and that comes with responsibility. Thank you. Thank you, Ava and for joining us and for the insightful presentation. As Ava mentioned, we have a bit of time for some Q&A. So if you have a question or a thought about some of what we heard today, I'm happy to bring a mic by and we'll see if we can answer it. Cool. Start off with George. Can you give us a little bit of color on the economic disparity? I'll give you an example where it's like the PIE-PIE situation where it's like, hey, we should probably, you know, try to secure critical projects but then as open source maintainers we're talking about, burnout, volunteer time. But if you look at the economic dependencies of just PIE-PIE alone, there's no, the economics, can you help make sense of that? So I wish I could. You're right. There's a huge disparity. I'm not, I don't think the PIE-PIE maintainers are comfortable with me sharing their operational cost but simply running PIE-PIE itself is incredibly expensive. They are, it is funded through a donation of, of cloud resources, compute, storage network bandwidth that is immense. It's public infrastructure essentially in my view and so there are discussions before Congress right now of how to classify certain types of open source work as public infrastructure and thus fund it by the government. That might be a way to balance this. I don't know yet. We're trying to figure all that out but you're totally right. There's a huge disparity between a bunch of volunteers who build cool stuff and then companies build amazing products with it but the revenue doesn't flow back and that disparity is part of what's created the situation. I saw some hands back there. You mentioned earlier that the old belief that open source means there's more eyes on it and that automatically makes it more secure that that's basically a myth but I'm just wondering do we know is there good data to show like is there all else being equal? Is there a difference in terms of software quality between open source and closed source? Is there still, can we still say there's an advantage even if that old idea is a myth? I think there are two clear advantages. One, when issues or bugs or vulnerabilities are found, people are empowered to fix them. That cannot be understated. And the second is some open source. And this goes to the there's a huge spectrum of things we call open source. I can't say all open source is more secure than all closed source. That axiom doesn't work at this scale but some open source projects have implemented community-led policies that result in much more secure software. For example, requiring multi-party review for every commit and not just two people but two people or three people from different companies. So there's a check and balance there to prevent any singular actor from sneaking in something bad or having gated CI systems with essentially production quality testing before merging codes or the main line is always operable. There's a lot more as well that can result if a community follows those best practices in very robust software. Doesn't mean it's going to be free of bugs but the quality bar is very high. And of course, lots of open source on the other end of the spectrum where projects don't follow that and who knows how good they are. Hello, thank you for your talk. I'm curious what your thoughts are on how we communicate the way we trust to organizations. I'm thinking specifically government. The reason being is I just recently dealt with the government entity and they're like we need these builds to be done in the US and I've got a long term contributor I trust who's in France who does those builds and so I trust them but the government entity of course didn't because they want to start someone in the US. So how do we communicate our metrust model to those types of organizations? Excellent question. I suspect the answer is you don't, you follow, it's the government you do what they're asking. Like if they're saying this must be built in the US on US soil on hardware that is sourced from US manufacturers only operated by US citizens only as an example hypothetically those are just requirements from the government. They're not going to listen to you like oh trust the thing built over there like can't do that. There are requirements in federal procurement as well around this. So yeah. It can only run so quickly. So you you basically detailed how it took about 40 years for for us to get to a point where we started to really face the things that we could see as inevitable, you know, just a few years into a lot of these things. It seems like we're we're in the same situation when it comes to the open source nature of research in the biological, you know, pathogen world. And, you know, do we have 40 years to figure out all these same processes in that space? One of the things that keeps me up at night is knowing how much open source biology is out there now and how easy it is for lots of people to walk into a lab and print invented brand new deadly pathogens because of open source bioprinting technology. No, we don't have 40 years to figure out how to solve this. If we have time for one more. We'll do one last question. And then we've then we've got to probably wrap it up. So you talked a little bit about what the Congress is doing, which I didn't know, but fundamentally, this seems like a worldwide international issue. So do you know of any efforts addressed at that level? Yes. So the European Commission and Open Forum Europe are working on they're looking at similar legislation around open source and software supply chain security. UK is looking at stuff as well in terms of cyber security in general and privacy online, both India and Japan have passed things recently bills in that space. I think they're they're tailing US and Europe a little bit. The I can't say what the UN specifically is doing on this right now, but I won't be surprised if they start. Certainly the Atlantic Council sort of a cross Atlantic political advisory body, advocacy body has picked this thread up and is certain to advise policymakers on both sides of the Atlantic about it. Cool. Well, thank you very much, Ava for joining us and for for for sharing. I think all of this will be super impactful on not just not just us as as users and operators of open source in our in our in our companies and with our employers, but as as how we develop and release and participate in. I so so thank you very much for joining us and for being for for joining the scale family. I don't I don't have the the jersey for you today. Supply chains are a thing. But our our tradition is that keynote speakers walk away with a scale jersey now that you're part of our team. And we'll we'll be mailing one to you shortly. But with that, thank you again. And we'll we'll proceed with the rest of scale today. Again, things that you might want to check out as you're as as you're wrapping up today, we've got the expo floor open downstairs. If there was a fireman's pole through the center of this room, I believe you would land right in it. Please go by say hi to our sponsors and our community booths and thank them for the for their support in their time scale. Wouldn't be possible about them if you see somebody in a scale jersey or a scale orange safety orange t-shirt. Thank them for the network and for all the other infrastructure around the show that's again entirely volunteer driven. You you you too can get a Star Trek AV uniform. You just got to join the AV team. So and then, of course, game night tonight, about seven p.m. down in the Plaza Ballroom tomorrow morning, we'll be back in here to learn about how to build diverse communities. And then tomorrow afternoon, we'll be here to learn about how open source built the Internet. So looking forward to seeing you all and thank you for joining us again for scale again, March 9th through the 12th Pasadena Convention Center. See you there. Thank you. Then another guy raised his hand and said that he he had the same problem but with security issues and they started to talk about it. And after a while, another guy raised his hand and said that he don't know where to put all the communities misconfiguration resources, because if you put it in the application repository, then all the developers might mess it up and he don't know what to do. And they kept talking about the developers and those developers and what to do with the developers who don't know when the developers that don't understand. You got zoom up. You accepted the call. No, but like if you can just get me to your zoom startup window, we can do tests just from or really test computer audio. OK, so I think that. Yeah, no, that's PCM. Audio code. There you go. Test one, two, one, two. Yeah, it just takes me a minute. And it's probably if would you want to just do a test between two items just to see because I can't test or see how distorted it is on this end? Yeah, because it could be just too high on going through the line. Actually, oh, this is speaker and headphones. All right, cool. Test one, two, one, two. Test by Mavic test. Intro words before I start the recording. Should I put on one of those wearables then to get a mic or should I just borrow this and bring it back here to him? Either way, like he and people are just asking questions with the handheld, they he will hear it now. OK, so it won't really he won't need to repeat. So yeah, so you guys should be fine now. The only thing I'm worried about is that it might be just too hot on the other end. Yeah, well, we'll find out. I guess what I intend to do and let somebody talk to me off the ledge is I'll click open that boomerang after the recording stops. And then I'll turn the laptop around because then the camera will actually show the room and I'll leave the mic off. Yeah, are we getting? OK, so we're not going to sound out at this point. No, that's that. Yeah, sound out. It's going to come out from back. Like the return from him is going to come from that computer. OK, so the question is, which mic should we use for one of us to use it? How about if I borrow this mic? Not those. Just any other ones that are on the camera. Anyone that's in there? Yeah. Your whole show will fail. And here is another one. OK, just curious. Why do they all say 3915? Is that the frequency or something else? It's a property number. OK, got it. Just give me a thumbs up whenever we're ready. We good? Hi, welcome to the cloud native track as we start the state the day at scale on July the 30th, 2022. Our opening session is a great and interesting one on what's been learned from a hundred plus Kubernetes postmortems. The speakers, unfortunately, couldn't be with us today. So but they will be present over for a live Q&A over a remote Zoom link. However, I'm going to start this as a recorded presentation just because we didn't want. We wanted to minimize the exposure to conference Wi-Fi feeding video from a remote site overseas. So you're going. I'm going to kick off the video in just a minute. One thing I advise you of is this video was recorded at a another meetup, a flux meetup. So if you see a few captions on there that say flux meetup, you're still in the right place. Don't worry about it. So with that said, let me introduce via recording. Noah Barkey and Shimon Tolt. Hi, everybody. Thank you so much for being with me today. Thank you so much for coming. Today, we're going to talk about one of my favorite topics, which is what we've learned from 100 class plus Kubernetes failure stories. But the story behind this talk actually began long, long time ago. One day, my dear friend, Neem, who is also a colleague and DevOps engineer in the company that I work for, has invited me to join him to a meetup. And I said, yeah, sure, sounds fun. What's what's the meetup? And he said, I don't remember something about flux. Definitely Gidobs. And, well, first of all, he told me had made Gidobs. I mean, come on. But there still was a small problem, though. I don't consider myself a DevOps engineer. I'm a full stock developer by heart. I mean, I do a lot of DevOps tasks and I work a lot of that with DevOps technologies. But I'm not really a DevOps engineer. So I told Neem, I really don't think I should be there. I'm not like real DevOps, you know? And he completely got upset with me. He told me that it's a complete nonsense and that he will send me the address and to be there tomorrow, 8 a.m. He told me, no, I will be waiting. Don't disappoint me. And I said, okay, let's do it. And we went. And ladies and gentlemen, it was the best meetup that I have ever been to. Why? Two reasons. First of all, it was the first time that I realized how much I love DevOps. And the second reason was that it was the first time that I ever said out loud that I believe that every developer should practice DevOps. And now, after that, I said it, now we can really start talking. So, hi, my name is Noah Barkey. I am a developer advocate and a full stock developer for about seven years. I am also a tech writer, one of the leaders of GitHub Israel community, which is the largest GitHub community in the whole universe. And I work at an amazing company called The Tree, where we help developers and DevOps engineers to prevent Kubernetes misconfigurations from ever reaching production. Now, why am I telling you all this is because before we launched The Tree, we wanted to learn as much as possible about the common misconfigurations and the pitfalls in the Kubernetes area. So what we did was to read more than 100 Kubernetes failure stories and watch webinars and videos and lots of other stuff. And this is exactly what we're going to talk about today, the lessons that we've learned and how you can prevent it from ever happening to you. But let's go back to the meetup story. So we went to the meetup and obviously I was the only developer there. I remember that everybody seemed to be like, so corona like they figured everything in life. And they started with a 40 minute session about flux, which was very interesting. And then the organizer said, pizza's in the bag guys, let's take a short break. And then we'll do the panel. And I looked at them and I was like, panel, what are they talking about? And so they've told me that apparently they wanted to have a cloud native expert's panel where people can ask them whatever they want. And how it usually goes with panels, people are too embarrassed to ask anything, right? So after three, four minutes of awkward silence, one guy raised his hand and said that he's very frustrated with the developers in his organization because they just started to use JFrog registry and not only that the developers don't know how to use it, they always get so mad because their builds are getting failed. And he don't know what to do with those developers and what like what to do and ask what are the best practices in shifting left. And then another guy raised his hand and said that he had the same problem but with security issues and they started to talk about it. And after a while, another guy raised his hand and said that he don't know where to put all the communities misconfiguration resources because if you put it in the application repository, then all the developers might mess it up and he don't know what to do. And they kept talking about the developers and those developers and what to do with the developers who don't know and the developers that don't understand. And I was like sitting there and keep hearing about those developers who don't know and those developers who don't understand anything and can't ruin everything. I was like, how can they say that? But you know, I was too embarrassed to ask anything, right? Because I'm one of those developers. But after a while, I decided that, no, no, no. This is not a bug that I am the only developer here. This is a feature. So I raised my hand and I said, hi, hi, my name is Noa. I am a developer advocate and a full stack developer and I hear that you talk a lot about the developers in your organization and I'd like to speak in the name of my people. No, no, no, I didn't say that. Just ask regular permission to speak and everybody, I remember everybody like turn around to see whose voice is speaking here and permission was granted and everybody looked at me and I hesitated. But then I said that, well, you say that we don't understand. You say that we don't care and you say that we might ruin everything. And first of all, you're right. You're right. But give me some credit here. I mean, you forget the most important thing. You forget that we're different personas. We work on different tools. We work with different, we even have different goals. I mean, I wake up every day to be the best feature machine that this world has ever seen. I have code to write, tests to run, tons of pull requests to review. I have the next feature to plan. I also need to worry about best practices, security, stability, how my code is maintained and how readable is my code. Why should I suddenly care about those YAML files that you put to my repository? What the heck is Terraform and why is the memory limit in Kubernetes is so important? How do you expect us to work together on the same pipeline, on the same technology when I don't even understand it? And then one guy turn around and ask me the question that I fear the most. He said, so what do you suggest? And I looked at him and I smiled. I was terrified, but I smiled. And I was like, okay, okay, I told myself, okay, let's do it, no, let's do it. I was like, you know what? Let's talk about what I think you should do and you should delegate the knowledge. So first of all, you need to delegate the knowledge. You need to learn what are the best practices but to teach others in your organization. Now, this doesn't mean that suddenly everybody needs to learn everything about Kubernetes. No, this is not what I mean. Do it wisely, choose your champions, pick those developers who, those developers. Pick those developers who are most interested in infrastructure code in DevOps technology and delegate your knowledge to them. Think about front end and back end developers. Every organization has those front end developers who will never do back end development because this is only about, I don't know, controllers and repositories and RESTfuls APIs. But on the other hand, you have those back end developers who will never do front end development because this is only about styling and CSS and we don't want it. But every once in a while, you have those full stack developers who do both. This is the kind of developers that you look for. The developers who wants to do both because they are sort of like the ambassador in the front end and the back end development. And if you delegate your DevOps knowledge to them, they can become your ambassador in all the development teams. So pick those developers who are most interested in DevOps technology and delegate your knowledge to them. Then they will be your ambassador and they will delegate the knowledge to the rest of the developers in the team. Now, there are many ways to share knowledge. You can have company meetups, you can share newsletters, white paper, articles, you can send emails, but the best way to learn about best practices and to share knowledge is to learn from other people's mistakes and to learn from other company's failure stories. This is why I would like to welcome you to my very own private show, What's the Mistake Game Show? Are you ready? Let's do it! I'm super excited. Okay, so the game goes like this. I'm going to show you two Kubernetes manifests. Each time I'm going to point into a specific key which is configured differently in each manifest. You will have to look very carefully and tell me which one you will deploy. Left or right? Are you ready? Let's do it. Okay, this is a cram job configuration. Pay attention to the concurrency policy. Which one you will deploy? Left or right? I really need to add the music this time. I was like, in my mind, I was like, t-t-t-t-t-t, okay, I'll let you think. And the right answer is right. We always want to make sure that we set the concurrency policy either for bid or to replace. And the reason why is because when we set it to allow, whenever the cram job will get failed, it will not replace the previous one. And this is something that happened to Target. They had one failing cram job that created thousands of pods that were constantly restarting. And not only that, it took their cluster down, but it also cost them a lot, a lot of money. But let's move to the next failure story. This is another cram job configuration. And once again, pay attention to the concurrency policy. Which one you will deploy? Left or right? And the right answer is right again. You see here on the left side, the Orlando is an online fashion company with over 6,000 employees. It's a big company, guys. And what they did is that they actually used the correct configuration. However, they placed it incorrectly in their YAML. So not only that it took their API server down, it obviously cost them a lot of money because they had downtime. So yeah, very, very, very sad story. Let's move forward to the last question. This is a very innocent pod and pay attention to the containers. Which one you will deploy? Left or right? And the right answer is right again. You see here on the, let me go back. Here on the left side, we don't have any limit. We don't have any memory limit. And this is something that happened to Blue Metador. They had one pod that served third party application. It was Sumo Logic application, if I remember correctly. And because they didn't add the memory limit, nothing basically stopped from those pods to take up all the memory. And though those containers were memory hogs, so they basically took all the memory in the node, which it took their API server down with out of memory issues. Very sad story. Blue Metador was a small startup company then, and I believe that right now they are much larger, but it was very tragic for them. And you see targets, Orlando, Blue Metador, they aren't the only companies who suffered from these pretty innocent mistake, if you think about it. I'm talking about big companies. I'm talking about Google, Spotify, Airbnb, Datadog, Toyota, Prezi, lots of other companies that share their own for Kubernetes failure stories. And not only that, I highly encourage you to learn from these companies' failure stories because it will inspire you to think about other use cases and other edge cases that you probably didn't experience, but it will also will force you to ask yourself the ultimate question, which is how can I make sure it will never happen to me? And this question is very important because it forces you to think about whether there are workloads in your organization, whether there are requirements for your workloads, and what is the security and the stability that you want for your production and how you can achieve that? And the answer is best practices enforcement, policy enforcement. But before we talk about, and we deep dive, how to start with policy enforcement, I want to talk about the most important thing, which is to start small. And I see a lot of companies that tend to forget to do that, which is to take it in small step. Don't drop on everybody gigantic policy restrictions in one day. Don't drop on everybody policy enforcement in one day. Do it gradually. Pick one team, take that team as your pilot team. Have a meeting, make sure that everybody understand why you do what you do, why right now, what is the scope of this enforcement? And gradually add more policies to the process, add more teams to the enforcement, and this, and like with small steps add the policy enforcement to the entire organization. Now, I believe in two things. Two things. I believe in shift left and I believe in GITOPS. I believe that every Kubernetes manifest should be handled exactly the same as your source code. And that as soon as you identify a mistake, the less it might take your production down, it's a fact. Now, the way I see it, we should automatically validate our config files on every code change in the CI. And furthermore, if you will use tools that can be used as a local testing library, it can extremely help you nurture the DevOps culture in your organization. Why? Because developers are used to local testing. This is actually part of their policy. Every developer runs a local testing library before he or she submit a pull request. And guess what? They expect at least those tests to be run again in the CI. So allowing the developers to use tools that they know to use tools that work pretty much the same as local testing library will allow you to delegate more knowledge to the developers. And therefore it will liberate you from the constant need to fence every Kubernetes resource from any possible misconfiguration. Now, from our research, there are basically three types of misconfigurations. We usually have the syntax errors, which kind of combines all the mistakes that happened because we accidentally submitted an invalid YAML file or a Kubernetes resource with invalid schema. And I know that it may sound very basic, but you'd be surprised to know how many companies share their failure stories because they accidentally submitted invalid Kubernetes resource. Zalendo is one story. Another of my, one of my favorite stories is about a skyscanner and they accidentally deleted two of their home health charts curly braces. So they basically had, like they deleted all their services and they had one corrupted namespace and they had five hours of production downtime. So it happens. The next type of misconfiguration is what I like to call knowledge issues. And because the thing about Kubernetes is that it kind of combines everybody in the R&D department. You have the developers, you have DevOps, you have security, you may have machine learning engineers, you have ITs. You have a lot of different personas that need suddenly to work on Kubernetes on one technology. And the absurd is that only the DevOps engineer usually know what are the best practices and how to actually use Kubernetes. And Kubernetes is like a cockpit stand. You have a lot of that buttons and a lot of default configurations that in most cases are not suitable for you. And you need to learn what is the most suitable use a configuration for you and whether or not to use the default configuration. And we usually lack the knowledge. So most of the misconfigurations and the failure stories that I see are related to knowledge issues gaps because we have a lot of different personas that don't know how to use Kubernetes and what are the best practices. And the next type of misconfiguration is what I like to call team alignment because learning about best practices in the industry is not enough. You also need to worry and to make sure that you are aligned with your teammates, that you are aligned with the best practices and the internal best practices in your organization. And I'm talking about things like using a private registry for your images or setting a specific amount of limits for your workflows or jobs and tasks and other stuff. It kind of combines all the internal best practices that you have in your organization. And it's really important to make sure that you and your teammates are aligned with the same best practices. Now that we have these types of misconfigurations let's review them one by one to see how we can enforce and make sure that none of these are happening to us. So starting with the syntax errors, we want to make sure that our resources are valid. And the first thing that we want to do is to verify our file format. Whether we use JSON, YAML, XML, Docker file we can make sure that our file format is valid. And one of the tools that I really recommend you to use is the Wykey, which is a portable CLI YAML processor. It's very easy to use. It's open source of course. And I highly recommend you to try it. And once we've verified that our file format is correct we also want to make sure that whatever is written in that file is also correct. So we want to verify our resource syntax that the schema, our Kubernetes schema is correct also. And the built-in option, which is the option that I usually prefer is to use the kubectl with the dry run flag, which basically tells the API server to not apply these resource, but to only validate the schema. Now with the dry run flag there are a couple of challenges because to use the dry run flag you need to pick one of two strategies. You can use the client strategy, which is not very helpful because it only prints out the YAML that's supposed, the resource that's supposed to be submitted. And the server strategy, which is the strategy that we, this is exactly what we want to do. We want to only validate the resource and not apply it, but it requires cluster connection. And usually not only that the developers in the organization don't have kubectl access, we certainly don't want to also give all developers cluster connection. So as an alternative you can use kubectl form, which allows you to do the same thing. It basically pulls all the schemas into a Git repository and query that repository to allow you to validate your resources without any cluster connection. We use it even in the tree open source. It's very, very easy to use, highly, highly recommended. Now let's talk about how to start enforce best practices. And the first thing that you need to do is to define the policies that you want to enforce. Whether you want to make sure that every container has a memory limit or that every crime job has a deadline or that you set the liveness and the readiness or the startup props for your containers. Maybe you want to make sure that every, that you put, you use your private registry for your docker images or that you use they just jazz. There are many best practices that you probably want to enforce, but you need to define all the policies and the rules that you want to enforce in your organization. And most importantly, you need to think about where in the pipeline you want to distribute and enforce your policies. And this is very important question because where you will decide to do that might affect your entire organization. For instance, if you will distribute your policies in the CI, it can affect all the developers in your organization. So really think about where is the most suitable place for you to start enforcing your policies. But let's say that you delegated the knowledge and now you have, you even have developers, champions that you trust and that you delegated the knowledge to and you define the policies that you want to enforce and everything is perfect and rainbows and stars and stuff. The bad news is that this is only the beginning because you also need to manage your policies. And by that, I first of all mean to have an easy way and intuitive environment to dynamically adjust your policies. And I see a lot of people that tend to think that Git is the right place, but I think that they're wrong because Git is great for version control, for cooperation, but not for management because it won't provide you anything that you need. Give one provider a way to grant permissions to who can delete or create a policy. Give one provider a way to see which policies are being used in practice in all your organization, in all your CI, CD pipelines, in all your services. And when you have dozens of repositories, when you work with GitOps, you now you have another nightmare to worry about people using the same policy when you modify it. And not only that you want everybody to use the same Kubernetes version, you want everybody, you now need to worry about using the same version of your policy. So you also need to worry about how you monitor, review and control your policies along in all your organization aspect. And another thing that is very important is to provide guidelines along with your policies about how to fix your policies. You want to make sure that no developer will feel frustrated because they don't know how to add liveness or readiness probe. You want to make sure that after you put so much effort in defining the policies, writing the policies and enforcing the policies, people actually know what to do when one of the policies gets failed. So it's really important to provide guidelines why the policy got failed and how to fix it. And I promise you that if you'll do it, no developer will do the same mistake. Again, nobody wants to take down production. Not only the DevOps, but also the developers, nobody. And this is how we realize that having a centralized policy management is the right solution. And the good news is that you have a lot of tools that you can use for policy enforcement. You can use Gatekeeper, you can use Conf Test, you can use Trivi, and you can use Cape Verano. And they're all open source, so you can use them today. But the tool that I want to talk about is the tree. And to show you how we believe in the tree is the right way to enforce your policies and to prevent misconfigurations from reaching your production. So what is the tree? The tree prevents misconfigurations in communities. Misconfiguration such as liveness probe, readiness probe, memory limits, digestions, security best practices, and lots of other stuff. And the tree is integrated in all your CI CD pipelines in your organization. The way that it works, if you think about the CI CD pipeline, we usually have on the one hand, we usually have the DevOps engineer, which is the persona that actually know Kubernetes, know what are the best practices and what is the right standards for the production to be stable and secure. But on the other hand, you have the rest of the engineers in your organization, which usually needs to work on the Kubernetes resources, but they don't really understand anything about Kubernetes. So the tree is right there in the middle. It provides an environment, an intuitive environment for the DevOps engineer to implement policies and to enforce the policies in all the CI CD pipelines. On the other hand, it provides a CLI that the developers can use to scan their resources in the code, on the local machine, in the CI or as a pre-commit hook, and to validate their resources on every code change and automatically in the CI. And because the tree is a centralized policy management solution, if you change a policy, if you modify a policy, the tree will automatically propagate the changes in all the CI CD pipelines that according to your settings. Now, how do we do it? We have a CLI, an open source CLI, we have thousands of stars, thousands of contributors, and I highly encourage you to join our community and submit a pull request. And the way that the CLI works is that on every file, on every resource that exists in a given path, the tree runs an automatic checks that basically combines everything that we just talked about. The tree validate that the file, a YAML is valid, that the schema is correct, and that the resource is compliant with the best practices in the industry. Because we know how hard and what is the effort that you need to do and to put into defining policies. We know how hard it is. So we got your back. We already provide built-in rules and best practices for your communities, like memory and live as probe and readiness probe, and to make sure that you don't use the latest version and a lot of other stuff. We already provided built-in in the CLI so you can basically plug and play it. To use the CLI, you basically need to install it on your machine. It's a basic code command. And then simply scan your resources with tree test and the path of all the files and the resources that you want to test. And yeah, that's basically it. And I'm very excited to share with you that we just released our web hook in the cluster. So we integrated in the cluster as well. So when every resource that you apply to the cluster, the tree can scan it. And if it's not compliant with the best practices, the tree can enforce that resource and to deny it from being applied to the cluster. To use the web hook, you basically just need to install it on your cluster. It's an admission controller. It's the admission web hook that is used integrated to communities with the admission web hook and admission controller API. And all the policies and the rules can be dynamically adjusted in our dashboard application. They're all written in JSON schema. And to sum up, I highly encourage you to start. I really hope and highly encourage you to think about whether the policies that you want to enforce in your organization, how do you want to do the shift left in your organization? Thank you so much for listening. Bye bye. And then Noah is going to join us remotely over Zoom for live Q and A. So give me 30 seconds while I set that up. Okay, okay, okay, okay, okay, okay, okay, okay, okay, okay. Sorry about that, sorry about that, sorry about that, sorry about that, sorry about that, sorry about that, sorry about that, sorry about that, sorry about that. So, are we good? Good, good, good, good, good, good, good, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay, okay. Okay, awesome. Good. So Noah, can you hear me? Yeah, can you hear me? Unmute yourself now, we should be able to hear you. And now? Oh, wanna turn up your volume? Check volume on that. hear me? There we go. All right. Well, okay, we did it. Okay, so questions. You have to use the mic. All right, just raise your hand. What was the first Kubernetes linting tool you mentioned? The first web? Kubernetes web? Linting tool for the ML? Yes, Q. Why Q? The one after that. Q conform? Conform. Thank you. Yeah. Yeah. It's really good. It's really, really good. Okay, more questions? I don't have a question, but I just want to say your presentation was amazing. Wow. Oh, wow. Thank you. Wow. Thank you so much. I feel like I better understood Kubernetes. Oh, thank you very much. Okay. I don't hear. I don't hear anything. Let me turn mine off. Okay. You mentioned that that three CLI two? Yeah. LinkedIn and as a checking, does it work with Helm trucks? Yes. Yes. We have a Helm plugin where you can install and use it with Helm truck. And so, yeah. Everyone related to that? Operator? Operator? Yeah. Does the linting work with operator? No. We don't have an operator yet, but we are working on it. It will take us a little bit, I think, more than a month until we will release it. But we are working on it, but we don't have anything like an operator yet. We only have an Helm plugin and that's it, basically. One, Josh. It wasn't mentioned, but is the tree open source and readily available to people to use for free? The CLI is completely open source and free. You can use it even without any internet connection. And it works completely offline. And the web hook is connected to our SaaS application. It's also an open source. All the code is out there and exposed in the Git repo. And it's completely free to small teams with two nodes on the cluster. But if you have more than two nodes, then you'll have to pay. And you can see all the pricing in our website. Okay. Well, thank you very much for participating in the first ever remote presentation. Thank you very much, everybody, for joining me. It was a pleasure. Thank you. Bye-bye. Yep. Hello. Yay. Hello, everyone. Welcome to the Cloud Native Track. We now have J.J. Asgar. And he's going to be talking about migrating a monolith to Cloud Native and the stumbling blocks that you don't know. I am personally excited about this because I've had to deal with multiple migrations from monoliths to some sort of Cloud Native stack. So curious what you've gone through that I have as well. Take it away. Thank you. Well, that's what happened. I was expecting it to be. Hey, everybody. My name is J.J. Asgar. I'm a developer advocate for IBM Cloud and IBM. I don't know what that means. That's fine. It doesn't matter anymore. I really do have the email address of awesome at IBM.com. And if you know anything about IBM, that's pretty impressive. IBM is famous for atrocious email addresses. So when I figured out how to do that, I can tell that story later if you want. But it's pretty, I'm proud of it. So I have an hour to tell a story here. Normally, I target up 45 minutes for this talk. But I've been told that I'm a quote, unquote whiteboard presenter. I have spent many a year going to conferences. And I've spent many thousands of dollars going to conferences. And I have fallen asleep at presentations. So my goal for this talk for y'all is, by the way, I'm a Texan, so I say y'all, my goal for y'all is to gain something, learn something, ask questions. We are here all to learn something because either you're moving towards going to cloud native, you're on your cloud native journey, or hell, you don't even know what you're doing. And honestly, you're here to learn as much as you can. So never hesitate to raise your hand. Never hesitate to just shoot out a question. I also stream, so I'm used to being, what is computer JJ when I'm deep in playing around with Python or whatever. So I'm here to, I'm good at pivoting. But let's actually get to the talk. So your company has finally decided to move to cloud native, the cloud native ecosystem. You've landed on containerization as your first step. Seems pretty reasonable. You've heard all you need to do was containerize your first app and then push it to Kubernetes or OpenShift, Nomad, one of those things, and the cost savings will just come. You've done this, and well, honestly, things just haven't gone as planned. Some of the tech didn't do what you were expecting. And wait, what do you mean our OPEX just went off? Simply said, the promise of containerization or migrating the cloud native ecosystem can be a lie if you don't do your homework. Sadly, most companies just don't. And in this talk, I'll explain a few gotchas that a few enterprises and the guys of Asgard Labs hit moving towards this cloud native world. And hopefully you'll learn from their mistakes. So when you take your trip down this path, you'll either be more comfortable or at least closer to that promise. Everyone knows what Docker is, right? So if I start saying Docker, who doesn't know what Docker is? How about raise your hand? There we go. So what is Asgard Labs? Asgard Labs is a multinational tech conglomerate that bought hard into the VM ecosystem. They make ones and zeros like nobody other company. But notice they were falling behind. Maybe that sounds familiar. In all seriousness, Asgard Labs is just a collection of different companies that I've had exposure to working at IBM that I've heard the same stories and the same issues over and over and over again. And of course, Asgard Labs is a fake company. And no, they're not hiring. Thank you for laughing. That was supposed to be a joke. The best jokes are the ones where you tell people to laugh. So what did Asgard Labs think they had to do? And hopefully you hear echoes of this in your own stories. They thought they could take their migration from the physical data center or co-location to the VM ecosystem. How many of you went through that path where you've had some data center and you're like, hey, we bought this thing called VMware. That's pretty cool, right? We can shove more VMs in these fucking things. Well, they thought they could do that. Use the exact same technique to move from the monolithic application to cloud native. Bare metal to VMs was the same as VMs to containers, right? Did anyone nod their head? There we go. I see some people shaking their head. Good. So where do I come into these conversations normally? So we've set the stage of the typical company out there where they're just like, it worked in the past. It should work again today. Well, I come into the conversations with Asgard Labs when they've had a few successful migrations. Honestly, I beg to differ that these were successful, but they called them successful. They're nowhere near where they thought they could be. And had to put so much time and effort and money for the ROI they've spent just isn't there. I come in as that cloud native person, you know, the person who actually understands what Kubernetes, how to spell Kubernetes first of all. And I ask some simple but very, very tough questions. And believe it or not, I actually start every single meeting with one of these conversations with these exact same questions. And it really does drive people crazy, but you find out so much information. And if you have ever been in a fifth grade English class, it'll seem very familiar very quickly. So let's ask those straightforward questions here that are deceptively hard to answer. Who containerized your app? Was it the developers of the operations team? Was there more than a couple status meetings between the project teams and who shipped it? Did you actually have communication back and forth? Do we know that they actually sat down and maybe they were using Podman and you're using Kubernetes and the production environment. So if you didn't know who did it, is it the same snack across the board? There's so many different conversations there that to know who containerized your app is one of the most important questions to ask. Why did you containerize your app? Frankly speaking, the question is, ask our labs containerized because they thought they were told to. Someone on high came down and said we need to containerize our application to move forward. Not only did for the reason that the execs needed to say that their core software stack was now next gen, it was because there was some CIO read a magazine in an airplane once. I know in the cloud space when I moved from a co-located data center into VMs and we were using OpenStack. That's a different conversation for a different time. We were forced to move to OpenStack because the CEO of the company, at the company I was at the time, they wanted to say that they could hire the new talent because they are now cloudy. That is such a weird and consistent mirror now for other companies moving from VMs to OpenShift or Kubernetes or Nomad or the cloud-native ecosystem because they want to pull in the new talent. What does the new talent know or want to learn is cloud-native. They want to say their next gen. If that is the reason why you are doing it, there is a much larger conversation to have. Where did you deploy? Where are you planning on deploying your containerized application? The first from the conversation is a cloud you are using because of choice or an ELA. Who does not know what an ELA is? Enterprise license agreement. There is a certain company out there that might have a word processor and a spreadsheet system that we might all know and I have seen some of the computers here at a Linux conference which is kind of weird, different conversation for a different time, but they give you free credits to their cloud when you buy their software to help you write documents or spreadsheets. I have seen many companies who have had some CIO or CTO out there see that they have a buttload of credits on this cloud and they are like, well, we have all these free credits. Why don't we just put our software in this cloud? It is the whole first one is free conversation. They don't recognize that that cloud might not be the right one for their actual product if you do a lot of ML, for instance. As soon as I say ML, you are probably thinking of a search company out there that does some pretty good stuff with ML. Not that cloud, that company that does that word processing software. Obviously, I am not naming the companies but we all know what I am talking about. It is amazing how many companies have to deal with that situation because if you are using a sub par cloud for what you are trying to do, you are going to have downstream problems before you recognize it. Is this the best one for our company? Or is it the one we are forced to use? This happens so many times. Know where you are planning on deploying it? Or where you plan on deploying it? Because trust me, as someone who knows that IBM has a cloud, first of all, hey, nice. Okay. It was because I said I was a developer advocate for IBM cloud, wasn't it? No? Okay. Again, trying to be joking but you may as well actually don't recognize we have a cloud. Anyway, what I am trying to go at is you need to know where you are planning on going. My favorite one, this one happens so often and I love sitting in that WebEx slash Zoom slash whatever and I ask this question, what did we actually containerize? This comes from the actual conversation of the architecture of the containerized app. One of my best examples is it might be a car company, it might be an airline company, it might be a bank. I am not going to say what it is but they took something called a war file. Who knows what an art war file is? Who knows what an ear file is? Yeah. Well, it could have actually been an ear file but I am just going to say war. War is going to be my catch-all software, right? I am not a Java developer but everyone knows what it is. They took the Java Docker container and the war file and just shoved it in there and then called it today. They were like, fuck it, I am out. I containerized our app, we are done. This opened up so many conversations because if you know anything about Java development, a war file is more than just, in essence it is a container. You put a containerized system on top of the containerized system and then shove it out to your platform and you start seeing some weirdness. Let's take a quick aside and because I work at IBM, we might do a little bit of Java here and there. That is supposed to be a joke. Let's talk about some actual paths. Here you go. Of course, because I work at IBM, I have to have a web sphere somewhere in my presentation. There you go. Thank you for laughing. Exactly. You see, man after my own heart right here. Let's talk about re-platforming first. That was what they were doing. They had some legacy application and they shoved it into the open liberty and migrated to kubernetes open shift. You should be using open shift. Not kubernetes. Open shift. But that is it. That is basically taking your application, turning it into Docker container and then pushing it out. Seems pretty reasonable. It is amazing how many companies just stop right here. Back to that story about shoving the ear file in there. There is the next step you have to take, which is repackaging microservices. Now, I realize microservices is a hot topic and the idea of using small little changes is important. But in the cloud native space, you have to spend, you have to get to the point where you can have some type of microservice. You take that legacy ear app, you split it into two, you have two different wire files, two different microservices and then you have a shared, of course, because I work at IBM, you are using the IBM MQ, right? No? Does anyone even know what that is? Okay. Thank you. But you get my point, right? So you start breaking your app up into smaller little bits and pieces. So instead of having one big ear file with a bunch of applications inside of it, now you have two different Docker containers and a shared Docker container with the message queue, right? MQ message queue. It's probably going to be rabbit, but you know, you get my point. And now as you see, we are still on the open shift in Kubernetes. So we are getting closer, right? We have re-platformed out. Now we have split it out into multiple things. That also now allows for some advantages that I will talk about here in a moment. Now, as you get deeper and deeper, and again, this is a path, right? You are going to have to continually go down this. This doesn't happen overnight. You have to come to something called the strangler pattern. Who does not know what the strangler pattern is? Oh, okay, cool. Well, it sounds exactly like what you assume. You are going to find something throughout the choke and strangler. No, I'm kidding. I'm kidding. Again, my jokes are just not landing today. The idea of a strangler pattern is to slowly but surely take those microservices and make them even smaller. So you start shaving off functions and systems around it. So you get smaller and smaller and smaller entities of running your application. So as you see here, the web app and microservices, you have broken out your database, of course. And now you are starting to get even smaller and smaller microservices. A horrible example but the only best ones I can come up with is we all know about buying stuff on a web store, right? So we have on, what is it? Black Friday here in the U.S. I'm assuming most of us are from the U.S., so I'm going to make some U.S. idioms. I've made that mistake in the past in other countries and it's amazing how many stoic faces I see because they're like, what the fuck are you talking about, JJ? Anyway. Point being, is everybody gets a receipt after you make some purchase online, right? Well, if you needed to do a return on that, do you need to do some type of OCR? Optic. Thank you, character. Yeah, thank you. That's amazing how many times I tell the story and I still don't know the name. But the beauty of it is you got to do some level of that. Well, on Black Friday, right, you're going to have a bunch of these coming in. The idea is that if you have the OCR system already built inside your application, the idea is to spin off that little bit as its own little service, own little function, or whatever. And then run it side by side so then you can have smaller, smaller services. This could be a website, this could be whatever. The idea is to get smaller and smaller so you strangle things down to the most instance it can be. So then all of a sudden, everything is in micro microservices. So instead of just like one big war file, now you have a bunch of little different services. Does that make sense? Have I lost anyone? No, it doesn't make sense. It doesn't. Okay, good. Thank you. You were only like, no. And I was like, okay, start over. So you get the point. The idea is to get refactor and bring it back down. I didn't even mention the word refactor until now because you're going to refactor your application. What is the strangler pattern but breaking up your application to smaller and smaller bits? You have to refactor. That scares a lot of people. But that's the way it is. So I didn't really touch on this as much as I should have, but let's actually talk about the architectural advantages and disadvantages of going down this path. The first thing simply said velocity. If you go down the path of cloud native, you will get higher velocity for your business. The ability to focus on their own histories, scoping your clusters for what you actually need to do will only benefit you and your business. You'll be amazed on how many times I've walked into conversations with rather large Fortune 50 companies. And they've literally said, JJ, we need to use automation to spin up a new cluster because we have this batch job that kicks off every Sunday and we need a cluster already there so we can run this batch job. I'm like, okay, cool. Well, it sounds like you know what you're doing. Like, why are you going down this path? Well, we have a three node cluster and it's doing the thing. We see all this work spike on one machine and the two other clusters are doing something, nothing. So we need another cluster so we can spin it up for our computer. Well, who knows what a pod is? Can you split pods across nodes? No, you can't. I find out through conversation that what they had done is they had actually put a container, maybe a Java container, into one pod, one container, one pod, shoved it on this cluster and they were expecting it to level load across all these clusters. So what they were going to do was spin up another cluster to run the exact same job on that cluster for more compute. So they wanted six times the amount of compute power for jobs on two nodes. Does that make sense? Have I lost you? Now, at a small company, that's a lot of money at a Fortune 50 company, not so much. But the point being is that you not, when you don't pay attention to the actual application and how it's being built and running on the cloud native space, you start just hemorrhaging money. The challenge though is that going down the cloud native space, it requires such a high level of cooperation that you need to learn to build out more integration tests. And along those lines, most likely a completely different deployment policy system. All of a sudden, when you start iterating through new changes and how quickly they happen, the way that you deploy stuff now is going to be completely different. So I didn't actually know when I wrote this talk that people didn't know what the CCB was. So I might have worked at a small company that started with a D and ended with an L in Austin, Texas. I was supposed to make another joke. It was a concept inside of that company called the change control board. Does everyone know what I mean? When I say change control board, it makes sense now, right? But we called it the CCB. So at this time, I was on the operations team and we had a CCB that literally every Thursday afternoon, 4 p.m., would literally meet in an office or a huddle room or whatever you want to call with the director of QA, the director of DEV, the director of operations, and the senior director of the org. And they'd go through every single change that was going to be released on the weekend. Now, I was the operations guy. So when I saw that thumbs up, which by the way, they did a thumbs up, they were like, yes, let's release patch blah. When I saw that thumbs up, that means I was waking up at 3 in the morning on a Saturday night or Sunday morning and releasing code. So I was always sad to see that thumbs up. But the point I'm trying to get at is that there are still companies out there that do these things where there is this little council of elders that say yes or no to releasing software. I've experienced this firsthand and in the cloud-native ecosystem, this doesn't work. When you start changing containers, when you start changing applications at a microservice level, you could be making changes multiple times a day. Now, if you have a bunch of senior directors sitting in a room once a week, imagine how much money that was costing the company. Now, take that a little bit farther and say every time you've got to release software, which could be multiple times a day, they have to sit there and do a thumbs up, that's your full-time job. That's ridiculous. But unfortunately, companies don't recognize this. When they move into the cloud-native space, their whole policy of releasing software, the whole policy of what they've built over years and years of doing this no longer can be working. I realize this is very enterprise-y, but it exists, and it's very important to recognize. I still hate that thumbs up. I'm sorry. I didn't quite say that, but I have conversations off, Mike, and away from cameras about ITIL. JJ, wait. We're doubling iPhone things. Well, of course, as you're walking down this path into the cloud-native ecosystem, you're going to need to audit and verify your work, right? You're going to need to sit down, and you can't just... Just give me a second. I don't want to cough into that, Mike. You're going to have to audit and verify that you aren't doubling up on technology or work. A great example of that issue that I was talking about earlier, with the war file and how they were leveraging or trying to level-set the cluster, is that they were leveraging a scheduler inside of the Java stack, and they... Because it was all interpod, when the job would spin out, it would all just be inside the same pod, so in essence, a VM. I know it's not a VM. I know, but imagine it is a VM. So that's the reason why it didn't level-set across the cluster. If they had leveraged the Kubernetes scheduler and actually spun out as different pods, those different jobs had to be done, it would have level-set across all of them. So leveraging the actual scheduler for Kubernetes would have actually benefited them significantly, but because they didn't audit the work and understand how the technology was being ran, they were trying to spend thousands upon thousands more dollars throwing more hardware at the problem. Also, if you didn't know, inside of Kubernetes or OpenShift, again, I got a cough. The life of a professional speaker, right? There's a load balancer built inside. So all of a sudden, now when you... As you step farther and farther into space, you might... You're going to have to take a moment and recognize that that thing that might start with an F and end with a five, load balancer that you've been using for however long from the big IP and you have a bunch of these really smart networking people, you might not need that anymore because OpenShift just does it with a route, or you might have an ingress policy inside of your Kubernetes cluster that will do all the work for you. You've got to sit there and really audit your work. It's really important to spend the time and make sure you're not doubling up on things. So I might have worked at a small company that did a lot of automation. So is an automation good here? And why is everything so complicated now? Well, first of all, very much so. Of course. With all the moving parts, you're going to need to leverage automation and computers to take care of the human error prone parts out of the equation. You're not going to... You're going to have to trust the bots to do the work for you. And honestly, your app has probably always been this complicated, but you've probably never been able to see how complicated it was. Now you can visualize the complexity and understand that that enterprise architect, those two enterprise architects that have been there for 20 years who run that little application, or am that application, because they understand everything is the reason why you're paying so much money because they understand the complexity. But moving to the cognitive space, and now that you're truly looking at those pipelines, or the bottlenecks and whatnot, and visualizing what's going on, now everybody can understand what's happening. You can focus on those bottlenecks and optimizations. And when you truly get to microservices, you'll be amazed on what information you can get out of it. A great friend of mine quoted this to me when I was walking through this presentation. He said, JJ, you had an ordinary bull mastiff. Everyone knows what a bull mastiff is? It's a big ass dog. Now you have 13 yipping chihuahuas. Thanks, Rob. And it's true, right? You now as microservices inside of the cognitive space, you don't have one big... You've still got to take them for walks. You still got to feed them. You still got to spend the time with them. But now you have 13 of them, and that's important to recognize. And to take it one step farther, another great friend of mine, Ken, if something goes wrong, just as Ken stole this quote from someone, I stole this quote from him, this really hits home. It really does become a murder mystery because if you haven't spent the time and effort to create some centralized way of putting this information together, it's going to be a nightmare. And I am stuck on something. There we go. Take a moment to think about this. No longer, unless you have set up some aggregated way for logging or log monitoring, every log file is going to be unique for every single one of the services. And as you walk through each of these processes, you'll have to figure out what did what when. What did what when? That's a weird saying. More importantly, you need to create some type of standardized logging system. So when something does go pear-shaped, you can actually figure out what happens. That also requires higher-level cooperation between teams, right? It all goes back to working together as these things because no longer do you have one Catalina.out file, right? Now you have a bunch of different files that who knows could be anywhere. That's a different, that's getting into more details than I should. So let's ask some questions about the cultural shift because what have I implied this whole time? It's not a technology problem. It's a human problem moving into the cognitive space. So you're going to need to make some cultural changes. I mentioned that CCB earlier. And then as our labs, the CCB became something almost like the Phoenix project had. Who's read the Phoenix project? We all read it because it's now required reading for our industry. But at the very beginning of the book, when they had the little grouping that came together, right, people stopped showing up at one point because it was overhead. Nobody cared. Nobody spent time there. And that happens. That happens in multiple companies. How many times do people cancel that meeting about making a change or whatever? It wasn't engaging. And it became a burden. With moving into cognitive, you naturally start allowing self-orchestration to happen. You have rollouts and updates. You lean more and more on that pipeline and the collaboration to get the different widgets and different features out in the right time. And what did I just say? You've got to build that pipeline. Who does not have a CI CD pipeline or CI pipeline? Who has a CD pipeline? You're all lying. Nobody has a CD pipeline. What are you talking about? Continuous delivery? No, come on. So joke? Nothing? Okay, fine. With the cultural shifts that you see happen, you'll need CI and CD pipelines. You'll need some level of continuous integration or continuous delivery. You need to leverage those standards with Linting so you can always make sure that you have some standard to your code. Now, the simplest thing to do is create some level of CI pipeline that has some type of Linting. I was an old operations guy, right, person, sorry, that I would be woken up at three in the morning when something went wrong. And I had to read some code and we did not have a Linter. Something I love about Go specifically is they have the Go format command that comes out of the box. So every single bit of Go code out there runs through the Go format command so you know where that little curly brace is. You know how this is written out. Rust is also got some stuff too, but you get my point. That means at three in the morning, I don't have to argue in my head where that little curly brace is because we've standardized on the Go format command or Pepe. It doesn't matter. The point being is that you have some level of that. And for an operations person at three in the morning, that is super important. So if you have some level of pipeline, you need to have some type of Linting there. So when the downstream product goes out the door in a microservices architecture, everybody hasn't agreed upon where the things look. Okay, I know there's some developers out there that will argue with you until they turn blue, but they're wrong because it doesn't matter what the software looks like if it doesn't run, the company doesn't make money, right? It's the way it rolls out. The overhead of reading code, you should be able to read code as well as you can. So three in the morning when things go wrong, you pay attention to what happens there in the logic instead of the syntax. Does that make sense? Thumbs up? Perfect. See, you are paying attention. Thank you for that. So you've got to learn to collaborate with other teams. The hardest thing I saw Asghar Labs deal with was the actual collaboration between teams. They had some great propaganda around having scrum teams and tribes and shit like that, but nobody actually wanted to do it that way. They just still did it the old way. And collaboration isn't just status meetings. It's more than that. It's declaring shared contracts for jobs and responsibilities with constant communication between teams. Jira tickets can only get you so far. I realize how scary that statement was, but it's true. Sorry, PMP friends. One of the most successful things I ever saw in Asghar Labs company, which I want to name it so bad, but they won't let me, was it every sprint, they switched out one person from one tribe to another portion in that global app. Now, every single two weeks, you had new blood in the team you were on. And it would change. The different person would move around. Okay, for a handful of weeks, that was a problem. But before you recognized it, new blood every two weeks, your documentation gets better because they've got to be useful in two-week sprints. You slow down significantly, but before you recognize it, you hire a new engineer coming on because you're in this experiment. They just read the documentation for that new sprint, and now they're useful. They're not six months out of the line or whatever. It just works. Granted, you need buy-off from executive leadership to make this happen, and there's a lot of social things to go on, but the company that did it, they were able to get features out so quickly because the ramp of time was negligible. And before you know it too, because now turns out Jane Doe over here loves APIs. Hey, two weeks ago, I was working with her on this thing. Turns out she's really good at this thing. Maybe I can just ask her directly about this. It shares the tribal knowledge, is no longer in the small little tribe, it's now in the community. And it just spreads farther and farther, and it builds bridges and builds cooperation. And it's such a simple experiment that if you're willing to risk a little bit of slowing down, you'll velocity will just skyrocket in the long term. I'm really bad at telling that story because there's a lot going on. Did it make sense? Perfect. Thank you. Yes, you did. I love it. So of course, at every conference, you need to talk about visibility monitoring because what do all nerds love? Graphs. I don't have a picture of a graph, I'm sorry. One of the Asgard Labs subsidiaries thought they could just buy one product and call it a day. I know there's some companies downstairs, downstairs, yes, that say that they have a single pane of glass that will give you everything you need to know. I'm here to tell you that is marketing speak. You cannot just buy one product and call it a day. As much as people want you to do that, it is not possible. I've never seen a single company be able to do that. And yes, unfortunately, old sis admins, Nagios is not going to cut it anymore. And they learned this the hard way. Sometimes you have to have multiple monitoring applications and visibility for only portions of the teams they care about. You have to recognize that different monitoring solutions, different visibility have different focal points. So you're going to have to be no longer an expert in something other. You're going to have to figure out what actually you need to look at and leverage it. As badly as Asgard Labs wanted to buy that single pane of glass, it's just unrealistic. There's so many moving parts in the microservices ecosystem or cloud native ecosystem for that matter. This was a huge cultural shift for one of the Asgard Labs subsidiaries. And this is still something they are dealing with because they can go to IEA conference and go down and talk to the expo hall and get sold something. I'm not saying don't. I'm not saying don't, but recognize you need just more than one thing. For the love of God, please. Anyway. So I had to put this slide in here because as much as we are probably engineers in this room or developers or whatever you want to call us, operations people, I don't know. Something you got to recognize is your software you're running is controlled by someone in accounting, right? Like your company needs to spend money to make money. So you need to spend a moment and recognize that the economics of running cloud native is different than what you've done in the past. In the cloud native space, OPEX rules all, right? Operational expenses. It's that simple. You can pay everything with a credit card now, as we all know. I mean, that's a whole argument with shadow IT, right? Well, most of it. But you get the point. CFOs go back and forth on this constantly. Of course, as a downstream engineer or someone as a developer, you probably don't have to pay a lot of attention to this, but you got to make sure your management chain is. Because CFOs go back and forth, some love it, and some, assuming you keep a hold of your expenses and your spend, you can actually see a decrease in spend over time, which is really great. On the flip side, if you don't, you can start seeing things skyrocket very quickly. Again, I got a cough. Also, another thing that most people don't talk about, coming from the data center space or the co-located space all the way into cloud native, CFOs used to use this current term called depreciation. Who knows what depreciation is? Yeah. Unfortunately, not a lot of hands, but that's fine. Depreciation is something that allows CFOs to push money down the line, so they don't have to worry about stuff so much in the past. That's something traditionally called capital expenses. The cloud does not do that very well, if at all, and that causes some problems. Hopefully, your account is talking to your management of some level, so they at least walk through the changes. Because, again, we're talking about cultural changes moving into the cognitive ecosystem, and you probably hadn't thought about the spend and or the challenges of accounting, because that will be a problem with your... It's not just technology. There's human problems here. There's a point I'm trying to get across. You got to make friends with your CFO. What do you mean our support's now on Stack Overflow? Well, frankly speaking, yeah, it is. I'm not going to lie. You're going to spend some time inside the community again. We're at scale. Of course, y'all are already involved in the community. You wouldn't be at scale if you weren't. But it's important to recognize when you're moving towards the cognitive ecosystem or in the ecosystem, you're going to find other different places like Stack Overflow. There are companies that leverage open source and build on top of it, and yes, when I mention stuff like OpenShift or Kubernetes, you're going to have to find out where these people meet because things are going to go wrong and you're going to need to work with them. Of course, you can throw money at companies to make it easier, but if you're involved in the communities, things are just going to get a lot better, a lot quicker. Of course, you can run Kubernetes and OpenShift on IBM Cloud. So let's talk about some tangible things that you can start with to hopefully become more successful. There's a ton of technology to help you get going. The best thing you can do is really take a moment and figure out when you containerized your app, did you really containerize it or did you just wrap a pod and wash your hands of it? Back to that VM conversation. You just did what you did in VMs and shoved it on Kubernetes and called it a day. You have to have a large conversation about why you did this. Was it because you wanted to be not left behind? You just wanted to be cool, go off to the shiny? Or was it because you thought you could leverage some software to get better value for your customers? Believe it or not, that's why we're doing this, right? We're trying to bring value to our customers. Ideally, with masking all these corporations as I have with Asgard Labs, it's helped highlight some of the consistent issues that I've seen. And the best thing you can do is first ask yourself, do you really need to? I mean, I've talked to more companies about not migrating to the cloud instead of migrating. That's the irony of this talk, is that as much as you might want to migrate to the cloud native space, if you're already making money with the product you have, if it's already in the black, don't touch it. Use your next product on the cloud native space so you don't have to migrate it. It's amazing how much time you'll lose, how much engineering effort you'll lose moving from one to another. I mean, of course, you get mandated to do stuff, but it's a real conversation you have to have. If you're already committed, well, then you probably need to take a beat and look for optimizations instead of features, right? This will drive your teams crazy. This will drive your executive teams crazy because all of a sudden you're not building new stuff. You're trying to build resiliency, and resiliency is important. The more you pay up front, the more you do your homework, and you use the correct tool for the job, the better you'll get. What to say it is a good friend of mine, Thomas Kate said, you wouldn't use a saw when you needed a hammer, or use a hammer when you needed a saw, right? Take a moment, just imagine that in your head. I mean, you can use a saw to hammer in a nail, you look like an idiot, but you can do it. Or, you know, you can take that hammer and just beat that piece of wood, you need to cut it in half, you look like an idiot, but it'll work. So think about it. Thank you. We've got some time for Q&A. Anyone have questions? I will fill down a here of the mic right over to you. I'm Pisces. I'm more of a Netflix kind of guy, I can talk about TV shows all day if you want. In the instances where you have a some piece of software that needs a complete rewrite anyways, would you recommend rewriting it as a monolith first, or in that case, would you transition to microservices? That's actually where the strangler pattern fits in perfectly. Because you already have some application that already exists, did I answer it correct? You already have an application that already exists. Now, figure out some feature inside of the application, and build a microservice of that thing, and then remove that from the monolith. So you run side by side. Back to the bad example of the OCR stuff, right? If you already have a built-in side of your monolithic app, you build an OCR service, run it side by side with a monolithic app, as soon as you're comfortable with it, then you can remove it from your monolith, and now you have a microservice that does that thing. Does that make sense? Yeah. So you just piece me all it out. That's the idea of the strangler pattern, right? You make it smaller and smaller, shaving off little bits and pieces, features, and functions into their own little entities. Does that make sense? Yeah, that makes sense. Thank you. Any other questions? Also, it allows you to write it in whatever language you want. Because they're entities, like maybe it turns out writing that thing in Python because you can just import whatever is better than rewriting it all in Node or whatever. Like, you know, because they're its own entity, its own container. Sure. All right. Let's give another round of applause for JJ. Thank you so much. I can put the jokes back up if you want. Got a little bit of a break and then next up will be Rob Richardson at the helm of Kubernetes. That is such a bad pun. I love it. Cool. Where's my cell phone right here? Michael. Michael. It's Michael here. Mike. This concludes the mic check. Cool. Hello, everyone. Thank you. This is the Cloud Native track. Next up, we have Rob Richardson at the helm of Kubernetes. Does anyone here already use helm? Oh, you've got an audience. This will be fun. Go for it. I'm about to learn something cool, then. So here's we're going to build a Helm chart live today where we package up content and deploy it to our cluster. And here's the part where I tell you I am definitely post the slides on my site tonight. If ever you've found a speaker who does that, I would invite you to stop them and make them post their slides right now. I actually chased a speaker once for about six months and he's like, I don't even do that talk anymore. And I'm like, and you never posted them? Which is why you can go to robrich.org. Let's head there right now. I'll click on presentations here at the top and here is at the helm of Kubernetes. The slides and the code up on GitHub are online right now. Achievement unlocked. So here is the source code of the application that we will build and the details of that application are less important. Here is the helm chart. That's the done scenario when we get to it. And here is the Kubernetes YAML. That's where we'll begin. So I definitely started in the middle. Let's jump back here and while we're on robrich.org, we'll click on about me and learn about some of the things that I've done recently. I'm a Jetpack developer advocate. So if you're struggling with deploying to Kubernetes, I would love to learn with you. I'm a Docker captain and a friend of Radegate and Microsoft has given me some awards as well. AZ Givecamp is really fun. AZ Givecamp brings volunteer developers together with charities to build free software. We start building software Friday after work. Sunday afternoon we deliver that completed software back to charities. So if it's optional, caffeine provided. If you're in Phoenix, come join us for the next AZ Givecamp. Or if you'd like a Givecamp here in LA or wherever you traveled from, find me here at the conference or hit me up on email or Twitter and let's get a Givecamp in your neighborhood, too. Some of the other things that I've done, pro microservices and .NET 6, that book just came out. That was a lot of fun. And one of the things I'm particularly proud of, I replied to a .NET Rocks podcast episode. They read my comments on the air and they sent me a mug. And if you'd like a .NET, I'm just kidding. So let's dig into Helm. So Helm. Helm is a package manager for Kubernetes. With Helm, you can template YAML files, you can build installable packages, and you can share packages with others. Hey, wait a minute. Isn't that what Docker does to containerize things? Don't I have a package manager? Don't I Docker pull and Docker push? What's this Helm thing and why would I need a Helm thing in addition? Or does this replace Docker? Docker packages up by our application. We still need a Docker file. We still need to push our image to a Docker registry. We'll still use that Docker tool chain or another OCI compliant container infrastructure. Then we end up with the Kubernetes YAML soup. Now we'll probably have a deployment that specifies pods. We may have a service in front of that. Maybe we have an ingress. Maybe we have roles or persistent volumes or cron jobs or all of the other things that wrap our service into this installable system. Helm is about packaging that stuff, the Kubernetes YAML pieces. Let's focus in on the values that we care about and kind of avoid all of the YAML soup. So Helm doesn't replace Docker. Helm uses Docker to be able to install the system as well. So Helm demos. And the majority of our talk is here. This is really cool. So I popped open our GitHub repository here. Here's our GitHub repository. And I cloned it and popped it open in VS code. Now I've deleted the chart folder so that we can build that ourselves. And we have this Kubernetes YAML file. Now I've already built this image, this at the Helm image. So if we say Docker image list, there's at the Helm version 0.1.0. Cool. Our Docker file is right there. So if you want to comment on my version choices, then I would love to chat with you about that. So let's build a Helm chart. Now there's a lot of YAML here. And so if I were to pick at the Helm, and we can just see how many times that appears in this YAML. There's a lot of duplication here. Now it makes sense. Kubernetes needs to specify all of the details about Kubernetes. But I don't want to deal with it at that level. I want to deal with it with just the values that I care about. So let's create a new folder. I'm going to create a, that's a file. That's not a folder. Let's create a new folder. And I will call this folder chart. And inside that chart folder, I will create a templates folder, new folder templates. And then inside the chart folder, I'm going to create a new file. And I'll call this values.yaml. Now this is outside the templates folder, just inside the chart folder. Now I could have called that chart anything, but by default Helm packages are called charts. So that seems like a good default. Now here in this values.yaml, let's identify the values in this Kubernetes YAML file that we care about. Now this is definitely dependent on your business logic and your opinions. So your opinions may differ from mine, and that is totally fine. But let's start out with name at the helm. I want to focus in on that service type of node port. That sounds like a good one to capture here. Ports, I have a whole lot of port 80 in here. So I'll set that in place. Replicas. Now maybe in our company, we only have one replica or we require you to have three. So maybe we don't want to include this. But in this case I do. So I'll copy replicas in here too. Image label right here, the version. But I want to call it image label. So I'll rename it to image label. Now I'll probably want to swap that out at build time. So I'll just leave this value as image label. I could also set its value to blank if I really wanted to. I've got the image label. Oh, I've got this registry URL right here. So let's include this registry. There's that. I'll probably swap that out at build time as well. Oh, we've got some resource limits. So let's copy in the resource limits here. And let's get this indented the way YAML likes it to be. There we go. And I've got an ingress. Now depending on the service that I'm building, maybe I'll have an ingress, maybe I won't. And so let's create a new value here called ingress enabled. And I'm going to set its value to true for now. We've also got some ingress annotations here. And these are probably dependent on my cloud. And so in this case, I've turned on HTTP application routing, but I may choose to specify an nginx ingress or whatever other details I need about my ingress. So I've captured that. That's pretty cool. And then let's go for the ingress host. There we've got the ingress host. Cool. So out of this 56 lines of YAML, I really only care about 14 lines. All the rest is just kind of fluff. Now I got to pick the values that I care about and hone in on those. So if you wanted to pick different values, maybe you really don't care about the port or it's always service type of node port, then you could definitely pick more or less details to be able to hone in on the pieces that you want. Next, let's grab a chart.yaml. So I'm going to dig into my stash because, of course, great developers go steal from the last time we did this. Let's pull in chart.yaml. And here's chart.yaml. Here in chart.yaml, API version and name, that sounds a little familiar. In Kubernetes, we have API version and kind, so that works. Now I've named this name at the helm. Oh, cool. So I don't need this one. Let's delete that one. We have a description which is kind of nice. Here's a sample of Kubernetes using helm for Kubernetes. We have a type. Now there's two different types. There's library and application. A library is a suite of functions associated with helm charts that I might use to consume in other helm charts. Now in this case, I have an application. I have an end user thing that I'm going to install. So am I installing an application or a library? It's that thing that I'm installing. So I'll choose application in this case. Then I have two versions, the version of my chart and the version of my application. Have you ever gone into your package manager and the person maintaining the package is different than the person who authored the software? So there was some bug in the package, and so they had to increment the version. So now the application version and the package version are different. I can't publish the same version, but the application didn't change, but I needed to change the packaging. So in this case, we do have two separate versions. One to represent the version of the software that we're including in this package, and one to represent the package itself. So if we need to repackage our software, we can increment one without incrementing the other. And it's really easy to understand what's installed here. So we've got our version and our app version. That's excellent. And we have our 13 now values that we care about. Next, let's focus in on trimming this Kubernetes YAML file apart into its pieces. So let's pop open the Kubernetes YAML file again. And let's create here in our templates folder, we'll create a new, not folder, we'll create a new file, and we'll call this service.yaml. And here in service.yaml, we'll go grab all of this content and we'll set it there. And we'll create a new one. Let's call this deployment.yaml. And we'll go grab this stuff. And we'll put it here. And we'll go create another new one ingress.yaml. And we'll go grab this piece. And we'll set it there. Now I'm already liking this because I can split up my pieces into different files. Now I could have probably done that before and then done a kubectl apply-f and just pass in the folder. But now I kind of get that nice separation of concerns, which I like. So let's start off at the service. And let's see if we can replace the details from this values.yaml. So first thing we have this at the helm. Now we had dot chart dot name coming out of the chart. So let's grab that. In fact, we can replace that here and here. We have here the service type. So dot values dot type. Excellent. We have a port dot values dot port. And in fact, we have that specified twice. So let's just put that there. Hey, the majority of our service was just kind of, you know, fluff. That's kind of cool. We've just abstracted all of it out of the way so that we can focus in on just those values that we care about. Let's do that again inside of deployment.yaml. So I'm going to go look for at the helm. Let's replace it there and there and there and there. And that's so cool. I don't need to focus in on all of those details. It just kind of melts away replicas. Let's do dot values dot replicas. Pardon me. I've got dot values dot image label here. And in fact, I also have the image label there. And let's put in the dot values dot registry there. And I'll put quotes around it just so that my editor gets that this is one string, rather than trying to figure out, you know, how do these things link together? Okay, we've got one more dot values dot port. And we'll set the resources aside for a moment. Cool. We were able to focus in on just those values that we cared about. And these are the values that I chose to focus on. But you may choose to include more or less. I was actually working with someone and the only variability that they had in their Kubernetes YAML was the deployment server name. You know, they were just setting environment variables. So their values dot YAML included one line. It was really cool. Line 18. Line 18, I still have at the helm. Thank you. That would have been embarrassing. Excellent. Good catch. Next up, let's head to ingress.yaml. I've got a few more at the helms. This one actually is dot values dot ingress host. And here's another at the helm. Let's put in dot values dot port there. Glad I didn't need to restate the port 27 times. I misspelled values. Cool. Now let's take a look and see how we're doing. I'm going to come in here and I'm going to say helm lint. And what's the folder that I choose to lint? My folder is named chart. So I'll lint the chart folder. Oh, cool. One chart linted. No charts failed. Now it really wants me to put an icon inside of my chart inside of my chart dot YAML file. Okay. So that was the easy stuff. We just replaced this and it looks a whole lot like when we're building, I don't know, our HTML together with our data. We have an HTML template and we have some data coming in from our application and we kind of mirror, marry those together in an interesting way so that we can build up this template. That's exactly what we're doing here. It's just that we're using this, you know, handlebar syntax and we're marrying our data coming out of values dot YAML and chart dot YAML with our YAML. That's pretty cool. Now the interesting thing here is that these expressions are actually just go templates. So if we come to the helm documentation, we can take a look at the various functions that we have and, you know, start to leverage those. But they are just go templates. So we could also just, I don't know, flip over to the go helm, the go template documentation and start to understand how we might use interesting things. Now we definitely could go really intense here and build up this whole logic tree. But in this case, let's just go simple. But I do like this if and else thing. So here in our ingress, if we don't have an ingress, let's say something like if values dot ingress enabled. Now one other trick here is I'm going to put the dash here. Now it's mad because it wants the dash here as well. But I only want a dash here. The dash says space trim. So because I've chosen a space trim here, if ingress is enabled, I'll not start this on the next line. I won't have a blank line. I'll just push this straight up here. And then here, let's do and and I'm going to space trim there too. Cool. Okay. Yeah, I'm going to put the space trim everywhere because that makes my editor a little happier. Fine. Cool. So a little bit of logic here of how we can start to glue these things together in interesting ways. Now we've got this if here. Now let's take on these annotations. Now here in the annotations, I do want to dig into the annotations. But the annotations are well, a bunch of yaml. So I can't just go grab the value. I may have more than one, especially in this resource section, I have some content embedded in it that I want to capture. So instead here, I'm going to say let's say space trim to yaml dot values dot ingress annotations. Annotations. Did I spell that wrong? As long as I spell it wrong consistently. Okay. So I've pulled in my annotations. And now I need to make sure I indent it for spaces. Now I'm indenting it for spaces because well, that's how this is indented. It's indented for spaces. So it's going to grab that whole block of yaml, set it in place right here and move it in four spaces. Cool. So we've got that one in place. Let's also come into the deployment here and we'll grab these resources. And in this case, I need to not delete it values dot resources. And I'm going to indent it 10 spaces. Nice. So we could continue adding more logic depending on the needs of our system. But let's go lint it again. And Oh, nice. No typos so far. I'm going to pull in one more file. I'm going to pull in this um, read notes dot txt just to save us some typing. This notes dot txt just kind of shows some interesting information about our app as we deploy it. Thanks for installing chart.name. Here's how you use it. Here's some more stuff about it. That's kind of cool. So now we've got our thing. I'm going to lint it one more time and everything looks good. Cool. So now let's grab the content of this template and see if we can run it in interesting ways. Now in this case, we only templated a deployment, a service and an ingress, but we could also template persistent volumes and persistent volume claims, config maps, cron jobs, or the yaml that you're sticking inside of your application. It doesn't need to be a Kubernetes thing. It only needs to be a yaml thing. So if you have a particularly gnarly config file that you're waiting to, that you have to set this into it and set that into it, well, Helm might be the right way to take more control over that yaml file. Focus in on the values that you are just looking for. So let's say helm template, and I'll give it the folder name which is named chart and out pops the yaml. Cool. So we've got the 80 in place. We've got the at the helm in place. We've got some stuff here. It looks like we're doing pretty good. Let's do this helm template chart. And I'm going to redirect it to dist.yaml. So we've dumped it into this file here. And so if we take a look at this file, we can see dist.yaml. And I'm going to grab kds.yaml, and I will compare them. So we've got now kds.yaml here on the left. That's the file that we started with. And dist.yaml here on the right. That's the file that we just built. Now we do have some comments specifying what file it came from. That's cool. We also have the quotes that we chose to put around the data here, which is pretty cool. And other than that, it looks like we have achieved the file that we started with. So no bugs so far. Nice. Oh, it looks like I can upgrade this app too. So now that we're able to template this out, we were able to focus only on those 13 values that we cared about, and we could build up our yaml. So now let's start to set variables. Helm template chart. I might just choose to set image label equals v0.1.0, and ingress enabled equals false. Ooh, I didn't get an ingress this time. And instead of spitting out image label in all the places, I was able to spit out the version that I set for it. So let's imagine a use case where we just finished the build. We want to grab some variables associated with the build context and just dump it into our Kubernetes cluster right away. So I can do a similar thing. Let's say Helm template chart. I'm going to set image label equal v0.1.0.1.1.0, and ingress host equals quote quote, and registry equals quote quote. I'm going to just pipe that to kubectl apply dash f dash, and we'll be able to blow that straight into our cluster. Now that's cool. We were able to capture the build context and focus in on just those variables that we cared about. kubectl get all, and we can see everything is running except for, ooh, we've got some problems here. Let's take a look at that. kubectl describe this pod, successfully assigned the thing failed to, ooh, what is the name of my image? My name of the image is slash at the helm. That isn't quite right. Let's pop open our template again and take a look. Okay, so we have our registry slash the chart name. That looks good. If we have a registry, if we don't have a registry, that's probably not okay, because well, we don't want to do this slash. Okay, so let's do this. If dot values dot registry, space trim, then do the registry and do that slash, space trim, end, space trim. Cool. So now that we've been able to conditionally put in that slash, let's try this again. Let's template this chart, pipe it to kubectl apply. The service is unchanged. The ingress is configured. And the deployment is configured. kubectl get all and ingress. There is no, oh, I did create an ingress. I just created it with no host. Nice. And now my app is running. Cool. We just took the details from the build and we just stuffed it into our Kubernetes cluster. If that's the need that you have for this templating system, then you can definitely stop here. This is a great use case for helm. I just focused in on those values that I cared about and I blew those into my Kubernetes cluster. And just as easily as we created them, let's go delete them and kubectl get all and ingress. And we've got nothing running. Cool. Well, that was a lot of fun. Let's level up a little bit and talk about how we might package our helm charts into something that we can then install into multiple environments. Now, again, we don't need to go here. If your need is just to blow the results of a build into a cluster, then you may not need to go here at all. But let's do this. Let's say helm package chart. Cool. So I built this thing, this at the helm 0.1.0, this TGZ file. And if we pop open this file, here's that file. Let's seven zip. Let's unzip on G zip it to there. Let's untar this to there. And let's take a look at what we've got in this folder. Oh, it's our helm chart. In fact, if we diff this folder, let's compare this folder with our chart folder right here. It basically just packaged up all the things. Now I'm choosing not to show identical files. We can see that all the black ones are identical. So let's just focus in on the differences. And it just removed white space and sorted things. That's fine. Cool. So we're able to package that. And that's pretty cool. Let's come delete this one and delete this one. And let's package that with a little bit more details. Helm package chart. I'm going to set the app version to 0.1.1.0 and the version of my software to 0.1.0 and the destination folder to the disk folder. Cool. So now I have this at the helm inside the disk folder version 0.1.0. Okay, so let's install it. Helm. Oh, so that was cool. I got to set my app version and my version. What if I want to deploy, what if I want to add those variables associated with my build context? So I wanted to override that image label. So let's do that exact same thing. Except for now I'm going to say dash dash set. And I'm going to set image label to be 0.1.0 and registry to quote quote and ingress host equal quote quote. And I left off a quote right here because terminals are wonderfully weird. Oh, unknown flag set. It turns out that the package command doesn't have the set flag into it. But there is a really cool helm plugin called helm pack, which has exactly the same syntax as the package command except package plugin. And so because I've got that helm package or helm pack plugin rather, I can come in here and I can say helm pack, not that helm pack. And now I've got that at the helm 0.1.0 that includes these variables set. Now I'm not going to untar this tar ball, but we could just see that inside values.yaml, it just has those things set the way I needed it to. So now that we've got this file here inside our disk folder, this at the helm, let's install that chart. So let's say helm install my app, I'll just give it a name, and I'll point it at where my helm chart tar ball is. And now I can choose to set some environment specific variables. So let's say ingress enabled is false and replicas is two. Cool. So that was awesome. And as I packaged my chart, I got to set some build variables. Maybe I'm setting the git hash or the branch name, you know, other details associated with the build context. And then when I installed it, I got to set the install context. Maybe I'm setting environment variables or the environment name. And now that I've got my content in place, we get that read me that kind of explains nice things. Oh, helm status, helm status, my app, which by the way gets us that page that we just shot. So helm get all my app, and it spits out the Kubernetes YAML that was actually applied to the cluster. That's pretty cool. Or kubectl got all. And we can see the resources that got deployed. Now we chose to deploy two replicas. We chose not to deploy an ingress. So we have no ingress running. And if we now go to this particular port, we can see the application running. So let's just head out there and go to local host and that port. And now we can see our app running. Cool. So if I now do this helm list, we can see the helm charts that we have installed. Now the helm chart is just that package of stuff. And if we had just done a cube seed of helm template, pipe it to kubectl apply, we wouldn't get this helm list. But we can kind of see, here's the chart tar ball that we use. Here's the version, which is just an auto incrementing thing. So we've installed it once. And we haven't upgraded it. Here's the app version that we had installed here. And that's pretty cool. Now let's imagine that we needed to upgrade this. Maybe there was a bug in our chart. And so we wanted to upgrade our app. So let's say helm upgrade my app. That's my install name. And I'm going to point it at my tar ball, which in this case is the exact same tar ball that I used previously. And it will now upgrade it, altering whatever resources it needs to be able to get to the current state. And so I can now helm list and we are on version two. Now in this case, I didn't change anything. So it didn't replace any Kubernetes resources. But it did nicely show how we might upgrade an application. So that's pretty cool. We've got the helm chart built. We're able to apply that into our cluster. But I noticed something that kind of bothers me here. Cube, CTL, get all. What if I want to install two copies of my application? Now right now I'm using at the helm right here. And so if I were to install it a second time, it would tell me that that name is already in use and tell me that I can't install this second version. Hmm. Well, that's a bummer. Let's come back to our chart and see if we can fix that. Now in all of the places here, we're saying the name of the thing is the chart.name. But here in notes, we have the release name. That's kind of interesting. That was the name that the user chose as they chose to install our helm chart. So let's use that here in our resources. Let's rename our service to chart name dash release name. Now in our selector, we're also going to say here's our release. We don't need to change the port or the target port. And so now we've got our service configured. Similarly, let's come to our deployment. Our deployment is named chart.name-release.name. We don't need to change the replicas. Let's change the selector to release. We'll also change our labels here, release, there. We don't need to change our image name because that is the image name baked into the pod. So if we wanted to, we could, but I'll just leave it alone. We don't need to change our container port or our resources. Let's flip over to the ingress. Let's change the name of our ingress. And we also need to change the name of our service that we're pointing to. Now our service name is not just chart.name, but rather chart.name-release.name. Cool. Now we have a release specific deployment. So in this case, let's do a Helm Lint chart. Let's go validate that we didn't make any typos. That looks good. Excellent. Helm chart. Our app version didn't change. It's still 0.1.0, but our chart did change. So this is version 0.1.1. Our chart incremented versions, our application did not. That's pretty cool. The destination will still be the dist folder. We'll set some values like image label equals v0.1.0 and our registry is quote, quote. Now I'm noticing that I'm setting it to blank. If I'm going to do this all the time, I probably ought to pop open my values.yaml and set the registry here to quote, quote. But I'm going to be dumb and leave it as a replacement expression. That'll let me know that if I haven't set it correctly then, but maybe in my organization that's a business rule that it's always the same. So maybe I don't need it to be one of the values that I care about here. I can just leave it in my yaml. Okay. So I've set my registry to nothing and I'm going to set my ingress host to blank as well. And now we've packaged this into 0.1.1, a new target. Cool. So let's install this. Helm install my app to, we'll grab the dist slash at the helm. We'll make sure that we grab 0.1.1. We'll set ingress enabled equals false and replicas equals one. And that looks good. So now we have two versions of our application running helm list. We have my app that is app version 0.1.0, my app 2 that is also 0.1.0, but we can see that they're using different tar balls. The first one is using 0.1.0 and the second one is 0.1.1, which at this level of wrapping is really hard to see. So let's cube CTL get all and take a look at the resources that we've got. Cool. We have a deployment and a deployment dash release name. That's pretty cool. We have a service and a service dash release name. We have some pods and some pods dash release name. So let's upgrade our original one. Helm upgrade my app to dist at the helm.1, cube CTL get all. And now we can see that because we've upgraded our version one as well, now we also have that release specific values in our cluster as well. That was kind of cool. Now we got to focus in on only those values that we cared about. We didn't need to worry about all the other YAML pieces. We got to choose those 13 values that were important to us. And if our business needs are different, then we can adjust that and maybe only focus in on 5 or 12 or maybe we need to focus in on 25. And just as easily as we installed them, helm uninstall my app, helm uninstall my app too, cube CTL get all and all of that content is gone. Did you grab the right YAML file? Do I need to set the values on the way through to get it to delete correctly? All of that just kind of melts away and helm was able to take care of that for us. Cool. So we got to just run a template straight into Kubernetes. We got to package that in an interesting way and leverage that package to reinstall our app in the cluster based on our environment. What if we want to share these packages with others or even consume packages created by others? Let's come to the artifact hub. Now, artifact hub is the registry of helm charts. Now, similar to Docker hub where we're looking at particular Docker images, here we're looking at particular helm charts. So let's pick one. I'm going to say engine X and let's search for an engine X helm chart. Now, here's a good engine X helm chart and it tells me about it. Oh, here's the install instruction. It's nice. Helm repo add. That's kind of cool. Artifact hub is not a registry of helm charts. Artifact hub is a registry of helm chart registries. Now, the difference is subtle, but it's really cool. You know how Docker hub is really struggling with you're consuming all our bandwidth. Will you really quit it? Artifact hub doesn't have that concern because all it says is the server's over there. So let's do that. Let's helm repo add bitnami and we'll give it the URL to bitnami. Yep, now we've got that repo in place. And our next step is to then grab this and install it. Now, what Docker image do I need? What configuration values do I need to set? I don't know. I don't care. I'm just focused on trying out this product and it gives me some cool instructions about how I might get this. I can get the service port and the service IP and then, you know, echo that nicely. I'm just going to go the hard way and look at that particular port. That's cool. My app is running. So now let me come over here to local host and that port number and we can see that we've got engine X up. What's the settings? What are the details? I don't care. I just grabbed this helm chart and I quickly got this application running. That's cool. Now, if we dig into this, we can see that there are some configuration values that we could choose to set. We could say dash dash set or we could set these inside of a values.yaml file and we could choose to apply those as we start this. And these are all the things that the author of this helm chart deemed important. So I don't know. That's kind of a lot. Not all of those are important to me but some of them are and some of them are important to you and those are probably different. So it's kind of nice that they gave us a lot of configuration options so that we could really configure this any way we needed to. But I also assumed that some of the assumptions baked into this chart, some of those opinions, may not apply to me. So maybe this helm chart isn't the ideal way to deploy this software if my opinions differ from the opinions of the helm chart author. That's easy enough. Let me go grab the source of their helm chart and pick different values.yaml, adjusting the template to match. I could definitely do that if I wanted to. Or sometimes I just want to understand what docker image they used and I'll just start up the docker image myself. So we were able to spin up this helm chart, helm list. There's our thing, my release. That's what I happen to call it as I copied and pasted it. So helm uninstall my release and poof, it's gone. No registry rot, no DLL hell, no all of the things that we might associate with trying out software. I just quickly grabbed the helm chart, I spun it up, it did the thing, I got to play with it and now I understand more about the capabilities of the software. I might now choose to go mess with those configuration values to get it tuned to exactly what I need or maybe I discovered this isn't the piece for me and let me go find the next software. That's cool. Now we could definitely choose to publish this helm chart to artifact hub or we could choose to publish this helm chart to an internal helm registry. A lot of docker registries, especially cloud-based registries now also support publishing helm charts so we could choose to put in helm details there too. Cool. So that was a lot of fun. We got to see how we built up a helm chart template, just putting in the values that we cared about so that we could focus in on just those 13 things that were important to us. At the end of our build, maybe we just want to take the build context and blow that into place and that is an excellent use of helm. We also then upgraded to be able to package this into a tar ball and we saw how we could then set the build context variables as we built that tar ball and also set environment context as we push that tar ball into place in each environment and then we upgraded again where we started talking about sharing helm charts, pushing them into a helm chart registry much like artifact hub. That was fun. The code for all of these is up on GitHub and so that chart that we just built is right here and you can see there's those 13 values that we cared about. So it was cool. We chose to scaffold a deployment and a service and ingress. You may also choose to scaffold PVs and PVCs, cluster roles and cluster role bindings, config maps and secrets or you may choose to scaffold a particularly gnarly YAML file in your application. Focus in on just those pieces that you care about and let helm melt the rest of that YAML away. Helm is great for being able to install things. We saw how we could install them and upgrade them, but it can't monitor, scale or heal these things. Now we could definitely bake those pieces into our YAML. We could get it to talk about how we might auto scale things, how we might self heal things, but helm itself doesn't really give us any of those details. Helm was great for installation. We got a little bit in upgrades and helm definitely helped us there. But if we want to take more control over this process, that's when we reach for Kubernetes operators. Now I totally get that this kind of mirrors AngularJS. As soon as you get to custom directive, the learning curve just goes straight up. Operators are much more difficult than the helm charts that we built. But if you want to take more control over that, if you want to get more into the lifecycle insights or even autopilot where you can get this operator to start to control the pieces, then that's when you'll reach not for helm, but for the operator SDK. But for just taking care of installation and upgrades, helm worked out really well and it's a really simple tool to get started with. Helm was great for removing all that duplication. We saw even in service.yaml how we could focus in on just those two values that we cared about, the name and the port. All the rest of that YAML just kind of melted away. It was cool. We used the three different pieces where we could just template it and blow that into place or where we could package it and start to install this package in interesting places. And I'm really excited that we got to set some build-specific variables on the way through, but also set environment-specific variables as we installed this tarball. So let's get the build version, maybe the git hash, the branch name, and bake that into the tarball. And then let's also set the environment name, any environment secrets that we need to to be able to get that installed configured for that last mile. And then if we wanted to level up again, we could start talking about sharing our packages with others. Artifact Hub is a great example, but you can also get private helm registries for each of the clouds. That was a lot of fun to be able to show you helm from start to end. If you're watching this on demand, hit me up on Twitter at rob underscore rich. If you find a question tomorrow or next week, hit me up on Twitter as well. But for those of us here live at the event, what are your thoughts? What are your questions? What should I do next time to expand this talk to better fit your needs? Does anyone have questions? Sweet. So yeah, the first question is how does it manage dependencies? Would it automatically pull the dependencies it needs for like let's say engine X or something like that? Good question. How does it manage dependencies? The short version is it doesn't. But the medium version is as we look here in this helm, we had here in chart.yaml the application type application or library. We could create a helm library and have that pull into helm charts as we build them. But that's still not quite dependency resolution. I imagine what you're looking for is when you install this helm chart, I also want that helm chart run to and helm kind of doesn't really do that. On the other hand, you could create a super helm chart that included both of those. Probably not the level of separation you're looking for. Great question. Thank you. Any other questions? Cool. That's awesome that the first question came from here and the second question came from there. Somebody over here get the next question because that'll be fun. Oh yeah. I was wondering if you had any thoughts on helm file and whether that would be an extension to your talk. Good question. Helm file. A docker file and docker compose helm chart and helm file. Helm file is kind of a yaml that describes and all of these helm charts together. It is really elegant. That is a great idea to extend the talk to describe helm charts and that may also help you with a dependency need where in your helm file you can specify multiple charts that need to kick off at the same time. Great idea. Any other questions? Of course. I was kidding. Are there any alternatives to helm? Any alternatives to helm? Yes. Dozer was a good alternative, although that one's kind of fallen out of favor. Some people will do this with Terraform or with Pulumi and that's definitely possible. Some people actually automate helm with Terraform or Pulumi, which is really intriguing. And I guess the biggest competitor to helm is Kubernetes YAML. If your YAML is simple enough, you don't need this level of complexity. You don't need a helm chart just to deploy one Kubernetes resource. Great question. Thank you. Speaking of alternatives, I'll try to make it a question and not a rant. I mean, there are many other alternatives, right? Like, you know, KPT and customized a little. That solves a little bit of a different problem. And Jason Net and Q and Dahl and CDK8 and so on. And like, what would you say to the argument that you should not be using helm that is actively harmful? Oh, good question. So there are indeed lots of alternatives. And you mentioned much more than I did. That's awesome. What about the argument that helm is actively harmful? I agree. You're focusing in on just those values that you care about and kind of abstracting out all of the rest. Now it is definitely a leaky abstraction because well, what if the bug is over there? If you don't get how Kubernetes works and you're learning helm straight away, that's probably not going to help you very much because as soon as your helm chart breaks, you're stuck. Learn Kubernetes and how Kubernetes works. When you get tired of that Kubernetes YAML duplication, then pick up helm or another tool that works in a similar way. But if you just have a very simple YAML file, helm is definitely overkill and can be harmful. I have a question. What are the options for doing continuous integration with helm? So like, if you are depending on somebody else's helm chart and you just want to have a hands-off approach where if a new update comes in, you don't have to touch it. Just reply it again and keep your cluster up to date. Good idea. So what are the options of integrating helm with a DevOps pipeline, a continuous integration pipeline? That is a great question and we poked at that a little bit here. Let's imagine that I've got this helm chart that I'm building and I also want to depend on the latest version of somebody else's helm chart and that's kind of all I have here in my DevOps build. So I might have two lines towards the end of my Docker build step. The first one is helm install or maybe helm upgrade. Maybe I want to do a helm list, pipe it to said, figure out if it's installed already. And at that point, I'm just installing the latest version of their helm chart. And then the next line is helm template chart, you know, whatever folder name it is, set some environment variables and pipe that to kubectl apply. And that's a great way in my DevOps pipeline to integrate helm into that process to ensure once the build succeeded, that latest version is installed. If you want to take it a step further, helm LS and grep that for your current version, the version of the helm chart that you just deployed to validate that that's the thing that's actually running, maybe even kubectl apply, pipe it to grep for, you know, running or even more low tech. And I both love and hate this. Curl the public URL of your application and make sure you get back at 200. If you don't fail the build. Yeah, that's what help check does. You're right. Any other questions? All right. Everyone give it up for Ron. This was a lot of fun. Thanks for letting me play. That's just what I'm looking up. All right. Next up at what is that 430. We've got Daniel Walsh on podband new features. Ooh, that one looks like fun. It does. So see everyone back cloud native track in about a half hour. Thank you. Testing testing. Is that good? All right. Welcome to the cloud native track at scale. Our next speaker is going to talk about pod man, which is a full featured container runtime. And he's Dan Walsh, who I would contend is the world's foremost expert on that subject. He's spoken at scale before, and he's a great speaker. So with that said, take it away, Dan. Thank you. Okay, so basically my team built a whole series of container tools, and we're going to be talking about pod man. How many people here have never heard everybody's heard of it? How many have never used pod man before? Okay, so what way I like to start is I like to get a little anticipation going. So everybody stand up, please. Okay, please out all text that is written in red. Now try to get this together. All right, everybody together. Okay, so this is my, I like to think of myself as Dan Coyote here. You know, this might be a dated slide. How many people don't know who the guy on top of the Mr. McGill. But I figured some people under the age of 40 probably don't know who he is. So there'll be a lot of dated slides in here that, you know, we'll have to explain to these young ones that are in the crowd. And under 40 is now young ones. So pod man came out a few years ago, and this talk is really an update on what pod man is. I'll explain those that don't really know what pod man is, what it is in a minute. But first of all, I want to talk about the health of the pod man project. So here you can see that our GitHub statistics, we just took these back last week, and there's basically about 1500 forks of pod man at this point. And we're approaching 15,000 likes or stars in the project. And here's another dated one for anybody that doesn't know Sally Struthers, they like us, they really like us. So tell each brothers for those that don't know won the Academy Award and she came out running out saying I can't believe they like me. So to give you a little more statistics on the pod man project compared to the Moby project, which is the upstream of the darker. In the last six months since January 1st, there's been 122 people have contributed to pod man where it's been 52 contributing to Moby. There've been over 1000 commits versus a little under 700 closed issues, 1010, 1110, and then merged PR. So basically everything's two to three to one statistics going into them. Pod man and Docker are made up of multiple other sub projects down the line. So cryo is in a sub project of pod man, but those are also cryo and container D to container engines. This was pod man is not a container runtime. It is a container engine. Okay, this, but people often make that mistake. But there's two other container engines as container D and cryo and their primary function is to actually run Kubernetes workloads. Pod man and Docker really aren't used. Normally people have packed together kind and other tools to run Kubernetes on Docker and Kubernetes on pod man, but for the most part container D and cryo are optimized for running Kubernetes. There's a mailing list for pod man. There's about 500 people on it at this point. There's a couple of messages a week. It's really just announcing, you know, like when we have release candidates, things like that. So last, you can see the growth in liked pod man over time. Since it started, I think it started in about 2017. But you can see also, you know, the sort of a steady state of likes and then something happened last. Third, anybody know what happened in September? Third that caused that huge spike in people looking for pod man. Docker rate limit. No, close. Docker desktop being proprietary. Okay, basically Docker closed source. In my opinion, Docker or at least basically said you can't run Docker if you have more than 100 employees in your company without paying Docker money. So basically the biggest question on literally became what is an alternative to Docker? And the answer was pod man. So for those that don't know what pod man is, I'll give you a talk that I gave in the last couple of years. So this is called replacing Docker with pod man. And everybody should have said pod man because it's in red, but you're not focused. So this is how you replace Docker with pod man. You do a DNF install to get a pod man. And then you say alias docker with pod man. And then I go questions. Okay, so when we were building pod man, the goal was to, the goal was to basically, we knew that Docker and if you Googled how to do something, it's going to give you a Docker command line. So how to make sure that am I fading in and out? Sorry about that. Am I back? Excellent. Now I'm loud. Okay, so basically we knew that when you looked up how to do things with containers, you would always get a Docker line from Google. And anybody, so it was common knowledge of how to run containers. And Docker, one of the great things that Docker invented was a, can you turn it down a little bit? I think I'm, I feel like I'm loud. One of the great things they did is they simplified running containers, right? And everybody sort of learned how to run containers if you played with Docker at all. So pod man copied their CLI. So basically for the first couple of versions of Docker, we really implemented the entire Docker CLI. And anytime if we get a something that says, I run this Docker command line, it works. And I run the same command line with pod man. And it doesn't work. We consider that a bug. Okay, with a few exceptions. I'm usually when I think that's a bug in Docker, I don't implement it. But anyways, so basically all you have to change is DOCKER to PODMAN. So pod man, by the way, I didn't mention pod man stands for pod manager. Okay, has nothing to do with sexuality. Okay. I'm glad you guys asked. That was, that was good. So when we replaced the Docker command line, people said that's great. We wanted to have a different tool for running containers or a alternative tool for running containers. But there's a whole bunch of tooling that's been built in the CICD systems. And different, you know, tools have been developed over time to talk to the Docker demon. Okay, to talk to the Docker socket. And, you know, pod man didn't do that in the originally. So over the last year, my screen looks fine. Okay, did I do something again? I'm going to go over an hour over now. Is it the thing that converts it to every USB-C too, if you want to convert that to? I haven't even got to the demo part of this. This is just going to be a disaster. Okay, so basically over the last year, we added an API to pod man. We run pod man as a service. So I've been renting and raving for years that you should not have a demon for running your containers. You don't need a root running demon. So I call this a socket activated service. And what happens is basically you connect to pod man with the help of system D, and we're going to explain this a little later, basically can be socket activated. So when the API connects to a pod man socket that's provided by system D, system D then activates pod man service and the pod man service will then create a container on your behalf. And then when it's done, when your container goes away, or when you stop talking to the service within five seconds, the pod man service shuts down. So you don't have to have a constantly running service on the system. Because we've added an API, we also wanted to take advantage of advanced features of pod man. Pod man is not only manages containers, but in its name it manages pods. And for those that don't know pods are one or more containers working in the same environment. It's a concept invented by Kubernetes. So pod man can interact with Kubernetes type workloads. And actually we believe that that's sort of the future. So pod man PY is a Python bindings to be able to talk to the pod man's system activated service. So you can write pod Python code to talk to the service, but we also support Docker PY. So Docker Pi, if you built any APIs, any CI, CD systems, any type of services using Docker PY, you can actually point it at the pod man service. So you don't need a Docker demon running on your system. Matter of fact, you can run Docker compose. So full Docker compose support against the pod man service. You don't have to have the Docker demon running. One of the big advantages of pod man over Docker is pod man by default runs rootless mode. So you can run all of your containers without having to be root on the system. So non privilege user, we don't put any special set UID runs with standard permissions on all Linux distributions. So unlike Docker, which runs a rootful demon to run containers and has to have multiple demons up and running in order to run containers, pod man can run simple command line, fork exec, a container underneath it, totally rootless mode. So now you can run compose, Docker compose talking to a rootless pod man socket activated service and run your entire compose suite without ever having to be root on the system. We have full support for, we just recently in the, I think, as of the 14th of this month, last month, I guess now, GitLab gave full support. So GitLab worked with Docker and now they've replaced, not replaced, but added full support for pod man as well for running services using this API server. Oh, I guess at this point we go into a live demo. What could you run? Okay, so what we're going to do right now is just show you that I actually have Docker installed on here, but you'll see that the Docker see the error at the bottom that's basically saying the Docker demon is not running on the system. So I am running Docker client, standard Docker client, against basically right now against the Docker demon, it's not running. And to prove it to you, I run a system, sorry, shouldn't hit twice, but basically you see the red line there showing you that the Docker demon is not active on the system. Now I'm running the pod man command. So I'm running sudo pod man remote for version and showing you that the pod man client, which I didn't cover yet in this talk, but pod man client can talk to the pod man service. And basically you see that both versions of pod man on the local system are obviously run both running the same version. But now I'm going to run. I show you that the pod man socket is actually activated on the system and it's listening. So this is not a this not running service. It's a system D listening at the pod man's socket. And what I'm going to do right now, the bottom line of the screen shows I'm setting an environmental variable Docker host, which basically allows the Docker client will respect the Docker host. And instead of pointing at the Docker socket, it's going to point to the pod man socket. So this is the Docker version of the Docker command running against the pod man service on the system. You can see that the Docker refined responded with the pod man engine here. And that's all information about pod man. So show you again on this system, I happen to have no containers. But if I had run containers on the system, you would have seen the part number of pod man containers. But again, this is the Docker client talking to the pod man service shows that we've implemented enough pretty much everything that Docker client needs to be able to work with the Docker service. And so now what I'm going to do is I'm going to run. So this is actually going to run a Docker client is actually going to run a container. Right now it's pulling it down as a live on the system. So I pulled down a UBI micro image from registry.access.redhat.com using the Docker client against the pod man service, pod man pulled it down and you see inside the container that's running that the environmental variable is set to pod man. So that shows you that Docker is running a pod man container on the system. So now what I'm going to do is I'm going to now I'm going to switch to running. So that was everything I just ran was root full. Now I'm going to run a rootless mode. So I'm showing you again, it's just now now notice I'm not running the sudo in front of the commands. And I'm showing you that the pod man service is up and running on the system. Now I'm going to run Docker client again against the so this just ran. So now I ran a Docker client against a rootless pod man service. I ran a rootless container using Docker client against pod man. And that should show you a dash a there. But so now what I'm going to do is I'm actually going to run Docker compose. So I'm going to run right. I just showed you that there's no containers running on my system at this point. So I'm going to run Docker compose script. And it fired off a container. So Docker compose running against the pod man service fired off a Docker in a pod man container. And now I'm going to do Docker compose down going to wait 10 seconds because the application I'm running inside of it doesn't receive the signal properly. And so sit down the service and I jump to the next demonstration. But trust me, there are no containers running on the system at this point. Okay, so I showed you pod man remote. So pod man at this point talk can talk to the pod man service as we showed in the demonstration pod man clients talks to the darkest service and watching containers. You can set up pod man to work across SSH connections. So we wanted one of the things I didn't like about Docker is Docker basically has the ability to listen to TCP sockets and people use it to set up TCP connections to it. And pod man can do the same thing, but I really don't like that. What I prefer to everybody to do is use SSH connections between. So you would talk pod man through an SSH tunnel to the pod man service. So you talk Docker P Y Docker compose, you can talk over SSH connections. Pod man has the ability to create these SSH connections using command. And one of the reasons I'm talking about this is like, well pod man can create local containers. So why would I want pod man to talk remotely to a pod man service? Well, it might be interesting for that. But I want to highlight to everybody in this room. Does everybody understand containers are Linux and Linux is containers? All right, there are some windows containers. But for the most part, when people say containers, they're talking about Linux containers. Almost everything at Docker IO and Quay.io and Artifactory, they're Linux containers. So when I'm sitting on a Mac or a Windows box running containers, I'm running Linux containers. And in order to run Linux containers, I have to have a Linux kernel running. So what on a Mac and on Windows, what they're doing is there's a virtual machine running and pod man or Docker is talking because it's a client server operation. The Docker client is talking to the Docker demon running inside of a VM when you're running Docker desktop. So pod man. Similarly, we've had a project called pod man machine, which sets up a command. So pod man can basically pod machine will set up and pull down a virtual machine to your box to your Mac or Windows box on a Mac is pulling down fedora core OS and running pod man inside of it using similar commands that we just shown with pod man remote. It can run in both rootless and rootful mode. So when you start up your service, you can say which socket you want to use, you can actually switch back and forth between rootless and rootless mode on a Mac and on a Windows box. So on a Mac, and if you want to get pod man on your Mac working, you can do basically these three commands brew install pod man, then you do pod man machine init, which will pull down the virtual machine on to containing fedora core OS will launch the virtual machine. And then you just run your container just like you would normally. So the pod man that we ship on a Mac that brew is installing on your Mac is actually pod man remote. So it's a remote version had coded to always talk to a remote socket on the system. And the pod man machine command will set up full connections over SSH to the virtual machine running on your system. There is a native installer being worked on right now being developed and it should be out this fall. So that you'll be able to just basically go up to a website and double click on give me that on your Mac double click on pod man and it'll kick off and install on two machines so you don't have to use brew pod man on Windows takes advantage of WSL 2. So WSL 2 is basically this fancy I'd like to think of as a fancy way of running virtual machines on Windows. So Windows developed a really nice integrated system for running virtual machines basically running Linux kernels on their Windows machines and it's very simple to install pod man matter of fact there is a Windows installer if you go to pod man.io look for the latest release you'll see a Windows installer click on it it will install setup and install WSL 2 on your machine on your Windows machine and then start running pod man locally. So what happened in September we're addressing with these tools. There's also a very large effort going on right now in the upstream project to create a GUI for pod man it hasn't been released yet but this daily releases of what's called the pod man desktop this is going to be fully open fully free as free as in liberty and free as in beer so there will never be any charge for this tool to run on as many Macs as you want because we believe in fully open source. It's available it's going to be available or it is available on Windows Mac and Linux I prefer if you all use Linux with Fedora on it but most people don't listen to make but basically pod man loves people who like you know machines with dead fruit on the back or Windows or Linux. So pod man really loves docket compose as you see a lot of people in the world want to run docket compose have lots of lots of docket compose but we really really love Kubernetes okay we believe that Kubernetes won the orchestration battle and I believe that the future way that you want to write orchestration if you're going to write a YAML file to define how you want to run containers multiple containers in a application or multiple containers in your environment on your local machine you should do that with Kubernetes YAML files as opposed to writing it in docket compose YAML files so going forward that way you could run the same I envision at some point people can be running in their hopefully open shift but Kubernetes environments Kubernetes are open shift environments you build a Kubernetes YAML file that defines how your application is going to run underneath Kubernetes you run it through the entire test suite in that environment you write your CI CD system you have massive cloud power and now you basically decide that you want to run it on a single node well you might not want to have to install all Kubernetes to run that application on a single node think edge devices well why not have use pod man to take Kubernetes YAML and run it locally on the system so it's not an alternative to Kubernetes it's running but it will understand the Kubernetes YAML file in order to launch containers so pod man playcube is the way we do that so pod man playcube and pod man generate cuba two commands that pod man supports if you build your containers using the traditional way you build containers you know pod man run command pod man gray command you can actually do pod man generate cuba and it will take the command that you generated and will translate it into a Kubernetes YAML file so we'll out let it so if you only know how to build containers using a pod man or a Docker workflow we can generate it similarly if you have a Kubernetes YAML file you can just do pod man play actually we have new command coming out comment it's going to be pod man cuba play but pod man playcube basically takes the Kubernetes YAML file they're going to be aliases of each other and run runs the environment on it that will make it easier to migrate to and from Kubernetes with those two commands that I just talked about but we also understood that the certain commands that docker compose sort of invented that people really really like like being able to build an image from from a docker or a container file so you basically could put a container file into a directory and you can say docker compose up and it will actually read the the container file and will actually create an image for you without having to pull it down from container registry so pod man playcube now has build support so you can actually do the same thing with Kubernetes YAML files not only that but we have the down flag so in Kubernetes in docker compose as I showed in the demo you can do up and down to bring up and down a whole series of pods and contain well Kubernetes compose only runs containers but in a Kubernetes YAML file you can bring up multiple pods multiple containers in your local environment and have them come up and down with a simple pod man command to bring them up and bring them down so docker several years ago I think this was the first docker con had one of their engineers I won't name the person was proud to wear a badge that said that they rejected all system dprs most of these system dprs were coming from me so but I'm not spiteful like that so pod man is proud to say that we support fully support embrace the power of system d system d is the init system is a default init system for all linux systems for the most part I know there are a few out there that are still but for the vast majority of linux systems out there system d is the way is the init system that you use so containers should support them so pod man treats system d as a first class partner system d is pid one in a container is supported by default so if you want to run a complex service inside of a container with using system d standard unit files pod you just put it pod man sees s been a net or s been system d it will set up the container in system d mode in the mode the system d expects by default and run the container you can generate system d unit files directly from pods and containers remember I talked about pod man generate kube generating kube yaml from your running containers pod man also knows how to generate the best practices unit files so if you design an application running on your system and you want to generate a unit file that you could then take to a thousand different machines you can basically generate the unit file using pod man system degenerate and then you can take that unit file and put it on a thousand different machines when you stop the service it will run pod man on that machine to pull down the container from the container or pull down the image from the container registry and launch the container on it so it's fully integrated pod man supports socket activation one of the things you could never do with docker is because docker is a client server operation you can't pass when you run a docker command line inside of a unit file start of a service file the docker client if you want to do socket activation what system d does is hope it receives a connection opens attaches the socket to and then leaks the socket to it's the the process running inside of the unit file so the docker client got that server socket but docker client is not going to launch the container it basically had no way to hand it over to the docker demon since pod man is a fork exec model pod man can take the socket the open socket and pass it all the way down to the running container so you can use socket activation fully with pod man where you can never do it with docker sd notify is a concept in system d that basically says if you have a service that takes 10 seconds to start and you want to have other services that rely on it say a database that takes a while to start up system d to find this protocol where it can say have this the actual service just say your database server to send a signal to system d and say i'm up and ready to receive connections well that doesn't work with you running it inside of a docker container if you run it inside of a pod man container it's fully supported lastly and probably most important is pod man has built in the concept of auto update so imagine you're on an edge device so you're running a service on a system and you want that service to run for years and years and years and it's running out there just chugging away and all of a sudden you have a cve or have some kind of vulnerability well wouldn't it be nice if you could just fix your container image update it push it to a container registry and you have 10 000 of these boxes all running pod man inside of a system d unifile well what pod man will do is pod man auto update will fire up at some period of the day using system d timers will go out to the registry see that there's a new version of the image that the container it's running is based on pull that image down to the system and recreate the container running on top of that new service so you get updates to 10 000 machines instantaneously not instantaneously but over time with pod man running pod man also supports health checks so a pod man health check would fire off and make sure that the updated container rise service is running properly if it is not if the health check fails pod man will then roll back to the previously running container on the system and you will later update hopefully update your serve you know service again figure out what went wrong and update it so with pod pod man auto update you start to have a service that can be self updating in your environment i talked about rollback support so let's do another live demo okay so let's create a system d service to run a container image so what i'm doing here is there i just ran so this is the the red hat shifts ubi 8 dash dash in it which is running system d inside of a container and here you see system d running on roll 8.6 on my fedora core box just like that it's just standard there's nothing special in that command and all i all i did in that command is run this command pod man run ubi 8 in it pod man saw that the system d was in it um ran the container on top of it now the system d doesn't receive the um this is when i should put my glasses on because i'm an old man oh you can't see it there so uh what you see here is i just did pod man stop dash l this is a kind of a uh hacky thing that we added to pod man so since we're developers uh we were anybody that's ever played with darker if you want to stop a darker container you got to figure out what the id is you have to find figure out what the um um you know the name that you named your container well being lazy engineers we just basically added dash l which basically says stop the last container i started on the system so uh originally i thought that was kind of hacky and stupid but we use it all the time so we love hacks so you can see that the service stopped um running on the system so that was system d running inside of a container now i'm going to do pod man system generate uh so this is all the fields that you know you can actually add if you want to add additional fields to your unit file that we're going to generate from running containers so i'm going to run a container on my system with the top command basically just a system running top and then i'm generating out a um a unit file so this is a unit file that i just generated and we have studied and worked heavily with the system d team on the best practices for running pod man in a system d unit file so this we would heavily recommend that you use this unit file that was generated um based on this uh service uh we called the top service so now i'm gonna so now what i'm going to do is i i created a new unit file so i want to i have to do a demon reload to tell system d that this new unit file is available you see on the system that there's no services running uh there's no containers running on my service and now i'm going to start that service kick off the service you see a container running so basically full integration with system d for launching services and if i want to stop the service i do it and by the way this is all happening in rootless mode so there's no root root running um system control you have to have a special container you see i stopped the container it's that simple to run pod man in this as a service um on your system all right i mentioned a couple times um i've actually um i've been leading the container team at red hat for probably about 10 years now and i recently switched over to pod man pod man on the edge i wish it was called that um it's um rel for edge uh team at red hat um so we're working on edge devices basically small computers to run on the edge small computers to run inside of automobiles and i'm working to get containerized applications into these small working working environments as we move to the edge devices and i believe that pod man is the perfect tool for running containerized applications on edge devices i kind of jump the gun on this so you can define your containerized application inside of kubernetes yaml files running inside of open shift kubernetes deploy it then on rel core os for edge doesn't quite exist yet this is what i'm pushing for whether or not red hat probably not thrilled i'm putting up this but um okay and then so once we have that environment it's a combination of kubernetes yaml files and system d unit files we've actually now have a this blog talks about it we can actually take a kubernetes yaml file and with a simple system d command tell the system d to run the kubernetes yaml file underneath the covers system d launches pod man to interpret the unit file so you don't have to even generate the unit file you can just hand it the the yaml file and with a simple system d command you can actually tell it to run run it and we'll let run pod man underneath the covers um to run your services like that um the beauty of this is one of the beauties of running pod man on edge versus running uh versus running docker is when you run docker you always have to have at least two services running you have the docker demon a container d demon always have to be running when you run pod man pod man during the time that it's established the containers it's running but if the container is going to run in detach mode which most containers on services run pod man will disappear will exit from the system so you won't have pod man using up any system services while your application on an edge device is running and doing the only time pod man will fire up as if say system d launches it or human being launches it to go and change the the way the environment is running so pod man gives you no overhead when you're running on it allow system d to manage the lifestyle of the application and we talked about auto updates fillovers health checks all that stuff is powered under the system d so one of the things we also look at with pod man um these are things that i constantly push at um for pod man as i'm known as a security guy um the desi linux for many many years um i'm always pushing to add security to pod man to make it the most secure um way of running systems so what pod man actually supports multiple registries one of the things docker did is that hard-coded docker io is being basically the only true place to get containers images from but we wanted full support to allow you guys to to choose where you get your container images from and you can actually specify multiple um multiple um registries when you want to run your containers but that potentially opens up a risk that you could basically use what we call a short name so anybody that does a you know docker pull alpine um you expect that it comes from docker io well if you basically did pod man pull ubi eight we would want you to get it from red hat so we actually in command line mode we will actually prompt you for all the registries you have out have registered when you use a short name but we also have the ability to pre configure we have this at short names dot pod github containers short names dot com we have the ability to register names short names um in the environment so that if you use a user short name say ubi eight we basically have a uh a table that says get this this exact long name matches to that and that means um you as a uh a user of the tool you can actually drop your own short name to long name um translation table and guarantee that you always get the container image that you expect from the appropriate registry and allow users and that's this shows your short name aliases short name aliases at github short container short hub short names uh only allows we will only allow names in there that we know are correct so nobody can go grab a random you know Maria db and says that comes from you know always comes from one site but you know sent us you get it from the sent us uh registry alpine you get from the docker registry block you know so we we really um so the people allowing contributions to that are always watching for that another thing in security one of the things that uh many years ago uh when docker was first starting up there was they added the concept of linux capabilities so linux capabilities allow you to run root with less privileges on your system and there were 14 capabilities chosen to be able to rerun as docker was developed three of those capabilities i think are really stupid that they were allowed in docker but it's hard to change because docker's been running this way for 10 years um the three capabilities that we drop in podman versus docker automatically are audit right audit right allows you to write this to basically allow a container process to by default right to the auditing subsystem we think the auditing subsystem should be protected from containers and not allowed to be modified net raw allows you to create random ip packets on random any type of network packets on the system the only reason that's allowed by default is so you can ping linux allows you to ping in many ways without requiring net raw for many many years so podman will support setting up ping packets but will not allow network raw and network raw has shown to be allowed vulnerabilities and people to escape network namespace and allow you to do bad evil things the last one is really strange that it's allowed by default which is make nodes the device nodes is one way that you communicate with the kernel rootless users have never been allowed to use make node almost no one in the world ever creates device nodes on a system in in a podman container we create the device nodes that you are likely to use on your system so why is cap make node allowed by default on docker containers i don't know but i think it's because docker build at one point built a uh an application that tried to create a device node it blew up someone said hey i need cap make node and someone said okay you got it and that was the end of it so for 13 years and we've been running podman for five years six six years now without that access and no one has really noticed podman also has the ability to mask over lots and lots of the kernel file systems so by default a lot of slash proc and slash sys on the system are mounted into your containers but there is some information that could leak into it the way docker and podman gets around this is by mounting either dev null or tempfesses on top of it and you cannot unmount them so you can't see what's underneath them but podman also allows you to basically unmask those if you would need one specific one of those to unmask you can do it on docker you have to run the containers privileged so a lot of people the way they learn to run containers is i run a container i get permission denied okay i'll run it in permissive mode which is basically saying i run a command is rootless oh i got permission tonight i'll run it as root okay and that's that's basically what happens so we want to allow people to figure out and help them figure out how to basically lessen the security in order to get their application to run without turning security off altogether the last one is you can actually figure out that there are certain things that we don't currently mask so you might want to prevent your containers from seeing so we allow you to add additional masks um and then the last one i'm going to talk about right now is username space so one of the cool things that red that podman does is that podman runs in a username space that's how we do rootless containers so on your linux boxes right now you basically see a file etsy sub uid and etsy sub g id so the user ad command by default adds ranges of uids that are allocated for each user on the system and what podman does is it uses those uids to map you into a username space when you're running a container what that means is it can allow your range of uids to start at uid zero to uid 65 000 mapped to the uid defined in etsy sub uid and what the kernel does is it actually does the substitution so while in the container you will see your process running as uid zero if you leave the container that same process is probably running as your user id on the system so that's how podman takes advantage of username space but podman actually also can run rootful containers and rootless containers in different username spaces so if you can imagine you're running 10 containers on your system imagine if each one of those containers none of them were running as root even if they needed root inside of the container and they ran in a different username space so each one of them got allocated say 10 000 uids and then you just ran each one of them independently so podman dash dash username space auto will figure out the minimum amount of uids you need for your container and we'll launch each one of them so in the future i believe that if you're running your again edge device or service that's just running podman containers you should be using username space that equals auto even if you're running rootful containers and it will run them all in a different username space the reason i don't recommend that all the time for rootless mode is because we only allocate 65 000 uids so if you're running multiple containers in rootless mode you need to allocate a whole bunch more uids but anyways it's a really cool feature username space has been around a huge amount of time docker has never supported it fully okay docker supports one username space but doesn't allow you to run your containers in different username spaces kubernetes and cryo have just recently started to get username space capabilities into it and so kubernetes going forward will allow you to start to run containers on your system each one in a different username space which will be very very powerful from a security point of view so a couple of last commands we're going to talk about podman added capabilities to itself over the last year podman contain a checkpoint and restore allows you to take a container using the cryo system in the linux kernel and basically suspend it and then you can restore it back up so people when they usually think about uh suspended restore it's usually about i'm going to suspend it on one machine and copy some stuff over to another machine now you can do that with this system but i think a far more interesting thing is basically to take your container down say you're rebooting and you have a container that takes 10 minutes to start again that's a huge oracle database you can actually suspend it when the machine reboots and then when the machine comes back you can restore it and you know do an upgrade everything else and you can actually use suspend and resume for our individual containers on the system another tool we've added is podman image scp which allows you to copy container images between different services without going through a container registry so if i create a container image on my machine right now and i want to run on a different machine i have to push it to a container registry and then i got to get to the other machine and i have to pull from the container registry to the podman running on that service so podman can circumvent that and just say podman connect to the to a podman system that uh socket activated server or SSH connection and push the image directly to the service and you can push it from rootless mode to rootful so you can sit on your system you run it you create an image oh shoot i meant to run that rootful just direct run podman image scp and you can say run it as root on my machine and it'll copy the thing in that case it copies it over sudo podman secret support has been added and this is working with kubernetes secrets so we can actually put secrets into containers and container images but not have them saved to the image so this is full secret support and the last interesting one we've added is podman image mount so what we can do with podman is mount containers on your system so you can examine containers without having the container run and you can actually examine images without having a container run so if you would want to run scanning tools usually the scanning tools have to basically pull down you'll use darker to pull down the container and then in order to get the container mounted up they sort of have to run a container run and then enter well we think that running containers or processes on a on an image make the image at risk so we we basically allow you to just mount up the image get you a mount point and then you can run tools on the system to actually examine what's in an image so here is uh uh my grandson everett and he's asking uh what's next granddad and even though he doesn't call me that yet but i gotta love this picture so i have to stick it in my presentations uh so the podman future so one of the things we're looking at right now is the ability to remember i said that podman is not a runtime it is an engine there's differences that engines are things that go to container registries pull down images set them up set up storage and then generate an oci runtime specification and at that point podman darker container d trial will launch a runtime runtimes are things like c run run c kata g visor those are little tools that read an oci runtime translated into something in the kernel that actually runs the container so what podman is by default is moving away from run c to the tool called c run run c was written in go originally almost all containers in the world run with run c at this point but c run was a rewrite of the go code into c and it's much more efficient much better for running under kubernetes but also because it's using a standard c library we can easily add additional libraries to it so we've added a wasm engine so we're gonna have c run wasm we're gonna have c run k run and we're going to be able to run different workloads under the different types of oci runtimes and what the wasm one does it pulls in the entire wasm stack to the container engine and you don't have to embed your wasm run environment inside of your image so you can directly run your wasm based applications on the system anybody that went to Nathaniel's talk it was all about confidential computing and we're working heavily to get confidential computing into podman so that you can run basically container containerized virtual machines running oci runtimes so you can basically run these are light really lightweight virtual machines basically just using kvm to run your container on your system but imagine that you could run that in a confidential environment which basically means that root on your host cannot see what's going on inside of the container so again Nathan's talk previously talked a lot about confidential computing but the real goal here is that you can run an encrypted workload so again going back to edge devices imagine I'm in an edge device sitting at a windmill or on a train track somewhere and some random hacker comes up and rips the thing off the system gets to root on the system and then looks at what applications are running on the system well those applications are running in encrypted environments that even the person ripping apart the machine to get at it cannot see it imagine you're sitting in your car and you have a computerized system in your car and you have hackers to break into the car's operating system and get to root on it if you're running confidential computing you wouldn't be able to look at those so I think I just covered all of this stuff and then finally there's a shameless plug because I've written a book it's coming out yeah well right now you can get it early access at Manning Publishment the name of the book is Podman in Action so if you just google podman in action it'll come to this you cannot you can buy it and put about four dollars in my pocket so that will be very much appreciated so at this point I'll open it up to questions if I have time oh I didn't run out of time that's I usually like to run out of time so any questions yes let me get to you with the mic so everybody can hear does the kubu play support uh customize mode does it support customize this is when I reveal that I really don't know kubernetes all that well so I have no idea what a customize is so what I would do is I would open up an issue with podman and say it should support the dash k so and that's what we're doing we're looking for people to want who want to add kubernetes features to podman one thing I've been told is I can't make podman compete against open shifts so we will not make podman into a kubernetes competitor so what is the status of GPU acceleration in podman because that's the one feature that was in docker that prevented me from fully moving to podman so to what is it GPU acceleration cpu gpu like nvidia gpu yeah so for inference so gpu so this should it should fully work I don't know so the problem with the GPUs is the world of GPUs is probably the most hacked up universe so what probably is happening is you are comparing podman rootless mode to docker rootful mode and I will guarantee you that the nvidia and whoever is built at the gpu mode basically did not take user namespace into so if I think if you ran podman rootful mode it would work perfectly um what happens in so what the uh why don't we talk afterwards because I could spend 15 minutes talking about the insanity of GPUs and the way that the vendors are handling it so um let's let's move on other questions yes shout it out uh does it also install a hypervisor or do you expect a hypervisor to be so uh right now podman uses qmu so when you do a brew install podman podman requires qmu qmu comes down um apple is being very hostile to using qmu matter of fact that they've added in the m1 uh hypervisor now they've added really nice support in their native hypervisor to be able to run x86 applications on it so we think that apple's going to force podman to move towards supporting their native hypervisor so right now yes it's it forces use qmu in the future I would say it's very likely that we will also support uh apple's hypervisor just because that's the way apple is they're not very friendly at open source any other questions see it keeps on making them run back and forth across the any plans on supporting other uh init systems other init systems yeah podman is uh podman will work on any init system so whether or not it will automatically run and understand it the way another init program wants to be run or you're asking about like dominant one of those yeah well I mean right now if it's a fully open source project so if you had other types of generates that you wanted to generate from containers and you were nice enough to open up a pull request we would not block it so but yeah right now we've only done the init system that pretty much everybody uses any other questions oh same side uh the do podman still supports uh inheriting cpu weight and memory limits from the from a system d unit when you when you were not yeah so you can uh so if if you have set up in your unifile constraints then podman will run under those constraints so yeah podman has no podman has no special file special powers so if you say in a unifile that podman has to run underneath this c group and you're using c groups v2 then podman will use a subset so if you basically set put it into a c group that can only use a uh cpu2 on the system then when podman runs it cannot put the container on cpu1 it would be locked into cpu2 so yeah we will follow all the rules that the unifile tells us and that's just because podman doesn't have powers it's not it's not magical last question oh by the way i i do have podman stickers if anybody wants afterwards if you don't mind since you have extra credit for time because of the ab things i had three questions first one was um when you go in the cube play mode i'm wondering what the support is for persistent volumes does it so it will it will translate if you have a volume defined in your kubernetes yaml file it will create a local volume based on that that name so it will automatically create a persistent volume on the system okay but it's local volume it's local it's not going to go out or volume drivers or docker well docker volume and podman volume support services so you can create other types of volumes but by default all we create is a local volume for okay another quickie mentioning windows and max support how about bsd since this is so actually it's funny you should ask so um again since this is a fully open source project over the last month we've had bsd people coming in and opening up all sorts of pull requests for us to be able to support um podman on bsd using jails so it's like uh bsd machines don't have any concept of a of a container but they do have concept of jails so podman will be able to pull down an oci image container um for bsd and be able to run it on that i don't think he's completed his effort yet but it's pretty close he's got it pretty much into all the endo line libraries okay and final one from me i was just curious about the use case for why you did that image scp is that related to edge in some way that you thought it was valuable there or was that something else see the number one goal of enginez is to be lazy and we hated pushing to a ridge it just was it just it came up all the time people basically doing something root fold mode and then rootless mode and they wanted to copy the image from root fold to rootless without having to do a podman build again um and then the other use case was basically but it will work for edge devices you can imagine just scp-ing while plugging this all into say ansible scripts where you can basically pre-load pre-install all the machines um out there with images already ready to run um so you can it just was a handy tool and since we have ssh connections it would just make sense to use some kind and that's why it's called scp because it's using ssh under the covers okay with anybody else yeah i'll be around for uh i'll probably go downstairs but i can sit up here for i don't know about 10 minutes or so before anybody wants to walk up and ask me and i don't have the lovely podman shirts sorry and they're not available online but okay let's thank dam that was a great presentation and um we're gonna take a 30 minute break on the container track but since we went into edge the next speaker is actually going to be talking about edge um in relation to kubernetes so you might want to stick around for that thank you you want me to speak a bit this works yes thank you i've got six o'clock so let's get started oh i see i'll let somebody get seated here but um welcome to the last talk of saturday in the cloud native track at scale los angeles uh i've known our speaker today fredrik debian he comes all the way to talk to us from autoa canada and i've personally been seeing a lot of his presentations before and been in numerous meetings with him he's associated with the eclipse foundation as a shepherd of a number of edge related projects in the eclipse foundation and he's going to talk to us today about whether kubernetes might make sense for you in edge applications so with that said i'll let you take it away fredrik all right thank you steve uh for the introduction that's certainly the best hundred bucks i ever spent all right so i'm the only obstacle between you and the party tonight game night so thank you so much for being here we'll try to make that uh entertaining uh i think that that's the case so uh first let's start with the show of hands who in this room hates the eclipse ide okay great um who in this room knew that the eclipse foundation is much more than the eclipse ide okay excellent well if you didn't know now you know so now it's your job to let others know so we have been very bad at marketing right and many people conflate the eclipse foundation with development tools but we do much more than that in fact uh we celebrated last year the 10th anniversary of our iot working group and in 2019 we started the edge native working group and started tuning more serious around the edge and my job is really as the program manager for iot and edge computing to keep an eye on the 50 plus open source projects we have in the space iot and edge combined and and and those involved over 400 committers at this point in time so you know this is no small community and uh well it's a privilege to be uh the voice of that community in front of you today all right so uh our presentation today is really asking a fundamental question and you know this is game night so i'm i'm very kind to you the very first slide gives you the unconditional you know uncontested answer to that question so if there's one thing you should remember okay it's on the next slide and then you can go no problem okay so let's do this so should you bring kubernetes on this edge road trip and the answer to this unconditionally i'm i'm sure you will agree is maybe all right and the rest of the presentation explains you why is that okay uh but but really when you think about it you know uh building an edge computing project is literally like a road trip and if you had kids or maybe an annoying uh significant author or x significant other no i love my wife honey if you're watching this i love you but you know being stuck in a car for hours you know with somebody asking are we there yet or you know giving conflicting navigation advice or whatever is very painful and so if you're building iot or edge infrastructure and you make the wrong choice you know your experience on that particular project will be as bad as a bad road trip with that little lady there that's not in the mood okay so my goal today is to give you rule of thumbs advice you know coming not for unfortunately from my personal experience because i don't have the time to be in the field building stuff but this is what i heard from our community members and their partners and their customers and really um you know uh believe me when i report that but i'm more than happy to put you in touch directly with them if you don't believe me directly all right so uh the structure of presentation really simple first we'll take a step back and really um take the time to agree on what edge computing is and how it is different from cloud computing i think that's really important then uh we'll have a look at the kind of workflows you run at the edge and this is really critical because um a container is a container is a container yeah maybe but whatever you put in it or what you want to run outside of it is might be important so we'll take the time to really take a look at the edge landscape then uh we'll dig into options to run Kubernetes at the edge you can do it sometimes okay it's a good option sometimes not so we'll take the time to see what's available not exhaustively so i won't list every option but have a high level look at possible ways to achieve that and finally um we'll take some time uh we've been working on a number of open source projects at the eclipse foundation so i will tell you about edge ops the way that we see edge computing and how those projects very briefly fit into that and whether you should consider them with Kubernetes or instead of Kubernetes in some cases all right so um we talk about edge but edge is really you know in something that lives in parallel with the cloud and so when we talk about cloud computing this is really an environment where you have on demand availability of resources you know if we if we try to give the simplest definition of it you demand and you get resources now those resources typically they are homogeneous well of course you have different instance types but when you request a c4 or m2 instance you know what you're getting and yes you can configure it but you know it's fairly easy to have thousands of VMs with all of the same properties now um the thing is this is a large-scale environment right so you can yes request thousands of those but at the same time we forget this not limit less i have a friend in germany hello hans if you watch this i won't name him or his employer but the guy literally exhausted okay a specific instance type in europe in a specific geographical area so you couldn't create even one more of that because he had tens of thousands of that instance type running yes that's crazy it can happen okay so anyway so that's on a large scale but that's not undefined or unlimited and then of course this is centralized in the sense you connect to this handy dandy console so yes you have availability regions and servers all over the globe but you manage it in a centralized way and that's the whole point right you don't want to connect to the apac console or to the us console separately you want this centralized management and that's really important okay the edge is the complete opposite of all of that okay it's completely distributed the whole point is that you are outside the cloud outside the corporate data center and you are trying to deploy something there so that's a whole different environment and you will tell me but fred you can build a field data center and put servers in there yeah that could be mightily expensive so sometimes you will deal with whoever is doing that for you but then when we talk about the edge this is about putting a little edge device at the bottom of every sequoia in california to make sure the rlt okay that's the type of project we're talking about okay i exaggerate a bit maybe you could consolidate and have one box or 10 trees whatever but okay the point is you are in the wild literally which means you cannot go there easily which means it wouldn't be economically feasible to go there frequently and there are many many constraints there temperature humidity forest fires whatever right so that's a whole different ballgame so yeah you can build your own little field data center but that wouldn't be economical to put that throughout the mountains of california for example and we are operating at a small scale not in the sense that you you you cannot have 10 000s of edge nodes yes you can but each of those nodes has limited resources limited elasticity okay if i'm deploying little boxes in the wild maybe i don't have the budget to have you know terabytes of memory and you know petabytes of hard drives and unlimited or at least large scale resources like in the cloud okay so you need to think small so to speak when you are at the edge even if you may have powerful processors and and relatively lots of memory why you don't want to go back five years from now to replace the bugs because you deployed so much stuff on it that it doesn't deliver performance anymore okay you try to optimize early for size and you know a smaller footprint then there's the fact that it's heterogeneous you are doing things involving the real world involving more often than not a real-time mission critical type of application so you want hardware that really deliver you know the best performance possible in the smallest cost possible and typically lower power consumption and things like that which means you may not have a very heterogeneous environment you will have those little armor risk five processors that you will leverage in order to really have the best bang for the buck you will use AI accelerators in specific types of applications and all of that is something that if you work with containers you will have to tailor container images for all of those possible combinations that can be painful believe me okay but that's something that you need to take into account so ultimately if you want if you want to agree on let's say a kind of generic definition for the edge we could say that it provides compute storage networking capabilities closers closer to the edge of the network and optionally okay in the second half you maintain some level of elasticity and a consumption based pricing model if you are dealing let's say with some that someone who is an edge compute resource provider for you of course if you deploy yourself well maybe you will maintain some form of internal accounting that kind of stuff and then this really applies but this is optional really the core is that you bring okay all of those resources closer to the action now when we say we run on the cloud typically what we are running are cloud native applications right so when we say cloud native this is about a few things okay and and maybe you could add a few points to that but generally speaking we talk about you know doing in a DevOps fashion using agile methodologies microservices and containers and that's my pivotalness showing there repair repay rotate whatever I mean the whole point is that those containers are expendable right they are stateless and you want to scale taking advantage of that using a certain type application well edge native is a bit different as we will see now when you do edge computing there are a number of challenges you're trying to address and this slide really synthesized them well why would you put devices in the field and directly you know go to the cloud I mean you can do that but there are many situations where this is not wise okay and you see it on this slide there first is the problem of latency um you heard about the pig war no 1859 if I remember well so essentially when they established the border in the west between Canada and the US they decided that it would follow the 49th parallel you know through the ocean and then that the border will be in the channel between Vancouver Island and the mainland but in the channel that's a problem you know there is a set of islands smaller islands between Vancouver Island and mainland so it can be on one side in the middle or you know on the other side because of that uncertainty they were you know big problems in the area and then there's this genius us general you know hardball you know with a passion for um let's say violent action that decides hey I will solve the problem you know will occupy militarily those islands you know people were coexisting peacefully waiting for the border to be established but things were going well anyway the pig of a British settler eats the potatoes of a US farmer on the island and they can't agree on compensation you know one was asking $100 the other one was offering less than that um anyway it's been a whole mess the general sees that sends troops tension escalate and now you are this close to a war between the British and the US because a pig had eaten the potatoes of some farmer and the guy was running the show the general was running the show you know he sends a letter to Washington you know reporting about that and asking for instructions but he does his whole thing for a full month before you know an answer comes back and they send someone else to you know keep a lid on the potential conflict and finally you know that that war didn't happen but the thing is latency is a big problem you know if you have to go you know you have this oil and gas refinery you need to close this valve in 10 minutes seconds otherwise the whole complex blows up you don't have the time to go all the way to the cloud and back before getting an answer so edge is good to solve that particular problem of course because you bring the decision making logic closer to the action and you don't put into play the variable latencies of the public internet and your cloud provider and and you know you know our things can go badly there okay then there's a problem of bandwidth of course video analytics is a very trendy very sexy type of application but if you do video analytics let's say on a single hd feed that's roughly three gigabytes an hour worth of bandwidth so if you send that in real time when nothing is happening you know hour after hour after hour at nt and Verizon and whatnot will love you and your boss will hate you right so you want to do this locally and maybe start streaming live to the cloud when there's an incident you detect there's a robbery or something you know or whatever you need to detect with the analytics okay so edge helps you save on bandwidth for sure then there's the whole aspect of resiliency what happens if the network goes down maybe you heard in canada um two three weeks ago we had a massive outage one of the national wireless providers went down and it took them the better part of the date to recuperate and all of that was because they deployed a software update well our configuration update to the core routers in the network problem is you do that it went badly and they couldn't even access the data centers because the card readers were on the same network as everything else that's the core routers of everything so they had no way you know they had to force open those doors and you can imagine the mess okay so resiliency if you have something mission critical your network will go down and you need well you need to continue to operate because in the case of mission critical real-time things there are often lives at stake okay think air control think autonomous vehicles okay so you can't afford to have an outage and then while too bad we will be back in eight hours and finally data sovereignty that's really important the fact that because of regulations of or even personal preference you want data never to leave a specific physical location whether that is your home or maybe a state or maybe you know a region north america the european union whatever okay so this is especially important when you consider health care patient data is not supposed to exit that particular hospital and even if my doctor is flying let's say to japan okay is not supposed to access my patient data when he's outside the country for example things like that so it's really important and edge computing helps deliver on that all right so what makes edge native different from cloud native first you have to assume that the network will degrade or fail even if you have redundant networks you know people don't test for redundancy most of the time so oh yeah yeah we've got you know backup cellular connection there no problem did you test it even the cell network can go down as canada shown optimize for size and power that's really important power consumption is critical and you would tell me hey i don't operate over battery power no problem i don't need to optimize for power yes you do because the components that overheat because you put too much of a workload on it the lifespan will reduce you need more cooling it will throttle ask that to the owners of the new m2 uh mac book air for example you know that's a perfect case study why you should optimize for power and of course eat dissipation and finally zero trust you know when you put devices in the wild like that someone at some point will steal one or more okay and this means by default you shouldn't trust any device and then you have strategies in order to quarantine or maybe flag data that comes from devices that could have been compromised and that kind of stuff and that's really important okay so those things are not necessarily things that we think about when we do web apps that we deploy in the cloud okay but they are really really really important at the edge because we are outside the cloud in the wild okay now let's have a look at those edge computing workloads what are we running at the edge how do we package it and how it influences the kind of platform we will pick first okay when you talk about edge computing people will tell you oh there's the telco edge there are two edges three edges whatever you know and depending what they're trying to sell you okay the answer will vary our vision at the eclipse foundation there are at least in our edge native working group is this the space between the edge and the cloud is really a continuum and whatever you're trying to deploy the components for the solution will live in various physical places all the way up and including the cloud okay and that's true for all three planes management plane control plane data plane okay depending on what you're trying to build okay your components will literally be all over the place so the edge so to speak is anywhere and everywhere okay and that's really an important consideration when we say the edge is really distributed you see it right there okay now that's for people maybe less familiar with the planes just you know reminding you what this is about so the data plane are the actual software components right the control plane is really about controlling the application themselves and the real-time monitoring and finally the management plane manages the rest and does the device configuration so every year we at the eclipse foundation run an annual annuity and edge developer survey and we have a parallel commercial adoption survey for business types and those are the numbers from the 2021 edition but we'll publish the 2022 stuff for developers early october and the commercial adoption one in january by the way all of those surveys are on our website and they are all licensed under creative comments so feel free to pick a slide or pick a number and put that in your own deck okay that's something that we do on balf of the open source community you know as public service as part of our mission so we asked people in that 2021 edition hey what are you doing at the edge what are you deploying and in that particular version of the survey we were distinguishing between edge gateways and edge servers we removed that I mean the numbers are pretty equivalent on both sides as you will see but the interesting thing there is the variety of things okay yes AI is at the top but you have control logic you know for industrial automation and things like that so driving actual robots assembling a car and whatever then data analytics and sensor fusion the fact that you assemble and aggregate all of that data to push something maybe in real time maybe at regular intervals somewhere else okay and you do some of that mashing or preprocessing or merging on site closer to the data right then we asked people how do you deploy stuff and you would think that hey it's containers all the way no yes containers are part of the thing but there are native binaries especially for smaller microcontrollers you know you have a premium on your space and compute power so you keep things simple script files you know whether you know we didn't ask exactly what technologies people were using but you can imagine a number of answers there serverless functions you know some people do it and it's certainly a good model there and even virtual machines or virtual images on edge servers play a role as well so of course you want to maximize the utilization of the hardware that you want that you that you deploy of course you want to you know have the security benefits of isolation and things like that but there are many ways to achieve that and depending on the workload you don't necessarily always do it with containers so already when i said you know when i was asking do we need kubernetes on that road trip well not always because there are more efficient ways you know to to run some of those workloads without kubernetes involved right and i say i keep saying kubernetes that's my french showing up i apologize for that so that's kubernetes so please alias kubernetes okay sorry that's a bad tick i'm trying to correct it anyway um another important consideration you know you are at the edge and and keep people keep forgetting a whole half of the universe in the sense we've been using linux 3bsd macOS windows whatever all of those are you know going back to the roots of the unix time sharing system right okay time sharing type of operating systems where essentially you tiny slice your processor time and distribute that along all of the processors on the machine that's a great model especially to maximize utilization of the resources that you have the problem with that is that yes you maximize performance you maximize your resource utilization but the latencies are very variable right and the problem once again is if you have a very mission critical life-threatening type of application like let's say the abs breaks in your car you don't want to do time sharing okay you need some form of real-time system or operating system in there where you guarantee the latencies for every operation so that's a completely different mindset and that's really important because you know try to you know when I was coding I'm still coding a bit unfortunately not as much as I want but in my really coding days um you know we were um we were really focused on on on on on those types of applications that that's really um really take advantage of time sharing capabilities right the fact that you scale in a certain way and all of that but in real time it's a completely different day you know I'm true and true an IT guy information technology but if you go on the factory floor if you go on an oil field okay if you go in the wild where you deploy those edge solutions you will meet operational technology people you know the guys that support industrial processes process automation industrial automation and things like that and their mindset is completely different you crave for cutting edge right the presentation right before new stuff in podman and things like that those guys are obsessed with stability the fact that if this particular robot in the factory you know as downtime you will lose 16 million in production value this afternoon okay so this means you are very very careful about what you do right because that's directly related to the production capacities of an organization or even once again two lives you know if a pipeline fails if a nacoduct for water distribution fails this has massive consequences and you will be in the news quite fast right it can happen in IT as well but the mindset is different so the people around operational technology they are developers like you and me right because I was busy you know building web apps in my coding days and things like that like you are probably but their mindset is completely different okay and this is why you need to keep that into account take that into account when you do edge computing but because you are you know much closer to this gap between ot and it and you want to bridge it which means you need to to slow down on cutting edge sometimes and and listen to their specific requirements in order to deliver something that makes sense all right so kubernetes who got it at the edge so there are a number of options as I said I won't be exhaustive I will name some okay and but there are two now when you try to contrast the approaches there are essentially three approaches to run Kubernetes at the edge the first one is run plain kubernetes at the edge right you just deployed the open source version or they changed the name but open shift open source edition what's the name for that anyway whatever you you take that and and put a number of servers and then you fail and that's not a good approach so you have alternative approaches that have emerged in the last few years right like for example k3s from from rancher you have also microcades from from canonical and other alternatives and I just pick one you know there but you know there are two mini shift micro shift mini cube anyway plenty of stuff to love and like but the approach is always the same you try to reduce the number of dependencies and in the case of many of those lightweight kubernetes distribution you remove etcd right you don't want the thing to blow up I hope there's no big etcd fan there all right all of that to say that the approach is typically to reduce the footprint reduce dependencies and in the case of k3s they put everything in a single executable that roughly takes 40 megabytes to run so quite slim but not that slim in the grand scheme of things because many microcontrollers have less resources than that and then there's the cube edge approach okay which essentially requires kubernetes control plane somewhere but then as a set of different components the cloud core and the head job that you deploy in edge locations and those things are smaller footprint if I remember well one megabyte will be no four megabytes would be enough to to run the edge components in in in cube edge so that's better right certainly a much smaller footprint than 40 megabytes so we are getting there and all of that is fine and those projects and products even because there's a bunch of commercial products as well that I don't mention all of that that works well right people are doing actual deployments and no problem there but there's a thing you are doing containers at the edge and there are inherent limitations to what you can do with that okay now there are other projects that I didn't mention and and a few eclipse projects in the mix and all of them have a different focus most of them are cloud managed some of them can run just in edge mode without a cloud connection so for air gap environment in defense for example or things like that that's a good choice etc some of them integrate with kubernetes some of them don't anyway so that's a good reference okay and that's adapted from from a little paper you have the link to the original paper that I didn't write okay community members in our ecosystem wrote it as a scientific paper and compare some of those things anyway so you will have access to the slides and if not you know reach out to me and I will give you access to the deck and you can read the full paper there now let's have a look at specific use cases okay really to understand where kubernetes could fit or not in those use cases and the first one is the connected car we don't realize it but the modern car the car of today is really a data center this is not just about oh there's an onboard computer no there are several computers and they are connected to high-speed buses and things like that that are very specific to that industry and they're trying to standardize and go towards a more generic model there some people are even pushing for kubernetes in the car and things like that and and I suppose for some workloads that's specifically fine but when you think about it okay which in those connected car features are mission critical you know things that shouldn't fail whatever happens versus it's okay if I have to reboot the infotainment system it happens even in my car that doesn't have kubernetes I had to learn this weird control at the lead for my car because sometimes it freezes and yeah thank you Ford anyway all of that to say that when you think about driver assistance safety okay you need to have predictable latencies you cannot afford let's say for the little algorithm driving the brakes you know to be killed because oh we just restart the container or something right and and this shows that you know there are concerns or types of applications that really could be a fit for containerization inside the car and you have this little cluster and and maybe there's a need for a different approach for some of your components and that's really important right the modern factory is a data center as well right when you think about it all the level of automation robots not just for making things but for transporting things you know autonomous forklifts and things like that so once again should you run kubernetes in that particular context and it's too small it's ugly I know this comes from an eclipse project called basics that is in the space of industrial automation but what it illustrates is really the many layers that you have you know in the modern connected factory the fact that you have intelligence in the programmable logic controllers that drive the robots for example and then you have full subsystems let's say you are in a pie factory you have a full system that's really okay baking the pies and then you know that you have this huge conveyor that will bring them to the cooling tower and all of that so you have sensors to watch all of that and the packaging system at the other end so many of those things are not only machines but full subsystems you know with a various levels of components and all of that is automated by operational technology people they have their own networks they have their own set of standards protocol connectivity technologies all of that right and the level of complexity is staggering but once again some of those things can fail no problem let's say the system with little screens that remind employees that they should smile because they have a job and things like that that can fail right but whatever is at the core of the production for example shouldn't fail because you have lots of potential financial losses at play you can ruin you know 80 000 car 80 000 car very easily with a robot that doesn't complete its welding at the very precise moment it's supposed to do so once again you need more robustness than a typical container based system will provide you because you have so many levels of indirection in there so and we spoke about AI AI is not about one specific industry but there are certainly things that are very specific to AI types of applications at the edge that you need to take into account for example when you deploy AI at the edge you operate on the fragment locally of the full data set and that's really important because you need some ways to bring that back at some point to the clouds in a way that makes sense but you will make local decisions that will be a bit biased by this localization then you have always this concept of an error function if you have data that doesn't make sense or you need to validate something you will call up you know to the cloud and try to figure things out there but as we said you have to assume that the network will go down so once again you need to design around that fact and then optimization of that particular model in data set you know of those particular algorithms will depend on local conditions and there let's say you analyze humidity in the air and there's a water broken water main above your little sensor and it drips on it you have completely falsified whatever comes out of that particular node and you need to prevent well you cannot prevent such incidents you will detect them after the fact so how do you flag potentially suspicious data you know that's not in line with the trends and things like that right and and that when you are at the edge you need to be able to do it autonomously without a connection to the cloud because you never know when the network will go down okay so that's really an important design consideration there so in terms of architecture what kind of questions should you ask yourself on whether you know you will you will pick probabilities or not first there's the element of latency how predictable it is you know you're driving a nuclear powerpoint you're driving a very complex industrial process with precise timings um you're literally deciding when to break in a connected car once again you know those are things that cannot be interrupted and should operate on a predictable time frame okay then there's a question of whether you can afford to lose your data or not okay in many applications let's say a connected building a smart hotel let's say you put sensors all over the place to measure temperature humidity room occupancy can you drop 15 seconds worth of data probably it won't kill anyone you know we can maintain the temperature in this room without any problem of course if that's the international space station you cannot afford to drop any any microsecond of data right because at any time you may need to correct altitude and things like that so that's a whole different ballgame um and then there's the unicity of your instances so we said you know in a stateless model you know containers you know it's like in a in a big family you can lose a kid no problem you know you you don't hesitate okay maybe not um that example no all of that to say that in the case of typical kubernetes types of application any given container doesn't have any particular significance you'll try to be as stateless as possible well in the case of industrial automation in many processors like that or even you know real-time trading on wall street you cannot afford to lose anything and you cannot afford to lose what any specific container would be containing at that point in time at as a state so that's really an important consideration then our our constraint is your hardware and your overall infrastructure that's really important and as I said you have to assume that even if you have headroom you have to be conservative about the way you use those resources because you never know when someone at head office would have a bright idea and they will ask you oh you have those nodes put that on on them too you never know when that will happen so you need to be conservative about whatever you are using especially to prolong the lifespan of the equipment right so if you put too many things you will overeat if you overeat the equipment will fail faster and then of course you have little to no elasticity so you may over provision of do a point but since you are spreading those nodes thousands of them probably at some point you cannot afford to over provision that much you know we're not in the cloud and finally how far your control plane should be compared to the actual edge devices so once again the farther you are the more unpredictable latency you are introducing in the whole system so in some applications you can afford it in some others you can't and those are really important considerations in designing the whole solution and deciding whether you want Kubernetes in the mix or not and yeah you know one important thing is people have been doing or trying to do stateful containers or Kubernetes you know I'm ex pivotal we are trying to run databases there I wasn't involved in those efforts but really you know you are either the guy walking away from the explosion or you're the one stuck in it and most people that tried it are stuck in it okay so and and really this makes sense when you think about it Kubernetes from the very beginning has been engineered as a way to scale stateless workloads right that's the religion that you should have the mindset that you should have okay you can try to shoehorn something that states full that's not necessarily the best model especially at scale so you know that's one important thing if you have state and most real-time mission critical types of application have local state then you should be careful about this you don't want the guy to be stuck inside that fiery explosion okay so if we try to synthesize where is Kubernetes a bad fit real-time mission critical constraint devices right even if k3s let's say is only 40 megabytes you can go much lower you know with other types of solutions okay and finally there's heterogeneity you know yes people are starting to say yeah we mix and match x86 and an arm in our clusters and then you look at the documentation for that and it's scary it's not mature we're not there yet maybe five years from now it will be mature enough to do something mission critical that would be on a hybrid cluster we'll see you know I'm but don't do it don't do it yeah all right out of this to say that overall when you think about it you know edge is not about one particular technology but picking the right platforms and tools from the extensive open source toolkit that exists so this is from our white paper on the jobs and the logos in color are from eclipse the authors grayed out are from uh author provenance uh typically linux foundation and fh and things like that uh not that I don't like them we were just trying to illustrate our own footprint here and you may argue with me to know no end with about the placement of one specific icon you know a logo for a particular project you like on this the whole point is just to try to illustrate them in relation to each other and of course that's the typical deployment you could okay so we are among ourselves to know and on this so you but don't don't do this the whole point is to show that even though we are still known for the desktop ide and all of that the footprint that we have at eclipse is really you know spread all over you know the edge to cloud continuum and the DevOps continuum and the author dimension and what we don't have we integrate with so of course open stack of course Kubernetes of course things like podman or whatnot projects like fledge and edge x foundry all of that is highly relevant and interesting depending on what we're trying to build there's no one stop shop in edge and it's even tour in iot and edge because people will tell you hey I can provide you everything from the sensor to the data center yeah through their network of partners and all of that no one makes all of those things and tests them in advance right so it's more often than not an adventure and in building your open source architecture and all of that for any particular solution you have to keep that into account so please don't limit yourself explore what's available and there are plenty of mature well maintained open source projects in the space at the eclipse foundation at the linux foundation and elsewhere uh don't don't forget open stack and the open infrastructure foundation there all right now I will focus to conclude this talk about the way we see things at the eclipse foundation and when I say we I'm just the voice there right I'm speaking on the behalf of our community so um we may be right we may be wrong but that's what we agreed on and you are more than welcome to join and you know show us the right direction if you feel we are wrong uh first so the full story you know this could have been a three hour talk is in this handy dandy white paper so you have the QR code or the link to download it so what I will do is really synthesize it a bit in the context of our conversation today okay but really uh this is something that we came up together you know representatives for our major members and I sat down and we wrote that together that has been painful 30 plus pages of collective writing is something I don't want to relive again uh anyway so what is ed jobs exactly ed jobs is the fact that DevOps is fantastic but when you are at the edge you don't do naive DevOps okay think about it connected car everyone is stuck on the interstate right 430 pm you know horns blaring let's patch those cars oh yes right you don't do that you don't do that so yes you want to break the division between dev and operations that's good okay yes continuous integration but not continuous deployment you have to be a bit more intentional about that okay and and sometimes more incremental because you know you you spread over the updates on a subset of the nodes so that you have leftover capacity if something fails for example um and and of course microservices infrastructure record all of that is good but really you know I covered the challenges that the edge you know edge computing solves so edge jobs is really DevOps tweaked with those changes in mind with platforms that out of the box will help you with latency reduce bandwidth and things like that the characteristics of edge projects as well in iot in the sense that a typical edge or iot project has a longer lifespan you know you replace maybe your servers every two three three years maybe maybe even less than that right well in the case of edge you're placing those nodes for 10 years on every locomotive you know in Germany for example that's a different ball game okay um heterogeneity i talked a lot about that the constraints the physical constraints that you have vibration dust watering grass humidity etc there are so many things and you would tell me why i will use ruggedized hardware no problem there problem is that's not enough yes that's an important consideration but your data will be falsified by those vibrations for example and you need some level of correction and all of that so that's not just you know putting some code in there you have to be intentional about the way you write it and and then there's the connectivity the fact that the network will fail my brother's works for major network equipment company that rhymes with crisco and you know he hates me when i say that but you know it's always you only notice the network when it fails but when it fails you notice it right it's really so you have to assume that that's really a core assumption and of course we covered the workloads at the edge and the types of artifacts that people are deploying so you're getting more than just containers you have VMs straight binaries scripts serverless whatever and and this diversity will grow over time technologies don't die they just become niches that pay richly the old people maintaining them i'm getting to that point in my career i guess so all of that to say that edge ops is not only a philosophy but platforms that implement those things in a way that's different from the typical DevOps supporting platforms and so all of that action in the eclipse foundation happens in what we call the edge native working group and really there we are about three things we are called first so yes we care about blueprints but blueprints that come from our code okay we write the blueprint or the architecture after we implemented some stuff and add real pain with real customers okay the goal is really to simplify and streamline production deployments and really edge ops is you know this not turn star where our members whatever project they are involved are trying to implement in one way or another and when I say implement that's the four projects that you see there at this point so you have eclipse iofog eclipse fogos it's written zero five is pronounced fogos it what happened when you let engineers name a project right and trying to be clever there eclipse kanto and zeno so iofog is container orchestration at the edge okay so it will it will use an oci runtime to execute them but really this is about the fact that you have the central controller and the edge node spread all over the world and there's a built-in layer seven proxy in there that comes from redact projects copper but it's been fully integrated in it so without vpns or complex network configuration you have the central controller that can be in the cloud or completely at the edge and you control the nodes wherever they are okay so that's really one of the tools and and and that has been around you know it's so whole that it's got fog in the name so if you if you've been in that space for a while you know you know that it's a bit outdated as a name but it's got recognition so they kept it and the same for fogos it's been around for a while as well but it's a completely decentralized take so there's no central controller and really the principle is to have this unified compute fabric including microcontrollers to edge nodes to cloud and it supports virtual machines containers trade binaries and things like that so a different take on it and it's a really cool technology then there's eclipse canto which is a take on the edge more oriented towards software defined vehicle and things like that so it's an edge platform we run containers and it's got specific facilities to upload and download files and things like that so they are a bit more use case oriented in the case of that particular project and finally there's zeno zeno is a spin-off of fogos essentially they wrote a custom protocol to run fogos and then they realized hey it's got on its own so zeno is a pub sub protocol next generation one and it melds you know what dds could do with what something like mqtt could do and it's really powerful in the sense that it's built from the ground up for the edge so you start a few nodes they will also discover and connect and form a clique or a mesh or whatever but then you have a routing node that you can deploy in order to route over the public internet from long location to amateur but the routing node is also the point where you can declare distributed storage so you have a storage plugin that will support no sequel or sequel databases or even the file system or even in memory for caching and things like that and the routing node you know is the place where you have this rest api that you can you know access in order to create those storages and things like that and when you publish and subscribe data you can say let's say i publish slash us slash robot slash telemetry and that particular database will record all of that but then you could have let's say localized databases that would record data by state for example and keep the data inside that specific state if you deploy nodes you know in the corresponding physical locations okay so it's a very powerful concept and it includes support for evals and queries and things like that so this is not just pub sub this is literally a way to manage historical data built in into the protocol at the same time all of that particular work is done by our members you see LF edge there so we exchanged memberships because we support any and every open source project or community that really shares the open source way right but really you know there's always room for more i hope that particular slide to be full with your respective logos two three years from now um so my invitation to you please try out the technology it's all open source i can if you're not sure what to pick i can certainly help out with that uh follow us etc but really thank you for being here till the end i'm impressed right uh game game night is certainly a big draw and you can follow me uh on twitter or DM me over there uh the walking group also as an alias so well thank you for listening all that time to me thank you very much and by chance does anybody have any questions we have maybe one or two minutes for one or two questions if not there was a lot of work in this presentation he had a rough slot at the end of the day where people go and try to catch dinner but i think he deserves a round of applause and maybe even if you think he deserves it stand up and give him a standing ovation thank you steve