 We're going to be talking about deploying to Kubernetes thousands of times per day. Clearly, it's a popular topic. My name is Dan Garfield. I'm a full-stack engineer, but I also run marketing at a company called CodeFresh. And I'm William Dennis, product manager on Google Kubernetes Engine. So we're going to be talking to you today about deploying thousands of times per day. And that's really all about high velocity. So the first thing, we're both engineers, but the reason that we're really want to talk about this today is because we work with a lot of really high velocity teams. From CodeFresh, I work with companies like UNICEF, Steelcase, Giffy, if you like cat GIFs. And I've worked with eBay and Niantic. Yeah, so these are all teams that are doing high velocity deployment at scale. They're delivering containers into production every day. And so we've been working with these teams, and we've gleaned a number of lessons from them. The commonalities, I would say, that we think are useful to the community. Now, some of these things that we talk about do highlight some Google Cloud stuff, some CodeFresh stuff. But really, these are all things that you can take and use and implement in your own processes, regardless of where you're hosting it. Yeah, so why do people choose Kubernetes in the first place? There are definitely a lot of reasons. One might be scalability, the ability to take an application and make it sort of planetary scale, avoiding downtime through the operator pattern, not getting that 3AM wake-up call and needing to add or reboot a server, reducing costs through bin packing, increasing developer velocity through releasing more frequently, and, of course, infrastructure abstraction not being tied into one particular cloud or one particular vendor. So it was interesting to me when we had a technical advisory board a few months back. We had about 40 people in the room representing some fairly big companies. And we did just an informal poll of what was the most important feature, what is the most important reason that you picked Kubernetes? And to my surprise, actually, about 80% of people put their hand up for developer velocity. So I think this is a really important topic. It's an important point of why people are actually choosing Kubernetes to begin with. Yeah, it's funny because we talk so much about scalability, so much about redundancy. But at the end of the day, what everybody really wants is they just want to be an effective engineering organization. They want to be able to have higher velocity, be able to make changes very quickly. And so it does kind of stand out to me as a surprise, 80% of the people picked developer velocities their reason for adopting Kubernetes. And by the way, these slides, these are things that you can take if you're trying to sell Kubernetes to your organization. We'll provide these, you can download them, take them, go use them like a club and beat your team and say, go get adopt Kubernetes, that's what we're going to do. There are kind of three big reasons why you want to adopt high velocity. Right, so one important thing is reducing the risk. So we think that if you're releasing your software very frequently, even up to thousands of times per day, but as long as you're sort of releasing it at a high cadence, the more atomic releases, you're not like batching everything up into a massive kind of quality release. You're not sort of tying up various different things like maybe a new feature or a time-dependent feature with a security release. And it means that you can push these out, you can observe the effects, and if you have to roll back, you're only rolling back a little bit, you're not faced with that horrible situation where you had to deliver a feature by a particular date, and then there's a risk that you may have to roll it back because of some unrelated problem. So we think definitely reducing risk by deploying very frequently is a very important strategy. Yeah, I think my dad used to say, small cuts make small mistakes, right? So it's the same thing with software development, you reduce the risk. And with those big sort of giant commits, those quarterly releases, they're very cost inefficient. So high velocity is also about cost efficiency. So think about it this way. When you're making changes to code, the reason that you're making those changes is because those changes are going to give you business value, right? You're delivering something that's gonna make your company more successful, making more money. If those changes, any moment after those changes are complete and they're not delivered into production, you're basically paying money to store those. So your CFO is gonna love this point. Basically you wanna get those things out faster because they're valuable for your company. And the third, which a lot of people don't think about, a lot of people think of high velocity, lots of changes, sounds like a security risk. Well it's actually the opposite. Being high velocity means that you're gonna be more adaptable to change. And security isn't a game where like, hey I did everything right, I don't have to worry about it anymore. No, security is a game that changes all the time, right? SSL had a bug for 10 years, no one knew about. Heartbleed happened, internet blew up, everybody lost their minds. If you're high velocity and you walk in and there's a zero day, you're like, oh, I guess we have one extra release to do today, no big deal, this is baked into our process, we do this every day, so it's not a problem for us. So being high velocity actually also makes you more secure. So save that one for all your financial industry folks, take that back to the team, that's a super valuable reason to adopt high velocity. So there are a lot of, how do you become high velocity? There's a lot of common or highly repeated stuff that you'll hear, agile planning is super important. We're not gonna beat the agile drum today, I think you guys have probably got it. We're not gonna beat the microservice architecture drum, you guys have probably heard a lot of that, hopefully everybody has drunk the Kool-Aid maybe, but and everybody, obviously everybody knows that you could always use better testing. We're gonna talk about kind of five core lessons that we've learned from working with all these different companies. So here is number one, high velocity teams center on images. I'm gonna explain what this means. I have a quote here from Daniel Stone, he says, our entire developed test stage deploy cycle is Docker native. This reduces the complexity of each step, allowing us to build with a smaller team. So what does this mean? What does it mean to center on images? Well, if you think about an image, images are the star of the show in your release process, in your development process. An image is immutable, right? So when an image is created, it should be ready to launch. I talked to someone at a meetup recently and we're talking about this point and they said, you mean when I push my image to production, it shouldn't download a war file? No, don't do that. Oh my gosh. He said, well, it's pretty big. It's like 500 megs. And I was like, God, put it in your image. What are you talking about? Yeah, so your image should be ready to go, right? And by doing that, your image becomes a point of validation, right? So previous to working with containers, you might validate a change set, but change sets are ephemeral. They don't have dependencies associated with them. They don't have the underlying OS associated with them. They don't know anything about infrastructure. An image doesn't have that problem. You've baked everything in, so you can move an image and we're gonna talk about that in a second. Now, having an image as a star of the show means that it's also connected to the entire process. So if you have a JIRA ticket or you have test steps, all those things should be associated with an image, right? So you're gonna, just like you have maybe a commit or a feature branch associated with an image, you should have, sorry, an issue associated with a branch or a commit, you should do the same thing with an image. That includes like deployment status, and you should have a history, which we'll talk a little bit more about. Yeah, and another benefit of the images, I was talking to a Japanese startup recently, they were in the process of adopting Kubernetes, and they were most excited about this idea that you can take that same image that the developer built on their machine and take it all the way through to production, through staging and tests. And it also means like, if you actually encounter a bug in production, you can potentially pull that image down, connect that to your staging environment and debug that way, and you have a lot more confidence that you're actually running the same exact dependencies and the same exact code. Yeah, reproducing bugs becomes a lot easier because they're like, oh, there's a bug in this image, I'll pull it down. I'm not trying to pull down a change set and then make sure my environment matches, right? Right, but of course, if you are pulling that image, you actually need to know what code was actually running in it as well. And I think that's an important next point here, which is that you need to carry that history with you. Yeah, so a lot of us think about the latest tag. You should probably forget that. If you're deploying 1,000 times a day, latest is just a random image. It's like, you don't even know what's in latest, right? So that means you can't really rely on that as your strategy. Instead, you need to have those things living on their image and be aware of it. And you'll see what I mean here in a little bit. One of the things you should be able to support is what I call time travel. What is time travel? Well, time travel is ability to say, I need to go back to what was running at 9.15 a.m. last Monday. Now, of all you people sitting in this room, think, how fast can you answer that question? Because if it's within 30 seconds, then you're in a good spot, right? So you wanna be able to support time travel and I'll show you what that looks like in a few minutes. You also wanna have it closely coupled with Git. So if you're looking at an image, you should be able to find out the diff that's the code changes that went into that image very quickly, right? So if something's broken, you need to debug it, you've got an image, what do you know about it? An image should not be a black box. Right, and I think the classic use case, right, is that you're working on a bug, you're trying to fix something, and you have a patch. And the patch maybe fixed it or maybe didn't. And you have a report that the issue is still occurring. So one of the most important things, of course, is that you need to know, did the patch actually reach production? Or are you gearing these bug reports on the unpatched version? If you don't know what code is actually running that image, it's really hard to answer that question. And so as long as your image is tied to the Git commit, you can go back, look at the history, see whether or not that patch made it, and then decide, okay, well, the patch actually didn't work, or maybe it just wasn't deployed. And I think that's an important distinction. Yeah, and if you're driven by Git, then you're gonna also be using automated builds, right? Right, yeah, and I think that's a really important point because we're just saying that you need to know what's actually running in production. If you're building those images locally, it's possible that you may have uncommitted changes in them, which of course, blows up that traceability. So as long as you're using an automated build server, then you can feel confident that it's actually reflecting everything that's currently in Git. And if you're not doing that, then it makes the previous point a lot harder. Yeah, I would actually enforce that from like a security process. Well, yeah, because it's not in code review data, right? Yeah, exactly. Basically just bypasses the process so you can do that. So that's kind of what it is to center on images. The second thing that we've learned from high-velocity teams is that high-velocity teams shift left. So what does it mean to shift left? I'll give you a quote here and then we'll talk a little bit more about it. This quote's from Damon Zerchler. He says, when our engineers commit code, codefresh, and you could build this yourself as well, codefresh runs all the testing we need and spins up an environment just for the feature they worked on. Our QA and design teams can access this unique environment and do a level of testing that just wasn't possible before. Our test cycle went from three days to three hours. So hopefully you get the idea of this is a big benefit to shift left. What does it mean to shift left? Well, you're all probably familiar with this diagram. This is a typical dev release process. You make a feature branch, you make a commit and every commit you run unit tests. And then when you're ready, you issue a pull request. Someone reviews the code and then you push it into staging where you do your deeper level of testing, integration, performance, security testing, licensing, scanning, all those kinds of things. Now typically you can only do that once you reach staging because that's the only place where your application can access all of its associated services. It's backend, basically the full application stack. The result of this is that your staging step becomes a huge bottleneck and it becomes a cost sink. So if you think about it, what's the most expensive part of that process that we just looked at? It's code review. Code review is very expensive because your time is the most expensive time. Paying for a few extra cycles on compute. Yeah, on compute, it's not a big deal. So what we can do is actually we can eliminate this and we can fix it by shifting left. So this is what a shifted left pipeline looks like. Every time you make a commit, instead of only running unit tests, you should be able to access the service you're changing along with all of its dependent services to run not just unit tests but also integration tests, performance tests, security tests, even user acceptance testing before you do the pull request. So what does that mean? Well, it means that if you're a developer and you're working, you're gonna get instant feedback on everything about the change that you've made, not just if it passed unit tests, you're gonna know everything. And so you can make all the changes. So when you do issue a pull request, not only is it gonna be higher quality, but the person doing that pull request, they already know that it works, right? They already know that it works. We already seen that it works. It already went up in environment. It's already been validated. They have all the proof points they need. And then when they're doing that pull request, all that they're looking for really is like, let's make sure that we have style here. Let's make sure that we're following best practices, security, those kinds of things. So that's what it really means to shift left. It basically means to take all of those testing, your full stack testing, and moving it so that it happens on the commit at the branch level for basically, for every branch that you work on. The third thing that we have seen from these teams, and actually I think William, you were talking about this. Well, to basically achieve what you're saying, you need to have a portable application. So that feature branch is effectively bringing up in a femoral environment, right? So it kind of looks like you're staging environment, but instead of just having the one, you have like as many as you have features. Yeah, you have nth staging environment. Right, so that's really hard to do if your application isn't defined well. So you need to make sure that you have like very clear definition of your environment. And a good way to test this is to look at your mean time to recovery. So having a low mean time to recovery is generally a good thing anyway, right? But it's also a good metric to see like, yeah, like how quick can you actually bring this environment up? How quickly can you create that ephemeral branch? If you can't do that, it probably means that you're not doing this very well. So something at Google that we do is this disaster recovery test and kind of do it like I think about twice a year. And we'll use an example like this, the meteor hit a data center or some kind of crazy example, aliens have invaded or something, just to sort of set the scene. But it's important to do that disaster recovery. And when you do that, does it always go well or all these smoothly? Well, not always, but that's kind of the point, right? That's the point, right? Yeah. So if you don't have a plan to do that, you definitely should have that plan just from a disaster recovery point. And so I think, so CodeFresh, you have an opinion about this with Helm, right? Yeah, so we've actually, CodeFresh, we've fully embraced Helm charts to do this job for us. If you're not familiar with Helm charts, basically it allows you to define your full application stack with all of the associated services, which images are running, which versions they are, how the networking works. And so with a Helm chart, not only can we, you know, if CodeFresh was destroyed in some sort of disaster, we could redeploy it within two minutes. But it also makes it so that if we want to spin up an ephemeral version of CodeFresh for testing, we can do that because we have it well defined. So you're gonna get both benefits for the same amount of work. Exactly, both benefits. And we've actually, this is just kind of a side note, but CodeFresh this week, we launched public support for Helm charts. We've been dog-fooding this aggressively internally for about the last six months with some of our customers. And we've found that it's been really, really effective. It's also the most highly requested feature of support for Helm charts. So we announced that right here at KubeCon. Thank you. Wow, Johnny Ive Loman, yes. And so that's actually Helm charts even for like a bespoke application, right? Like a lot of people have used Helm charts for maybe, you know, if they're releasing software like WordPress that you expect thousands of people to run. But you're also saying that people are actually using that just to represent the full state of their application. Yeah, represent the full state. And it solves so many problems not only from an automated testing standpoint, but also like, hey, how do I get an, I've got a new engineer on the team. How do I get their environment up and running? Right. And a lot of companies that can be like a four day process. Oh, like a 20 page document, yeah. Yeah, or a 20 page document. So this makes it really, really easy to do. Now, going into that, a lot of configuration should live outside of your images, right? Yeah, so we've talked a lot about having these like ephemeral instances or multiple different environments. We also talked about the benefits of being able to take an image that's running in production and actually pull that down locally, connect it to a staging server and debug it. Of course that doesn't work if you've baked configuration in images. Like I really can't make this point enough, I think. The test here is that you need to be able to run that same image in test and production. If you can't do that, it generally indicates that you have something baked in. So obviously like config maps and secrets and Kubernetes are the way to do that. But yeah, make sure that you're sort of passing that test, right? Can you take that image and just connect it to any random environment? If you can then you're in a good place, I think for a high velocity environment. Yeah, absolutely. And it guarantees you can use the same image. Right. Yeah, so the fourth thing that we've saw and we just, so we talked about application portability. Number four, we found that really high velocity teams outsource cluster management. Right, and what do we mean by this? So I'm not saying that everyone should just jump on the cloud, although that would be fine if you did. But we're saying like your application developers should be focusing on building the application, right? So Kubernetes is fantastic in the sense that you can run in multiple environments. You can run it on-prem, you can run it cloud, you can run it multiple clouds and a whole combination of all the above, right? But I think the point we're trying to make here is that you probably should have a dedicated team focused on actually providing that service to your organization. It shouldn't be the app developers that are like manually creating nodes and adding like using Q80 and things like that. You wanna have like either a central team managing that or you wanna be using a cloud-hosted offering. Yeah, and think of it like how many of you and raise your hand, how many of you are managing the server racks and replacing broken machines, okay? Three of you that are on the ops team you probably should be, right? Nobody else is, right? Kubernetes management is the same story. If you're getting lost in that rabbit hole of maintaining Kubernetes, like you're not gonna be effective as a team, you need to have it outsourced. Either outsourced within your own company or outsourced to a cloud provider, right? And of course like when you're choosing where to do this, always consider scalability and redundancy as well. So even if you're doing a fairly small application, you're not sure how successful it's gonna be. Like probably one of the reasons you picked Kubernetes as well as the high velocity attributes is the fact that you actually can scale it if you're successful. So make sure you're positioning yourself for that success by choosing a solution that can actually scale when you need to. And so that brings us to open standards. And I think one of the benefits of Kubernetes is that it is open, that you can connect up multiple different clouds. You might have seen at the keynote this morning, I think Dan announces like 40 different vendors now that are certified on Kubernetes. So you can actually look for this logo, certified Kubernetes, and you have a reasonable degree of confidence that your workload is portable. But it's definitely something to keep in mind because you lose that open benefit if you're sort of choosing locked-in solutions. And just to emphasize the point, like these are all the different companies that are already in the conformance program. And I made this slide two weeks ago and I think it's increased by 30% since then. So you're definitely not sport for choice if you wanna retain that compatibility, right? That's great. So now this is number five is high velocity teams have to connect all the dots. All the stuff we've talked about, kind of backing up to a high level, really what it boils down to is empowering application developers to make changes, right? And if people can access these clusters, they can spin up the environments, you have portability, you're making changes left and right. Maybe you have dozens of changes going out every day or thousands. If you don't connect all the dots, that's gonna look like chaos. And so if you need to have this all managed in a place where you can have everything visible, I'm gonna show you how we do it, but you really need to have that order available to you in order to make all those things happen. So I'll show you how we do it at CodeFresh and these things are equally applicable outside. So I think everybody can see this okay. Hopefully it's not too small. Do I need to zoom in? Everybody sees okay? Mild indifference, okay. So this is a view of I'm sitting inside of CodeFresh and right now I'm looking at what's running on my Kubernetes cluster. And you can see I have a bunch of different services here. Luckily they're all green. Green means good. Red means oh no, yellow is maybe. So they're all green. I can see for all my different namespaces everything is in the green right now. And if I wanted to understand what's running in production, well first of all, I can jump over to my application where the endpoint is sitting. I can see that it's up and running, that's good. But I can also see what image is actually running with each service. So I can click on this and see immediately what image is running. And not only that, but I can see everything that's happened in the entire history. So here you can see all the different changes in the timeline, who's contributed to it. I can see that it has unit test coverage. I can see the performance latency. I can see there's the JIRA ticket that's associated with it. So I can jump straight into see what changes were being made. I also can load up performance report. Here I'm using Blazemeter. And I have a quality check that I've added here. Now I have also the commit shot. So if I jump over to the commit shot, so let's say I'm running this in production, I wanna know what was changed. I can actually jump straight to the commit inside of GitHub. So this gives you a chance to see exactly what changed. I can, let's say that I'm trying to figure out, maybe I pushed a security change. And you wanna know if that change is actually in the current image. Right, so I could jump over here and then I could look at the commit history associated with it, so I could see those changes. I don't think any of us have ever experienced the process of having a security change and then having it be overwritten by a change like a day later. No one in this room is familiar with that. There's nothing in the news about that. Shouldn't stand out to you at all. But this really makes it powerful, right? If you're trying to understand a change, you're trying to see what went wrong, you're trying to fix something, you're gonna have all that information right here at your fingertips and it's available within 30 seconds. And it's a pretty common use case, right? Because you maybe don't know if that fix actually fixed it, but first you need to know if the fix was actually there, right? Because it means something completely different, whether or not the fix just didn't work or it actually wasn't there at all. And I think that's why it's so important to be able to connect that image all the way through to the source code there. Absolutely, I can also see, I can access all the logs that were associated with this so I can see all the steps that happened, how long they took, I can do some digging there, I can also see all the layers associated with that image. So having this available, super powerful, kind of a superpower that you can have. Now, if I wanted to commit a change, and what we'll do actually, I debated whether or not we're gonna do this because it takes a few minutes, but we'll kick it off and see if it finishes by the end. I've got my builds view here. So let's make a change to this really quick. It's always fun to do changes, right? Live demos never go wrong. We're gonna say, let's change this, so it says let's chat, let's chat KubeCon. Hurrah, we'll save that. Let's commit it, hit get for that. Everybody can see this okay? Ask me about my terminal later. Total nerd about it. Is that nano? Are you easy there? Yeah. All right. So you're not making a comment of nano, is it? I'm all right, Emacs, bro. You win. All right, so I'm gonna go ahead and push this. This is gonna kick off my commit in that pipeline that we kind of just saw. Now, we'll see within, I don't know, 10 seconds that this build will kick off. There it is. Say hi to KubeCon. Hi KubeCon. So that's the early best practice, right, of not building locally. Yeah, not building locally. What this is gonna do is it's gonna, it's gonna create a one-off environment in which all the build is gonna happen. If you saw our talk yesterday on building scalable architecture, you actually got a little bit of detail about how this is all, how the backend of all this works. It's gonna clone all the code. It's gonna build the image. You can see we've actually cast all the layers that were associated with the image. So it's only making changes where changes are needed. So that means my build time is gonna be roughly 24 seconds, 23 seconds. Now, when it runs these unit tests, it's actually gonna spin up the whole application stack and then it's gonna associate those with performance tests. Now performance tests are gonna take about four minutes. So we'll just stand here awkwardly for the next four minutes and wait for it to happen. No, just kidding, we'll move on and we'll come back and check and see that it's completed in a bit. The other thing I wanted to show you as we kinda walk around and look at this is how rollbacks would work. So I mentioned that we've adopted, we've adopted Helm charts pretty aggressively. So I'm gonna show you kind of a sneak peek behind the curtain of some code fresh infrastructure here live on the stage. If I jump over to Kubernetes, I can see all of the clusters I have associated. Now, luckily this one that's yellow, that's okay. That one's like in development. All my production stuff is green. So this is the actual code fresh? This is the actual code fresh project, sorry. We're not gonna touch stuff too much. You can see staging where there's some stuff going on over there, that's exciting. But if I jump over to releases, you can actually see that I have my production here. I can see that it's been deployed and I can click and open this and I can actually see every upgrade that's happened, each one that was superseded, when it happened, what the version was and I can jump immediately to one and roll it back. So this gives us really, really great control over managing releases of our product and you can see from the timestamps on this, we do it all the time, right? So being able to, this is kind of that time travel thing I was talking about, this gives us that ability immediately. Yeah, we were talking about the history before and I think this is a very important point, like whoever you choose for your CICD, you need to make sure they actually have features like this because the Kubernetes deployment object doesn't necessarily contain all that history and the ability to link that back to the actual code that you're running. So I think that's definitely something that I'll look for. And one thing that I'd mention and we'll have a few minutes for questions here at the end is that CodeFresh actually, the ability to release a Helm chart is something that we've baked into a pipeline image step and we've made it a plugin that you can take and use in your own projects. So that's available, if you search for CodeFresh plugins, there's a GitHub repository, we have a whole bunch of them, we have tools for deploying to Kubernetes to ECS, we have tools for deploying Helm charts, we have stuff for working with JIRA, basically they're images that take arguments and can be put into a pipeline even with Jenkins or other tooling that you might use. So that's free for everybody, enjoy that, use it, contribute, we'd love to see that program grow. So this gives me my view of all my Helm chart stuff, which is awesome and I can manage that whole process. Let's jump back and look and see, I don't think it's been five minutes so it probably won't be completed yet, but we can jump back anyway and see just where it's at with that build. And so what's the benefit of running the performance tests? Well, yeah, so actually this is a really good question because if you're performance tests, even if you launch your whole stack, you're not necessarily gonna have like a million nodes dedicated to each of these. The benefit of this is basically that you can track performance changes over time in sort of the small scale environment. So this is a smaller environment than what's gonna run in production, but you can actually watch the delta of latency happen and then this gives you the ability to see spikes and to see trends and so over time you can track how your changes are affecting performance and which changes had the biggest impact on performance. So that's actually a really valuable feature. Now when this finishes, and it's gonna be another minute and a half so we'll probably take some questions in the meantime, when this finishes what it's going to do is it'll finish annotating the Jira ticket, it'll finish adding all this information to the image and then we'll actually do a rolling update of into Kubernetes. So all of this stuff, why do we do all this stuff? Why do we talk about high velocity? Why do we have all these principles? Well really for us, we want you to be successful. As engineers, we're successful as CodeFresh as Google Cloud, we're successful when you're successful. When you're able to deploy a lot, when you're able to be really effective, that's when we win. So that's what we want to contribute to everybody today is all these principles that lead to high velocity. To help you do that, I really recommend Google Cloud Platform as the best place to create and manage clusters. I think that they're probably two years ahead of everybody else in terms of features and manageability, portability. Just an incredible platform, really cost effective, dedicated fiber, talk to me about load balancers later if you're into load balancers. I know a lot of you are chat with me about load balancers, but that's why we really love Google Cloud Platform and it's why we chose it for our back end and we recommend it to all of our customers. And of course, CodeFresh is an excellent way to run your CIC pipelines. Particularly, I think you did the deployment really well. A lot of CI systems, I see it like, kind of like skip the deployment bit or it's like the last bit. Right. Yeah, so I mean, we hope that all these principles are kind of broadly applicable, whoever you use. Of course, this is us, so we're a little bit biased. But we hope it's useful and broadly applicable to any kind of environment that you're running Kubernetes and CIC on. Yeah, now to help you try everything we've talked about, you can actually try all this stuff for free. Stop by our booths for details. We actually have some special KubeCon codes that we're giving away at our booths. So stop by and chat with us. We'll give you those codes, both good for Google Cloud Platform credits and for CodeFresh credits. So stop by our booths, we're just downstairs. And then we can look back, we can see that this deployment step didn't quite finish. So we'll skip out on that. Right. Any questions? Question? Monolithic app and deploy the whole environment, this whole thing would be hundreds of micro-services. Yeah. So what's the relation between the chart, who loves those charts or when you build them, what gets deployed into the environment? Yeah, so when you're deploying a Helm chart, when you're deploying a Helm chart update, it doesn't have to redeploy images that are already existing. You can decide, basically Helm charts lets you specify and say, I want you to always pull the images, or you can say, I only want you to update the ones that have changed. So it's really up to you to decide which one you want to go with. In the demo that I just did where I deployed, I was actually deploying a single service and so I wasn't redeploying the whole stack. But it's a very similar looking process. Cool, question? So the question is, what happens if the image is bad or if the code doesn't run? Yeah, so if the image is bad, you can basically build into your pipeline a flag that says, A, I don't want you to continue, right? So don't go put this somewhere, stop where you're at, B, send me a notification in Slack, C, make a note on the JIRA ticket, all those kinds of things. Oh, so yeah, your question is, what happens if it goes to my Kubernetes cluster and Kubernetes for some reason can't pull the image? So maybe I didn't set up my image secret correctly or something like that. Right, yeah, it's crashing. So definitely you want the liveliness and the readiness probes in Kubernetes. In particular, the readiness, sorry, the liveliness probe will actually detect that crash state and then if you're doing a rolling update, what it'll do is it'll replace like one of them. It'll see it a crash and it'll basically pause the update at that point. So you won't kind of get left in that bad state. Yeah, so I highly, highly recommend readiness and liveliness. So yeah. Yeah, and you can also use like canary deployments to answer that question, which both things that you can do with CodeFresh and Google Cloud. Yes, CodeFresh is cloud agnostic. Obviously I'm biased towards Google Cloud, but you can use it with any conformant Kubernetes cluster. So if it's DigitalOcean, if it's AWS, if it's IBM Bluemix, if it's the one sitting on that dude's watches yesterday, I don't know how conformant that was, but if it's Kubernetes, we can connect to it. Yeah, and that's the benefit of Kubernetes, right? Like when we're pitching the integration of Kubernetes itself to you, it's like, well, you can do this for Google, but also you're getting everybody else that supports Kubernetes along for free, you know? Yeah, you can deploy to both at the same time. I can show you if I look over at... That's a really good point. Like I think a lot of people actually doing multi-cloud, right, as a redundancy. It's so important. Yeah. Yeah, don't apologize. That's... Yeah? Yeah, if I can show you. We're all about hybrid and multi-cloud, and I mean, that's the great thing about Kubernetes, right, you can have that one control plane, that one kind of way to deploy your images and just put it wherever you want to run it, yeah. So you can see, like, right now in this account, I have an IBM Bluemix cluster set up. I have two clusters on Google Cloud Platform. I could add them for AWS, or just any custom cluster I can add a custom cluster, and then you can deploy it as many clusters as you want, or you could do multi-cloud cluster, you know, that's a whole other scenario, but yeah, definitely. What kind of image metadata are we using so we can see, do rollbacks? I have to change history, but... Yeah, so I'm not sure I understand your question. Basically, each step will record what happened, and then CodeFresh has a built-in registry where you can annotate the information associated with it. We actually use a standard called OpenGriffus, OpenGrass, there's a booth for it downstairs. We just announced our support for it this week, so you can Google, usually if you look on our blog, it'll be like the second post back over what we blogged this morning, but basically, there's a standard for annotating images that it's very new, it's backed by, I think, Google Cloud, I think, JFrog, and CodeFresh are sort of leading the way on that, but yeah. Question was, how do you deploy a rollback configuration of secrets? Yeah, that's a good question. Actually, I have a talk on this tomorrow. Yeah, so one option is using Git, obviously for the configuration, not the secrets, and I don't want to kind of spoil the talk, but there's a cool project, I forget the name of it actually, but you can actually set up a private and a public key, and you can share the public key to all your developers, and the private key remains in the cluster, so you have a single secret in the cluster, and then you can actually encrypt all of the secrets, and then just put them in Git, because no one can actually decrypt them unless you have the private key, so it basically boils down to use Git for everything. Of course, you don't want to put all secrets in Git, so then you want to encrypt them, but to avoid having kind of like, we talked about meantime to recovery and bringing up a new environment, it's kind of a pain if you have like, 50 secrets you have to provision, so then what you can do is provision that one secret, which sort of bootstraps all the rest. We actually also provide, inside of CodeFresh, you can export secrets and move them into another pipeline, so if you want to set up a new environment, it's actually really easy to like, duplicate the pipeline, set a new target, and then make your change, and then test it. No, you don't need to rebuild the image, you can basically have a configuration-specific pipeline, and then you can have your build pipeline kick off the configuration pipeline, or you can just run them separately, it's kind of up to you. Yeah, so search for GitOps tomorrow, that'd be a keyword, yeah. GitOps, go ahead. That's a good question, the question was, you guys are using home charts, what are the special advantages of using home charts, and what are maybe some disadvantages of using home charts? Yeah. You're probably more familiar with the disadvantages. Yeah, I've got a couple. That's where you're the best. Well, one of the things is, Helm has a certain kind of authorization model, so sometimes if you've really locked down your cluster, you sort of have to think about the tiller in Helm, because what you don't want to do is have a brilliantly configured cluster, which is all nice and secure, and then just open this sort of door into it. I mean, it's okay, you just have to be aware of what you're doing and kind of set that up correctly, so that's kind of a common pitfall. I would say that Helm charts is by no means perfect, it's under very active development right now, I mean, this is Kubernetes, like we launched 1.8 a month ago, 1.9 is about to drop, so it's under a lot of active development, I would encourage you to like jump into that working group, but what we've seen is that of all the different standard ways of defining an application for Kubernetes, Helm charts is the most comprehensive and it's also the easiest to use. It has some really nice features with how it handles variables and kind of file construction, configuration, construction, so in our minds, we kind of weighed this, we started looking at this like a year and a half ago, trying to decide like what's the standard gonna be because we basically kind of did Docker compose Docker swarm support in a similar way, and we said, well, what's it gonna be for Kubernetes? And when we really sat down and went through everything, we played around with starting our own standard and talked to some people about that. We worked with Bitnami a little bit and basically we all kind of came together and Bitnami now has launched Helm apps, so I think most, I think the kind of large players that I see that are really interested in application definement, we're kind of settling on Helm charts now, so I think it's gonna be the standard, always subject to change, but I think that's the future. Yeah, I think it's definitely sort of just to add to that, I think it's definitely underactive development, so it does look like Helm's winning, but I think there's a lot of different options, so watch that space, I would suggest. And there are alternatives, all right, you can define your application perfectly fine just by having YAMLs for everything as well. And if you're doing that, what I recommend is either having a separate namespace or a separate cluster, so we just made cluster measurement free, I think like most, like the trend seems to be like free clusters for cloud anyway, so whether it's a namespace or whether it's a whole cluster, the benefit there is that then you don't have to change the variables, right, like you can just repeat the whole thing, so you can have like a bunch of YAML files as well, and maybe with like the configuration YAML, that's different for each one. Some people even use Compost still, right? So that's, you know. I'd say Helm is probably a safe two-year bet at a minimum. We'll see what new standards come out and we'll support those ones when they do, but I think that's the standard for now and as things develop, we'll make changes, but I think that's the one that is. If you're interested in that topic, Brian Grant, he's a tech lead on my team. He has a lot of thoughts on the matter, so follow him, I guess. I think he has a talk at CubeCon, yeah. What was your question? Yeah, so there are two different places that you think about secrets. One is inside of a cluster, so if I can actually work with my configuration maps here, so I can create a config map for a specific namespace inside of CodeFresh and manage those. I can encrypt those and then also secrets associated with images, I can do the same thing. So if I'm looking at like, you know, pipeline, I can go down to environmental variables and I can create an encrypt, you know. I can encrypt those on the fly and then I can store them, you know, make them portable. Yes, well, yes, yeah. The ones that, yeah, for Kubernetes, yes, they'll be, the config maps will be in the cluster. Any other questions? Yeah, so I think this really encourages microservices because if you have microservices, you can basically have a small team working on one microservice, so there's not like a million changes to a single microservice happening every day and that's actually where you are more likely to run into problems. Your services should maintain interoperability, right? Like that's the whole point of doing it. So you're saying they don't have to constantly be up-to-date, they can actually lag a little bit? Yeah, they should be able to lag a little bit and it shouldn't be a huge problem, but again, before it deploys, it's actually gonna rerun it with whatever is currently the thing and then deploy it. So you'll always get that smoke test and you'll always be testing against whatever latest is. So even if you're, and then if you run into issues, you can pull down and figure out what went wrong. So what's your good position to make those changes? I think one of my favorite features actually with CodeFresh is the ephemeral feature branch environments. So like, while MiniCube is great, you can also just use that as like your own little environment. Yeah, MiniCube is nice, but it does a lot of things that are, I would say it doesn't pass these Kubernetes conformance spec, but. Well actually it isn't certified, probably should be, but no one's certified it yet. I don't know if it does or not, but no one's bothered to run it. It has some limitations that, yeah. Well the other thing is like depending on how many microservices you have, performance can be a problem too. So the nice thing about your ephemeral environments is that it has all the resources it needs. Yep, you scale up for what you need. I had this funny experience, I was at a very large company, I won't name their name, I won't name them, but I was, we're getting together for lunch. So walking through the engineering floor and there was this engineer that had two computers on his desk. And I was like, why does dude got two computers? He's that good? He can just do both hands at once? And they're like, no, no, he makes a change. He commits, it builds. And then he goes to the other computer. I was like, oh my gosh, this is a big company. This is a company you all know. So it was really shocking, but. Not my company. Not your company. But I wouldn't say it builds, they're actually super fast either. Cool, question? Yeah, of course. So we have a customer named Steelcase. Damon Zirkler was the guy I quoted in this presentation. What they actually do is they use live data. When they do their tests, they actually basically pull down live data and load it into images that they put as part of their test pipeline. So it's kind of an interesting idea because you think of it as a database as being a staple thing. Depending on the changes, what you're testing, you could connect the thing you're testing to the real database. If really you're just worried about reading or something. But you could also take that and bake it into an image that is in essence stateless. You're gonna make changes to it, but you don't really care if it dies unless that flags a problem for you. So you don't necessarily care when you're testing it if that data's gonna be around forever. So having a persistent data layer may be not as important. So a lot of times we'll see people bake them into images and then use those as part of their sort of test infrastructure part. I see, right. So while working on this, there's a new effort called Service Catalog which is meant to solve that. So Service Catalog allows you to specify like a SaaS resource. Hey, I just want like a MySQL database so like I just wanna pub something or something. It's not quite ready yet. It's still in development, but that's the goal there. So I think for now the solution would be you would have to actually have a representation of that that you could deploy in the cluster. But we definitely wanna solve that problem. Yeah, you basically have two options. One, if it's like MySQL, then you're like, okay, I can create an image with MySQL data and use that. If it's like an RDS or something, you're basically like, well, I wanna duplicate the RDS and have it sitting in available for lots of different test things to hit. And depending on what you're testing, you may be okay with it being a shared resource. Right, but I mean, I definitely get the point where you might wanna be the actual. Yeah, totally, 100%. Yeah. But that's always gonna be an issue like Salesforce. Like you have to test against the Salesforce API. We're not ever gonna get a Salesforce image. Like it's not gonna happen, right? So you kind of have to work with these like sandbox environments that you duplicate at external too. But yeah, Service Catalog would be the one to wash. Yeah. Any other questions? You can upload them on the schedule. Yes, they're gonna be uploaded onto the schedule thing. And then if you watch CodeFresh on Twitter, I'll also tweet them out. The different teams create these images. There are so many brilliant teams creating these images. How do you ensure that your cluster is picking up to kind of run all these different images? Yeah, so are you running the cluster in your own infrastructure or is it cloud infrastructure? So node scaling is something that Google Cloud does really well. So if it was GKE, you would look at the cluster auto-scaler. So there are two types of auto-scaling in Kubernetes. There's like the pod auto-scaling, which is scaling up the deployment based on demand. Then there's the cluster auto-scaling. So that takes care of your problem, which is like someone just scheduled a whole bunch of containers and there aren't enough nodes. The cluster auto-scaler says, hey, it looks like you need a bunch of extra nodes and it will provision them within kind of like about 45 seconds. Yeah. Google Cloud is... Advice versa, by the way. It can actually scale it back down, you know. Google Cloud is very good at this. There are also solutions for doing it on Amazon. But basically auto-scaling is something when you're looking at where you're going to host your Kubernetes, node scaling is super critical feature to have if you want to support that scalability. And basically codefresh is infrastructure is entirely elastic, right? So like we have 15,000 users and we don't get the advanced notice of when they're going to be heavily using the platform. So we basically have node scaling enabled. And so when Kubernetes isn't able to schedule a new pod, GKE says, hey, I noticed that this pod won't schedule because there's not enough resources. Let me add some nodes for you. Boom, everything happens. And then when the high times have died down, it's smart enough to say, looks like you're not using these nodes. We'll pull these down so you're not paying for them. This is actually a thing that I love about Kubernetes because I think you can actually save a lot of money doing that too. Like in the past, you'd have to like provision your high watermark, right? And the great thing about cloud in general and Kubernetes is that you can just pay for what you're using. Google Cloud actually has per second pricing and has had it for like two years. I think Amazon just added it. We had per minute. Per minute pricing, sorry. But now we have per second, yeah. Oh, now it's per second? Yeah, as of like a month or two ago. So now you're ahead of Amazon. No, they actually went from per alley to per second. They're going to go to nano second, bro. Don't even worry. We thought about it. What was your question? So one approach is you can have two repositories, like one for all the config and one for the code. That way you're not doing an image rebuild every time you change the config. You can also then treat config changes as if they'll code and go through like a full review process. I've also seen some of our users will actually have the, have that config defined as a secret or basically as an environmental variable. And so then. Like diversion. So that they can just change it on the pipeline that they're working on and get what they need. So that's another way to approach it. But I'll plug my talk tomorrow again, which is GitOps, which is relevant. And I'm pretty sure you can apply everything in that to code for us too. Last question and then we're going to shut down and go. Yeah, we work with companies that have hundreds of microservices. Yeah, you can do a couple of things. One, you can use chained pipelines. So you can actually have a pipeline and then you can make the next pipeline available to it. So let's say you have a pipeline that's just the performance, all the performance steps. And all that pipeline needs is to have an environment passed to it. So you say, you basically, it's almost like an API, right? You say, this pipeline is available. You can plug into it. It will do all its work and it will pass back the information wherever you need. You just need to pass in the variables ahead into that pipeline. So you can use, you can kind of use centralized pipelines like that. The other thing that I would mention is that Helm charts focuses on dependency management as a function of microservices. So when you have a service, you can say, these are the services that this microservice is dependent on and then you're not taking necessarily even the whole stack. You're only taking the portion of it that you need. This is something that Google's actually really good at. Google uses a single Git repo for everything. Oh, not Git, but yeah. Oh, sorry. Google uses a single repo for everything. This is Chromebooks, Google Cloud, Gmail, everything. It's all in one giant repo. The way that they do that is they have really good dependency management so if I'm working on a change, I can pull down just the stuff I need to work with and not the 47 terabytes that make up all of Google's information, right? Or whatever it is. Probably more than that. Cool. Well, thanks, guys. This was really actually fun to have you guys hang out and chat with us and ask questions. If you want to talk more, hang out on our booth. Thanks.