 Hi, this is your host of Limb Hartia and welcome to a brand new episode of our series, TFR topic of the month, AKA T3M. And this month's topic is platform engineering is DevOps there. And our next guest is Zach Butcher, founding engineer at Tetrate. Zach is good to have you on the show. Yeah, thanks for having me. I'm excited to be here. Yeah, and today's discussion, as you know, is about platform engineering. And we also hear people, some people say DevOps is that to also discuss if it really is that. But before we get into this discussion, since this is the first time you're joining us, I would love to know quickly a bit about the company. Tetrate itself is all around helping deliver applications safely, securely using the service mesh. Those are as our leverage point there. A lot of the original team came from the original group of engineers that created Istio at Google and IBM and Lyft and a few other companies together. And we saw a really clear need in the enterprise space for helping facilitate how applications communicate across clouds and on-prem. And as we're doing this kind of cloud data modernization, that's maybe the crux of DevOps and platform engineering today. And so we play right in that space, kind of helping with how we, again, application connectivity operations and security. Excellent. Thank you for talking a bit about the company and giving the background, which looks like a perfect discussion today. If I ask you that, since you, as you said, your team, you folks come from, it's like some of these companies, they're born in this cloud native era. So you folks too fully understand the whole evolution. What kind of cultural or organizational restructuring because we started talking about DevOps and of course, as things evolve, we talk about SREs, then DevSecOps, and then now we're also talking about platform engineering. So talk a bit about what kind of evolution you are seeing in the ecosystem and really what customers are doing there. Yeah, exactly. So I think we're seeing like a couple of different things and there's a whole bunch of names that people are trying to put all things, right? And people try and name stuff for a bunch of different reasons, sometimes just for marketing. Sometimes it is good to have these nuances there. You know, I think a lot of the names are getting confusing, but fundamentally in this space in particular, but fundamentally what people are trying to do are just a few activities, right? So from the primary thing we're trying to do is deliver our software to give user value. So how can we facilitate that, right? How can we build an infrastructure that bakes in security, observability, operational concerns that helps offload from the application developer so they can focus on the innovative piece, the features, right? So that's one of the activities that's happening is building this infrastructure to facilitate applications. And a lot of times the label that we put on the group that's delivering that in the organization is that platform engineering or platform team, right? The second piece that is just general is that we need to then deliver applications successfully on that infrastructure, right? And that's the CICD, the build pipelines, the testing infrastructure, all of the stuff that it takes to go from source code on the developer laptop to an actual running thing. And then in a lot of times today we're starting to call that DevOps, right? And we'll come back to that topic of developer experience maybe and that's where a lot of that lives. And then we need to actually operate our application successfully on that infrastructure. And that does include runtime things like the metering and the logging and the actual response to an emergency. But that also can include kind of going back into the application itself and saying, hey, we're not meeting business objectives within our application because of some fundamental application architecture, let's say. So let's go rip that out and change it so it meets business needs. And that's the kind of thing that I come from Google, that's the kind of thing that SREs at Google would do. And these are inherent, you know, whenever we label those activities, delivering the infrastructure, getting applications into the infrastructure successfully, and then operating and improving them continuously so that they meet customer needs, we have to do that. And whether we wanna label that platform and DevOps and SRE or mishmash those titles across those roles a little bit, you know, in my mind, we need to deliver those things. And so how are we seeing people do it? You know, we're seeing people adopt these cloud native stacks, right? So, you know, Kubernetes is the base for our compute orchestration, but, you know, we're increasingly see people use things like functions as a service or Lambda as well to help facilitate that. Because at the end of the day, again, the infrastructure, it's all about offloading from the app though, so they can focus on that. So, you know, that's two of the big streams, you know, streams that we see occurring. And, you know, then we get into how do we actually facilitate the connectivity between these things and how do we make them talk? And, you know, one of the worlds that I'm dealing with is cloud and it's highly dynamic. And I'm used to dealing, you know, and, you know, kind of taking the role of, you know, somebody in, for example, large financial institution that I work with a lot, you know, and my own prim is very static, right? And very tightly controlled. And how do I kind of bridge the gap between these two? That's another big struggle. I see that falls squarely onto this platform, into this, you know, platform app operations side as well is this big trend. So, you know, this move into Kubernetes and application modernization to get agility, obviously, and then facilitating that. And the tools and techniques that we're building become kind of the focus of the platform team. And so we see, again, you know, that's where I see a lot of mesh adoption is one of the general trends. That's where we see people leveraging things like, you know, network policy to get agility over some of the more traditional network oriented controls that they have on-prem. You know, there's a bunch of stuff we can talk about there in that space that we're seeing. But, you know, it's that general trend of moving towards this cloud native architecture because we want to facilitate more rapid application development, application agility. If you kind of make a Venn diagram and put DevOps, SRE, that form engineer, do they overlap or they are their own, you know, separate supersets? Or let's say, before we go there, let's just say, how would you define platform engineering and then how different is it from DevOps or SREs and do they overlap? Yeah, yeah, yeah, that's a great question. So, you know, I, again, I bucket it into those three big buckets. So platform engineers are generally gonna be the, they're typically an engineering oriented team that's building a product. And that product is the infrastructure that the organization consumes. And it's important we think about it, a lot of what I do, I would call platform engineering. It's important that we think about it as a product because a lot of times you don't have a hammer to get applications to move on to this platform. You need to offer features that are enticing for applications. That gets, again, to that whole idea of what can we offload to the emperor to help applications go faster? So that's, you know, roughly speaking, what I would call platform engineering. Then, you know, again, DevOps is really about how do we get to these applications into the emperor? And there's gonna be a, there better be an overlap with the platform team there or, you know, or we're silos and we're not communicating, right? These two should be working together to facilitate things like safe and rapid delivery so that we can get into a world where we're releasing at a higher cadence. You have to have integration between the platform and the, you know, operations folks as well as the application team and how they're getting stuff into production, all that has to meld together. So there better be some overlap, certainly in how they're talking to each other, if not in the actual duties on the ground. And then again, that third pillar is that operations at runtime and feeding that back to the development so that we can make meaningful improvements. And that's traditionally, you know, what I view as the role of the SRE. And again, that's gonna interact heavily with how an application is delivered and the CI, the testing and the integration testing and the delivery of that application. But it's also gonna touch thing and it's gonna depend heavily on the platform. It's hopefully the platform is doing things like providing metrics and logging and operational insight for me as well, right? And so really the synthesis of those two together is that operational runtime piece. And so I, you know, I see them as three distinct roles, but they're better in a healthy organization, there better be some overlap between the two because they need to be cooperatively working together to be successful. And again, you know, I see those titles as being less meaningful. Like those activities that I described have to happen to deliver software. And, you know, if we call them platform or SRE, I think it's a little bit less important. When I was listening to you, you know, like because last year at CubeCon and we often hear, you know, people take dev options that now, you know, we're moving to, but people used to say Unix is dead. Mainframe is dead. No, that's core part of modern economy. You know, every time we make a transaction, it goes through a mainframe somewhere. Exactly. So if I ask you that when we hear, hey, DevOps is dead, is it really dead? Can it really die? No, no, exactly. Like maybe the specific idea of like this one engineer cooped up that like only builds build pipelines and like, and that, like, you know, that I think will go away. I don't necessarily think that's even a healthy divide to have is just somebody that only that, you know, like definitely need an engineering team that maintains the infrastructure around pipelines. But, you know, I don't think it's good to have folks that are solely in that silo of like, the, it quickly becomes the ops part only of DevOps and it becomes kind of the operations team 2.0, right? And I don't think that that divide of building the application and then operating the application is particularly healthy for producing, you know, teams that are, that have customer empathy for that are able to iterate more rapidly and operate more confidently. You need a tighter feedback loop. So that idea in particular, you know, I think we're maybe seeing eroded, but again, that fundamental activity of how do we facilitate how we get apps in the production, integrating with the platform and then operation side is still gonna exist, right? Whether that's devolved into individual app teams and the platform team, or, you know, maybe we still have a little bit of a glue team, but hopefully they're either cheaper or smaller because we can offload some to the platform to help onboard more easily, right? And maybe we offload part of it into the application team as well so that that team gets leaner but focuses on the cross-cutting concerns. Like, let's make sure everybody has a secure pipeline that's actually up to date that we're doing SBOM. You know, those things that every individual team needs but shouldn't have to do, right? And so focus it on this cross-cutting team acting on behalf of the organization rather than one per application team. And earlier as you were talking about developer experience, we are talking a lot about that. How would you define developer experience today's cloud, native cloud-centric word and what are companies doing or what can companies do to improve it? In my mind, DevX is everything about consuming what we just described. So, as a developer, I wanna take source code and I wanna put it into a machine and I want testing to occur and I want assurance that the change that I made is probably gonna be good. And then I wanna go ahead and roll it out and rolling it out, that could mean going into tests, that could mean going directly into production, that could be a pipeline that goes through those stages, whatever it is, right, doesn't matter. But at the end of the day, I wanna make a change, I wanna see that go out to a set of users and eventually maybe all my users and I wanna facilitate the feedback that that's doing what it should do and not what it shouldn't do and that it's actually meeting that problem, right? And so how do we do each of those stages is that developer experience, right? And so what are we seeing? We're seeing a proliferation of companies that focus on different elements of it, right? So somebody like, for example, a GitLab and GitHub both focus on the idea of the source control, facilitating a workflow on top of that, becoming, well, we can talk about GitOps in just a second, which is a big part of developer experience today for folks. But so we see companies like that who are facilitating things like workflows on top of source control as one of the ways that you enforce this kind of developer experience, right? Of how people do things in your organization. We see other folks come in on different parts of it as well. So we see projects like Argo and Flux, which cover a very important part about integrating our infrastructure, seeing that it's actually working correctly and then going and doing things like Argo CD and Flagger to go roll out safely, right? Each of these are different tools that address part of this pipeline of allow me to change my source code, produce an artifact, deploy that artifact safely, and then close the loop. That the thing that I did worked or didn't work, right? And I think that's maybe the key that has been hard previously is that fully closing the loop side, right? Like we've had kind of each of those in isolation, it's hard to get that total view sometimes. And so that's key, right? And we're seeing better tooling that helps integrate kind of end to end to help provide that feedback loop of, hey, this is the change that rolled out and this was the end result in production and yes, that's probably a thing you want or do you wanna roll it back, right? Because fundamentally that's where we wanna get developers that tight loop, right? I think one of the important things to note is like in general, I think with the adoption of Cloud Native, we've kind of seen a step backwards in a lot of developer experience because we're going from kind of tight curated platforms like Spring Cloud, let's say for example, or even the J2ED world before that, right? Where I actually had this like really tight integrated development experience with a really good local testing and with a high degree of confidence that like a change that I make locally will work because of the tooling and the ecosystem. And so I think one of the reasons DevX is so kind of in vogue right now is that as an ecosystem, we're finally actually getting into the meat of things moving into Cloud. There's, we've been in the early tail, we're finally getting into a lot of applications and now we're getting to those kind of large enterprise J2ED apps that we're modernizing and similar and developers are looking at the state of what has been built to date around like, how do I test on Kubernetes? Well, do you use kind or mini-cube or do you deploy a queue? And, you know, there's a bunch of options and the state is not nearly so defined. And so I think that one of the biggest thrusts in the next two or three years and one of the things that will be most important is exactly facilitating how developers consume this cool tech that we're building because if we're just playing around like Kubernetes and building these Cloud Native stacks, we as a platform engineering kind of community, we're not in the, then we're not solving the problem. So how do you actually take it and make it easy to be used? That's the crux and I, you know, what I see is that every engineering organization is solving that kind of an isolation, right there? Because then we want to encode the business process for our business and we don't want to use general tools and when I say we don't want to use general tools, what I mean to say is we want to use the general open source stuff as the bottom, but we want to assemble it in a way that's very opinionated for our organization. And, you know, that's good, but I do think the next evolution will be starting to commoditize even some of the things that that layer where today, open source is kind of commoditizing the layer below. As you were just talking about, you know, new labels and not just label, there are a lot of new technologies also which is emerging today. It's like hard to keep track, just look at the CNCF landscape, it's massive, which means that a lot of new organizations, they get intimidated. Hey, we have to move to communities because everybody's moving to community now. Hey, we need to embrace, you know, platform engineering. What advice do you have for organizations so that they embrace right practices, what they actually need versus what everybody is doing? We have a big kind of go with the herd problem. I actually joke pretty regularly these days that in a lot of respects, Cloud Native has become the new IBM, which is, you know, nobody gets fired for buying it. Nobody gets fired for building a CNCF architecture. But again, yeah, I think the, you know, a lot of teams, if we do it kind of in an ivory tower, we're not solving anything. So I think the guiding principle for the adoption of any new technology has to be, how does it help me achieve either faster, more cheaply, more safely? That end goal of deliver more value to my end customers faster. And so through the, if we're looking at, you know, do we adopt Kubernetes or do we migrate or do we say we're rad on VMs, for example? Well, we need to, you know, everybody's doing Kubernetes, that's cool. However, you know, for the past 30 years, we've been, you know, 25 years, we've been running things on VMs at scale very successfully. Do you need something like the dynamism and auto scalability that Kubernetes brings to solve an end user problem of, I can't click the app because it's too slow, because it's overloaded. Then that's a good reason to go and pursue something like better platform automation via Kubernetes, right? The same token with the service mesh, right? I see a lot of, especially a few years ago when even the mesh was a lot younger, a lot of, you know, people that are excited about the technology that wanna bring it into the organization. But unless you have a clear business need for the mesh, the very earliest use case that motivated it was we gotta mandate that we have to do encryption and transit because of compliance, that was one of the big use cases to adopt service mesh, that's a good reason, right? But adopting it because it's an interesting technology is not a good reason. And that's the litmus test that we need to use for all this, right? And too often, especially on the platform engineering side and on the step ops and SRE side, where we are engineers who are building to be consumed by other engineers, we lose sight of the end goal, which is focus on the value, focus on the value that the business is delivering to users and how do you facilitate that, right? That's something I try and instill across the board in our engineering at touch rate, right? Is it doesn't matter if you're the, our platform team who inside of our organization manages our infrastructure and helps with our testing and a variety of different things, they still have an end user focus as well, right? And I think as long as we can do that, we'll make a lot better decisions around the technology that we adopt. Zach, thank you so much for taking time out today and discuss this topic with me and I would love to have you back on the show. Thank you. Yeah, thank you. It's been a pleasure. I always love to talk about this and I'd love to be back anytime.