 I'm live right now. Oh wait, yep, looks like we're live right now. Sorry for that infinity video. I'm gonna make my Zoom full screen. Awesome. Hi everybody, I'm Priyanka. I am Director of Cloud Native Alliances at GitLab. I contribute heavily to the Cloud Native Computing Foundation and I'm on the governing board now. It's a recent thing. I'm very excited about it. So we're gonna kick off 2019 with a conversation with Chris Dutra here, who is from Shurasin. I wanna chat with him because we met at KubeCon and he was at a panel I conducted which was on navigating the weeds of Cloud Native tooling. So in that conversation, I had T-Mobile, CVS, Lyft and Delta Airlines come on stage and share with us their thoughts on Cloud Native, how they see it impacting their organizations and finally, what are the challenges they face when looking at the myriad tools that are available? So today we're continuing that conversation and I'm so excited to have you here, Chris. Welcome. Totally. All right, so to kick off I'll introduce yourself. Awesome, very cool. So you were at the panels. I'd love to hear your brief takeaways so that the audience can learn as well. Traditional brick and mortar type of a store that is obviously in a big competition now with what we call an Amazon economy. Everything's online, everything's ordered. So I think for them to be able to migrate to or start or continue their Cloud Native journey, that's super important for the survivability of the business. So, and that's kind of something that you see a lot. You see like, for example, Sears, right? They failed to innovate and as a result, they're having a lot of financial difficulties. Unfortunately, yeah. And I think for someone like a CVS, that's a super important part because folks don't need to walk into a store anymore to get what they need. They may need online to refill their prescriptions or something like that. Also, when you think of CVS, that they're partnering with like Target now and Target is pretty much known as having like pretty good, pretty well on their Cloud Native journey. So there's, you know, especially from a partnership level, that's super important. Now, T-Mobile in a sense is kind of along that, but I think they're a little different because they, you know, not only are they kind of like a brick and mortar store, but they also sell a service, which is Hellcat, right? So I think what's important there is bridging the gap of, you know, they want to go to 5G. It doesn't make sense for them to be, you know, on legacy technology. So it's important for them to have that from at least from a marketing standpoint. The, I think had probably the easiest Cloud Native journey because, you know, they're relatively young company. And, you know, I think from that standpoint, they're not dealing with a lot of different types of legacy assets. So that makes that a lot easier for them to kind of innovate. Also, you know, having competitions such as Uber, you know, they're in a position where they need to continue to innovate. Yeah, innovate or die. Yeah. Yeah, totally. It was interesting because do you remember how Matt was like, everybody else was talking about their Kubernetes like plan and how they were on it. And Matt's like, oh, we're actually not on Kubernetes right now. And I know they're planning on it, but it was so interesting because they're relatively young. They were able to kind of, I think, build a lot of stuff themselves in the beginning. And then like Envoy is an example that they contributed to school. Yeah. And I think, lastly, Delta was interesting to me because when you think of the criticality of the services that they provide, right? You know, a Cloud Native journey for them is a little different than a lot of other companies because their stuff goes down. We're talking about massive delays, the airports, we're talking about, you know, maybe planes can't get off the ground because of scheduling conflicts or something like that. It's much, the criticality of their systems is super important. So they have to probably take additional steps with regards to security, with regards to fault tolerance and stuff like that. So that's super important, but it's great to see that a company like Delta is able to actually get going on their Cloud Native journey. Yeah, I think that makes total sense. By the way, I'm going to do a quick gut check to make sure our YouTube live stream is working fine. I've hit record here just as a precaution on Zoom, but let me do a double check real quick. Sorry for anyone looking live. Okay, let's see here. Yep, that looks like it's working. Every time I go there, it gets, you know, the infinity. So, yeah, that's what scares me as far as the fourth thing really quickly. Oops, never mind. Let's go back to, I'm pretty confident it's working. Let's go back to, that was great. And then the worst case is the reporting happening. So we're good. All right, cool. So, no, I absolutely agree with you. I think it was very, it was a devil and everybody had different problems, which is interesting, you know? And so speaking of different problems, I'd love to hear where you're at with Shearison. Sure, and we have an interesting, you know, it's not unlike what the folks were going through, because traditionally when you think of data scientists, they're going to have huge desktops with graphics cards and all this to run different types of machine learning and artificial intelligence. But when you think of, you know, you come up with this great software solution, how do you scale that out so multiple people can use it? You can scale humans, but that's probably not efficient. So, you know, when we think of how of scaling software, you know, the cloud really comes in handy. Now, with my background was I've been using Kubernetes probably since about two, three years now. So bringing it in to have it work was a pretty natural fit for a lot of like the backend services and front-end services. The challenge really was migrating those data science workloads onto Kubernetes. And, you know, a lot of the challenges there was their, the resources are not deterministic. So, you know, you could have spikes in CPU and memory and you had to be careful to not make sure you brought down your type. Plus there, for example. Right. There's been plenty of times trying that stuff. But so that's why we had to get innovative around how Kubernetes interacted with our data science workloads. And, but also important to that is how Kubernetes managed them as well. So, we didn't want to get to a position where we were spinning up a hundred nodes and, you know, we didn't want forever because that's not cost-effective. And, but also, you know, if the client is running the workload and it's taking longer than them picking up the phone and calling up the data between the two, we gave us that, you know, happy medium and get true service. Yeah. I think, I think that makes sense. So, a little use Kubernetes in the first place. What was the biggest motivating factor? I think the biggest motivating factor for that was that, A, you know, having, you know, so we have about 30 engineers and 25 data scientists or so. Gotcha. And, you know, so the support ability, you know, I came in as SRE number one. So being able to have the manually orchestrate support all of that is not going to fly. So having, you know, Kubernetes does a lot of that and like being able to, you know, restart pods, you know, do all those different things. And, you know, when you're comparing it to solutions like Docker swarm or something like the richness of what you get out of the Kubernetes API is far superior to what Docker swarm is from my opinion. So ultimately, and plus to like the community around it too, made it a lot easier for us to, you know, if we had issues or anything like that, like it's much easier for us to get up and running and keep running, which was really important. So, you know, Kubernetes is the community for that. But in turn, you know, when you look at the CNCF ecosystem, right, there's a lot of great tooling that's being built. Some of it is specifically for Kubernetes. We initially went down looking at Envoy and ended up using the master. So that's actually being used for our API gateway. So essentially we have Envoy at the edge, but we're using it in a way that, you know, it's basically, it's a nice API gateway that all of our services are connected to now. A lot of people have been talking about ambassador as a part out there. It's really neat. I mean, it's, you essentially get Envoy at the edge and what's nice about that is it's really super simple to set up, but also if you're used to working with Envoy metrics and observability, you actually get all of that. You can hook all that into Prometheus and you have it all ready to go. Oh, that's sweet, yeah. The only thing I think, and they're working on it now is actually getting ambassador to start using the V2 APIs from Envoy. And I think once they have there, you're gonna have a lot more feature rich things, but you know, the DataWire team did a fantastic job of it. Yeah, no, I heard similar feedback at KubeCon. And as you know, GitLab, we're a believer of the single application for the entire DevOps lifecycle. And we wanna embed the observability experience into the rest of your workflow. And it sounds like ambassador is making progress in that direction as well, at least from the service mesh to API, you know, to all of observability perspective, which is super cool. Obviously the world moving in a direction that we agree, we have seen work really well. So how did you find ambassador? I'm curious. It all started, I actually went to, we started out as we were going through and I took a couple of the folks because you know, the developer team here didn't have a ton of Kubernetes experience. So we had a meetup actually at HBO, which is across the street. And some of the folks from Envoy, they started showing off Envoy at a meetup. Where is this, by the way? I forgot to ask where you're based. I'm in New York, yeah. Okay, I got it, I got it. HBO, that makes sense, that's why it's across the street. I say it's definitely not in San Francisco. Yeah, yeah, no. So we were... Sorry, you broke up for a second. You hear me now? Yes. So we were there and they were talking about Envoy and we saw a lot of the traffic shaping and different things and that was great and we really wanted to go deeper into Envoy, but we had deadlines, of course. So initially when we were looking at Envoy, there was what we call, it was like, ambassador was kind of used as the Kubernetes provider. So we were able to get ambassador up and running and it was honestly probably like two or three days of work to get the configurations the way we wanted them and we had it good to go on production. Nice, that's a big deal. Yeah, it really was because especially at that point we were probably 10 engineers and I was only SRE and we've grown a lot since, but to be able to deliver software as quickly as we needed to, that actually came in handy. Absolutely and did you... So you know how on the panel, Jasmine from Delta was like, you know, there is the AEOU of how to decide whether a tool is ready with I think one of them with E was enterprise readiness. So how did you know that? So I think a lot of it, you know, we were able to... I think when you combine something like that with the kind of enterprise grade readiness of a lot of components in AWS, we felt pretty confident about it. You know, hooking it up as the traditional Kubernetes service it just tied into a elastic load balancer for us. So it wasn't like, we weren't sitting there thinking, okay, you know, we have to set up Ingress or anything like that. Like Ingress wasn't involved in this at all. That's amazing. Yeah, so from that standpoint, we were, you know, it was like, for us it was like deploying a service essentially and then basically each of the other deployments into that. Yeah, that's super cool. And so clearly you guys have adopted Envoy through ambassador and that sounds like it's really going well for you. And in particular because observability is now baked in it sounds like in your tooling there. So how do you connect this piece of tooling which is the metrics around stuff with the rest? For example, your CI CD pipeline, your unit tests, all of that. Sure. So I think we keep it simple for now. So with one thing we are looking at because we're actually building out a kind of enterprise grade to do this delivery now and we're looking at things like that. So we, you know, I think that this plays in nicely to how we want to go with AB testing, canary releases, traffic shadowing, those types of things. So we're looking into that. I think for us it was kind of let's keep it simple and then we can expand on it. So we would have traditional rollouts of the United States. So, you know, we would update deployments in real time and they would just be kind of rolling updates. Gotcha. I'll put in a mandatory plug since you're looking at CD solutions, you should definitely check out GitLab. I'm just gonna say that. But okay, awesome. So tell me this, one of the things we discussed in the panel, right, was just there's so many tools out there for so many specific problems for a specific kind of system. What's been the experience in your team when it comes to, okay, we decided to go cloud native and okay, now we have 500 tools or did that not happen? What's been the experience like on that point? So that has happened to a certain degree. And I think the key thing, I kind of have two philosophies with it. One is write tools or a job. And I think one thing that we try not to do here just because we're not Netflix, we're not Facebook, we don't have a million us per second or something like that. So for us, as long as it's reasonably satisfactory, I think that we can get by with certain tools. So, there wasn't ever a need for us to say, hey, we should really revisit native envoy at the edge and master's work is just fine. So one thing is like write tools or write job but also not broken, you might not want to fix it. So I think that's like also, it's also something to keep in mind and consider. When it came to certain things, and I have another example of this where thinking about like data pipelines and ETL and so forth, we had a bake off. So we had, there were certain teams that were super interested in using their particular tool. Sure. So we gave them a week and who came up with the best solution is when we went with. Gotcha. And what's nice about that is like, it's kind of, you get a nice little hackathon thing. Right. Culturally, it's great for the engineering team. Yeah. Also, you see which tools shine. You know, for us, Airflow is amazing. That's very well-recorded by people, yeah, for sure. Ended up being used and we're using it everywhere now. Oh wow, that's nice. So just to understand different people, different teams had different tools of choice because they thought this was the right job. However, it sounds like the mandate was we got a consolidate and because you don't want to have like 52, right? Exactly. I mean, we're a small team too. So like from Vanessa's side, there's only certain things that we can support, but we were open to the fact that, you know, and that's why we encourage the bake-offs because, you know, people are motivated to show up what works for them and also to like, we get an understanding of what the value process is for these tools and how it's gonna- It's the best way to learn without like just reading. It's like by seeing and it didn't actually get to you. Yeah, no, that's really smart. I think if anyone's listening, this is a good idea, do a bake-off. Because then it's less about like, like just, you know, emotional opinions which have their place too, but it's more just like, okay, let's just, like the proof isn't, and it's interesting to me. So you said you're not a huge team and still there was like, you know, a bunch of teams that wanted to do, you know, participate in the bake-offs. I'm curious how many, I guess, people or teams do have and how many were in the running? We had, I think it was three tools in the running that were there. We're kind of like a large team. I mean, we have like functional aspects to each team and like what they work on. But at that time, we were one big team. Like they were like pockets of folks that had like opinions on things. And like, how many people was that? I'm just curious to see like, as I do the, you know, two tools or something like that, I don't know. I think it depends, right? I think in some instances, like, for example, like the consensus largely here, like Kubernetes one-out, there wasn't like an actual or anything like that. With certain like, you know, ETL, we were going through that. I think that we had a handful, but I don't think we've ever had like more than two or three, I think it was mostly like, people had certain opinions about one or two, mostly based on past experience. Yeah, and that was like you said, like how big is that team? Sorry. So our team is about, we have about 30 engineers now and about 25 data scientists. Gotcha. Okay, so about 55 people. We support both. Yeah, yeah, yeah. I always, I know everyone's unique, but I think in terms of, you know, like it's a reasonable bucket to support both as like one. Cool, this is super helpful. Thank you for sharing. So when you think, for example, Airflow, right? Do you think about what, from that AEIOU, the I, which is the integrated piece? How much do you guys care about that? It's integrated pieces of your puzzle. It's important to us because, you know, one thing that we always try to achieve here is that when we have, you know, when these pieces aren't integrated, then there's probably some manual step in the process that we don't want to support or also, may, you know, if an edge case hits and something breaks, you know, then the pagers are gonna start flying. So I think from that standpoint, like the integration piece is super important. Airflow working into what we're doing with Kubernetes and our managed databases, it's really, really helpful. Nice. What do you use for managed databases if you are? We're all AWS, so it's Redshift and RDS. Gotcha, awesome. Awesome, very cool. So since we're talking about cloud, let's switch topics a little bit and tell me, like, so are you an all AWS shop? We are. Gotcha. For the most part, yeah. Okay, and do you, so for the most part, that means you're multi-cloud or you mean AWS plus bare metal? It's AWS, it's multi-cloud, yeah. So in some instances we have clients that will do stuff on their cloud. So, but like for the most part, the media optimization piece is all AWS. Gotcha, gotcha. Okay, and so given that they're so far the clouds, but it's mostly like, let's say it's a AWS shop, would you say the tooling you pick is impacted by the cloud you use and do you look at ability between clouds, even though, I mean, you are an AWS shop, so this may not be that big a deal. I think, so it's a fair point. I think at me right now we're small enough where the cloud is probably too much if we're a little overkill. Yeah, I hear that. Because in that sense, you know, we don't have like, we don't need five nines of availability like that. So from that standpoint, we're probably have a little more lack requirements where like a simple cloud provider is fine. But from the, you know, especially with folks on the panel like you have to look at multi-cloud, I think that like, you know, their situation's a little bit different. Like, you know, having US East One and NATO risk it down, like is a big thing for them. It's a really big problem. Yeah, exactly. Yeah, deep bubble or probably don't want that. Exactly, exactly. Right, right, no, that makes sense. So tell me, Chris, where do you see a native journey you've just adopted online through ambassador? You're obviously all in on Kubernetes. I wanna hear two things. One, what's been the results so far? And two, where are you going next? Sure, so the results so far has been great. We haven't knocked on wood, haven't really had any major like cascading outages of production. And I think a lot of that has to do with the resiliency of how we've set up Kubernetes data. So that's worked out awesome. I think for us continuing on our journey, I think big thing is, we want to standardize on continuous delivery. That's a big thing for us. That way we tidy up and be a little bit more deterministic on where things are getting deployed. But also, you know, we have other smaller projects that we wanna onboard and have kind of a single plane, single pane medium support. So that's huge for us. In addition to that, we're looking at, we're actually evaluating some EKS stuff now, so managed Kubernetes. We're traditionally running on EC2 at this point. Gotcha. I still have some like concerns with EKS and kind of the maturity of the product right now. But I think that said, I think, you know, in talking with them at KubeCon and so forth, I think they're on the right path. I think in probably three to six months they'll be in a good spot. But obviously the value for us there is that if we don't have to upgrade a cluster, that's- That would be amazing. I actually did, at AWS Reinvent, I did a live demo of deploying Kubernetes clusters to EKS. I used GitLab for the process, but it was interesting. It was nice to be able to, like with EKS we were able to do that. I had help setting up the clusters and everything from teammates as well. But yeah, if you wanna check that out, I'll send it, it might just, you know, get some interesting ideas in your head. It was so nerve-wracking, honestly. It was like, you know, when you kick off a pipeline on stage, you're just like, well, I hope this goes through. I've tried it many times and it did, but you never know. I told the audience, like, okay, I kicked off the pipeline, let's all sit and pray and hope that something doesn't go wrong. It worked out, so it was really good. It was fun. But yeah, I'll send you that video, it might be useful. Cool, okay. So in terms of results, I hear, okay, so it's been really useful, failovers and it's all, and in terms of next steps, so you're saying standardizing on CD is the big one next. Awesome, awesome, that's really cool. Thank you for sharing your journey so far. I'd say before we wrap up, I'd love to know if you have any advice you'd give anyone watching. Yeah, sure, I think, especially if you're immune to Kubernetes, this is probably one of the best communities I've ever worked with ever, with like any open source product. So do not be afraid to like jump on Slack, go to a meetup in your town or city, wherever you're at. I think that what's really cool is that not only virtual community grade, but I think that the in-person community has been fantastic as well. I agree. That is gonna help the learning curve. Even just looking at, it's easy to kind of fall into like, all these different distros and vendors and so forth, but like understanding Kubernetes, like how it works from an architecture perspective is super critical, because when you go off and you're looking to try to support and fix something, having that base knowledge is critical. Yeah, no, I 100% agree with that. Thank you so much, Chris. Well, this was really fun. I'm glad to join, we had this conversation. I'm going to be doing more of these. So if anyone who watches is interested in speaking, please send me a note. You can tweet at me, P-R-I-T-I-A-N-K-A, or I think you can use the YouTube chat, something like that. Just provide your thoughts on Cloud Native and where you're at in the journey and we'll take it from there. Well, thanks again, Chris. This was really fun. I'm gonna turn off recording and then turn off the YouTube Hangouts on Air, and then we'll wrap up offline one second. All right, thank you. Thank you, stop recording. Yes, exit full screen. This is my first time, so I'm sorry to everybody. No. Not so... Okay, no. Okay, all right, let's see here. Google Hangouts. Stop screen. Thank you to everybody who are going to stop the broadcast now. Thanks so much, bye.