 All right, everybody, I think we're going to go ahead and get started. The time has come. So my name is Colby Johnston, and this is Brett Mayer behind me. We are both from Comcast and work on a team that is responsible for various cloud platforms. We've been running Cloud Foundry for many years. As we now call it CFAR, the application runtime, it's been a great Paz environment for our cloud-native apps, and it's done very well, and our customers have been very, very happy with it. Today we hope to share with you a gap that we encountered as we worked with customers over time. Many customers came to us with even modern applications that are very common today, and they didn't quite meet the 12-factor app requirements. You guys can't hear? All right, how about that? Is that any better? Okay, all right, so many customers came to us with modern applications that they wanted to put in the cloud, but they fell short of the 12-factor app requirements that are really kind of required for CFAR. And as a team, we recognized this. We wanted to help them, but really didn't have a solution or a platform to offer them so that they can enjoy the benefits of being in the cloud. So today we hope to share with you what we did for these customers and how we helped them onboard onto the cloud. Now, we as a team, our focus really has been over the years to help developers develop faster, build faster, and deploy faster. We do this by removing friction. I just want to take a moment and describe what friction is. Friction can be a lot of different things, but primarily one of the big things is customer support. We don't believe in opening tickets. If customers need support, they just simply ping us on our Slack channel and we will respond immediately and give them the support they need. Other friction examples may be a DNS request or IP allocation or storage allocation. All these things slow developers down. They have to wait and there's delay. What we have done as part of our platform service is include using, we call them add-ons to the platform to automate a lot of the pain, a lot of the delay away. This has created a lot of happy developers because it's fast and it's efficient. Developers do not have to wait for these services and they can continue to move faster. We do our best to take the friction away to allow the developers really to do what they do best, which is write code. We've been doing CFAR for over five years. We have over 40,000 AIs and over 19,000 apps running. It's a very large environment and it continues to grow at a very rapid rate. One of our biggest challenges really is just to simply keep up with demand of this platform. About five years ago, as I said before, we started out with CFAR, which is one of the three, what I'm going to call pillars of our cloud platforms that we offer as a group. Shortly thereafter, we added concourse to provide a CICD type service. Now really between the two, this combination allowed developers to move at speeds that I don't think they had seen before and they were very, very happy with this combination. Now recently, we added a third leg, I think, to the stool and it's CFCR, the container runtime. And at this point, we feel that we now offer complementary platform services that really offer and accommodate a wide range of application requirements. You know, it may seem a little bit odd out of the same group to support, deploy, maintain two different platforms, AR versus CR, that really do have a lot of overlap in terms of features, functions, traits. But what we have found over the last year or so as we've been messing around with it is that they are much more complementary to each other than in competition. I'll turn the time over to Brett. So in order to set the stage, I'm going to briefly touch upon the cloud-native approach. Just a couple of bullet points that kind of tie into what we're doing. So the first is time to market. And with the cloud-native approach, you know, that allows businesses to deploy their applications and their services much faster by reducing the amount of time that developers have to spend dealing with infrastructure. So like what Colby was mentioning, tickets, DNS requests, provisioning this and that and the other, just dealing with the infrastructure. And in addition to that, by using CICD practices and automating things when and where it makes sense, these organizations can get their applications deployed quickly, securely, and predictably. The second point, resilience and portability. So cloud-native also empowers organizations to build and run or build and scale loosely coupled applications that are resilient, they're easier to manage, and they're able to run on any cloud provider. And this in itself has a number of benefits, such as reducing the application complexity. By decomposing traditional monolithic application into a set of smaller services. And these smaller services can be developed much faster, deployed easier, maintained easier. And again, they're easier. They run on any provider and they're portable. So with regards to our group that Colby and I are in, like he said, most of the success that we've had has been related to application runtime. And demand keeps growing for this platform and it's resulted in us creating a platform that's capable of running many of Comcast's most critical application workloads. Despite this, we've had a number of customers that have come to us interested in moving into a cloud environment, but for one reason or another we've had to turn them away. And some of these reasons being perhaps their application is not fully decomposed. Maybe it's only got a few microservices, but still primarily monolithic. Perhaps they have a very resource intensive application, requires a lot of memory or CPU, such that might not fit into the application runtime environment. More than likely though, the reason is going to be statefulness or persistent storage. And any of these requirements that I've mentioned kind of fall outside the realm of 12 factor app principles and therefore aren't really suited for application runtime. And because of this, these customers are stuck in traditional infrastructure again, VMs, physicals, what have you. And that's not good for developers, the application or the business. Again, we're stuck with infrastructure. So over the last 12 months or so, we've been thinking about this problem and we think that we've come up with the solution, which is CFCR. Thank you, Brad. Now, Brett mentioned how do we fill this gap and we filled it with using CFCR. Now, CFCR has some distinct traits about it that allow it to do very well in a few things that CFAR really does not. But let's first look at the left hand column there and talk about what they have in common. CFAR and CR, that lists you see there, are common traits. We could probably argue about which platform is better at what than the other, but I don't think there's really any argument that they're both good at those things. Now, in the right hand column we see CFCR and those are the distinct traits that really separate it and make it distinct from AR. If an application has stateful workloads, it's probably got to go somewhere else other than CFAR. If it requires the maintaining of state, if it requires the storing of data, we would counsel that customer. You know what? AR is probably not the best place for you, but we do have a place for you and it's called CFCR. And I will say this, AR does abstract at a higher level than does CFCR. It abstracts more at the application level, whereas CR abstracts more at the pod level. And at the pod level, you do need to worry about things like CPU, memory, and especially storage. So, you know, unfortunately, even today we do have to still worry about things like CPU, memory, and storage. We just simply can't forget about it. If we do spend a little more time thinking about it and planning, CFCR has a great way to accommodate these apps that require all these things you see here in the right-hand column. It does really well at stateful workloads and computing, and if your app has high compute needs, great place to put it. So, statefulness and CFCR, really, you know, why is it so good at accommodating stateful apps? Really, the one thing that separates it, I think the most from AR is that first item in the list you see there is persistent volumes, persistent volume claims. We're just going to call it a PVC from here on out just because it's easier to say. Now, a PVC has the ability to store data, and this is a very standard Kubernetes feature or function. And the life cycle, let's talk about the PVC for a second. The life cycle of the PVC is independent of a deployment or a pod. You can scale your pod up, or you scale your deployment up. You can even delete the pod in that deployment, and you're going to be able to keep your data because your data is sitting on a PVC, which is completely independent of a pod. Now, the great thing about taking this little extra time to define your storage on Kubernetes is you do it one time. Once you do it, Kubernetes really takes care of the rest. It knows where your pod is running, and it's going to take that volume and mount it to it. It could be running on any host or VM in that cluster. It'll find it and mount your storage there. It's a great thing. You do it once and forget about it. But it really actually gets a little bit better than this. If you combine two ideas, the idea of a stateful set and a PVC, before we describe the benefits of it, let's step back and talk about what the stateful set is. Now, a stateful set is a deployment of a set of pods. I kind of like to think about it as an application cluster deployed in Kubernetes. If you add the concept of a PVC to a cluster inside Kubernetes, you get something very powerful. You get a cluster that you can scale on demand that can store data as well. That is a great thing. Now, we've observed that over the time that we've been using CFCR, there's some common apps that take advantages of this combination of both stateful sets and PVCs. Examples of that might be databases like Mongo and Cockroach or messaging apps like Kafka. These are all running in our environment right now. I did want to take a moment, and we do have a snippet of a deployment manifest for a stateful set. I did mention it's a little harder. It takes a little more time and thought, but it really isn't that bad. If we start at the top here, we see this stateful set will have five replicas. All that means is it's going to deploy five pods. It's going to deploy a Kafka image. This is going to be a Kafka cluster. If we go down further, we see that there is a volume mount. That varlib Kafka that we see there, that is where we're going to keep and store our data. Below that we see volume claim template. That's really where we define the PVC. You see it's 300 gig in size. Each one of those five pods, replicas, will get its own unique storage volume that will attach to it. If later we get down the road and it turns out, you know what, we really needed seven pods or seven replicas. We scale it up. It creates two more volumes and attaches it to those two additional pods. Things like that are really easy in Kubernetes. There's a lot of value there. It's definitely well worth taking the time to figure these things out. Like I said, it's not that bad. Okay. Cole, we just discussed the gap that CFCR fills. I'm going to talk a little bit about the tool set that we use to deploy our CFCR clusters. The brawn and the brains here. We consider Bosch to be the brawn and concourse to be the brain out of this tool set. Bosch really does the heavy lifting. It manages that VM life cycle, the VMs that make up the base of this CFCR cluster, the masters and the workers. It also manages the software releases that in this case make up CFCR. So that's going to be Kubernetes, XED, Docker, things like that. And besides that, Bosch also ties in very nicely with CICD. In our case, concourse. Now speaking of concourse, that's the brain. So that brings all these tools together. So Bosch, Vault and GitHub. And by bringing these together, we define our deployments and then we put that into a pipeline and we're able to run this pipeline and deploy our clusters with repetition. There's going to be no change in deploying those and very little human intervention. So less prone to five finger mistakes and things like that. Vault is our centralized key management. So that stores our certificates, tokens, super secret passwords, all that stuff. Things you don't want exposed in GitHub or scripts or manifests, anything like that. And then GitHub, that's used extensively by concourse. So any repositories that are defined in our deployments, concourse is going to pull that in, whether it's source code or manifests, whatever. And it's going to use all these tools and build that deployment. So with regards to the deployment, we keep those as vanilla as we possibly can. So even across cloud providers, whether it's on-prem or off-prem, there's very little difference in what our deployment manifests look like. And as platform engineers, for us this makes things a lot easier as the configs are smaller, they're easier to maintain and easier to understand. And in addition to this, so we aim to keep our clusters, CFCR clusters all running at the current release. The current release cycle of CFCR is about every three to four weeks and in order for us to maintain that, we need to have a stable and a repeatable process. And by using these tools, specifically concourse, it allows us to do that. So we just do a little bit of update to our deployment regarding the CFCR version we want. We check that in the GitHub. We run the pipeline. And regardless of the environment or cloud provider, concourse will run that and manage the deployment. In addition to just deploying clusters and upgrades, we've also been able to leverage these tools to build our own patches. For instance, last year there was a Kubernetes vulnerability. We were able to build this patch, deploy it to our clusters 24 hours before the official release was published on GitHub. So the flexibility that these tools give us, in addition to the platform itself, really allows us to move fast, stay secure, and just keep everything stable and cookie-cutter. So this is just a quick snip of what the deployment looks like from a Bosch perspective. It's really just one line there. You can see we parametrized everything. Nothing is hard-coded. We try and keep as few ops files as we can. And again, keep everything simple. This is just a snip from the pipeline itself. And the reason I got this is because it shows how all the three tools, Bosch, Git, and Vault, are used in the pipeline. So we're referencing two repositories up there, the Git CFCR deploy. That's a private repository that we manage that contains our manifest files and specific deployment information. The Kubo deployment, that's the official CFCR release, used to be referred to as Kubo. Under the params list there, the first three, those are parameters used by Bosch. So that's going to define the name of the cluster, by as that it's in the environment. And then the last three are Vault parameters. And this is a good example of how concourse is able to dip in the Vault, get whatever secrets it needs, and populate them into the deployment. So none of this stuff is, you know, any of this secret information is not exposed into plain text. So where are we at right now with CFCR? Well, we are still in POC, but we do plan to go production very soon. We have deployed it on-premise within vSphere, and we've also deployed it into our AWS, which is our primary cloud provider, the Comcast. One of the advantages of CFCR that our customers just love is the fact that the user experience for our developers is the same, regardless of which cloud provider it's in. If our developers need a load balancer, or they need storage, or an IP address, they would define that exactly the same way, whether it's an on-prem or an off-prem deployment. It just simply doesn't matter to them. It looks, it feels, it smells, it's all the same. They use exactly the same command or exactly the same YAML definitions in their code. You could take what they used on-prem and deploy that same thing, off-prem and AWS, and guarantee that it will work, because all of that is abstracted away with CFCR. Now this kind of goes back to the concept of reducing friction. This is big. Our developers can concentrate on their code and their application rather than trying to figure out all the individual differences and intricacies of the various cloud providers out there. So it really has been a big win in that sense for us, as well as them. Now up until this point, we've talked primarily about the benefits to our customers, the developers, by utilizing CFCR. But I did want to take a moment and talk about the effect that choosing CFCR has had on our team, what the benefit has been to the platform team. And when we started out in container orchestration, we started out with a different platform. And this platform required us to manage bare metal, various operating systems, and tons of ansible roles and playbooks. And in many of the tools, in fact almost all the tools were separate and distinct. They were not in common with the tool set that the other half of our team was using for CFAR. And really what this did was it divided our team in two. You either did Kubernetes or you did CFAR, but not really so much both. By switching to CFCR, now we do have a lot in common with CFAR. The tool sets are very similar, if not the same in most cases. For instance, we leverage BOSCH for configuration management and VM deployment, lifecycle management for VMs. And by using this common tool, it's created synergy. It's allowed us to share common practices and architectures and also the skill that we've gained can be leveraged across both CFAR and CR. But it really hasn't been without its challenges. Now this slide here depicts that challenge. And I got to say this slide is a little bit graphic. There's a lot of death and destruction going on there. But what it does do, it's a really good depiction of the learning curve for BOSCH if you haven't had experience with it before, which I had not. But over using a little time and effort, I think for the most part, we have been able to get up over the hump and we're all up on top of the mountain there and we're in that DC9 CAD clearing a path for higher team efficiency. So it's been a good thing for our team and I think we're better for it. So we've gone through and described what CFCR is, the gaps that it fills, how we deploy it. And all that is fine and great, but we didn't want to stop there. We wanted to provide more than just Kubernetes as a service. We really wanted to give the developer a ready to use turnkey environment. Again, reducing friction. So we took it upon ourselves to bake in some common features or services that most developers are going to need. So things like certificate management, DNS management, ingress and egress networking, as well as built in integrations to some Comcast infrastructure, such as logging as a service, monitoring as a service. So again, to reduce that friction, reduce hurdles, tickets, et cetera. And so that's great for the developer and that also developer helps us as platform engineers because it again provides us this cookie cutter approach to it. We're not installing things post deployment for a developer, but not on that cluster, maybe on this one. Everything is the same. So that leads to easier troubleshooting in some instances. Less customization. And overall it gives us more time to continue to develop new services and new features for the developer and allows us to streamline our approach and ultimately give an improved customer experience. So we want to shift gears just a little bit here and I think finish by emphasizing Comcast's commitment to open source. Comcast, we are big believers in open source. We're not only users of it, but we also like to contribute back to it. And here on the slide we do have a couple of examples of recent contributions back to the community. Apache Traffic Control, Uber Healthy and there on the bottom there is a URL that you can go to if you like to see that or even clone it and try it out and give it a spin. We'd also like to invite you to stop by your booth and get some free swag. And also we are hiring and always looking for good people and we do value diversity and inclusion. We do think it makes us a stronger and better company. With that said, we've come to the end of our presentation and I think there are at least a couple minutes left and we'll open the time up for any questions you might have. Thank you. You mentioned a creative CFCR itself, but how about those services and the day-to-day story for those Kubernetes workloads? You know, I say we are the platform team and do work very closely with those that are deploying those, but we don't really dictate to them how they're going to do it in a lot of cases, oftentimes, but just by nature of putting those into Kubernetes with our Docker Hub upgrades are very easy. Customers do not have to worry about OS patching for sure. And when we patch our platform, we can do that in a rolling fashion that has zero impact to their application, which is great. But I don't know that I have a full answer for you on what it looks like to maintain Kafka Day 2. I think we could probably find you, somebody that could speak a little bit better to that, that's closer to it. Did you have anything to add to that, Brett? Yeah, I mean, we have a few developers on our team that are more geared towards recommendations on how to configure things like Kafka, so we could certainly get you in touch with those folks if you've got any deeper questions on that. Is there another question? Yes, sir. Yeah, we're going to dictate the release and patch cycle of the platform and we will work with them and say, hey, we're going to upgrade and you need to make sure your application is compliant with what we're doing, but we're not going to dictate a particular release or patch for an application, generally speaking. Anything else? Okay, one more. That's part of CFCR, so a new CFCR release may include, it probably won't be the latest and greatest Kubernetes, it might be like a minor release back, but that's all dictated in the CFCR release as to what version of Kubernetes. Does that answer your question? Yeah, all right. We could easily roll back if we had to, but we don't make it a habit. I think we're at the end of our time, so if there are any further questions, we're going to be here for a while, so please feel free to come on up and talk to us.