 in this presentation. My name is Konstantin Semenov. I'm the principal software engineer from Pivotal based in Dublin. And today I'll be talking about the application runtime and the container runtime, not a part of the Cloud Foundry Foundation. I'm pretty much sure you've already seen the slide today a number of times, but I'll have to repeat that. You have to note the nearest exit sign and make sure that in an event of a fire alarm or other emergency, calmly exit to the public concourse area. Emergency exits, tables leading to the outside of this facility are located along the public concourse and for your safety in an emergency, please follow the directions of the public safety staff. So the tale of two runtimes. Today we're gonna briefly touch on the software deployment and delivery history, the inception of Cloud Foundry and the power of Bosch, the use cases of Kubernetes. And at the end of the talk, I will give a demo that will show how you can use both Cloud Foundry application runtime together with workloads deployed to the container runtime. And we'll have a small space for questions. So brief history. What is the main goal of those platforms of the technologies we're working on? Businesses are getting very, very competitive and they have to deliver and deliver fast and deliver reliably. So what was going on in the industry in that quest for the speed and reliability? Back in the nineties, you had to physically provision a server to deploy your workloads. So you would probably, if you worked in a corporation, you would make a request, wait for that to get cleared, signed off, delivered, then handed off to the operations department, which would be completely separate. Those people would install the server in the rack, configure it with the operating system and all the whatnot, then receive the package from you and try to run it, maybe fail, then you will provide them with an update and whenever the queue reached your update, they will attempt another one and another one and so on and so forth. Needless to say that it doesn't feel very fast or reliable. The automation was fairly complicated and you had to have physical access if you want to change a configuration in your machine. Fast forward to 2000s. When VM were successfully virtualized commodity hardware, such as Intel servers. Previously, the virtualization technology has been used by IBM since the sixties in the mainframes, but never before was used for commodity hardware. That span a big boom in the operations industry. The operator's life significantly got simplified. They still had to know the ins and outs of all the hardware, but their job was just to plug it into the hypervisor and then anybody could operate this remotely. So your hardware maintenance can be seamless because of different technologies to migrate VMs, so you could migrate them before you turned off the machines for maintenance. And the automation has become easier and spun off the whole DevOps movement slowly but surely. But the boot time for every machine is still slow because you have to put the whole operating system every time you turn the machine on. In the meantime, Google was experimenting with different technologies that were working on bare metal. That was Borg. They launched Borg in around 2005. It worked with massively huge workloads. A single Borg cluster could easily handle 10,000 machines or even more. And it acted as a learning ground for cluster orchestration and scheduling evolution in the future. And it's really amazing and it still evolves and it's still being used by Google today. But let's get back to virtual machines and zoom in to the future. By 2010, thanks to Docker, containers became more and more popular. Previously, containers have also been known or similar technologies, but they were very hard to run and to maintain and to create and they were not so easy. So Docker simplified that greatly and in addition to that Docker hub that is a sharing ground for your Docker files that you can build on top of, also gave the movement a huge boost. Your booting time for a container is just a few seconds because you're sharing the kernel. You're sharing the operating system resources. All you have to run is the software that you're running on top. It's one more layer of abstraction making it even easier for developers to create the containers that they're on their workloads in. And by default, the virtual machine that is being shared between containers is secure but there are always flags, dangerous ones that you can use to override this. And the focus of every container is typically on a single application. So what that leads to, in by the end of 2009-ish, VMware started working on something called a VMware Cloud Application Platform, bringing the application platforms available on public hyases on-premise. It was initially a proprietary VMware project but it was eventually handed over to Pivotal and Pivotal published it as an open source and started the Cloud Foundry Foundation. When we talk about Cloud Foundry or platforms in particular, a platform is not something that is visible to the end user. If a platform is perfect, the end user will never actually notice. So the interesting thing that I'm gonna touch on today is what is running the platform. It's the second level of invisibility. A platform, especially Cloud Foundry, is pretty complicated. It contains a large number of different modules configured to run together to provide this experience to the end user. And at the beginning, it was using Chef for deploying Cloud Foundry and it worked at the beginning but when we attempted to scale up and use in a large deployments and started to run maintenance tasks, turned out that Chef was posing a number of problems for that. It was hard to scale, didn't have any health checks and updating machines was very difficult and unreliable. That's where Bosch enters the scene. Bosch is a virtual machine orchestrator, you can call it that, that does a number of things. It separates the operating system from the deployed services. So each of those can be updated independently. Every time you get an update for the operating system, you update the operating system and continue running all of your software on top and likewise, when your software updates, it can be updated independently from the operating system. It supports multiple IASs so that you can deploy the same software over and over again using fairly the same configuration or a similar one regardless of whether you're deploying on GCP or Amazon or Azure. It provides a number of levels of high availability which means that Bosch is babysitting every process that has been entrusted with, making sure that the process works and if it doesn't, it will report and try to relaunch and same goes to VMs. Every time a VM goes down, Bosch can be configured to bring it back again to life. And it supports automation for day two operations such as scaling and updates. And it uses an internal DNS for service discovery and that internal DNS can be combined with health checks so that the services that are returned are the ones that are alive. An additional piece of software that runs in tandem with Bosch is Credhub. It's a storage for credentials and configuration information and simplifies the internal credential management, external credential management and configuration between components. More often than not, the operators don't even need to know what the certificates and the secrets and the internal passwords of the deployments are. They just need the components to talk to each other. And if you don't expose those secrets then they're less likely to leak. And even if they do leak, it's fairly simple through Credhub to rotate them all over your deployments. And if you're really paranoid or have special requirements you can have hardware encryption support. So this is the cloud-fighting side of things that we're more or less familiar with. Let's jump on to Kubernetes. What's this beast? It's an open-source container orchestrator that was developed by Google as an open-source tool. It is largely influenced by Borg and takes a lot of concepts from there and evolves them. It supports provisioning, scaling and high availability for containers and integrates with the most used IOS platforms. The integration though is a little bit limited in terms of the provided resources. The IOS integration can provide storage to get mounted into the containers and optionally if the underlying platform supports it, load balancing to expose the services. But it does not provide any health checks for either the control plane or the machines that are running the containers. So let me show you an example. Maybe you're running Kubernetes somewhere on the enterprise and you have a bunch of developers and they tell the Kubernetes control plane to spin up some containers. Nice. They all get scheduled and provisioned on different VMs according to the policy. But what happens if all of a sudden one of those VMs disappears? Somebody tripped over a wire or something. Well, the good thing is that the Kubernetes will reschedule those containers on existing VMs but it will leave the gaping hole where the VM was because it doesn't know how to create one. Enter Bosch and CFCR. With help of CFCR, Bosch will recreate the worker VM or the control plane VM so that the cluster won't be overstressed unnecessarily and will keep the available volume for the workload. So, quick quiz. What is CFCR? Is it container-focused Cloud RAM or is it a certain factor of complicated risk or maybe it's a cute fluffy canine runtime? I'd rather say it is but really CFCR is Cloud Foundry Container Runtime also known as Kubo, the guy on the left. It's a Bosch release for plain vanilla Kubernetes which means that updates to Kubernetes can be easily integrated and that means that the CFCR promises to integrate patch releases within a week after they are published and minor versions within a month after they are published. And in addition to that, the Kubernetes that is integrated into CFCR means to be on par with GKE. It is an open-source project developed jointly by Pivotal Google VM where it's come and it provides automatic recovery for Kubernetes control plane and other worker nodes whenever possible. What does it mean whenever possible? Well, basically, if the data store that is backing your cluster has been destroyed beyond repair, unfortunately CFCR won't help you there. It won't magically recreate lost data but it can recreate all the infrastructure. It provides cluster settings in CFCR. It's stored by default, same as the Docker settings that I showed you previously. But we also have some knobs to expose dangerous ones. So the question that I hear asked time and time again, the question, why? Why would Cloud Foundry Foundation have those two runtimes? Whether you should stick to one or the other? And the answer is both. It depends on your needs. Cloud Foundry application runtime is being evolved further and further. There's a lot of exciting changes up the road. It has fantastic supports for stateless 12-fact wraps and microservice meshes, especially with the coming support with Istio and Envoy, there was a talk previously. The storage needs for most of the applications can be provided by external services. But that still doesn't cover 100% of the workloads that you're going to run in an enterprise. You can use CFCR to run data services. You can use it to port large legacy applications and third-party Docker packaged or Kubernetes packaged applications. When you're not the developer, you've been handed a package. Please run it. There you go. You have a way of quickly standing up a cluster and running that workload there. And also you can use it for running workloads that require customized infrastructure. And that's a requirement like GPUs or special sorts of networking or exposing a number of various ports or other isoteric stuff. That being said, let's jump into a demo, which will be fairly quick. I hope. Let me switch. There we go. So I have a simple application there that is running on Cloud Foundry application runtime. Does everybody read the text? It's not that hugely important, but it's nice if you can. It's basically a simple application that talks to a MongoDB. And the MongoDB is deployed on Cloud Foundry container runtime. So what I'm going to do now, just to show you a little bit, I will query the database for a bunch of records. So we have four records with some strings in them. And we can put another record with an extra string. That would be a random pirate message. That goes recorded. Got to sign an ID. And if we ask the database for it, here we have our message. Neat. So what are we going to do next? Next, we're going to destroy the virtual machine that is running MongoDB. So here I have some output. I will comment on it a bit later because virtual machine destruction takes a little while on GCP. But what is the necessary thing is this is the entity that is running our database. And this is the name of the virtual machine that it runs on. So it's something ends on 05204. We have, here's the machine. And we'll go ahead and delete it along with the boot disk. Goodbye, cruel world. So in the meantime, I can explain, can you see the text in the terminal? Is it too small? Okay, great. So I have a bunch of things displayed here. As I already said, this is the entry that represents the container. Oops, this line here. That represents the container running the MongoDB. This is output from Bosch that shows you the VMs that are involved in the cluster. We have three master nodes and three worker nodes. The master nodes are the control plane that contains an HD database and a bunch of REST APIs and some demons. And the workers are the workhorses that are running the containers. Now, here at the bottom, we have a load balancer that's a service that is exposing the MongoDB. So this is the actual public IP address and port that the database is accessible on, or supposed to be. And finally, the last section shows you the view of Kubernetes of the nodes that it has available. And right now, it believes that it has three nodes. It hasn't picked up that one of them is going to die. It will eventually. But if we try querying the database, pretty much we probably will... Oh, it's still working. Let's wait for it to completely go away. In the meantime, does anybody have any questions while we wait? Okay. Good. Everybody get a little bit tired, I guess. Let's try again. Now we're not getting any data. This is because the database has gone down. Bosch has picked up that it's gone down, but Kubernetes still believes that it is up and running. Here, this line still says ready. Eventually, it'll figure out about a minute, I think. It might be dependent on different things. So it's not really... Oh, here you go. Not ready. It figured out that the node has gone down. So there's a lot of different machinery in place. So for example, it still thinks that your database is running, although it's actually dead. It's good to know that you have this latency when things go down. So it detected that the database has gone down, finally, and now it's creating a new container for it. And within a minute, we'll have it up and running. The only reason why... This is not really a setup recommended for production. The only reason why I brought it up in this manner is to show you that this is the actual database that we're talking to and that it's actually dead and inaccessible. And when it comes up, we'll see that all the data is still intact. Typically, in a production environment, you would have a number of replicas. So if one of them goes down, the rest of them are still serving the traffic and the data gets saved and maintained. So now Kubernetes sees that there are only two worker nodes. Bosch also sees two worker nodes and will eventually create the third one. Yes. No, I haven't, to be honest. So we've been experimenting with different failure modes, but we haven't been working on Azure yet. So now we see that the container is running and this let number one here says that the probe is working. So we can go back to our application and although you can see that the node, the worker node, didn't come up yet and is not yet registered with Kubernetes, our database is already up and running. Let's get our list of products. Here we go. And that concludes the demo. So eventually the worker will come back up, but it's too long to wait. Okay. Moving forward. What are the directions that CFCR can take in the future? This is not a promise. This is just a vision. Support for Windows workloads is one of the most requested features. It will take a lot of time as there has been a spike and we've tried to integrate with Windows, but the networking seems to be very tricky. Eventually we'll overcome it, but it won't happen right away. Integration with Bosch back up and restore. Kubernetes is backed up by the LCD database, so we need to figure out a way of backing up and restoring it safely and regular intervals without disrupting the cluster. Interaction with the underlying IIS via Bosch as opposed to directly via the platform. Why is that important? Well, when Kubernetes creates workloads, it provisions disks and load balancers on different platforms, but if you kill the cluster, there is no easy way of cleaning up those resources. If all the interaction will go through Bosch, then you can easily do that. You can identify those resources and you can either throw them away or store them or restore them or reuse whatever, but there is an easy way of locating them. Service broker integration, and that's an interesting one. The OSBAPI team is working on unified standardized way of providing services, so it would be really, really great if you could do a CF create service and that would create something on your Kubernetes cluster if you could feed in a YAML manifest of a deployment or something like this. That would be fantastic. We welcome all the contributors, so these are all the bits and bobs that the CFCR Bosch release contains of, the CFCR Bosch release itself, the LCD Bosch release as a separate release, and the Docker Bosch release. We also have documentation website that is separate and it's also GitHub repo, so if you want to contribute to documentation, you're also welcome to do so. With that, we can jump into questions if there are any left. Great, yes. At the time, we haven't even tried doing it yet, but we're just anticipating locking up at CD database in order to backup, to make a consistent backup, would stop the API from registering any changes, so we would want to somehow avoid downtime by either streaming a slave or something like this. I don't know. Yes. Okay, so the question is, there were known issues with ATCD before in CFCR, and whether we have overcome them? Yes, so the question is whether we have the same problems as the application runtime at CD, and the answer is no. The at CD release for CFCR is a separate at CD release, the reason being that the use cases are completely different. The CF application runtime used at CD as a cache, meaning that if you have a problem, just nuke the data store, bring up a new one and start caching again. You can't do that with CFCR, you can't do that with Kubernetes because that's your source of truth. You can't throw it away. So we had to create a different release to reflect that difference. Yes, I can delegate that answer to the anchor of the PKS team. Great, upcoming Kubernetes events. There is a panel in the experiments and extensions next door, right after this talk. Tomorrow there will be CFCR office hours with Colin Humphrey's Pivotal. And after lunch, I will have three Kubernetes-related talks from Jonathan Birken and Morgan Bauer IBM, Pivotal and SUSE, discussing open-service brokers that I have just mentioned, and UAA authentication for Kubernetes from Altaurus. And on that note, thank you very much. Please leave any feedback you may have for the talk. I would be very grateful. Thank you very much.