 OK. Hello, everyone. I am so delighted to be here with Cloud Foundry's community. My name is Giorgi, and I work for SAP as a developer. So my primary interest as a developer has been hacking for the back end. So that interest has led me towards working for SAP's Cloud Platform from the very beginning when the SAP made the move to the Cloud. So today I'm going to be talking about Docker and Cloud Foundry. So in the beginning of this year, something very interesting happened to me. Me and one other guy from SAP, my colleague Christo, who is unfortunately not here on the summit, flew into San Francisco to join Pivotal's dojo program. So the dojo program, for those of you who may not know, is a deep diving learning program. So we just get to join one of Pivotal's teams and work together with the engineers. Each morning, we drove to the office, had breakfast then, joined the stand-up, and then paired up with the Diego team that we are part of, working basically on what is Diego's backlog. And the idea behind all of this stuff is to learn the code base, get to know the community, get to know the process, the culture, and it is a terrific experience, I can tell you. So when we got back home, me and Christo had our own topic that we should contribute back to Diego and Cloud Foundry in the community. So that's what I'm going to be talking about today. And first, a little bit of context. I don't know maybe how many of you are comfortable with Diego. Maybe you saw OnSys Talk yesterday. So OK, a little bit of context on Diego. What is Diego? Who is Diego? Well, Diego is the next generation RunSIME for Cloud Foundry. And it's intended to completely replace the DEA. And it's a self-healing system, which means that it can detect up failures and bring the broken instances back up for you. And another very important aspect of Diego is that it's eventually consistent system. So this is best explained if I tell you that Diego actually perceives two worlds. So this is one world being what the user desires and the other one being what is the actual state of the systems. And you should know that in real life, for distributed systems, these two states, these two worlds, tend to diverge quite often. So in a nutshell, Diego's job is to take the necessary measures to make these two worlds eventually consistent. Very briefly, just a high-level overview of how Diego operates. We have the Cloud Controller that we know from Cloud Foundry. And we have Diego. So Cloud Controller knows everything about apps. It works in the domain of apps. And Diego works in another domain. That is the domain of tasks and LRPs. A task is a simple one-off job that you just kick off and want to know whether it succeeded or failed. And on LRPs, a long-running process. So that's basically your app that you're running. It has to run continually and be managed by Diego. So each time you start an app, it has to be staged. So we have this bridge component in the middle. So that's very important because Diego is all about loose coupling. We have the domain of apps. We have the domain of Diego. And we have this bridge that is doing all the translation between the two domains. That's exactly if you've been at OnsenStock yesterday. That's exactly what makes it possible to reuse Diego outside of Cloud Foundry that is lattice. So during staging, the Cloud Controller will fire a stage app request that's going to hit the stager, which lives on the bridge. And the stager is going to translate the start app request to a task that Diego is going to run. If this succeeds, then a very similar story. We'll have the start app request, which will hit the NSYNC. The NSYNC will translate it to an LRP that represents your application. And that's going to be managed by Diego. OK, so that is Diego. And let's now look at the topic of integrating Docker inside of Diego. First of all, are you comfortable with Docker at all? Yeah, I expected to see so many hands. So I can maybe get away saying that Docker is a container engine that makes it easy to pack, ship, and run your software inside of containers. So as you may know, Cloud Foundry has been employing containers for a long time to run the apps inside. So that's done for the mere reason of optimization. We have a runner VM, and we don't to waste the whole runner VM in order to run just one instance of the app. That's not optimal, so we want to run multiple instances of the app on the same runner, be it the cell or the DA. So that's where containers can help, because if you want to run several things next to each other, then you want to make sure that they won't step on each other's toes. So that's why we need containers. And Cloud Foundry has been successfully employing its garden technology, its own solution, its own engine to run these containers. Those of you who were at the talk of Mr. Julian Friedman yesterday have to know about garden. So if garden was successful for quite a while, running these containers for Cloud Foundry, then why do we need Docker at all? Well, as I already mentioned, Docker provides a very convenient model for distributing your containers. They define something called an image that actually can push to a central hub, that is the Docker hub. And this way you can distribute your application. You can create an image, push it to the Docker hub, and then the whole community can basically take advantage of it and reuse it. So if Docker is that good at distributing containers and Cloud Foundry in Diego is that good at running them, then why not combine the two and take this win-win situation? So now that we saw the merits of the Docker plus Cloud Foundry combination, let's look at what did the community already do towards this integration? Because this is by no means a new idea. You know, Docker has already been around for about two years. So the community have already attempted to bring those two together. So what's going to follow now is a brief overview. I'm going to mention some open source projects that attempted Docker integration. First of all, you may have heard of CF Docker. It should have been part of last year's summit. Maybe the most logical thing, if we were about to run a new technology on top of Cloud Foundry, would be building a new build pack. Because you know, the build packs are those bunch of subscripts that know how to run your app on top of the generic Polyglot platform. So that's exactly what CF Docker did. So they provided a build pack that can run your Docker images on top of Cloud Foundry. And if you want to use a project, you can do something like the following. You have to do a CF push upon to the build pack, which is CF Docker, and give it a directory containing your Docker file. Another interesting project is Project Decker. That's a project by the Cloud Credo Company. It goes one step further by providing not only a build pack, but also a separate stack for running the Docker container. Because if you want to run Docker, then you've got to be running the Docker daemon. And in order to do this, you need to provide a new stack. And a stack is something like an OS template for the VM that's running your apps. So that's Project Decker. It defines this new stack, and it's used as follows. You do a CF push. You point to the stack Decker. And again, you give a directory that's containing your app. So these projects were presented last year. Some of you may have heard about them. And it was changed since last year. Maybe you should know if you were at Ounce's talk yesterday. Since last year, Diego has matured. It has grown up. And now it can natively run your containers without having to install anything. Well, basically, you have to just install the new Diego CLI that makes pushing these images easier. So the Diego CLI is installed with a CF CLI. And it has this Docker push command. So if you want to run your container using Diego native support, you have to do a CF Docker push, provide some app name, and then provide a reference to a publicly available image in the Docker Hub. For example, Busybox. That's an image that's available on Docker Hub. So the key thing here is that Diego doesn't actually use the Docker daemon to run this image. Rather than this, it just simulates the Docker technology by using its own garden tooling. So this seems to work fine. And Mr. Julia Friedman talked a little bit about this yesterday. So that is the native Docker support. And let's now dig a little bit deeper and see is this production ready? Can you go tomorrow and use it in production? Well, the quick answer is no, because it suffers from some problems. So let's now look at what these problems might be. Actually, all of the problems that I'm going to talk about are from the simple fact how Diego currently operates. So currently, on each request, no matter if it is a start or scale up request, no matter if you're starting or just scaling, Diego will reach out to the internet and pull all the bits of the Docker image and then fire up the container. So this leads to several problems, first of which is something that we call unpredictable scaling. So imagine that several weeks ago, you just started two instances of some publicly available image that you found on the hub. And these instances have been running healthy for several weeks. And now you decide to scale them up. So you most probably go to the console and do a CF scale minus i, for example, 4. And you'll basically tell Diego, please start me two new instances then. Diego will go to the internet, pull the latest version, and bring up your requested instances. But if in the meantime, the app provider decided to release a new version and publish that to the Docker hub, then you will end up running two versions that are with the older version and two instances that has the new version, which is a problem. And that's unpredictable scaling. Another more obvious problem of how it currently works is performance, because it is clearly suboptimal on each and every request, on each and every start or scale operation to go and download the whole container. We know that the containers tend to get bigger, so it's clearly not optimal to do this. And last but not least, currently in Diego we lack private image support. Private images are much like public images. They are available on the Docker hub, but they are protected by credentials. So if you want to support private images, then you need to deal with credentials. And that's not easy. That's tricky. You always have this trade-off between what is convenient and what is secure. So you either have to, all the time, prompt the user, give me your credentials. Give me your credentials. Or you can store the credentials in your database, which is kind of tricky to be implemented securely. So that's the reason why, for the time being, Diego native support has opted out of private images. And this can be a problem for you if you are some organization that decides to release your proprietary app on the Docker hub. So me and Christo and the Diego core team were scratching our heads thinking about how should we address these problems so that we can make Docker support publicly available production ready. And we came up with the following idea. So we have Diego and we have the Docker hub. And we decided to plug in another component in the middle, which is the private Docker registry. So this is very much alike the Docker hub. It's a registry for Docker images, only that it's running in the private Diego network. So it's inaccessible to any human. It's inaccessible to no developer, to no user. No one can push or pull images from this registry. And it's there solely for the purpose of being accessed by Diego. And Diego is going to use it as a cache. So if you want to start your Docker app with this caching support, then here's what's going to happen. First, during staging, Diego will reach out. And as normal, it will pull all the bits of the Docker image, and then just before it proceeds and start the image, it will cache it in this private registry so that it's there, so that it's available for being reused later when you decide to scale the app. So this means that you no longer depend on what's available in the Docker hub during scaling. And that's how you solve the unpredictable scaling problem. You can freely scale up and down all the time without worrying what content might be available in the Docker hub. And even if Docker hub goes down, then you won't be impacted. So it's kind of obvious that the performance for using this cache is a bit better, because instead of going and downloading the whole container all the time, you'll be using something that is available in the local network, which is better. And what is not obvious is this registry solving the private image support thing. Well, it's not exactly solving the problem, because during staging, you still need to pull the private image from Docker hub, right? But it's really a relaxing condition, because right now, you need to authenticate the user in front of the hub only during staging. And if you do it once, then the user can freely scale up and down using the cached version. So this makes it possible to just prompt the user for credentials during staging, and then throw them away. You don't have to store them, because you have the locally cached version. So that's how the private registry makes it a little bit easier to introduce this support. So what's going to follow now is a brief demo of how all of these work together. So let me open my console. I hope you can read this. So we are going to start Docker image on top of Diego. That's an Amazon system running on Amazon. So as I already showed on the slides, we'll do a CF Docker push. We'll give some fancy name, let's say Summit, to the app. And I'm going to use this image. It's our showcase image for Diego. And one more thing I have to do. I have to pass in this no start flag. And I have to do this because I just want to create the app without starting it. And that's because this registry thing is still experimenting. It's not enabled by default. So I want to just create the app, then enable caching, and then start the app. So that's why I passed that no start flag. So how do I enable caching? Well, it's simple. I have to just do CF, set ENV for Summit app. And I have to set our flag, which is called Diego Docker cache. I have to set this to true. So this will enable caching. If I didn't set that, it was going to work the all the way. OK, so now I'm ready to start the app, CF start Summit. And let's keep our fingers crossed that this will work. Hope that the demo gods are with me. OK, what you're seeing right now is Diego pulling all the FS layers from the internet. That's during staging. And right now it's pushing. It's doing the caching right now. I will scroll up when it finishes. OK, let's scroll up. So here we can see that Docker image will be cached as. And we see some local address that's in the private network of Diego. And we see this unique ID. This is the cache ID of the image. So we see that it's being cached. For the sake of completeness, let's hit this URL in the browser. OK, so that's the app. And if I go on and scale the app, let's say I want two instances. So this should be way faster. OK, you see, we now have two instances. And if I refresh this, we should have round trouble in between the two. OK, let's go back to the slides. So that was the quick demo. It worked. And now a little look into the future. What's our timeline? Well, currently, we're here. That's MVP0. That is something that's a float, but not by any means unsinkable. We have this image caching support. And we have the opt-in because we're still experimental. We have to be explicitly enabled. Next, on our plate, we intend to further secure the registry by the means of restricting the network so that you cannot do bad things with the images and put it behind HTTPS. Another thing that we plan on doing is the private image support that I talked about. S3 storage, because right now we're using the local disk, and it does not scale. It simply does not scale. Then next, we'll introduce disk quotas, which means that if you try to start a Docker image and you don't have the disk resources, you're going to fail miserably. And then react to your storage if you don't like using public Amazon Blobstores. High availability, because one instance won't scale if you do a lot of scale-up and down requests, a lot of starts. And then eventually, we plan on adopting the next version of the registry and the next version of Docker when they come out and are sufficiently stable. So to summarize what should be the takeaway of this session, please try this at home. We have it in the incubator. We should go to the Cloud Foundry incubator slash Docker registry release. We have a readme, some easy steps to follow to install this. With Diego, please try it. Send us feedback. Feel free to open issues, write us emails by all means, help us build this together. So having said this, I'd like to say a big thanks to the core Diego team. These were the people that made all of this possible. They were a great help to us. So many thanks to OnSeed PM and Diego's core team. And thank to all of you for your time. So I'm now open for your questions if you have any. Thank you. Questions? Can't hear you too well. Is the caching at the app level or across apps? It is at the app level. Every app gets pushed under unique identifier. But because you're running the Docker registry, then if you have the same apps, then you won't be duplicating the bits. Because the GraphFS is helping with this. It won't duplicate the bits. It will only duplicate the metadata. Do you have plans to support private registries? Like an internal registry? Didn't quite get it. This is a private registry that serves as a cache? No, no, no. An internal private registry instead of the Docker registry. You mean if you don't want to work with the hub? Correct. OK, so the purpose of this registry is solely to be used as a cache. And that's for security reasons. If you want, you can maybe reuse the Bosch release for the registry. You can kind of spin up your own and still use it with or without the caching. So that's a different problem. Even today, it's possible to spin up your own private registry and start from there. So it is possible, but it's not in the scope of this caching thing. How much of the Docker remote API are you planning to support things like environment variables and volumes and stuff like that? Is that all on the roadmap or where are you expecting to get to? Yeah, it is on the roadmap. Currently, we are not supporting much of Docker's metadata. You know, it's some early stage. That may be a question that's better suited for the garden team, but we have this on the plate. We plan to soon add support for users, for ports, custom ports, because right now we're just using 8080. And we surely plan to extend on this. OK? There are no more questions then. Thank you for your time. And anyone that's interested, you may come to me. And thank you.