 All right, good morning, everyone. Thanks for attending. So I'm Eric Malm. I'm the product manager for the Diego team, one of the core projects in the Cloud Foundry application runtime. And today I'd like to give you an update on some of the things we've been working on since the past North American Summit. So first, some administrative business. If there's a fire, remain calm, exit according to the signs can be an outside. All right, so first, let me give you a brief introduction to what Diego is, in case you haven't heard of it before. So Diego is the container runtime that's tailor made for the CF application runtime. So it's actually a fairly old project at this point. We initially started development to replace the previous system, the DEAs, back in January of 2014. And about a year and a half later, we reached a point of parity and stability that we said was generally available for use. About a year after that, we demonstrated that we could run a quarter million containers in a full CF environment, stably. And so at that point, that was our milestone for declaring 1.0 on the release. And that started the deprecation clock on the DEAs, which removed six months later. And then lately, we had another major version release a couple of months ago, kind of clear out some of the, and simplify some of the configuration in the release, while still preserving the ability to upgrade across major releases, as long as you're hitting adjacent ones. So today, I'd like to give you a brief overview of how the Diego components fit into the Cloud Foundry application runtime. And then a few selected updates about what we've been doing. So one thing I'd like to tell you about is how we've been improving the stability of performing active health checks against application containers. So we've seen a few issues arise there. And then switching gears a little bit to tell you about some security improvements that we've been making around some instance-specific credentials that we're supplying to containers, and how we're improving the platform on top of that. Finally, if you're responsible for operating Cloud Foundry, you may be interested in some of the improvements we've made to our operator tooling that lets you inspect the internals of the system more conveniently. And then finally, I'll tell you a little bit about some of the next steps that we're thinking about in the months ahead. So first, if you've been using Cloud Foundry, then you're likely more familiar with more external facing components, such as the Cloud Controller or the Go routers that handle HTTP traffic. And you've almost certainly experienced the magic of running CF push and very quickly observing your build pack or your doctor image or now even your Windows applications running on the cloud. So let me tell you a little bit about what's happening under the hood here to get those application instances running. All of these instances are running as containers inside of Cloud Foundry's container engine, which we call Garden. And Garden is great at its job, which is to execute containers, create them and run processes in them. But it's also very intentionally a local system. It knows how to run containers on its local host. But it doesn't know about the rest of the distributed system that constitutes Cloud Foundry. And that's where Diego comes in. So Diego was responsible for organizing placement and execution of those containers across dozens or hundreds or thousands of container hosts and ensuring that they're all doing the right thing and they're all running according to the desired state that all the application developers have pushed. In order to do that, the core of Diego has some dependencies. In particular, it relies on a consistent data store to coordinate a lot of this activity. And nowadays, that's a SQL database. So MySQL or Postgres. And then the components themselves also need some degree of coordination so that they can be behaving correctly amongst themselves. In the past, we've used console for that. And we've been introducing our own replacement for that because of some issues we've seen around using console at Bosch together. And then finally, if you're using Diego release, it includes some ancillary capabilities, most notably the system that enables SSH access into application containers. So I'd like to do a little bit more of a deeper dive into how these components are arranged in a typical CF deployment, such as one that you might get from the CF deployment repository on GitHub. So if you've been using that to deploy Cloud Foundry, you're most likely familiar with Diego from the Diego cell instances that that deploys. And as you might guess, that's where Garden is running on each host in order to do its job of creating containers and running application processes in them. But as I mentioned, Garden is this local system. And so Diego starts to come into the picture here to connect it to the rest of the Cloud Foundry architecture. So the first core component of Diego that lives on the cells is what we call the cell rep or representative. And this is responsible for registering the presence of the cell to the rest of the distributed Cloud Foundry system for managing that local Garden to do its job of making containers. It also does some accounting for the memory and disk allocations that it's making for those containers. And then finally, it's responsible for downloading some of the assets like the droplet and build packs that go into those containers to do various types of tasks. And so it caches some of those depending on the data that comes in. But the reps themselves are not receiving that work directly. And that's where the other core components of Diego come in. So the next instance group that you will likely have seen from CIF deployment is called the Diego API. And that's where the BBS, the next main Diego component, lives. And that's responsible for providing the public API to the rest of the Cloud Foundry subsystems so that they can submit work and inspect how it's going after they've submitted it. And the BBS also understands the particular types of work that Diego is capable of running, namely long-running processes or LRPs and one-off tasks. And it enforces the various lifecycle rules and state transitions for those as it manages them across the system. The one thing that the BBS doesn't do, though, is make direct decisions about where that work gets placed. And instead it delegates that to the third core Diego service, which is the auctioneer. Nowadays in a CIF deployment-based deployment, it's a little redundant, that's living on a VM, or instance group, called the scheduler. And so that auctioneer is responsible for receiving work to be placed from the BBS, communicating with all the cells that it can find, and deciding optimal placement based on that up-to-date set of criteria. Lastly, I mentioned this locket component that we've been introducing. This is responsible for coordinating many of the distributed locks and the cell presences that the rest of the control plane needs to find the active instances of these services. So in particular, you'll notice on this diagram that we have many, many instances of each one of these instance groups. But for the BBS and the auctioneer, only one of them is active at a time. And so locket helps coordinate that activity. Finally, the BBS has one other crucial responsibility in the system. It's also responsible periodically for assessing both the desired state that clients such as Cloud Controller have expressed and the actual state coming from all of the Diego cells in the deployment and reconciling any differences. So if there are extra instances that shouldn't be running anymore, it tries to stop them. And if they're missing ones, it resubmits them for placement to get them running again. Anyway, so that's an overview of the core components of Diego and how they help manage this responsibility inside of the Cloud Foundry application runtime. So I'd like to change gears now a little bit and give you some information about some of the updates we've been making over the past year. And one of these is to do with how Diego on the Diego cells performs active health checking of application instances. So our previous way to express this through the BBS API has been through something called the monitor action. This is a fairly generic action to perform periodically to assess the health of an individual app instance. So this could involve running a binary that does a TCP port check or an HTTP request against the application. But in general, it's fairly broad. You could basically run anything, any other kind of process inside of there. But through the Cloud Controller, we've only really expressed these kind of network-based health checks. So just to illustrate, let's say we've got the Diego rep running on a VM represented with our little logo here, and we'll look at what happens over time as it starts to run an application instance. So let's say it gets assigned an instance of a Ruby build pack-based app to run. So it first talks to Garden and creates a container for it. And then it starts to invoke the application start command inside of that container. So of course, that process is not going to be initially ready to receive traffic. It's going to have some warm-up period of at least a few seconds before it starts listening on its support and is capable of servicing requests. So let's say that it eventually becomes healthy, but not immediately. Here's how Diego detects that to know when to start sending traffic to it, to register it with the route and control plane. So a couple of seconds after it invokes that process, it begins this monitoring process. And it does that by invoking whatever was specified in this monitor action inside of the container as a separate process. So that process itself is going to have some amount of warm-up time. It's going to need to potentially load files from disk into memory to get them executing. And at some point, it's ready to go. And it does its check against the application instance. And it's not ready yet, so it fails. So the process then winds down and reports that through the Garden API back to the cell rep. So at that point, the Diego rep knows that this application instance is not yet ready to receive traffic, but it hasn't been so long that it's ready to declare it failed yet. So it just waits another couple seconds, and it tries again. And this time, it catches the application instance right after it started listening on its support. So it says, great, this instance is ready to receive traffic. And it goes and emits route registrations, updates the BBS with the networking information that applies to this particular instance so it can start receiving traffic. But that's not the end of the story in terms of how the cell continues to monitor this process. So on a less frequent time scale now, it continues to invoke that same monitor action to probe the correct behavior of this application. So 30 seconds later, it's going to run another one of these health checks. Everything's working fine at that point. So it reports that that instance is still healthy. It keeps it registered with the control plane. Well, we observed that as convenient as this process was, especially to express via the Diego APIs, it did occasionally suffer from some problems, especially when there was unexpected resource contention on the cell. So if the disk's really slowed down, or CPU really slowed down for some reason, then you could end up in a scenario like the following. So 30 seconds later, the rep invokes that monitor action again inside the container. But now, because of that resource contention, things are taking a lot longer to start up. Maybe it's taking garden a longer amount of time to get that process actually running in the containerized context, because as you may have seen from some of the garden talks, that's actually a complicated process. Or maybe it's just taking longer for that process even to start inside of the container. So in this case, maybe that health check is still able to successfully evaluate that process inside the container, but it's really taking a much longer period of time to finish that process. And the Diego rep is actually told not to wait forever to perform this monitor action. And at some point, it's going to decide, this has taken too long. I haven't heard from it. Something has probably gone wrong. And it's going to mark the instances being failed. So even though eventually this process returns to the garden API and says, I succeeded, exit code 0, then the Diego cell is going to decide this instance is not functional anymore. And it's going to stop it and report that as a crash through the control plane and get it rescheduled elsewhere. So we've seen this occasionally lead to cases when instances are actually healthy, but the health check is failing such that we think it isn't anymore. And in the worst case scenario, if you have a lot of contention on your infrastructure, then this can unfortunately lead to some cascading failures that are difficult to arrest. So we thought, OK, we really want to improve this to improve the resilience of the platform. So here's what we settled on. We actually introduced a new API on the desired LRP specification called a check definition. And it provides a much more narrowly scoped interface to define the kinds of health checks that we found important in Cloud Foundry. And we have these TCP port checks or HTTP requests. And so Diego is now free to reimplement those as a longer running health check running inside the container that takes much less system overhead to orchestrate and is much less sensitive to things like resource contention on the cell. So let's look at the same scenario with a long running health check performing that. So again, we have the Diego cell and it's being assigned an instance of this build pack application to run. So again, it creates the container and starts the process, which is eventually going to become healthy. So in its health check assessment of this process, it starts out just as before, where a couple of seconds after launching the process, it starts this health check. And that initially finds that this process is not healthy. But instead of exiting and returning right back through the garden API to the Diego cell, it instead just keeps running. It knows that there is a certain timeout for the startup of this instance. And the rep knows that too. So it's just going to hang out there. And a couple of seconds later, check again. And at this point, the application is healthy, just like it was with the short running health checks. So that process exits and reports back to the cell that everything is functional. It's good to go. And then the same long running procedure happens for the ongoing health check to assess the liveness of this container. So again, the Diego cell invokes this health check in a long running mode. And it performs a health check. Everything succeeds. So it just waits another 30 seconds to do another health check inside the container. And that all works. So let's also see how this works when it's detecting a failure of the application instance. Let's say that at some future point in time, it actually does go bad and closes its port or stops accepting HTTP requests. So that health check on its next period of probing is going to detect that. And that point is going to exit with a non-zero exit code. And that's the signal for Diego to know that this health check has actually failed inside the container and to rightfully kill this malfunctioning application instance and get it rescheduled elsewhere. So we've definitely seen this reduce the amount of system overhead that is required for the system activity to monitor all the application instances, especially as your container densities on cells go up. We actually just realized on the Diego team that we've considered this stable for a few months now. We actually needed a little bit of help from Garden to correctly arrange the resource limits on these health checks. And we just had forgotten about declaring it no longer experimental and stable. So hopefully next week sometime this will become just the default deployment option inside of CF deployment to run this way. But you can easily opt into it now with an ops file. OK, moving on a little bit, I'd like to tell you about some of the security primitives that we've been incorporating into the platform and that we've been finding really useful to build on for other system enhancements. And so we've been calling these instance identity credentials because they're a short-lived, per instance, specific set of credentials that we issue to every container running on the platform. So to illustrate, let's get another Diego cell up on the screen here and it's going to start running an instance of an application here. And associated to that application container are several different unique identifiers. So especially if you're using container networking, that container has an IP address that's unique across the deployment. If you're not, if you're still using the Garden built in networking, it'll just be local to the host. But container networking is now the default in CF deployment as well. So this is now typical. And then there's also an application GUID associated with this container that's associated to the identifier that Cloud Controller uses to track this application. And then finally, each instance of this application has its own unique identifier that's also the same as the container handle. It's the host name inside of the container. So these are all identifiers for this unit of work that have different meanings in different contexts. So what the Diego cell does is it encodes those identifiers into a certificate and key pair that it then supplies inside the container. So application processes can make use of it to identify themselves as a particular unit of work on the platform. And it does that before it even invokes the application start command so that it's all ready to go when your Ruby or your Java process starts up. It can just find them in a location that's determined by environment variables. So let's take a look at the contents of the certificate and see how it's encoding that information to make use of. So because this is an X7-9 certificate, of course it has a subject. And that's where we put some of these logical identifiers for this unit of work. So that application comes in as an organizational unit in that subject identifier. And the instance identifier is present as its common name, which we thought made no sense because it matches the host name inside of the container. And then similarly, we have that information present as subject alternative names inside the certificate. So that's a great place to put that specific IP address for the container on the network, as well as to repeat the host name, that instance, GUID, as a DNS SAN. And then finally, these certificates are deliberately very short-lived, by default, only a day. So we'll make one when the container is first created, but it's only valid for a day after that. So if something happens to compromise that application instance, then on a relatively short time scale, that certificate is no longer valid for anything. And this is also something that operators can configure. In the Bosch release, you can turn it down to a minimum of one hour, so that those are rotating very quickly. And of course, there's a unique serial number in the certificate to identify it. Okay, well, you may be concerned what happens when that certificate expires. It's only valid for a day, or maybe only an hour. So the rep knows that though, knows when it issued it, and it knows that validity period. So before the previous one expires, it creates a new certificate with the same identifying information, but a different overlapping validity period. And it moves that into the container over place of the existing files. So if you're an application process, you can either pull that location periodically and pick up the new certificates, or you can even set up a file system watch on those locations and receive a notification through that watch that these files have changed. So this is, I think, a really interesting security primitive on its own, but it's also allowed us to do some increasingly sophisticated improvements to how we're running application instances and routing traffic to them. I think the most exciting of those that we've developed is something that we're calling route integrity, and it's making use of Envoy, which is one of the new class of these very versatile, dynamically configurable network proxies that are becoming increasingly popular to coordinate things like microservices. So let's see how this works on a Diego cell, and because we're gonna be talking about routing, well, let's also get a go router into the picture here. So we can send some HTTP traffic to the application instance. So as before, the Diego cell is gonna start running an instance of an application here, and it's going to instruct that application to listen on a particular port, conventionally AD80, although there are ways to configure that through the CCAPI. So in order to get network traffic into this application process listening on this port, the Diego cell through garden allocates an external port on the host that's going to direct traffic into the container on that particular port. So now when this application is ready to receive traffic, the Diego cell is responsible for registering it with the routing control plane, which it does as follows. It knows its own IP address, so it takes that with the external port and tells that to the go routers. So let's say that in this case, this is an application that's mapped to a route for example.com. So eventually the go router is going to receive a route registration that says, if you want to talk to example.com, then talk to this particular external IP address and this port. So then when it fields a request for that domain, it connects to that IP address and port on the cell and then the traffic is forwarded into the application transparently, so it can handle that. So this has worked great. This has been the route registration and routing model on Cloud Foundry for many years, but it does suffer from a few deficiencies. In particular, we've been very diligent on Cloud Foundry about telling applications only to listen on plain HTTP. Don't handle any of the TLS stuff yourself for secure communication. If you want that to happen, rely on the go router or load balancers in front of it to terminate TLS. And that of course means that the go router to container link is always in this model plain HTTP, which some people have been concerned about as a security risk. And in the worst case scenario, it does mean that if the go router doesn't receive an update to its set of registrations, but the containers are still moving underneath it, it could eventually end up misrouting a request to a different container on the platform. And to defend against that, the policy of the go routers has been to if they haven't received an update about a particular route within two minutes, they'll drop it out of the routing table. They'll prefer to lose availability instead of losing consistency in terms of what they're routing to. So we said, well, this is not an ideal situation because if there's some sort of network outage in the control plane, we don't want that to affect the data plane, how messages are getting to applications. And we figured out that with these instance identity certificates and this new breed of proxy, we could resolve that as follows. So in this model, when the cell is opted into this configuration, it's going to run this Envoy proxy inside the container as a separate process that it knows how to manage. And it's going to configure it dynamically as follows. It'll tell it to start listening on a separate port from that application instance. And on that port, it's going to terminate TLS using those instance credentials. And once it does that, it forwards the traffic to the application port itself. So now the Diego cell, once it's determined that that's all up and running, knows that it's going to allocate a separate port on the host to be directed to the port that that Envoy proxy is listening on. And when it registers this instance to the Go Router, it instead includes an alternate registration with that port and its external IP address. But it has a couple other pieces of metadata there to allow the Go Routers to do the right thing. So it first tells it that this is a TLS route instead of a plain HTTP one as has been the default. And it includes the specific identifier that's baked into that instance identity certificate. So then when the Go Router gets a request for example.com, it prefers connecting to that TLS capable port over TLS. And it starts the handshake and at that point, it verifies that the subject name that's coming in that host certificate matches the identifier of the instance that it was told about. And in this case that works. So it finishes the handshake successfully and then sends the request, which is then transparently proxied to the application. So the application behavior doesn't need to change at all and we're just using Envoy to terminate TLS inside of the application container and proxied the traffic. Envoy's dynamic configuration is also great here because when we rotate the credentials, then we just dynamically reconfigure Envoy with the new TLS context to terminate. So it's all seamless, existing connections, continue transmitting data until they're closed and new connections get the new set of credentials for their handshakes. All right, so this is again an optional feature inside of CF deployment at the moment. You can use an ops file to opt into it. We're also doing a little bit of fine tuning in terms of how we account for the memory usage of these Envoy proxies, because of course that's not for free. And so we're continuing to make that more sophisticated. But if you're willing to give it a little bit of extra room to allocate memory, then this can be a great way to improve the security and the resilience of your routing tier inside of your CF deployments. Finally, I'd like to give you some updates about what we've been doing with some of our operator tooling. In particular, we have a tool that we call CF dot or the CF Diego operator toolkit, which we kind of think of as a command line tool for the Diego APIs. Most of the APIs are binary encoded with protobufs, so they're not very friendly for use with curl. And so instead, CF dot has a set of commands that are intended to interact with those APIs and translate inputs and outputs into streams of JSON. So in particular, you get a stream of JSON objects on standard out, which is great for piping into commands like JQ for further processing, or even into your line-oriented UNIX utilities to do some ad hoc data analysis about your deployment. In particular, we even have a Bosch job that compiles CF dot and deploys it with JQ and even puts them on the path for you. So you can just drop into one of the Diego instances in your deployment and use this tooling to inspect very quickly. So when I talked about this at some at last time, I think we had just reached parity with all of the BBS APIs in terms of having CF dot commands for those. And since then, we've added one that allows you to get a running stream of all of the events about tasks that are happening in your environment. So if you, you may have been using the corresponding LRP events command to monitor activity about your desired and actual LRPs, your app instances, and this lets you do the same kind of thing for tasks. We also added commands to interact with API that lock it presents. So if you want to inspect locks and cell presences or manipulate them, then you can do that using CF dot. And then finally, we've added some commands that let you inspect the state of individual cells or even all the cells in the deployment. So if you wanna get a snapshot of what they're available or total resources are, how many instances they're individually running, then this allows you to do that as well. So I'd like to give you a couple examples of how you can use these new commands to do some ad hoc querying in your environment. So let's say you wanna get a stream of how task IDs are getting assigned to cells as they start running. Well you can take that CF dot task events command, pipe it into JQ and look for the particular criteria that indicate that that task has started running on a particular cell. And then just report out the task identifier and the cell ID. So maybe you start this running and you run a staging task on Cloud Foundry and so you'll see this line of output come out saying this task ID was assigned to this cell. And then maybe a couple of minutes later, you run an application task and you see that task eventually get assigned to a cell. Likewise, if you wanna get a snapshot of just the total amount of memory that's available across your cells at that point in time, this is emitted via component metrics but you might not have aggregation of those hooked up or you might want an up-to-date snapshot across the entire deployment. Well that's where the cell states command comes in handy. You can just use it to sum up the available memory that all the cells are advertising. So I was running this in one of our testing environments which has I think a total of three cells. So it came back with about 74 gigs of available memory at that point in time. All right, so lastly I'd like to tell you about some of the things that we're considering working on in the next few months. If you were at the keynote on Thursday, you likely saw Zach Robinson's presentation about how we're thinking about doing rolling app updates. And we've been considering this for a while and I think we finally have the right approach in terms of understanding what the developer experience for that is going to be like and using that to drive the internal system improvements. So we're very much looking forward to collaborating with the CAPI team and especially the routing team on coordinating that so that it's a safe and reliable rolling update mechanism that keeps your application sufficiently routable during the update. Also these envoys that we've put inside the containers are now very tantalizing in terms of other capabilities. We're kind of just scratching on the surface about what we could do with them. And the container networking even now is working on a track of work to explore how to transparently proxy container to container traffic through them to start taking advantage of some client side load balancing features that Envoy provides. We also are always looking for areas to improve how the cells are managing their memory and disk capacity and how they're able to put CPU limits on individual app instances. I think we've even seen some increased interest on the mailing list in terms of controlling that or getting better telemetry and understanding about that. And so we definitely want to hear what's important from you the community in terms of developer and operator controls over that. Because it's been a fairly flexible model and that can occasionally cause some problems for application performance. Finally, along the lines of improving reliability and stability in the platform, I think as a community we've been thinking more about how we can be doing more focused monitoring of the availability of CF installations. And so I'm very interested in thinking about how we can make that even more informative for operators as they are responsible for operating subsystems like Diego inside of a CF deployment. And so we'd like to use that also as raw data both for the development teams and for operators about what features to enable, where to improve reliability. And along the lines were of course, always very interested in improving the general stability and security of the platform. So when any kind of security issues arise or if there's a severe stability issue, that's a top priority for us. So I think we're about out of time but maybe we could field a question or two if there are any. Yeah. Yeah, so the question is about Spiffy as an emerging standard for metadata and app certificates. Yeah, I think we're very open. We don't have a current plan to do that, but I know especially some of the people that are involved with identity management on the platform as a whole. So like the UAA team, they have a lot of ideas that are coming out of the activity around that kind of identity management on the platform. So as we begin to see that standard emerge in the community, we'd be very interested in integrating with it to provide a potentially more federated model for identity across cloud foundries, Kubernetes, anything else that's adopting Spiffy as a standard. Yeah. Oh, right, so the question is, we have these instance identity certificates. How would you integrate them with some external service? So at this point, those certificates are all derived from a root CA that the platform knows about. And so the Diego cells, they get their own certificate authority to issue those certificates. So it would merely be a matter of getting access to the CA certificate for that root authority and having your external service be able to trust it. So if you are operating in a Bosch context, you're likely to have access to that as one of the variables that goes into the Bosch manifest. But if there are other ways that we can more conveniently expose it, maybe through the V2 Info endpoint on Cloud Control or something, so that some other service could discover the platform and decide how to trust application identities. That'd be very interesting. And I think a relatively straightforward capability to build out. But we'd also certainly wanna understand the acceptable workflows because we realize that there's a sensitivity to ensuring that you have the correct trust bootstrapping model for that. So we wanna make sure that you're not getting a bogus certificate and then somehow accidentally trusting bad traffic to those services. Oh, a question is about can we run a more generic command as part of the health check? So I think that's a potential capability. We wouldn't have a current plan to do that, but if it's something that we could figure out how to express appropriately to developers through the CC API, it'd be straightforward to extend that existing definition to accept that as a type of health check. And again, we'd be able to arrange it as this kind of long running implementation inside of the container to ideally reduce the amount of resource usage required for that. Yeah, one more question. Hey, Tim. Yeah, so question is about the actual resource usage for these envoy proxies. So we did some initial benchmarking of their memory footprint in test environments. And we found when they were quiescent, they generally, we were pretty safe to assume they were taking about five or six megs of memory as their footprint. When they handle traffic though, that does increase their memory footprint. And at least in the short term, it looks like it remains fairly stable and it might be relatively proportional to the number of connections that it's handling. So we actually have a story ongoing right now on the team to do more detailed benchmarking of that memory consumption so that operators can start accounting for it appropriately. As part of enabling that configuration on the cells, we do let you add an additional static amount of memory allocation to those containers just to give envoy that additional amount of head space that it uses. But there's a fairly static amount. And so if we can get a better understanding of the dynamic behavior of the proxy, then hopefully we can do something more sophisticated there. Because we certainly don't want to make sure that this is a stable option for app instances and we don't want to be um-killing everything all over the place just because the envoy proxies are running into the walls of the container. All right, I think that's all we have time for. I'm happy to handle questions after this or at Diego office hours later today. So thanks very much for attending and enjoy the rest of the summit.