 Hi, I'm Yui Kow, I work for Pivotal Software, I am also the Runtime PMC Lead in the Cloud Foundry Foundation. As you can see from the title of the talk, I'll be giving an update on Cloud Foundry isolation segments, formerly known as Elastic Clusters. So let's get started. Microphone. Microphone. Microphone. Is this better? How do I go back? I can't go back. I'm Yui Kow. And I work for Pivotal Software. I'm also the Runtime PMC Lead in the Cloud Foundry Foundation, so you might know me from that. As you can see from the talk, I'll be giving an update on Cloud Foundry isolation segments, formerly known as Elastic Clusters. So let's get started. I'd like to frame the problem we're trying to solve. Now a lot of you out there, you might start with one Cloud Foundry, and that's great. It might look a lot like this, although this doesn't really mention the data stores or all the other components required for coordination. You may have even striped them across 3AZs for availability, and as more apps come to the platform, you can just scale the appropriate tier, you know, whether it's the routers or the cells or the logging tier. And that's great. But then this happens. You end up with more deployments of Cloud Foundry for one reason or another, or maybe even this happens. I do know of an organization that has 15 deployments of Cloud Foundry. And there are a lot of different reasons for that, perhaps for disaster recovery or HA or dev versus prod. But there are some operational concerns to maintaining so many Cloud Foundries, like keeping roles and permissions in sync across all of those Cloud Foundries. The base cost of running a Cloud Foundry, I think we're at somewhere about 20 VMs or so of different sizes for a single Cloud Foundry. As you have more VMs and Cloud Foundries, the deployment complexity and the maintenance costs are high. So then it's freeing to ask, you know, can we reduce the overhead, right? For each of those additional deployments that you have or wish to have, we can start with asking two questions. Is it okay to have a shared single management tier between those CF deployments, right? One set of admins for the whole deployment. One set of data stores, a Cloud Controller database or Blob Store, might be some concerns you have there. And if those are okay, great. And the second question is, is the latency low between those deployments? We ask that right now because in this feature set, we're not specifically solving for instability that's introduced in between the shared components as latency increases, although that may be something that could be addressed later. But for now, if yes to both of those, then I think isolations could help reduce some of your overhead. So what's an isolation segment? And I'll just read this here. It's a group of Cloud Foundry resources, compute, network, and or logging to which applications can be directed for deployment. And I'll digress here for a moment to talk about naming. So what's in a name? We've renamed and reworked this proposal in feature set in the architecture a few times, run into a few walls. It's been called placement pools and isolation groups and elastic clusters. We've now settled on isolation segments, thanks to a suggestion from Sandy Cash from IBM. It derives from a discussion of isolation and segmentation requirements as they pertain to PCI DSS, which a fair amount of you care about. I'm also fond of the word segment as it's not quite as overloaded as some of the other terms like groups or clusters or zones. So let's get back to it. You've got four Cloud Foundries pictured here. There's half to target four different end points and hopefully push their app to the correct one. Your admins may have difficulty keeping the permissions correct. So given an org and a space, should that user have the same permissions across all of them or does it vary? Because maybe one's prod and one's dev or test. The operator management, they have to think about what the patch levels are. You might end up not patching one of them for a really long time and it just gets really out of sync. So let's assume that for these deployments, an isolation segment might help to reduce this overhead. So let's take a look at what that might look like to create an isolation segment. This is really the simplest type of isolation segment I'm showing here. It involves deploying an additional set of cells. You've got shared management, routing and logging, but you can have dedicated compute resources. In this architecture, you could associate different VM types to different isolation segments. Perhaps the blue segment has lots of CPU and solid state drives or perhaps you have an organization that has quality of service guarantees such as CPU availability or perhaps you're offering Cloud Foundry as a service, you'd like to be able to charge an organization for dedicated usage of a set of cells like a kind of premium usage. Additionally, you can consider using this as a way to add additional runtime capabilities in a smaller set such as like an NFS driver to a set of cells in a segment. Thanks to the recent persistence work, which you can learn more about in a talk tomorrow, NFS and shared file systems as a service, you could then enable a specific service broker with plans that are just for that org or space. So really just kind of adding premium capabilities for particular deployments. Another possibility might be creating segments that are known to spin down at nights on weekends. That's some other possibilities may come to you as you think through these. So let's take a look at a possible UX for this. I have not run this by the CLI team, but this is the general intention. So as a Cloud Controller admin, you can create an isolation segment. Let's name it blue here. And then you can bind it to a space in a particular organization. And your space developers, all they have to do is see a push. They don't have to know about that or think about that in the scenario. In a future milestone, and I believe some work is also underway to introduce this UX, we propose that an admin can associate multiple isolation segments to an org and then an org manager could self-service and associate particular segments to particular spaces. The next type of isolation segment that I'd like to talk about is routers and cells. From what I've heard, this is perhaps the most compelling use case for a fair amount of customers and most likely to help reduce overhead. With this, you can isolate application traffic for a particular set of apps in a segment. You'd need to configure DNS and your edge load balancer to direct routes to the correct set of go routers. And the go routers for that segment would need to be configured to only forward traffic for routes with that segment. So from the edge load balancer through the go router and onto the cells, the entire application request is contained within the segment. Which is quite nice if you have certain segmentation requirements there. With each of these segments, operators would need to select how strict the networking separation is for each segment. Now we do have a specific requirement for a console to be able to talk across segments on a specific set of ports and IPs in the initial milestones. But otherwise, the only required communication is between the individual segment and back up to the management here. We're also hopeful that in future, the requirement for console communication across the segments will no longer be required as console allows for the hub and spoke model. There's an additional nuance about how to deal with domains and routes in a self-service way in this model, which may or may not apply to your organization. Perhaps there's a shared subdomain for each isolation segment that requires routing and the CFCLA could be enhanced to highlight which shared domains are dedicated to a particular segment. There's an additional proposal planned to deal with how to elegantly handle domains and routing. But if you have ideas or suggestions in this area, I would welcome your feedback. You can comment right on the existing proposal right now. Now a third type of isolation segment that I'm showing here involves adding logging to that. Perhaps you have an organization that has compliance or security requirements that require the application logs from a set of apps deployed to one segment are never commingled with logs from another segment. And if you have this particular use case, I'm very much interested in talking to you so I can understand which aspects of logging need to be isolated a little better. There's actually many components in the logging system. So there's many choices we could make there. There is another related aspect that I'd like to touch upon briefly before wrapping up that involves trust between components. Could we add authentication and authorization between components, so ensuring workloads intended for blue components cannot be accidentally sent or received by components in the red segment, which may be of a lower trust classification, right? Maybe those logs go into a public domain somewhere. Now I'm not a security expert and we'd need to run this by someone with more security background. But a possibility we've discussed is perhaps using a set of certs for each component and as each component communicates with the management tier, the management tier could do some additional verification on the OUs of the cert where each OU maps to a segment. And if this trust mode is enabled for a segment, only components bearing the appropriate certificate issued by the trusted CA from the management tier could receive that particular segment's workload. So to wrap up, I hope for those of you with multiple Cloud Foundry deployments or planned growth of Cloud Foundry, isolation segments could help reduce the total number of deployments under management. Your feedback is really valuable and needed on the proposal, which can be found here at this URL. I am really optimistic that we'll reach the first milestone by the end of the year, which includes Compute and the CLI commands associated with that. Right now we've got the initial work for this first milestone done in Diego, the crud for it and Cloud Control are done, and we just need to flow that information through and then also implement the CLI commands. You can find progress updates and links to relevant trackers in that proposal at the bottom, and additional proposals for some of the future milestones as we add more capabilities to this will be posted, so be on the lookout for those and would love your feedback for that. Great. Questions? Johannes? Yes. There is a stack hack. It's not great. The build packs don't like the stack hack, for example. The stack hack. Well, for one, you're hacking. It's not supported, so it's in a fork. Two, as you can see from the UX, if you're doing the stack hack, you actually have to think about what stack you're going to, right, and whether that's appropriate. Another issue with the stack hack is the build packs aren't actually, depending if you handled it correctly, the build packs are looking for a particular stack to be able to determine what binaries are the right binaries to use when you're staging your app. It looks at that CF underscore stack variable during staging to see, like, oh, you're on CF Linux FS2, we'll get those binaries. I haven't looked too deeply into the stack hack, so I don't know how well it addresses that issue. Yes. The last question. You have to also consider minus one highest if you want to look back at your different statements. Yes. This would, we think, would also be possible. You might need a separate Bosch director right now, and you share the appropriate things in the manifest, because right now a single Bosch director can't target multiple CPIs, and you'd also have to consider that latency, right? If you're wanting to burst to Amazon, maybe that would be okay. Maybe it would work, but I don't know. You'd have to test it to see if the latency issues are large enough to cause issues. That's something I'm hoping one of the teams can explore, about how much latency is this architecture tolerant to, yes. I don't believe it has to be. It can be a separate Bosch deployment. That's true. I haven't actually looked at what the manifest changes look like, that Eric has done on the Diego team. If it's possible, he's done it as a separate job. Jim, you would know. The isolation segment, does it have to be a separate deployment? So if it's a property on the cell, then you could deploy just cells and then associate the property then, right? And then you could choose. As we're looking to get this out as quickly as possible, so the simplest thing is, as an admin, I can bind to it. Again, we're hoping to introduce some self-service UX, where an admin can map particular isolation segments to an org, perhaps, and then an org manager could self-service those isolation segments to the spaces that he's managing. Does that make sense? Yes. Sorry. The question Guillaume asked is, in the proposal he had asked a question about, at the go routers, you could associate multiple isolation segments to them. And again, this kind of depends on what type of isolation you need at that level, right? If you're okay sharing a set of go routers across maybe two of your segments and you have five segments, right, there's two segments that you're okay with sharing that routing tier with, then that go router can just contain the routing table for those two sets of segments and then it can direct the traffic to the appropriate segment. And again, the sticky point here is around how you set up your network. Does it allow, if you've done the segments in separate networks, does it allow communication from this set of routers to the other side, to the other set of cells? But it's possible, depending on how you've configured your deployment, that you could have a shared set of routers across multiple compute. Is it likely to expand across? I think it's possible for the logging system but the logging system is quite a bit more complicated than the other components, so there's still some design to be done in terms of what that would look like in terms of how you discover the appropriate traffic controller to hit to find out where your logs, to get at the logs for your app, right? There's some amount of coordination, you have got some logs from the central management tier that need to get to the right place to be able to read them out. All right, if no other questions, thank you very much.