 So welcome to how to charge the TurboCharge, the Cloud Foundry API. I wanted to talk about a problem that I feel like a lot of organizations have, not just with Cloud Foundry, but dealing with large amounts of applications in general, wherever they may be deployed. And that is, how do you understand meta information about those applications? Not just what's deployed, but the properties, the interesting facts about the application. How can you ask the system what's true about some set of applications? And what can we do about that in Cloud Foundry? So hi, I'm John Fominela. I'm an advisor for Pivotal. So I work with mostly financial services and insurance clients, mostly in the Americas. But we have a pretty broad portfolio of folks that I talk to. And it's those folks who I think have run into this problem the most. They have very large portfolios. They have very diverse portfolios. Some applications that are really new, some applications that are really old, some applications that are Java, some applications that are .NET, some applications that are Node, totally different levels of sensitivity, and so on. So they are interested in understanding at any given moment what's actually running in Cloud Foundry and being able to ask questions of Cloud Foundry tell me how many of these applications are sensitive or tell me what billing code I should use for this particular application, and so on. And this is a problem that may not be apparent at first glance. I want to kind of show you the way that I think about it to maybe reveal what kinds of solutions might be appropriate to it. And then we'll see what I did with a couple of case studies for those clients and hopefully you can apply them to your own Cloud Foundry foundations. So like I said, the fundamental problem is one of metadata. What I mean by that is we're interested in, if we have some application, being able to tell the foundation or foundations about that application properties that are true that aren't part of the application itself. So for example, you might have a contact person for an application and you might want to tell Cloud Foundry, hey, the contact person for this application is Bob or Alice. You might have a billing code for the application that describes how the foundation should charge back money to whichever line of business is deploying that application. So that's another fact about the application you'd like to store close to Cloud Foundry, ideally. And then you may also have things like the details about the sensitivity of the application. Does it contain classified information or a personally identifiable information that might be subject to higher levels of regulatory or auditing and compliance scrutiny. So all of these things are facts, are examples of facts that we would like to record about the application with the application or at least very close to it. So we have that application. Now it's not good enough just to store those facts, because later we want to be able to ask a question like who's the billing contact for this particular application or maybe you want to ask a question not just of one application at a time, but a set of applications, maybe all the applications running on the whole foundation. Which of these applications contains sensitive information? I need to make sure that they're on this isolation segment or which of these applications has this billing code because we're going through a financial restructuring and we have to send that charge code to somebody else. So there are lots of kinds of questions you might be interested in asking not just of one application, but of many different applications. And you might have many different kinds of questions you want to ask your foundation. And ideally we have some mechanism by which we can verify that the facts that we said before about the application are or are not true for a specific application. So that works for like one or two or three applications at a time, but what happens if you have a lot of applications? Right now we're maybe going to rely on something like the APIs that are provided by the foundation to make sure that we can answer those questions. But there's some problems with that. So let's say that we have a collection, not just one application, but many applications running on our foundation, right? And let's say that some of those applications have a property that we're interested in. Maybe it's an application with a specific billing code, for example. And we'd like to find all the applications in our foundation that have that property that have the property of interest. So all the applications with such and such billing code or all the applications with Alice as the contact and so on. So we again have some way of identifying those applications. We are querying something, we don't know what it is yet, but we're querying something to get us the answer to that question. And then ideally we get some set of query results that is in fact the set of applications that that represents. So the kinds of questions like I said before that we might be interested in are things like chargeback, who gets billed for this application or contact info, who's the responsible party or the security contact, who gets notified for this application if there's a breach. Maybe you have a global security contact like security at mycompany.com for vulnerabilities, but maybe you have a specific contact for that application that needs to be notified. Maybe you have questions about sensitivity, which applications use personally identifiable information or MVUS, which applications are subject to hyper regulations or regulations about how we store healthcare data and private information about people getting medical treatment. Or maybe even more technically, which applications are using a particular dependency? Maybe you're interested in tracking something like I say dependency, I don't mean something like the root FS, but rather something like among all of my Java applications or all of my Python applications, if they use this specific library, it might be interesting to track which of those applications has those dependencies. But if later we get a CBE about that specific library, I'll be able to tell all the owners of those applications, hey, you have a vulnerability in this and we know that you're using this library. So it sounds like I think a useful ability to have, so a natural question might be, can we get this in Cloud Foundry? So the way that metadata kind of works in Cloud Foundry right now is that it doesn't. So the closest thing that you can really get is the manifest and it seems like a maybe a natural place to put this information. Like you'd want to, we already described properties about the application in the manifest, maybe that's a reasonable place to put some of this metadata. And so you might have something like, okay, maybe there's a metadata section that we add to the manifest. So how might something like that work? Well, when you see that push with that manifest, you know, of course we're gonna zip up the directory and everything that you're pushing from, then we're also gonna include all the manifest data and then together that will form a payload that we'll send to the Cloud Controller API. And so that payload is going to, you know, go over HTTP to the Cloud Controller API. But what happens when it gets there is that the manifest isn't actually stored anywhere. Instead, the manifest is read for properties that match one of a set of named keys. And anything that's not in that set of named keys is just dropped. So you can have anything you want in the manifest, even if Cloud Foundry doesn't know what to do with that and that won't be a problem. But those properties will not be stored anywhere. So the Cloud Controller API is not remembering anything that we put in the manifest. So this doesn't really work, even though that seems like it would be a great place to put it, it's not really gonna be useful for our purposes. So if we can't use the manifest, what are the strategies we could use if we wanted this sort of flavor of capability? Now, whatever strategy we come up with, we'd want it to basically have three properties, I think. We don't want it to be invasive to a user's workflows. We don't want the user to have to do something weird or different about pushing applications just because we've turned this feature on or enabled it in some way in our foundation. People have to totally change how they push apps. We're probably not making their lives better, right? We're probably complicating things for the sake of adding this feature. So we also want the experience to be seamless and that ideally there should be API compatibility. We don't want to do something that, if there's some other downstream tool chain that doesn't know about the fact that we've got this capability turned on, what we don't want is a situation where now that we turn that feature on, we've broken some tool like concourse that depends on a specific response or the API response. Whatever that format is or whatever that structure is, we don't wanna break that in our future responses. And we also don't want querying. So when we ask questions, we don't wanna have to ask the question of each application in turn. So if we have 5,000 applications, we don't wanna have to ask 5,000 questions. We wanna ask one question to all the applications simultaneously. So we wouldn't probably be satisfied with a approach where we wanna know, like show me all the applications that have Alice as a contact and then that requires I have to ask each of the 5,000 applications one at a time. That probably wouldn't be a good experience. We'd wanna be able to ask one, we wanna send one API request and get one API response back with ideally all of the applications that match our criteria. So I helped write a proposal that has been shipped to CFDF, which the link is in the slides, which I've uploaded. And you can read more about that later. But essentially the proposal is to make it such that if you have an API resource, whether that's an organization, an application, a space, et cetera, whatever that, I'm using resource here in the restful sense of the word resource. And whenever you have a resource of something that the cloud controller API tracks or is aware of, ideally we should be able to put metadata on that. And that metadata could start out really simple. It's just like a label tagging sort of system of just this key, this value. But that's sort of the first maybe place to start. That's not here yet though. So if you wanted this, what would you do? So the first thing you might try is something like environment variables. Like that seems pretty close to having a key value store. And in fact, because that's part of the recognized structure of the manifest, we will get that stored in the cloud controller or through the cloud controller API. So it won't be discarded, it will actually be stored. So you might say, okay, well that gets gonna work because obviously you can set environment variables and obviously you can use that as a key value store. Now what are the problems with that though? So one problem is that you've made, you've added information about the application, facts about the application that require that application to restart, right? So you've said, oh, the billing contact for, or the billing charge code for this application has changed. It's now X instead of Y. Well, in order for that to actually take hold, you have to restart the application. But nothing about what we're telling the application here has anything to do with the state that would be needed by the application itself. It's not like we're wiring up a service or we're changing some property the application depends on. In fact, there are no references to this environment variable probably within the application. So if that's true, then that seems kind of a bad idea to make applications have to restart just because we added some metadata. That's also bad because if I'm an operator, ideally I'd like to be able to set metadata that you don't have to care about, right? Like maybe you as a developer, you don't get to decide what the billing charge code is. Maybe that's something that the organization decides. And maybe you shouldn't even be allowed to change the billing charge code, right? You shouldn't be able to change your bill to get paid by somebody else. So that's true then. You probably don't want to use environment variables for that because you can set one environment variable when you can set any environment variable. So that's probably not the right place to put this. It would technically work, but not ideal. So another approach would be, what if we wrote a CFCLI plugin that shadowed push? So in other words, what if you installed a plugin that somehow overwrote the push command with this new version that would like siphon off the part but have the metadata and store it somewhere? So what if you could say, for example, cf push and then some extra variables at the end that would be interpreted as metadata in some form? So you could do that, but that's actually not, you could write the plugin that is, but you wouldn't be able to actually install it because you can't shadow cfcli command. So you can't overwrite push or bind service or anything that's a cfcli command. So that doesn't work. And maybe the next best thing is, what if we write a plugin but you don't shadow it? So you have to change your workflow then, right? You're not typing cf push anymore. You're typing cf metadata push or something like that. And then that could work, but again, now we're asking people to change their workflow. Every place that we use that API would have to be updated every place. So my continuous integration pipeline that was pushing applications to Cloud Foundry I'd now have to also say something like, yeah, please use cf metadata push instead of push. So it would technically work, but again, not the greatest idea just because, well, first of all, you have to convince everybody to install this plugin, right? I mean, now you have 5,000 developers making sure they all do that correctly, keeping their plugins up to date. So you kind of have a different sort of a problem. But again, it technically would work. So all of these approaches that I've described before that technically would work, they kind of fail that third test though, which is that we don't want querying to require enumeration, right? So you can get the data in there into Cloud Foundry, but when you want to get it out again, there's no API that says show me all of these, show me all the applications that have this property. You can't reach in and monkey with the internals of the, I mean, you can, I guess, if you wanted to recompile it yourself, but a typical platform operator doesn't do that. So they're taking the package as is and they're running Cloud Foundry as is. So they're not going to have the ability, they're not going to have the ability to do this third part. So is there a way to kind of do that third part? And the answer is yes, there is, it requires a little bit more elbow grease than the approaches we've seen before. And this is one that we use them in case study, but I'm going to describe later. So the way this works is that you, everything's the same from the developer experience. So nobody changes anything about that, but the end point, the Cloud Foundry API end point is proxied through an HTTP proxy before you actually get to talk to it. So it's almost like there's a route service interposing in the way, but whenever somebody says CF push or CF anything, they're really making a HTTP request to the Cloud Controller API endpoint, right? So when they do that, they're hitting an HTTP endpoint. So that means that if you're hitting an HTTP endpoint, you can also proxy HTTP. And what this proxy does is it looks at any information in the manifest that you've sent over and it will read the manifest first before it gets to the actual Cloud Foundry API. So there's a step in the middle that looks at your manifest and decides if there's metadata in there that you're updating or changing or whatever. And then this step is owned by the continuous integration pipeline. So people aren't typically pushing their applications directly to some production foundation. They're maybe updating a source control somewhere and that triggers the continuous integration pipeline which then triggers a push to the production foundation. But importantly, this pipeline stores the metadata that's been extracted from the manifest into its own separate data storage. It's nothing to do with Cloud Foundry. So it's separate wherever you would store the rest of the information about your application in the continuous integration pipeline. You're also storing some additional metadata about that. And this is, I would say, a fairly natural fit for most organizations because they often use things like Artifactory or some other repository of stuff that represents the inventory of all their applications. So it's usually a fairly simple matter to say, oh, yeah, well, I'm also gonna store some extra key value pairs in my Artifactory or in some entirely new data store. So that takes care of the storage part. How do we get the querying part? There's a couple of ways to do that. In the case study, we wrote a small API shim that all it does is provide a few Boolean type operators for being able to ask questions for this data store. But depending on what that data store was, you could very easily just use the continuous integrations pipe, use that data store's native operators directly. So for example, if that's MongoDB, there's a fairly expressive set of operators you could already use, or if it's Redis, there's a fairly expressive set of operators you could already use. But we wanted to put something in front of it so that no matter what you chose, as the storage engine, you would have this sort of, as long as you used a compatible storage engine, no matter what that was, you'd have some way of sort of querying the data store where the metadata wasn't. So we just wrote this very small API shim. It's about 400 lines of go. So if you want to ask a question now in the future, you just use that API shim. So later, after you've pushed the metadata, after you've updated something, now you want to say show me all the applications that have Alice as a contact or show me all the things that have such and such person as a security contact or this billing charge backcode, et cetera, then you're gonna use the API shim to do that. So how did this work? So what I did was I sort of proposed this idea to folks who were interested in the CFDF proposal but were not able to, they wanted it now, basically they didn't want to wait for the proposal to be implemented. So I said, well, okay, the proposal isn't implemented yet. Here are some strategies we could try and we basically went through the list of the stuff in section two that I told you about and I kind of rejected all of them except the proxy approach. And what we tried for our trial methodology was these three firms that expressed this interest and I asked them for one non-production foundation and one production foundation each. So let's try it in non-production and let's try it in production under the hypothesis that the kinds of questions you might be interested in asking about non-production applications are different than the kinds of questions you might be asking about production applications. So we wanted to make sure that if they were doing these kinds of queries that we sort of covered our bases on both fronts. There's a fairly broad range of number of instances that were covered by each firm. So although they were each doing one non-production one production foundations or the number of foundations was the same the number of application instances being covered by those foundations was quite different about a 2.5x to 3x difference between the smallest and the largest. They also chose different data stores. So one chose at CD which is a fairly simple key value store RAF consensus all that stuff. So clustering and then the other two chose Redis. Again relatively straightforward key value store much better querying and much better I would say higher level sophistication in terms of the operators you can use then you can use with SCD. And so what actually happened? So they basically all followed the structure I showed you before everybody exactly the same API shim everybody is their own separate continuous integration pipeline but they're all using a different data store or choosing which data store to use. And for the trial we basically said you can have either SCD or Redis. So they had to choose one of those two. So how did this work out? So the mean query time was the time it took to ask questions and get responses as compared to the baseline of record everything as environment variables which I thought was the closest thing that will be worth comparing to in terms of what you can do right now in Cloud Foundry with no work. That was about 1,000 times faster than using the API because you don't have to enumerate all of the different applications to get to an answer. You can ask one question and get the answer. So instead of waiting, if you had 450 instances you have to ask each one of those or if you have 450 applications you have to ask each one of those 450 applications do you have this environment variable? Do you have this environment variable? Do you have that environment variable? And so on. There's no way to query and mass environment variables as a key value. And you certainly can't do things like show me which applications lack an environment variable or show me which applications have an environment variable whose value partially matches the substring that sort of thing. When we, so we did this for about eight weeks. I mean at the end of the eight weeks we said how's it going? How's your, how's the strategy working out for you? And the result was about 94% approval that most people said, almost all people said that we would want to continue using it and all three of the firms said that they are rolling it out to all of the rest of their production foundations. So there's nothing that really changes on the foundation side. All you do is add more things to the list of endpoints that are being proxied and then sent to the, and then sent to this data store. So these firms are really excited about this approach because I think it gives them a capability that they don't have today which is really valuable. If you look at Kubernetes or CFCR you're seeing, I mean you see people take advantage at the fact that Kubernetes has tagging and labels all the time on their applications. It's like a very heavily used feature. Once you get above a set, once you get above a sort of the number of pods or number of instances that you can kind of keep in your head at the same time which is everybody running a production foundation of any reasonable size. Then you start needing to be able to ask questions like which of these applications has this property. And so that's why you see people in CFCR and Kubernetes using that capability. So the fact that we don't have it in CFAR or in the Cloud Controller API specifically I think represents an opportunity and I think we're realizing that by implementing the CFJEP proposal I talked about before. So I think there's a reason that all of these firms are excited by that because it's a capability that is needed by operators to be able to provide this as a service to their users. So thanks very much for listening. If you have questions I think we have about four minutes to take them. I know there's a microphone that might be floating around but if not I will listen to your questions and then repeat them for the benefit of the camera. Any questions? Going once. Do you have a sample of your implementation available on GitHub or something? So the API shim is available. The proxy implementation is specific to the three firms and I'm working on extracting that out of their specific implementations. So you can see that on GitHub at github.com slash FJ slash CF API shim. So I will update my slides to include a link to that on this slide. And maybe one further question. You talked about the attributes being provided by the manifest and extracted by your API shim. How do you ensure that for instance the cost center you're referring to is not manipulated by the developers? I'm not sure I understood the question. How does it ensure that? You talked about there are properties which are set by operators in contrast to set by developers like the cost center you're charging to and when you're extracting information from manifest it's provided by developers. Oh, so a couple of things. The API can be used by developers through the manifest but operators also get to use it. And because it's controlled by the continuous integration pipeline they get to act last. So in other words, if you try to set a property you can do that in the manifest but then it will get overwritten by the continuous integration's idea of what the correct value of that property is. You can say my billing charge code is one, two, three, four, five in the manifest but the authoritative source isn't your manifest it's the continuous integration pipeline's understanding of what the correct value of that is. Certainly the operator could make a mistake like nothing would stop you from doing that but there's no role-based access control or anything. It's just if you, it's predicated on a notion if you're a developer pushing you're gonna be using the continuous integration pipeline to that you don't have direct access to the foundation as a result. Yes. I had a question because these metadata features has been asked for a long time by users I was and I'm surprised that now that we're trying to pollinate three things like Kubernetes as you said that's native tags are native feature of Kubernetes and your solution does not seem in the core cloud for me. We'd expect a tag to be first level citizen inside the core cloud for me and being able to tag and the solution you propose seems around the core of cloud for me. Do you have any insight of why we don't tackle this inside the API and the data bundle of cloud for me? I'm sorry. So I think I understood the question. I think I understood the last part of your question but is it why, could you sort of repeat that in one question for me? I'm not sure I understood. I wanted to understand why we work around the V issue which is real. It's a lack in Cloud Foundry and we're managing it with proxies and things like this. Yes. Whereas we expect that feature metadata to be core in Cloud Foundry inside the API inside the database so we have something consistent. I think. I'm sorry, I'm still, I couldn't quite understand the rest of that. I'm still recovering from a little bit of cold but I think you're asking what's the relationship essentially between the proxy and the developer? Like why do we use it that way to get the metadata? Is that what I'm understanding? No, no, I was asking why we need a proxy? Why do you need a proxy? Why not just change the API and the data model of Cloud Foundry? Oh, why don't you just send the data to the API directly? Because that would require developers to change their workflow. Right, because if you did like CF push but you had to do it in a special way to get the metadata on there, well then you have to change your workflow. So I wanted all three of those properties. Like you don't change your workflow, it's seamless and querying doesn't require enumeration. So that would break the first rule. If I didn't have the proxy there then developers would have to do something different to get their data into the data stores. That's why the proxy exists at all. Why isn't that just part of the official API? Oh, I think that's the CF Dev proposal. That's what that would be. Okay, so that's the way I understood it as well. At the moment it's this proxy, right? At the moment it's this proxy. Right, at the moment it's a proxy. The CF Dev proposal would make it so that you don't need the proxy because it would be part of the API. But because the CF Dev proposal is not yet here and because Fortune 500 firms are impatient, they said, how can we get this now? So that's the reason for the proxy. That would all go away when if the CF Dev proposal that I mentioned is implemented, you're absolutely right, like you don't need the proxy at all if it's part of the API. But we don't have the API yet. So this is the next best thing. Yeah, right, yeah. Sorry, I can't hear you at all without the microphone. Do you have an idea when this will be available in the API? Oh boy, I don't know if I can comment or I don't know the answer at all to future schedules or anything. I do know that Zach Robinson who is leading part of that effort is here at CF Summit. So he would be a good person to talk to. What's the state of this proposal? Is this still in discussion? Can we contribute anything to it or is it already closed and? Sorry, could you repeat the question one more time? The proposal you mentioned. For changing the Cloud Controller API. In what state is it? Is it still open for discussion or is it finished and waiting for review? So I believe it was published in June or May. Don't quote me on that, but I think it was published earlier this summer. I am sure it's still open for discussion, but my understanding is it's being actively worked on at this point. That's my understanding. Thank you. So if you follow the link in that slide to the proposal, you'll be able to see the actual Google document that has the people's comments and stuff like that. I mean, I did not include a link to the mailing list post where it was discussed, but there's also a small thread on that as well. All right, I think we're a little over time. So if you have more questions, I'd be happy to stay after this and talk to you, but otherwise I hope you enjoy the rest of CF Summit.