 All right, can you guys hear me guys in the back on the pack roll? All right, so my name is okay. What we are my name is Rajesh. I work with pivotal and he's Brian So we're going to quick introduction. So what we're going to talk about today is Something that we've been building for well over like a couple of months now. It's a testing framework It started as a testing framework, but there are a lot of other stuff that we build on top of it. So Okay, quickly I'm based out of Michigan and I'm a platform architect my day job is Working with my folks in pivot from my for for motor company. That's my customer And I help them adopt cloud foundry. So again for folks who don't know Ford is one of the biggest accounts for pivotal and Brian is based on Minneapolis. Yep. So, yeah, I've been working on with quite a few companies in the Midwest area or so and so I've been able to help Demo and I show people a lot of really cool things about cloud foundry Then which is what we're here talking about today along with Node.js and all this stuff. All right, cool So so what we're going to introduce today is we call this project is X. Yellow Just a acronym we came across like your last week. It's a cross cloud monitoring and testing dashboard So it's a dashboard that we've been building. So there are multiple components in this dashboard. We're going to talk about And do a lot of demo about that dashboard today So going to the next slide What are we trying to do? What is the problem you're trying to solve here? So again, just a History of this project that started I would say around six months back I was working with some teams in Ford and Ford as you know, it's a it's a fairly big installation of cloud foundry multiple foundations and There's with multiple foundations. We have multiple challenges like hey How do we update and make sure that apps always running across the same app is running across all platforms? if you update the underlying build packs if you're underlying the update the stem cell and And all the service gets updated. So those kind of challenges that when you have large-scale installations that you come across And then how do you provide a visibility to the the engineering team the ops team and the and to the developers Across all these foundations. So with that premise, we've said, okay, maybe we should need to step back and say can we create a Across platform across foundation dashboard at the same time We also can we provide some kind of an attesting framework so that we can push the same kind of apps Across all these multiple foundations so that they can be they can be running these apps constantly To give that health of each of these foundations. So those were the two areas we started exploring And with that premise we like again said, okay Is this a problem that just Ford has or is this a problem like other customers have and when I started talking to other customers Most of the bigger customers that we came across they had similar problems like hey, they all started doing similar things So we Brian and me we kind of paired together and said, okay We should do something different and that's where the genesis of this project is so with that I'll hand it over to Brian. He'll actually walk through a story And give a demo, right? All right, so I like every business Some some business owner comes up with some idea and then you hire some developer It start off the project. I Generally just kind of get a feel for things are getting some stuff set up then you start hiring More developers and why isn't the transitions building out? See start hiring a few more people as you're going along the project gets bigger and bigger and pretty soon you have a whole bunch of developers and then you Start building up a bunch of different apps, which at first seems pretty simple I you have five apps to deploy and manage. All right. No big deal Then you start introducing cloud foundry which makes running those microservices a lot easier for you I you have a lot of resiliency built into there, but still you have to You have the one foundation then I have to monitor your apps make sure everything's up and running So you get everything all deployed there and then great five apps still real easy. I've been no longer having to Worry about running these on like a local laptop or something like that. You got the platform to actually manage everything for you but wait You know probably have like say maybe a dev space a test space Staging prod maybe a few other various spaces offer that one applicator for that one big a application So now you have four Or you have four spaces five apps and now you have 20 that 20 apps You now have to monitor which depending on who's monitoring that might get to be a little more Interesting because then they have to go into the various spaces make sure every app is up and running And then once you start worrying about somewhat of redundancy, maybe you want to have a couple cloud foundry foundations 2345 foundations serve across the globe then all of a sudden you have basically a microservice sprawl So now you have X many foundations times X many spaces times X many apps I guess to be a lot of apps to monitor in this case say 40 apps now you have to monitor So you have to log into in this case two different foundations across the few different spaces That's like a full-time job for someone to just go through and just make sure everything is up every day So some of the answers we want are some of the questions we had going into this were How do I keep track of all my apps? Are they running? Should they be running this maybe have like a DR zone that's maybe supposed to be cold that you want to make sure That's apps aren't running in along with just knowing so we're like Keeping track of all your apps knowing where they're running are my apps actually healthy or not Because maybe they're flapping what I'm going up and down or maybe the health checks are Actually failing behind the scenes, but the app's still up Maybe they can't connect up to a database or something that kind of stuff is Great for operations to build know and right now is a little bit hard Especially when you start talking about multiple cloud foundry foundations So with that we'll go to a quick demo Of what we've actually been working on So here is Right now I'm logged into one foundation This is a quick little app that we've been working on to be able to just go pull apps from every every space within your Foundries that you have actually configured. So in this case, I have one foundry configured I can actually go and click on one of these apps and it'll actually go out and Grab some more information about it figure out how many how many instances of the application are actually supposed to be up and running and What its memory limits are set to what services are bound when it was last updated and such that way if just As a as an administrator you can easily go and see at a quick glance what's actually happening with that application along with if Right now we're checking like just a slash health endpoint to see if the application is actually up and running And if it is we actually display the status right here Along with if an app is actually stopped in a stop state. It'll actually show up as a red little tile here So this is so right now. I'm logged into a single foundation and if I go and log into a couple more foundations So these are right now are just a can just configured in all JSON file We're eventually gonna do something else. We're not sure yet But if I go back to apps now This will actually go and pull apps across all those foundations that were that I'm actually logged into so you can see in this case This app this application just is actually currently stopped So not gonna be like it too much for details on that That so if we have a couple different apps are stopped we can also see what foundation they're actually a part of what org and what space Along with we are also pulling in Some concourse pipelines as well That we're we're using for actually deploying some of these applications. So in this case We'll use one that we have set up for Ford since we have our four guys here So if we go in here, we'll see that we have three different pipelines If there is something currently going on in this pipeline, we would actually see similar to how concourse Has a little Rectangular thing around whatever is currently happening that'll show up say like right here or so in the app Or just in this little tile and if there is something that was wrong There would actually be a little red red slice in there as well That's I We are we just started getting into the concourse part. We're still trying to flush out some of the details about that, but we're Very much new to how this is gonna work So we've been constantly changing a lot of features in the last even week or so We've we're working on this even today right before this so I was happy that everything was actually staying working So I guess I can go all right sure we go and go look at the apps again and See that everything is everything is green again. So this is very useful We think this concept is gonna be very useful to be able to monitor across multiple foundations along with We have a search So if we want to say just look at the canary apps that we have deployed You can easily just go through and search for that Along with I can just turn off other foundations and see just one foundations worth I can I we started Thinking of other things that person might want to filter on such as me build packs See if apps are started stop stuff like that, but we're still working on all the various permutations of Cool things that we can add to this Let me just take a pause here Can you hear me? Okay, so yes to add to what Brian was saying, you know The basic UI I mean we are we're trying to build a minimalistic UI to get started So things that we thought were important was like a search So I should be able to search for the same app across let's say four different foundations or I should be able to filter that by Give me all apps in a specific foundation or give me all apps which are let's say node apps or Java apps That are or maybe apps that are running Redis service or rabbit MQ service. Those kind of filters are available What we also would like to see actually as extension is what additional things that you guys, you know Who are managing multiple foundries? What do you guys think is important, right? I Is it a filtering is it sorting searching? Is it actually going against specific kind of profiles? Any additional metrics? So what we also doing is we are look every app that we are deploying If they have a slash health endpoint, so we are querying the slash health endpoint So if you go to a canary Java And if you see on the details section that is a status up So it's basically if it's a spring boot actual trap. There's a slash health endpoint that automatically is added to it if it's not a spring boot and someone is deploying some Custom apps that we are actually again in case of Ford We are telling all app developers to have a slash health endpoint They can write some custom data on that endpoint and we can show that that kind of custom data Like whether the stat with the app is really up or not up And you know those are the ideas that we have but we would like you know this kind of a community project like hey What do you guys think was important for you guys? just either see on your Panel or even the details page All right, so you know continuing on that theme, you know again we are thinking about Kubernetes like project cubos announced this week Should we look at Kubernetes clusters should we bring in Kubernetes as a cloud operator? Who was actually tasked with managing multiple cloud installations just not PCF or cloud foundry But you might have Docker swarm you might have Kubernetes You might have even services marketplace. You might actually want to see hey What are my common services and plans across all my foundries right? Do I am I actually having you know, let's say a gold plan for Redis Some kind of a quota and usage or a size number of a size Across all foundry. So those are the ideas we are kind of thinking about again Most of this feedback is coming to us from our customers. So we would like to hear from you guys What what do you guys want to you know, see in this community dashboard other thing that we are looking at is? Reports and again working with Ford Ford is actually You know doing for does a lot of elaborate analysis on the usage of the platform Like how many containers are deployed? How many are active containers or active AI's and how many are Inactive and how many of them are using let's say one gig and plus of memory and what is the gigabyte hour usage? And those kind of reports are then used for some charge back model So we're thinking of let's pull all this data across all foundries by Organization because you know most of the customers we have seen they look at org as a business unit and That that might be useful to a mechanism to pull that data and give reports around gigabyte hour usage on container and Then also capacity planning like let's say if you are looking at this What should be additional IS? Capacity like how many Diego cells I am running or how many nodes I'm running or how many Hcds or go routers or whatever those Metrics that we are pulling we can pull that and give some additional capacity metrics again the idea behind this dashboard is You know solving a problem of scale like if you have multiple foundries How can you bring all that data together if you have one? Yes, you can go to like apps manager or ops manager and you can get details about that But if you have multiple and your distributed team how you can bring all this data together So this is what we are trying to do what we want is feedback Q&A Questions, so this is our demo and talk track So The dashboard is the radiator dashboard which which Brian showed right and there's a test harness Which is basically running all the test the canary test so that multiple Components underneath that we haven't talked about the architecture yet So the dashboard is one micro service or one service, which is we are we are deployed in like it's built in NodeJS and react Then we have proxies for cloud foundry proxies for CI servers different proxy back-end servers and Then we have concourse pipelines running under a separate project So you have concourse all your concourse pipelines which are actually running all these different apps That's a separate project So there are multiple components in this if you go to that git repo that we just put it on on the Whereas of the first on the first page you will see all those different projects. Yeah Correct. So that's a good question. So what we have we are doing is that that app could the Card in the middle can be down, right? And that that basically means that the app itself is down In the stopped state. So we are looking at the state of the app stopped start or basically I don't know what's other states stop. Yeah, but then if you are looking at the health the slash health endpoint So the app might be running But it might not be able to connect to a back-end service in that case the slash health endpoint would show you Whether it is not able to connect to a database or connect to a red is or whatever it could be So there are two points of failure One is basically the app state itself and the other is the slash health endpoint. Yeah, right right now We're not actually color coding based off of if the application started But unhealthy because it can't connect up to a database currently We're not actually color coding the tile differently because of that That's a future thing Only so much time in the world All right And same is with the you know with the pipelines. We are showing right now You know one of the things that in working with Ford They wanted to know is the app might be actually a version of the app might be running But then we are pushing a new version of the app and that pipeline might have failed, right? So what what we want to show that on that screen is In fact, if you log out and log in to let's say another team, I can't do it right now Oh, you can so yeah So if you if you can log out and log like if you log into another team or the we can have multiple servers So if you log into multiple other server, you can see that one of the pipelines is down And that basically means the app might be the earlier version of that might be running But the newer version that you are pushing that by that pipeline failed, right? So you can have multiple Failure points a new app cannot be deployed a app that is existing the app might be stopped or a app might be running But the underlying service might be down Yeah, so I think the Genesis as I said the Genesis was like when you do synthetic monitoring you have you have a app Like you you are writing apps custom apps and you're doing some synthetic monitoring against that, right? one of the things the Ford teams were actually that you have multiple cloud foundries and they were tasked with is like when you deploy any new service Or you change any underlying component make sure that everything else all the apps are running on them They are always running right they are always hot. So how do you solve that problem? So that's the reason we said okay We should come up with like list of 10 20 use cases which are kind of smoke test Bundle them and then deploy them in all foundries so that we are consistently testing the same apps in All these different foundries so that was the like the primary use case driving this particular dashboard App teams took that and said okay. This is looks really nice We should actually extend this use case and can we write our own custom apps and have slash health endpoints And then we can use the same thing that we don't need a synthetic monitoring tool any other questions Yeah, right right now Right now we're doing it on click just because there's just some latency involved with doing that for every single one of them We just haven't quite got to that kind of point yet. So it's it was easier both time-wise and for us and Just request time to actually just go and click on it right now But yeah, eventually we want to have a bit better summary just within tile Then build a click on it to get like a more expanded summary or just have that be something you click on the tile And it actually expands or something like that So we're still trying to figure out exactly how the UI should look for some of these things, but yeah It trying to make this so that's more you can just look and see what's happening Especially being able to pull me back maybe like a an application version or something like that and having this in the tile Is a little card as well so that you can search for like say canary java or something like that And then you can see all the instances of it and you can if you're returning back Say like an application version or something like that or commit hash or something like that You can easily see that something is different between these two or three four five however many you have So we are there's a good question We all these apps that we have built actually they are all stateless like you know We are purposely not storing any data because if you want to get data Then you can always go to a like a apps manager or ops manager or some and get some data, right? Because if we are going to start storing like time series data, it's gonna be too much of data And we are going to replicate that data So I think we'll solve that problem like by saying okay if you need historical data Then you'd launch into from that app into a specific console for that app Maybe like a apps manager or something rather than actually trying to store data here and trying to replicate data I mean at till this point we are like Consciously thinking of not storing data Just pulling data and me making it stateless But at some point we need to store some data then we'll start thinking about what is the best solution to start? Yeah, we figured start start simple then go from there And we also don't want to be another source of truth for data as well We figure that most most all the data that we really need we can pull from the cloud control or API pretty easily along with If we start pulling apps from say like kubernetes or docker swarm or something like that I trying to have a little more uniform interface for that. I It's kind of where we're trying to go for that and if we could avoid having to store a state Or a kind of state that would be great But obviously at some point in time we're gonna want to be able to Store some little bits and pieces here, but we just haven't quite figured out what scope you want to actually store data yet All right any of the questions so Couple of things we want to point out before we like wrap this up is You want actually feedback, right? So there's like a feedback form If you guys can go back and submit some feedback and tell us what's important for you guys Who are using cloud for me that would be very helpful for us. So prioritize things And yeah, if you need code We'll send out the the whole the source code of this repository and how do you install this and how to start using it? And we're also we're also pointing the slides up on the schedule site thing as well So I look for that after this presentation should actually have the updated slides We're out. We're modifying slides pretty much up until we got to the presentation The ones that are up there right now are a little bit stale Should be updated soon after this then Alright, thank you