 Hi everyone, my name is Saurabh Hirani and I work as a DevOps engineer at Bluejeans and I'm here to keep you awake for the next 20 minutes in case you're feeling sleepy. So the talk that I'm going to be presenting is about Inframer. It's an open source tool that we developed at Bluejeans networks to solve a very interesting problem. Okay, so let's get on with the problem statement. So the problem statement that I'm trying to address is, cool, thank you, all right, this does not count, hello, looks good, thank you, right, so the problem statement, okay. So the problem we are trying to address is how do I relate, how do I correlate information in my distributed infrastructure? Now that's quite a mouthful, but I'll expand on this. What I'm trying to say is that in today's world, when we are talking about a node, we are talking about that node's information present in multiple databases. And my database, I don't mean MySQL or Postgres, what I mean is any place where your infra information resides, that means it may be in Chef, it may be in VMware, right? So picture a scenario, you have been told to set up a web server. So what do you do in that? First, you decide what is going to go on that. Do you want Apache? Do you want Nginx or something else? Once that is decided, you decide the config, right? And once you have all of that sorted out that, okay, I'm going to go with this guy, then you have to set it up. But you just can't set it up manually, you have to do it in a way that is reusable by others also. Now you should be able to change the configs without just logging on to the server and doing 10 things, right? So for that, you use a config management system like Chef, Puppet, or something else. So the information about your software, what all configuration is going to be there, what things you are going to install that goes inside Chef, okay? Now we have got your machine up and running, right? So now you have to tackle the problem of monitoring, as in you have to see to it that you put in the right disk space alerts, the right CPU alerts, and so on. That information goes in Chingga, Nagio, Zabix, what have you. And now that, you know, and this machine can either be an on-premise physical machine or it can be a virtual machine. And if it's a virtual machine, you have to decide which host will it go in, which cluster will it go in, which data store will it connect to if you are talking about VMware, right? And it can be a local hypervisor also, but I'm just taking a scenario here. And if it's on the cloud, then you have to look at AWS, Digital Ocean, and the same questions here. And for those of you who are not familiar with Device42, Device42 is basically an inventory management system which provides you APIs to tell you information about where my node is, how is it connected to my networks, which is, and all those things. So we are talking about just one node, a web server. And you see the information about that node is distributed across Chef, teaching VMware, AWS, Device42, and this is just one part. You know, it can go beyond this also, right? So if we revisit the problem statement now, how do I correlate information in my distributed infra? This is what I'm trying to answer here, right? This is my distributed infra. Now, if you have distributed infrastructure, there are very good parts about it, right? Each of these databases does one thing and does it well, right? Chef does not mess with teaching, I Ching has not mess with VMware. They all do one thing and they do one thing really well. And all of these guys provide rich REST APIs for you to work with. I'm sure that most of you guys, when you have to provision a virtual machine, you don't need to go to the AWS console to do things, right? You have your Boto scripts, you have your libraries which you can use and just fire them from the command line and your work is done, right? And obviously, as in the open source community, there are really good projects like Boto, PyChef, which are libraries built over these APIs to give you a uniform, easy view to do things. These are the good parts. But there are some pain points associated with it. All of this knowledge about your node, about, you know, all of the knowledge about a particular node is distributed across all of these databases. So your knowledge is trapped in silos, right? Their APIs don't talk to each other, and they shouldn't because each database is supposed to do one thing and it has to do it right, right? So these are the pain points associated with them. But at this point, you have to ask yourself that don't you wish that these APIs could talk to each other? Don't you wish you could correlate that information? Don't you wish you had a tool which answers questions like this? Give me an IP and I will give you information about that node in Chef, itching AWS device 42. You shouldn't have to fire 10 APIs to get all of this information, right? You should be able to do it from a single point. You should also be able, now we are, till now we are just talking about the node. Now let us move on to a higher level about databases, right? You should also be able to answer questions like, you know, give me a tool which gives me information about all my AWS nodes in this region which are running but not yet monitored, which are running and monitored but not yet shipped, right? And these are just some of the permutations that I'm talking about. There can be n number of them. And then in the end, it validates your assumptions about your data consistency. What do I mean by that? Let's face it, as in, you know, in today's world, no matter how much automation you have in place, you will always have that one VM that someone created for a customer demo six months ago and it's still lying there, right? So how do you find out such orphan nodes? So if you are with me till now, these are the questions we are trying to answer. Information about your infrastructure from a single point of view rather than, you know, collating all the information in your single, single script APIs. So this is where Inframer comes into the picture. It was a tool that we created to scratch our own niche and it was built using Python, Flask and Redis, which are very cool open source projects in themselves and it was built over a period of two hackathons in 48 hours but soon we realized that this is just not a hackathon project. This has many more applications and it was incorporated in our DevOps roadmap and we have open sourced it today. So if I have to tell you one line description of this tool and how it's trying to answer these problems, it's, you know, this is the most simple view I can tell you about it. It's collect, store and analyze information, right? And when, if you have to look at the architecture of this guy, oh great, this showed up, right? So if you, if you can look at the diagram here, when I say collect, I mean these collectors, oops, not working. Okay, that's fine. These VMware collectors, Chef collector, AWS collector, these are just standalone programs that you've written using your favorite programming language, Python and you've taken those collectors and then you've captured all of the information and you're just exposing them. These are generic scripts, these are everyday scripts that you write, you know, to collect, to talk to VMware, to talk to AWS, to talk to Chef, all these things. You write these scripts and then you forget about them. These scripts dump their information in a store. This store is like a central repo wherein these guys will just blindly dump the information without doing any analysis. You get a big JSON, you dump it, that's it. And over these stores, we build a REST API which exposes the information present in these stores in a consumable way, right? And then once you have the information ready to be consumed, there are no limits on what you can do with it. You can write tools over it, you can build dashboards over it and you can do a lot of stuff. So if you look at each of these components now, collectors, stores, analyzers. So collectors, as the name suggests, they collect information from each database and return a JSON, right? And each of these databases has multiple views. For example, if you are talking about Chef, then if you look at Chef, Chef looks at your information from the viewpoint of hosts, from the viewpoint of environments, and so on. If you look at VMware, VMware will view your data center from the viewpoint of VMs, from the viewpoint of clusters, from the viewpoints of data stores, and so on, right? And obviously, these are extensible. You can write your own collectors in any way that you want. So collectors dump their information in stores. So at the top level, there are collectors. They dump their information in stores. And the current implementation of the stores is redis, but the collectors are totally decoupled from the stores and you can have your own implementation. If time permits, I'll talk about that. And once you have the collectors and stores in place, once you have the collectors and stores in place, you can have a REST API over these guys, which means all of the information that I've dumped show it to me in a consumable way. Now if you look at this REST API, this is just a URI sample URI, which is pretty simple. In Framework API V1, in all the databases, give me the AWS database. In the AWS database, give me the US East One region. And in that, give me a sample instance. So this will give you all the information about that database in that one shot. I'll explain this. This will be very clear when I show you a demo. And this is where I step in with this. So if you look at this dump, this looks familiar to you if you use BOTO or any other library that talks to AWS. What we are doing is simply connecting to AWS, getting the instance information, dumping it in JSON. All of this you've seen earlier. Now what if of all of this information, I just wanted to get the state of this guy? So I take this URL. I add a filter to it, key equal to state. So OK. I get the state is stopped. So what if I wanted to apply this key on all of my region instances? So I move up the URL, use the same key, and say target key is equal to state. So I get all the running VM, stop VMs on all of these things. And of all of these VMs, if I just want the running VMs, I add one more filter, which is target key is state and target value is running. Target key is state and value is running. So I get just the running VMs. Now at this stage, you are probably thinking, OK, this guy talked with AWS, got the information, dumped it in Redis, exposes it through a REST API. And all that he's doing is showing me all the running VMs. I mean, you could probably do it in a 10 line BOTO script. And you don't need all of this for that. But this is not for that. This is useful because if you look at the private IP of this guy, 10.1.1, and this is just dummy data, the information about this node is captured not from an AWS perspective, but also from a Chef and a Nagios perspective. As in, you can query the Chef database, and you can get all the Chef-related information about this node. You can query Nagios database, and you can get all the Nagios-related information about this node. What does this mean? This means that now, all the information about your one single node is aggregated in a central store for you to consume and make sense of it. This is where it all gets very interesting. This is where you write analyzers. And this is where it all comes together. Because without analysis, your data is totally lame. And without data, your analysis is totally blind. So what you can do with analyzers, they are command line tools that you write on top of these REST APIs. They query individual databases and perform any operation that you want. But for an example, I have taken set operations, which means you remember the question that we talked about earlier. Give me all the AWS nodes in US1 region, which are running, monitored, but not yet Chef. How do you answer these questions? So this is a very dirty script that I wrote, a very simple script, actually. What it does is, if you look at this URI that I talked about earlier, in AWS region, for this key, give me the state information running. So all this is doing is that it's getting the JSON, extracting the IPs out of it, and dumping it in a file, USVS2 running. So you've got all the IPs of USVS2, which are in a running state. Number one. Similarly, you can get all of your Nagios nodes. You can get all of your Chef nodes. So at this stage, you have three files. All of the IPs in USVS2, which are in a running state, all of the nodes in Nagios, which are there, and all of your Chef nodes. You have three data sets, and when you have three data sets, you can perform operations on them. You can say that take the data set of USVS2 running, and take the data set of Nagios, and do an intersection on them, which means give me all the running nodes which are monitored. Then you can take the output of that data set, USVS2 running and monitored, and then you can do an intersection of that guy on all the Chef nodes, which means give me all the nodes which are running, monitored, and shipped. And you replace the intersection operation. By the way, this has been written in Python. This is a set library. So any operation that you can do through set, you can do through this guy. So if you do a difference, you get all the running nodes and the monitored nodes, but which are not shipped. And there are no limits to what you can do with this guy. Now the information is in your hands. And it is extensible, obviously. I mean, this would be customizable for you. Think about it again. Revisit the part wherein you queried AWS instances, got the information, did some crunching, and you spit out the output. All of this was done through one script. All I'm trying to say here is that break that into manageable parts. I'll write a collector. I'll expose it to the community. You write an analyzer. You put it out there to the community. I can use your analyzer. You can use my collectors. And the secret ingredient in the end is you. I mean, this sounds very corny and cheesy, but this is not a self-help talk. So there are no two infra layouts in the world that will be alike. You will have more databases. So in that case, you write your own collectors. You will have more information and more view towards your analytics. You might just want to go beyond the usual difference and all these intersections, all these operations. So you write your own analyzers. And you fit the tool to your needs and not the other way around. And obviously this was done through a hackathon. And hackathons are great places where these ideas flourish. So obviously, participate in hackathons. So to conclude my talk, there is a growing need to correlate all of this information. And it has to be done in a generic way. Most of you guys, when you query AWS, when you talk to AWS, I'll take my example. I write, I am a Python guy, so I write Bota, and you might have guessed this till now. But I write Bota scripts, I get the information, I dump it out, and that's it. What do I do with it? Do I tell a newbie that you are new to AWS? This is a generic collector. If you talk to AWS, it will just dump out the data. You don't care. And there would be a data analytics guy who does not care. Why should he write Bota scripts to capture the information when all he needs is data to work with? He can use your collector, he'll have his own analyzer. He works in his area of strength, you work in your area of strength. Obviously we are collating information collected by all of these APIs, and we are trying to make sense of it. And if nothing else, this also serves as a central source of your examples. Which means that Chef, AWS, VMware, any database that you're using, any DevOps tool that you're using, which exposes an API, there should be a collector for it. There should be an analyzer for it. Your analyzers can be custom. As in, the code stays constant. Your collectors are generic, your analyzers are generic, but the output that they produce is specific to your organization, and that stays with you. But the code can always be exposed to others. That was about it, and I hope I could communicate my ideas in a limited time I had. We had great fun writing this tool, and we have open sourced it in our GitHub repo, BlueJeans Network. Catch me or my colleagues to talk about it anytime. Thank you. Questions? Hey. Hi. Just an idea, it's not a question. You could probably dump all this data on Elasticsearch and use Lucinsearch, and get all kinds of metrics out of it using aggregations and stuff. Great. I have never had good timing in my life, but for the first time I have had this. Right now, better search capabilities, full text fuzzy values. Right. So we dumped the information in Redis because as a startup idea, this was the easiest thing to do, but we need Elasticsearch, we need stores, and if you look at the image that we have here about the architecture, right, this may look at, this may seem that the collectors are dumping directly the information into the stores, but this is a simplified view. What actually is happening is that the collectors are talking to a central runner, which is talking to the stores, which is talking to the stores, class, or whatever it is, and then dumping information. So the stores are totally pluggable. As in you can have Redis and Elasticsearch running together, right? So Inframer is an open source project, right? So you can, as, you know, I hope this works, but you can directly write your Elasticsearch store, dump the information there, expose it to the API, and you're good to go. Thank you for asking that. Questions? Yeah, very good. Yeah, I think that's it. All right. Thank you very much. Thank you.