 All right. Hello. Hi. Good afternoon. So today we're here to talk about OpenStack Searchlight. This is a project that's had its first release here in the Liberty Cycle and we'll be continuing it on obviously in the Mitaka Cycle. So I'm Travis Tripp with me. I have Steve McClellan and Lakshmi Sampas. So let's talk a little bit about what Searchlight is. So have you ever actually tried to find everything you've recently created or updated in your cloud? So let's say you're looking for a specific server or maybe an image or just anything or maybe you wanted to find something in your cloud that has say a certain keyword. I want to find everything that has the word Mitaka in it for example or misspelled that word while you're typing because if you're anything like me you probably can't type very well and you misspell a lot of things and then you don't get the search results you necessarily want. And have you ever actually wanted a UI for OpenStack that could prompt you with the suggestions for search to help you find what you're looking for rather than you having to know everything before you even get started. And then with OpenStack you actually couldn't do it because a lot of the APIs are really inconsistent. They're inconsistent amongst themselves about what you can search on in different fields whether you can do wild carding and then if you look across your different services they're different even across them. So maybe I can use a wild card in a certain field in Nova but I can't do that in Glantz for example. And they usually don't even let you search on every field. So if you want to do a full text search on the description field for say some of your images you're not going to be able to do that. It doesn't work because a denial service concerns with the actual database behind it. And then if you want to find your resources across all your types you actually have to search every single service individually. You're going to have to go in there and say hit Nova. You're going to have to go and say hit Glantz. You're going to have to hit them all individually and then aggregate the results on your side of the equation. And they're often slow. And if you want to get a change into them it's actually very difficult to get that change in. It can sometimes take multiple cycles and sometimes you can't get it done at all. Which for me that actually kind of makes me want to bang my head on the brick wall and get kind of upset. But that's why we came up with Searchlight. So our mission is actually to provide advanced extensible and scalable indexing and search across your multi-tenant cloud resources. And what we're doing is we're actually bringing the full power of elastic search to all of OpenStack. And it's not just an admin only service. This is actually for all users. So this means we're taking account the RBAC concerns. So if you go searching for something you're going to see the things that you're allowed to see. Not just the things that are not across your cloud. So what this is going to do or what it does do is is giving a consistent search API across all of your OpenStack resources. You're getting full text search on any OpenStack resource. Search term discovery. Meaning it actually can tell you here's what you can search for. Auto completion. And then one of my favorites is fuzzy search. So mistyping something and you put in the wrong letters or something else. You'll still get results according to what you're looking for. And then we can get into some more interesting aspects such as geospatial search other things. So if you put in say some latitude longitudinal coordinates you can do searches things like that. So actually want to take a peek. So what I'm going to show here is the searchlight horizon panel that there's a patch up for it. It works with Liberty. We're going to we'll create an offshoot of that specific for Liberty but we're going to continue developing it out in in Mitaka. And we're going to start off with taking look. This is existing Nova instances table in horizon. And you see you have various instances out here with different names and things. And if you want to go and search you do have some fields you can actually search on. You have flavors you have names and if you go out there and I type demo you're going to get some results back. And this is going against the Nova API. This is a standard Nova API it's using. And you can see we have things like demo Mitaka or demo Liberty. And if I want to limit that down and start adding some more advanced wildcarding into it. So say a demo star Mitaka star send off a filter request. It doesn't actually give us results. And I'm not sure why because the Nova API actually is supposed to support some wildcarding on those kind of fields. But we didn't get a result we don't know why. If we go and look for I want a particular instance in a particular status you know I'll have to take a guess about what's even in my database what the possible ones are maybe it's I'm looking for stopped ones that didn't give me anything shut down. Okay actually turns out it's shut off and let's see if we get results now and you're going to get your results. And so I had to actually do a full search against the Nova API to figure out what I'm looking for. We go over look at images and you're going to see now I'm going to have some other filter options. Again this is going against the regular glance API. I do a demo search. No results. I try to do demo star. Again no results. So you have inconsistent results based off your different APIs. And it's kind of frustrating. So even though I have a couple of options to search on it's actually kind of confusing. So here we're going to look at the search light panel which we've started. And the first thing this is going against the search light API. And the first thing you're going to notice is we're getting across resource search results. So I've got DNS records. I've got images. I've got servers in there. And if we go and take a look the full data that you might expect coming out of the Nova API is available to you. It's not just a subset of it. We go look at images again all the common fields you might expect. They're coming out of the search light API. And again here we are looking at the DNS records that came out. So now if we want to do that same very same search, we're going to search for demo. We'll do demo star. There we go. Near instantaneous results across your different resources, across your whole cloud and your different types. And if you want to do some wildcard type searching, so demo star metaka, type it in. There's your results. Very consistent. You're not faced with that question of if I'm on a certain API or certain kind of server what I'm going to get. And we can continue to limit it down. So then I say, oh, well, I actually want to search images. Well, that's actually very simple. We simply say, well, just search images and narrow it down very simply, very easily, very quickly. It's actually quite a powerful UI. And now here's one of the cooler parts. We have the ability to just easily limit things on different things. So let's find everything that was updated say at a certain time range the past day. Just click that and that's giving us results back if everything's been updated in a certain time range, whether you're your server's images or whatever. Now we can do discovery of the different terms that you can search on. So let's limit this to servers. So now we're down to servers. Just have those. And now actually the Searchlight API is now dynamically returning the different facets that you can search on. So it's listing them out there for you. And if you click on it, it's going to give you the options within there. So it's saying you actually have 12 servers that have an availability zone of Japan. Click on that, grab it, bring it down. You see it's now limited down to 12 items. And we can keep on building that out. So we'll just keep on doing that. And you'll see additional facets being dynamically discovered and shown with the exact options that you have. So here, we'll maybe look at security groups. There you can see how many security groups. And these facets and options, again, are being limited according to the project that you're logged into. So this is not just across your cloud. It's like I'm logged into my demo project. These are the actual values and choices I can make from within that project. So you can see it just keeps narrowing it. Here's your various networks that are available. And you can keep pulling it down until you get your actual final results. It's a very simple and easy to use, takes all the guesswork that you might have when you're using your classic search APIs on your various services. And it's consistent. So we go over to images and you'll see we're going to have the same thing. Your facets dynamically discovered as well as their various options. So here you can see I have, here's images that have, there's one of them that has a AKI format. So then beyond the basic discovery of those, we also have a take advantage of the ability to use the fuzzy search. So here I'm going to do type in my name field. Instead of open stack, I do open stuck, put the U in there. And you see, it actually still finds your results. So not only are you getting the nice search capabilities, you're being able to handle the fact that for people like me who can't type, it'll still find your search results for you. So like here's the Fedora. Instead of Fedora, I transpose some characters and I get Fedroa. And you'll see it still finds my images that are Fedora. So it's a very powerful search that you really couldn't expect every single API in open stack to be able to implement and handle on its own. And instead we're getting this consistent result. And now finally, if you really want to get powerful, the query language support, this is now doing full text search. I said database. You see it finds everything tagged to this database. I got Oracle in here and Postgres. And if I want to say, now let's limit this to open source. It's taking advantage of the tags. And you'll see it's now brought about more results. And you'll see it always is interesting. I got database. But why am I seeing in Genix and Apache? Well, that's because the query language support is actually very complex. You can do anding and oring. You can do grouping. So here I go and let's make that an and query instead. And now we've limited down to just your simple search results with that and. And then let's get a little bit more fun and say, I want to actually do some actual range queries and some faceting on a specific field. So I say, now find me everything that has a minimum RAM that's less than 2048. And just like that, it's popping those up there. Jumping down to 1024. We can go to 512. Or even if you want to search on a specific range, not just even, I want everything with less than, we can go to a specific range and say, give me everything, say, from 1024 to 2048. And you get your instantaneous results. And when we dig into that, you're going to see that it's really going down there, looking at the real field on those images and giving you that query ability. So there you go. Mineram is between that range. So that's the UI that we're creating for Horizon. It's going to give it a cross resource searching. And you can try it out with Liberty. It's actually available up there as a patch you can pull down. But let's take a little peek on the inside. We did a little bit of performance testing with it. And when we were doing just four compute hosts, 250 instances, so nothing huge, but a reasonable size sample set. Wildcard search versus list all of Nova versus Searchlight, you actually are seeing anywhere from four to 8x performance improvement in your search request. So it actually works as an effective caching layer as well. So let's talk a little bit about how does this actually work then. So down here on your bottom right, you're going to see your cloud services. Say Nova, Glance, Desgnate. And what happens is you index those up into Searchlight via a set of plugins. So we have a plugin, say, for Nova servers. And they can be indexed on demand. So I can just simply say go and index all my servers. And it'll go, index those into Searchlight, which actually is fitting that into Elasticsearch. And we also consume notifications. So if you get your incremental updates, we're going to be pulling in your incremental updates as they happen. So when you're doing your listing and querying request, you go against Searchlight. Do my querying, do my searching. But if you want to do an action request, I could go, hey, Nova, create my server. And Nova's going to create the server and it's going to emit a notification event. Then modification event gets consumed by Searchlight, which will then index in that into Elasticsearch, making it available and up to date. So we're going to go a little bit more in-depth on this now. And I'm going to turn it over to Steve McClellan to talk through that. Thank you, Travis. Yeah, as Travis mentioned, I'm going to go a little bit into more detail some of the innards of Searchlight. In particular, I'm going to talk about, first of all, how data gets into Searchlight and Elasticsearch and also a bit more on the query language that you can use to get results out of it. As Travis mentioned, there's two ways that we can get data in. And we call this indexing, which is a term we've borrowed from Elasticsearch. We have the on demand indexing, which you might use to initially set up Searchlight or when you initially set up your cloud. And then we have notification-based indexing, while the cloud's running. I'm also going to give a brief demo. I was dissuaded from doing it live for reasons we all understand. And with me, you get console windows, not flashy web interfaces. The top window here is running a search every second against Searchlight, asking for all nova servers that are indexed at the moment. Right now, there's nothing indexed, although I can run a nova list command and I'll get a list of the servers that are running. This is just a small DevStack instance, so there's not a lot going on, but enough to show what we're talking about. So if I were to install Searchlight and get it up and running with this existing cloud, I'd use the management command that we get with Searchlight. One of the commands it provides is to synchronize the initial index. When I run that, Searchlight looks at its config files. As Travis mentioned, we have a plugin system. Searchlight goes off and looks at those plugins, lists out what it knows about, and the elastic search indices that they're configured against. It's going to ask me if I really want to do this, because it's going to delete existing data so that we don't get left with stale data. This is a fresh install. There is an option to disable this if you know that you just want to update existing indexed resources. I'm actually going to say no to this because I just want to show nova instances for the sake of this demo. So I'm going to hit no here. We have a type parameter. I'm going to pass it OS nova server, which is the canonical open stack name for servers. Once I run this, Searchlight is going to go off to nova and list all of the running instances in the cloud. It's got administrative credentials, so it's able to get everything. In this instance, there's so few that it will do it in one pass, but if you've got a big cloud, it will page through the results. This is the same process for any resource type. It differs slightly based on what resource, but it's essentially the same process. The window at the top, as I said, is running a search. It's set to highlight changes in between searches, so it's running every second. If something changes on that screen from the previous second, we'll see that highlighted in gray just to show what's happening. I'm going to set this off. We'll see some log output from the management process, and then the search results will come in at the top. We see those highlighted, as I said. It all happens pretty quick. Search results start coming back almost instantly. So that's how we do the initial indexing of a fresh cloud or of a fresh Searchlight install. As Travis mentioned, we also have notification-based indexing. So I'm going to boot up a Nova instance with a little cirrus image, and I'm going to call it high open stack. When I run this, the Nova API is going to send some notifications saying that it's been asked to create a new instance, and as the scheduler also picks that up, it's going to send some notifications too. So I'm going to run that now. We'll see a couple of updates to the search results. The first one will be that we get a server in the build state, and we see that at the top there. A few seconds later, the scheduler is going to create that server, and it goes to the active state. So that new server has now been indexed by Searchlight. I've got a little script that runs a search against Searchlight with the name of the server in the bottom window, just so you can see what the index data looks like. It's pretty similar to what we get back from the Nova API, so if you've already got systems that are used to consuming that data, it's going to look very similar. It's a JSON document containing all the information we can get out of the Nova API. As I mentioned, update operations, they also send out notifications, so I can rename my server. I'll call it High Mitaka, and we'll get another notification from the Nova API that that's happened. We see the results change up there. Finally, I'm going to tidy up after myself and delete this server. The scheduler again will send a notification once it finishes doing that, and we'll see my new server disappear from the results. If you look at the tenant ID column up there, this search is running as an administrator, so it has access to all of the resources that are currently in Nova. Just as a demonstration of what Travis was mentioning earlier, the access control. I'll run a search, exactly the same search as the demo user, and we see that we just get results back that are owned by the current user, so we're able to limit results based on what role you have in the cloud. The other thing I want to talk about briefly is the query language that we're using. If you're familiar with Elasticsearch, this should look familiar to you. In my scenario, which is similar to what Travis was talking about earlier, I'm looking for images that have MySQL preloaded on them. My naive query is just to run a search with MySQL image, and this is the kind of query that Travis's UI is sending off, so this is kind of the back end of that. We see that I get a load of results back. Some of them look relevant. There's a glance image in there. Some of them maybe not so much. The metadata I maybe don't care about. So if I've got a bit more knowledge, I know that I'm looking for glance images, so I can sell Searchlight, limit the results to glance images, and I also know that our administrator tags images with the software that's running on them because she's a nice guy. So I can say that I'm looking for images with a tag matching MySQL. This time I get much fewer results back. The max score indicates that the results match my search terms much better, and in this case, probably both those results are relevant to what I'm looking for. And as Travis went into, you can get pretty complex if you know what you're doing, so in this case, we have a query where we're looking for Nova servers. We want to match all three of those terms in the must clause at the top, and we want at least two of the terms in the should section at the bottom. Elastic Search itself supports a huge number of operators, and Searchlight lets you use any of those as you want. The takeaway is that we're exposing the Elastic Search API as well as we can whilst adding access control and results filtering. If you just want to send simple queries, you'll get decent results. If you really know what you're doing, you can send very specific queries and really tune those results. And as Travis mentioned, Searchlight's been an open project from the beginning. It was in glance very briefly. Now it's an OpenStack project, and it's also extensible, so we're planning to provide support for as many OpenStack resources as we can, but if you had your own that you wanted to provide in a deployment, that would be something you can do. I'm going to hand over to Lakshmi now who's going to talk in a little bit more detail about how that plug-in system works after you. All right, thanks, Deep. So Searchlight plug-in, so we have seen Travis shown how easy it is actually to search across the resources and also how exactly it works. So the heart of all this is not any specific code within Searchlight, but it's all done through the plug-in. So anything specific to each service, it's all in the plug-in, and Searchlight provides a platform where you have a simple interface where you can interact with it. So let's look at the Searchlight plug-ins. So it's based on the Steve Doer plug-in, so if anybody already worked on the OpenStack, so it's pretty easy and straightforward, and even if it's not, it's actually very easy to write a new plug-in within Searchlight. So let's look at it. These are the different components of what you would write in a new plug-in. So what we have right now as far as liberty is that we have plug-ins for the Glance, Nova, and Designate, and in the upcoming release, I know we are looking at several mold plug-ins, but if there is something that's already not there in Searchlight, it's very easy to write. It just takes like a few hundred lines of code. So the main part is the first is the Search API, the one that you saw on the UI, so that's the simple interface. What a plug-in would implement there is pre-query and a post-query filter. We'll talk a little bit about them. So the common thing in OpenStack across is the RBAC, so every service has its own unique RBAC. So what the pre-query filter does is that you can actually specify your own RBAC mechanism as a Elasticsearch query. So before any query is executed, Searchlight will inject that particular RBAC mechanism so that exactly the same the service works, that's what you would get in a Searchlight output also. So the post-query filter, what it does is maybe you want to actually strip off some data, make it look exactly the way your service would send it back. Say, take Glance, for example, it has property protections. So based on who the user is, you want to actually remove some properties. So that's what you would write in a post-query filter. And we have seen the demo on the indexing, that's what the bulk index handler does, that's what your initial indexing is. So what you would do is you define the mapping, similar to a database mapping, you would use an Elasticsearch language, DSL, and specify what mapping that is specific to your service. And then you would specify how do you fetch the data and then index in Elasticsearch. And third is the notification handler. So once you have your data inside Elasticsearch, you want to keep it up to date, you want to keep it synchronized. As you've seen Steve show, when you create a new NOVA instance, you want to get the data inside the Elasticsearch. So what the notification handler does is it will listen to all the events coming from different services. So the plug-in handler or the plug-in developer would write that saying that these are the events I want to subscribe and once you get those events, what you will do is enrich it, map it according to your service, and then put it back in the Elasticsearch. So that's how you keep it up to date and synchronized. So let's take a quick peek at what it takes to write a new plugin. So we have a base API that defines all the hooks. So just what we saw about the input output filters that what you see is the getRBAC fill filters. That's where you would actually give a sample Elasticsearch filter. We'll just take a look. I think once we're done with this, we'll take a look in the next slide. And the filter result method, that's where you would provide what is the data that needs to be stripped off or maybe you want to add more based on how your service will look like. And the methods that you're looking at, the second one is the index handler methods. So here you're looking at a sample for the OS glance image. That's what defines this plugin is for this particular type. A plugin can have more than one type, so glance can have image resource or you could have a glance metadata. You can have multiple types that a single plugin can support. So the key is the getMapping. This is where you specify the data that you have. So for no instance, you would say, okay, this is the name that I want to get in and it has to be string or integer or you can go really complex on it. But for most of the use cases, it actually ends up being 10 or 15 fields and very straightforward. Serialize, once you have the mapping defined, you want to get the data into your service. So you would read the data from your service. Maybe you do a rest API lookup or any other mechanism. So if it's glance, make a rest API lookup or get the data in and then it converts into a format that you can put into the index as per the mapping. So you would do that in your serialize. So the third set of methods that we see here is the notification handler. That's what specifies what events that you support. And then similarly, the way you did serialize, any new events that come in, you would update it, you modify it and then you put it back into the index so that it's up to sync. So let's take a quick look at it. So the pre-query RBAC injection, this is what we are talking about and this is the RBAC mechanism that should be similar to the way the service does. So take for instance, NOAA or glance. Take the filter on the other side. That's where the query that you will give as part of that method. You're saying that I want to restrict it based on this particular project or this particular tenant so that if a non-admin comes in, he only sees all the images that he's supposed to see, not everything. It's similar to the admin. And actually, this is a subset of what we have for glance. It's just three lines or three sets of filters that we have for glance. So it's pretty straightforward and simple. And going down, you look at the post query. So for some fields that you don't want to show, you can actually remove them. So for non-admin users, you don't want to show some protected properties. So you can strip it out on the next one. So the bulk index handler, get mapping. So this is similar to even a DB API. All you specify is the type and if you want, you can say not analyzed or analyzed. But most of the time, it's simple and straightforward. And what you see is actually this is a subset of glance that we have right now in liberty and working. So these are the fields we just stripped off a little bit for the demo here. So serialize. So that's where you would use probably a Python client and go and get the data from your service extracted. And for the initial load, you get all the images in this case and then you index in the elastic search. And I think most of the other service would be pretty much similar. Get the data, serialize it, and you're done. So coming to the handler, so it says that I'm going to support all the events. You know, you don't have to support everything that comes out of a notification. All that you care is maybe a create update and delete events, right? So you get those events, you get the payload, and then you serialize it back into the elastic search. It's similar to the bulk handler just that it is specific to this particular event, like no instance creation or a glance image meta data update, something like that. So let's see. How do you deploy Searchlight? So you can have a Searchlight service deployed in each region. And let's say in your region, if you have Searchlight, so from the horizon or any other client, you could use Searchlight and then you get the data back and you can display the results back. So what if you don't have Searchlight in that particular deployment? You could still go to the alternative route, the way you do it right now. Go list all the new instances from Nova or go list all the images from, you know, glance and then see it. So we have an option. So if you don't have it, go to the alternate existing route. Otherwise, you're great. You can use all the advanced capabilities that Searchlight provides. So how do you scale it? So there are two main services within Searchlight. The one is the API service that you see that actually gives you the search results back. The other is the data enrichment or the listener service. So you can deploy them separately. So you can have as many API services you want. And there's no context or anything like that. You could put it behind a load balancer and you can or even you can go directly against the Searchlight API service. If it's a small environment, you just want one of the API service. Now for the listener services, they go using the Oslo messaging. They're based on that. So they listen to the events there. So you can have as many listener services as you want, separate from the API services. If you have a lot of events coming in, maybe you want to have a more listener services deployed in your environment. And even within each listener service, you can have more workers if you want to really scale it up. And the Elasticsearch cluster, you know, it could be even your existing Elasticsearch that's already in your environment. You could just use it. The way Searchlight does it is it protects it by having its own index. So it only gives you results from what has been indexed through the Searchlight. It doesn't let you access anything else that's in the Elasticsearch already. So if you want to use an existing bond, yeah, go ahead and do it. So that brings it to getting back to the overview. And I'll give it back to Travis. Okay. Is this on? There we go. So where are we now and where are we going? You saw what we have in Searchlight. We have released it. We're calling it a technical preview release. Although you can take it into play. We're actually asking people to take it to play and play with it. Give us feedback. Give you all the things we just showed. We have plugins for Nova server instances, glance images, metadata definitions, designate DNS domains, and records. In Liberty, we do want to have a shout out to everybody who helped contribute in terms of reviews. Those are different people from different companies who helped contribute. For Commits, we had quite a few different people commit. So fanboys here, it helps. Thank you very much. For Mitaka, our priorities are really about getting more content number one. We have various plugins that are going to be under development. We would actually encourage anybody to go and build your own plugin for your own stuff. So if you're on an open stack service or even for your own private service, go ahead and build a plugin and try it out. We're looking at more cross region and cross project searching concepts. We actually have a design session on that tomorrow, so feel free to come to that. We're looking at doing zero downtime re-indexing. So if you have to completely re-initialize the entire thing, there's a mechanism that you can use as elastic search that we're going to build on top of to make it just as easy as a command line button push to have that happen with zero downtime whatsoever. We're looking more at the API versioning from Nova with their micro versioning, make that more configurable of which API version you get, and we want to work on improving the notification data from all the various services that we integrate with. So we do need your help. We'd love it if you can take Liberty or just take Master for a spin, try out the Horizon plugin, give us feedback, feature requests, contribute anything. We definitely will take any input or help that we can get. You can join us on Searchlight on the OpenStack Dev mailing list, OpenStack Searchlight on Free Note, of course, and then we have a weekly meeting in a wiki. So please join in. So that's it. Thank you very much. Do we have any questions? Mike? Do we have a mic? Relax me. Thanks, guys. Thanks, Brad. Good presentation. I was wondering if you guys have any performance numbers at this point. For example, doing a Nova list with Searchlight as compared to the APIs. Yeah, that was actually, we did do some. So if I run all the way back here, the red there is Nova in terms of milliseconds of how long it took to do a wild card search. The green is Searchlight response time on the exact same cloud. So you can see wildcard search. We're, I believe it was somewhere around 60 millisecond response time and Nova was up closer to 450. And then for just doing a complete list all of what we had there, there's your results. So that was the basic testing. We intend to do more with it. So is that what you're looking for? Yeah, exactly. Sorry, I missed that previously. No worries. Anything else? Hey, you're talking about the, you know, your process update notifications, right? When something changes. Can you search on the historic data? We do not do time series data. It's current state only right now. Time series, you're going to be looking more like what Solometer maybe Manask is collecting. Yeah, I was thinking more like the can I find all the instances that were called something, right? You've made a name change or something like that. We haven't done that concept. I mean, there is ways that we could do that potentially. That's not our primary initial goal. Sorry, I'm late to the call. Sorry, late to the presentation. But did you cover how are you going to divide it up into tenants? Can tenants use it? And if tenants can use it, how are you going to divide up which VMs belong to which tenants when you're searching? I think you asked about multi-tenancy. That's actually one of the primary value ads that Searchlight provides. In addition to doing the indexing and having the common API, we do do the per-tenant searches. So like right now in the demo video and I'll be happy to show you afterwards since we're running out of time here. If you're logged in as an admin by default, say you're going to get what project you're on. You can say give me all data if you're as an admin. If you're not or you're just a specific tenant, you will get the data for the project that you're that you're going for. And we do pre-filtering and post-filtering of the data to ensure. So it's not this is definitely not intended as just an operator service. It's actually intended to be used by both project users and admins. Okay. All right, well thank you very much. We appreciate you coming. Thank you.