 lesson, I'll take it away. Basically, I'm going to provide the salameter PTL, as Allison said, and I'm going to provide an update on what we're planning to do over the Kilo release cycle. So, next slide please. We'll start with just a little reminder. And on slide two, I've got the mission statement that we agreed with the technical committee over the previous cycle. And this provides, you know, a fairly terse kind of summary of what we're all about in salameter. So, I've kind of divided it up there in the slide into kind of the three phases of what we do. So, the purpose really is to surface insight into what's going on in your cloud. In order to do that, we need to collect measurements, basically, of how the physical and virtual resources that comprise your deployed cloud, and how these are being used, how they're performing, who's actually using them. And then these data once collected, obviously, you know, in order for them to be put to use later on down the line, we need to persist these data so that it can be subsequently retrieved and analyzed. And also, we need to be able to trigger actions when some criteria are met. And a really good example of that would be heat auto scaling. So, basically, heat auto scaling is a mechanism whereby the content or the membership of a group of instance can be dynamically adjusted according to the trend in usage that's observed of those instances. And that's an example of where an action has been triggered by some data collection that Solometer is doing. And then the actual triggering of the action is basically driven by a feature of Solometer called Alarming. So, moving on to SlideTree, that's basically our mission. And the next thing really is how are we applying this mission over the period of the general release cycle. Well, here are the things on SlideTree. I've given it kind of a high level laundry list of the things that we spent our time in Paris talking about when we weren't doing touristic stuff, we weren't traveling around looking at the Eiffel Tower much. Most of it was spent in deep conversation in Conclave where we discussed basically our plans for the upcoming Kilo release cycle. And came out with the set of prioritized teams that we're going to attack over the next six months. So, the highest priority thing and the thing that we're really focusing on is completion of this new time series data as a service, the no-key project we're calling it, and the migration of Solometer to use that as its metric storage layer. So, that's really going to consume a lot of our attention. And basically it's been a multi-cycle effort. I'm going to talk in detail as to what it comprises and how we were doing it, what the status is, and what the remaining work looks like over Kilo. But in addition to that, we've got other parallel efforts going on. And I'm going to provide a little bit of detail on each of these on this update. Now, one thing that's quite important that's happening across the board over Kilo in OpenStack is the idea of taking a lot of test coverage that was previously provided by Tempest, which is an integration and test suite that spins up a lot of services and it tests a lot of interactions between those services, moving a lot of that coverage so that it's now within the individual trees of the individual services. And with some other plan testing improvements as well that fall under that category of providing the right kind of test coverage. And I'll give a bit more detail on that in a few minutes. The other thing we want to do is to basically make the segregation between tenants and the role-based access control in a sec to make that configuration richer and more flexible in Solometer. We want to reach a point in time at which the notifications that the other services emit, services such as Nova, Glance, Neutron, Keystone, etc. Those notifications are currently consumed by Solometer. And we want to basically allow that notification-based interaction to become as contract-based and as stable as a true API. So that will require some work both on the consumer side, in consumers such as Solometer and also related projects such as Stacked Act, but also a lot of change on the emitter side. So that's a very significant piece of work. It's something that's been mooted many times. It's been kind of oft-discussed but never actually delivered upon. So what we're really trying to do over Kilo is finally put that issue to bed and finally basically promote notification-based interactions so that they're kind of like first-class citizens in the open-spark world with a number of improvements on our roster around deployment flexibility. So ways in which the complexity of actually deploying a Solometer, the set of Solometer agents over a non-trivial deployment of OpenStack, not just a single node TOC or something small like that, and how that flexibility can be improved and the complexity reduced. We also have a few improvements around how we process events. So currently Solometer basically has two different ways on which data can be acquired. One is kind of an act of going out and grabbing that data, a kind of a polling model. And the other is more passive, kind of sitting back and allowing the services to tell us what's going on via notifications that are delivered over the Osno messaging bus. And we've got a number of ways of improving how we actually process those notifications. And lastly, I've got a line item that's kind of pretty much assigned to me, and that's a collaboration with a very interesting and related project in the open-spark ecosystem called Manasca. So Manasca are all about monitoring at a very large scale. It's a project that grew out of an effort within HP and now involves contributions from several different large open-stack companies including IBM and Rackspace. And there's somewhat complementary to what Solometer does, but also there's some commonality there as well. And we're going to explore how we can kind of collaborate with the Manasca folks and where our interests align basically, achieve things together. So let's drill down in a bit more detail into each of these themes. So firstly, this time series data as a service, this no-key project, what's it all about? So first to recap, this is actually very similar to a slide that I used in the PTL update for Juno. And basically I'm just recapping what no-key is all about. So the goal here is to provide efficient metric storage. So you may well ask the question, well, Solometer is obviously involved in storing metrics currently. Is that inherently inefficient? Well, there's a couple of kind of drawbacks to the approach that Solometer currently takes. And one of those is the fact that we snapshot resource metadata alongside each individual data point that we store. Now that provides an awful lot of flexibility, and it also provides a very good record of the evolution of the resource state, the timeline of that evolution. But the downside is that much of this metadata is either static or very rarely changing. So there's much more efficient ways of storing this near static information than continually snapshotting it alongside each individual data point. The other thing that we want to change, we want to kind of upend the approach is classic Solometer to coin a phrase is predicated on this idea of all aggregation being done on demand. If you issue a statistics query and you have a certain bucketization that you want to, for example, per hour, that aggregation is done on demand, on the fly, for each individual query. And if the same query is omitted a second later, it's done again. So what we want to do is shift to a model whereby we eagerly pre-aggregate and roll up these data as they're being ingested. So those are the key differences really between how we view this time series data as a service training out. It's been implemented initially at arms length from Solometer via a project that was spun up on Stackforge by my predecessor, Julien Donju, my predecessor as PTL of the Solometer project. And it was envisaged from the get-go as kind of a multi-cycle effort that we would play out over Gino and Kilo. And then we get to a point at the conclusion of Kilo whereby we were ready to migrate Solometer core to using Noki as its primary metric storage layer. So currently there's a canonical analytics and storage engine provided by Noki which is based on the Pandas Analytics Library. It's a commonly used data analytics library in Python and using Swift as the storage backend. But our intent is to also provide alternative storage drivers. So we follow the standard OpenSack pluggable model. And the idea is that we'll have alternative drivers based on other specialized metrics oriented databases such as InfluxDB and OpenTSDB. Those are the two that we're currently actively working on. So that's kind of a recap of where Noki came from. So where is it at currently? What's the status? So moving on to slide 5. I've listed there basically just a kind of a laundry list of what we've achieved so far and what's currently in-flight as far as fairly coarse-grain functional areas of Noki are concerned. So completed we've got the Core Metrics and Resources API. So it's a REST API in the typical OpenSack style. And basically it's going to form the kernel of the vTree Solometer API. We also have a single storage driver that's completely built out. That's based on Pandas and Swift as I said. We've also got the aggregation model and a policy-based model that determines what level of aggregation and how to roll up occurs for each individual metric. It's a fairly rich model because you can attach different policies to different metrics and have different levels of expiry and time to live related to different metrics. We've got a dispatch mechanism in place within the Solometer Collector service. And this is a API-based model. And by that I mean that the dispatcher uses the REST API to push the metric data points up to the time series data as a service and API. And the intent basically was for that to be the kind of proof of concept, the proof that Solometer data could be stored and retrieved from this new Noki service. But as I'll mention later we've got an alternative more efficient dispatch model in mind where the Noki storage driver effectively runs in process within the Solometer Collector service. And we also have the Keystone integration piece all gone. So the typical kind of call out to the authentication layer and also the out-of-the-box role-based access control policies. Now in progress we've got our two alternative storage drivers. One is based on OpenTSDB which is a moderately widely used metrics oriented database that's based on HBase. So it's a type of thing that one would run over an Hadoop cluster. And then also in FluxDB which is a very interesting database that's implemented in Golang that it's again specialized for metric storage and provides native features around down sampling of data points and is highly optimized for this particular type of storage problem. We've got some logic around custom aggregation whereby you can have pluggable components that basically provide additional aggregation logic. So clearly we're going to have our basic aggregation functions define things like maximum, and averages and sums and counts, and even more exotic things like standard deviations. But sometimes you want to go even further than that. So we've got custom aggregation layer in progress being implemented and a number of example applications of this including moving averages and also host, winters and forecasting. Another thing that's in progress that's quite interesting and will be crucial for our alarming integration is the ability to do cross-entity aggregation. So by that I mean the ability to aggregate data points that have originated from different resources. So you can say either give me the CPU utilization for this individual instance or else give me the average CPU utilization over a group of related instances. For example, the instances that make up an auto scaling group as defined by heat. And lastly there we've got, which I mentioned earlier, we've got in progress an effort towards providing an alternative dispatch model from the Solometer Collector Service that's basically in process as opposed to API based. So in that case the collector would load the Nokia Storage Driver and call out directly to InflexDB or Pandas Plus Swift or OpenTSDB or whatever it is as opposed to making a direct invocation on the REST API. So that's kind of where we're at as far as Nokia is concerned. So what additional work have we planned for Kilo? So basically the crucial things that need to be achieved before we can declare victory on this we need to recast the V2 query API which allows you to basically slice and dice Solometer data in many different ways. We need to recast that over Nokia semantics so that it takes account of the restrictions that would be imposed by this model of not snapshotting resource metadata for each individual and sample data point. And remember, having put in that restriction in place is key to making the Solometer metric storage strategy making that massively scalable. We also need to rebase the Solometer alarm evaluation logic on the V3 metrics API. So this will be effectively the retrieval API that Nokia supports. We need to provide logic to allow customers to take pre-existing measuring stores. So a customer for example, our user who basically had a deployment of Solometer running for some time and was using the classic Solometer storage layer we need some way of capturing that data and basically extracting it and distilling it and basically presenting it up then as equivalent data to the way Nokia actually stores in pre-aggregated rolled up form. And lastly in order to I suppose make clear the benefits of using this alternative storage strategy where we've already done quite a lot of profiling with a lot of potential optimization that we're going to attack over Kilo. And finally we're going to publish all of these results and indicate the potential benefits so that people are very sure before they switch from classic Solometer to using the Nokia storage layer as an alternative. Now of course the classic Solometer way of doing things will be maintained on a deprecation path and the standard in OpenSTAC will be for something like this to maintain for at least two cycles going forward. So it's not something that we're going to pull the rug out from under existing Solometer users. There will be a long transition phase and basically people will be given plenty of notice but of course it's always good to have a very solid case for early adopters to move forward. And that's what we intend this detailed performance profiling and effort to provide. So in addition to that we've got a bunch of other parallel efforts that are unrelated to the Nokia project. I spoke about these in fairly broad terms. I'll give a little bit more detail now. So in terms of testing what do we want to move to? Well we want to follow the trend that's happening right across OpenSTAC so moving a lot of our coverage out from this kind of global tempest test suite into functional tests that are more directly associated with our project repo. So we're calling these entry functional tests in the sense that they live alongside the actual code that's been tested in the same Git repository. And this will allow us basically to not be as restricted in terms of how these integration tests are actually constructed. So a good tempest test is a pure black box test. It doesn't assume anything about the underpinning implementation whereas we want to be able to test things in more of a grey box method. We want to be able to make assumptions about the internal implementation. We want to be able to write tests that do things like accelerate the metrics gathering cadence that Selometer uses so that the test can complete in a reasonable time. We want to for example change some configuration option that causes some Selometer agent to gather data points at a much faster rate than it normally does and then make some kind of assertions around the metric data points that have been gathered and do that all within a reasonable time. We also want to basically produce a set of API tests that are much more declarative in form that basically provide kind of dual duty such that you can read the tests that are unencumbered by a lot of complex Python code and instead as well as providing test coverage provide an almost organic form of documentation of how the API is intended to actually be used and how the form of requests and responses but in a very kind of high level and digestible form. And lastly we want to use the Rally test framework to basically introduce scenario tests. So these are tests that basically induce some workload over Selometer and then basically measure the results in effect. And that will basically allow us to measure performance improvements in a very convenient way and also allows us to basically watch out for performance degradations where they occur and to catch those early. Another area that we're going to dig into is this whole configuration of segregation between tenants. So up to now there's been kind of an all or nothing model in Selometer between basically users who have the admin role and users who are non-admin. So admin users can see all. They're omniscient. Basically everything is visible to them as far as Selometer is concerned. Whereas normal users can only see the data and can only reason over that data and alarm on that data associated with the resources that they themselves own. So that's kind of, I mean it's useful in terms of tenant segregation but it's very kind of absolute. Whereas we want to have a much more kind of nuanced model and use the role-based access control mechanism that OpenStack provides. And basically to start leveraging the more forward-looking features of Keystone including this idea of domains. So this notion that the administrative role isn't something necessarily global but it's something that can be partitioned between different related groups of users via this notion of domains. There's a number of different improvements around deployment flexibility. So we're going to merge a very similar central and compute agents with a single polling agent that can be run in a similar mode to the current central agent or compute agent or can do dual duty. And that will make small deployments more simple. We want to basically be able to centralize the storage of the pipeline configuration which is currently stored in a flat file that needs to be rolled out to each individual node. It's a pipeline that YAML that existing kilometer users would know and love. And we want to basically allow that to be centrally stored so that it can be changed in a more global fashion across large deployments. And we also want to, one thing, another thing that will make deployment simpler is this idea of allowing the metrics gathering over SNMP to be truly the characters to be driven by config so that you can decide which SNMP metrics you're interested in and then change those on the fly very easily. Smaller parallel efforts, we want notifications that are emitted by services such as NOVA, Neutron, Glance, etc. Currently these are essentially just preformed dictionaries. There's no contract involved, there's no stability. So we want to go ahead and schematize all of those. And we're doing that as a joint effort with some folks from the StackTack project which is a related project in the OpenStack ecosystem. That's also alongside a centimeter, probably primary consumer of notifications. We're going to improve our events pipeline so that the events database is completely split off from the metrics database. So you can decide to store your events, for example, in Mongo and store your data points, for example, in Nokia or in HBase or wherever any of the other storage drivers that are supported by Solometer. And we're also going to use the same model of coordination between scaled out notification agents. This is the agent in Solometer responsible for consuming events as we currently use for the central agent and the compute agent where we want to provide a kind of a mutually exclusive partitioning between individual agents such that they don't step on each other's toes and they don't duplicate work but yet no work also falls through the cracks. And that can be done in a way that takes account of the fact that the pool of currently running agents can change over time. And last we've got a couple of line items around collaboration with the Relation Manasca project. So I'm currently working on figuring out if we can reuse or leverage somehow their anomaly detection engine which is a very interesting component in Manasca that we don't have a direct analog off in Solometer. We both have a common concern around influxDB. They're using influxDB as one of their storage options and we also intend to do so for Nokia so with common concerns, for example, around getting influx into the continuous integration gate upstream. And lastly, Manasca are using Apache Kafka as their kind of high throughput internal messaging bus. And that's something that Solometer could learn from but we use also messaging both for external and internal messaging and basically Apache Kafka provides several kind of benefits around and throughput and scale, etc. So just my final slide here is just emphasizing the point that we're available for you guys to anyone who has any further questions or wants to kind of add any more depth on any of these topics to reach out either on IRC on our Hash OpenStack Kilometer channel or you can ping me directly over IRC on Eaglin at FreeNote and my home time zone is GMT so take that into account if you're wanting to chat over IRC. We have our weekly meeting at 1500 UTC on a Thursday so you can join that if you want to go deeper with any of these discussions and of course we've got the OpenStack Dev mailing list. So that kind of concludes my update.