 I guess we should. So I am not Nick Weaver, you might have seen in the program, Nick was going to come and talk about telemetry and various aspects of gathering telemetry and how to scale the gathering of telemetry and pretty late last week it turned out Nick couldn't make it so he asked me if I would come and talk about something to do with telemetry and I said yes, but it's not going to be quite what Nick was going to bring. So to introduce myself, I work in Intel Labs, we have a group of researchers who are here in Europe primarily and we're focused on infrastructure management and orchestration so quite a bit of our work is in the telemetry area and we consider telemetry to be foundational actually to how orchestration can evolve to making best use of infrastructure characteristics and delivering services at scale. So to start with some obligatory charts with lots of X's lots of things increasing I think there are 8 X's there in total so lots of stuff happening and more and more and more so so we've all seen these pictures. So this is a slide actually from Nick which is his version of a business view of orchestration and what happens so it's following somewhat of a life cycle here where you've got capacity planning and through your capacity planning you you send some configuration through to how you do workload placement, how you do rebalancing and you somehow kind of cross pollinate what your observations are there and then post facto or during you do your billing and show back but the point is that monitoring is at the center of all this really you're flying blind unless you have appropriate monitoring and from a from a lab's perspective from our view we have been trying to increase both the precision and fidelity of what we monitor but also to not create some kind of explosion of unusable data so working towards metrics that actually are meaningful and make sense. So our work is very applied by the way so I'll talk in a moment about the balance of kind of medium-term research against the more practical proofs of concept we do but everything we do is is driven by some kind of problem statement and you can see what some of these are here really they're really about maximizing revenue and pushing down operating costs getting best use of your assets. Cost performance up is kind of what it amounts to so there are various use cases that underpin the research projects we do as well as underpinning the the trials and POCs that we do both collaboratively and internally then directly with the business units. So to the question of how do you do that automatically on that scale of course you got a step back and what you see is a lot going on so this is just a small snapshot a very bounded snapshot of a small number of services running actually from our from our testbed I might be able to show a demo of some work we're doing here in just a while but the point is when you have overbooking on the infrastructure and when you have long-lived and short-lived services coexisting and different types of stacks that you set up for different kinds of services you get quite a bit of complexity and indirection and multiple levels of abstraction so if you're trying to see what's going on down here low level in the stack your PCM type counters and as you step up to stuff like SAR and then up to the service level metrics figuring out the relationship between those is not trivial but if you can figure out the relationship you land yourself on to a very powerful insights this is basically what's what's backing the bulk of the research we're doing this is what we're trying to chase is how do we get those nuggets and how do we do it to automation how do we do that scale this is what we're after so stepping back into a more kind of business perspective here the the messaging that's coming from our partners in the business unit the data center group on orchestration is is expressed in this kind of language so it's basically you watch you decide and you act watch or decide or actor are the kind of functional elements that that Nick would use and you're collecting data you're looking right up and down the stack from the physical platform to the various virtualization layers and up to the app and service itself and over time you try and learn some models and establishing heuristics and establishing correlations and teasing true correlations and figuring out where the causalities actually are and then you can tune your rules to your policies we've been over the past few years manually going through this process so gathering the relevant data and figuring out what sort of statistical analysis to do over the data how we can infer what impacts quality of service and what impacts efficiency and more recently we're in the space of now we've got these workflows stood up how can we start automating the workbook workflows so that's kind of where we're at right now and we're kind of stepping through it sort of problem statement by problem statement and workload by workload using actually real workloads so to introduce our work in our in our lab and a couple of my colleagues are in the room here by the way we this year took what we had established over a number of years of collaborative research much of which is true the fp7 mechanism by the way for anybody in the room who's involved in research in the European research area you have these collaboration mechanisms that help universities and industry come together and and align state of the art advancement to actually dealing with with actual real issues so we've been quite active in this space over the course of the seventh framework we'll have executed I think 12 or 13 projects in our in our lab and we're we're in the process of setting up some H2020 right now so that's kind of medium term stuff that's five years kind of horizon type research and what we started doing this year was we took what we've been doing to date and some of our active work and we drove some integration so we we have this this name for it we call it apex lake and what it takes lake actually is is it's a framework and an actual functional environment for conducting PLC's and research it's integrating orchestration research that we've been involved in from multiple of these types of focus areas and it's also a reference for collaboration so much of our work is collaborative we've got many ways to collaborate with organizations either true you know grant funded consortia or direct bilateral stuff or MOUs at universities so we've got all these things in place and we're really keen to learn from the perspectives beyond our organization so what's the perspective of a software developer or of an operator or telco or a PAS provider for example it's not a product I'm gonna say that about five times at least maybe even 20 I've been asked to I've been asked to emphasize this apex lake is not a product never will be it's basically a labs initiative and we do actually work quite closely with the product groups jointly so so basically it steps through this kind of cycle of monitoring and then figuring out what the work loads are looking like can we express what's interesting about a workload parametrically in such a way that you know a rules-driven system can make sense of it what do we capture in the past well currently true constructs like what I tell have for cmdb there are ways of expressing the landscapes we've been looking at those and adding some dimensions to those so in particular context there's not much out there that effectively captures context of the infrastructure in the services and we've been working towards that I'll explain that more in a moment and then how do you reason over that in terms of initial placement of a workload or even adjustment over time can you draw an inference based on what's observable low on the stack and what's observable at the service itself as to what might be going wrong or where an efficiency opportunity might be untapped and then we finally do some actuation we trigger it back into an orchestration system and something happens so that's kind of what we do and that's what I'm going to be talking about actually so I'm going to step my way around some of these functions I'm happy to take questions at any time by the way and I'm going to try and move fast deliberately so we do have time for questions because I'll be asking questions hopefully I'm here to learn I only found out on Friday evening by the way I was speaking here so so hence why my profile is a bit slim this was kind of an unexpected change of plan so start with the watcher so you can lots of people have scripts in place any system in will have lots of scripts they run that kind of traverse the system and do lots of SAR and top and all this kind of stuff and it's fine but really you're probably not instrumenting apps you wrote right so you don't know what the code is doing inside you don't know what it's expecting to see you don't know what else is coexisting with it if you've got three VMs sharing a node you don't necessarily know what's going on the other VM or how it's getting on there's a whole bunch of information that you are not looking at and probably for good reason because it's probably too much information there so so we're digging in there trying to find out okay let's have a look in and let's try and extract out what's actually meaningful so one piece of work that we've had ongoing for a few years now and it actually has its origins in an FP7 research project called Iolanes where we were chasing bottlenecks in a virtualized stack is our instrumentation which we call Merlin and what Merlin is it's an approach to doing fairly fine grained like tens or hundreds or thousands of Hertz type time series capture right across the whole stack so from the application through all the layers right down to the physical infrastructure through for example PCM counters it's capturing core and uncor counters so core is stuff like the processor utilization uncor is stuff like LLC hits and misses TLBs etc and then when we gather all this information and usually you gather hundreds of thousands of data points for a short five minute experiment we apply statistical analysis so we've over the years identified a series of configurations of stuff like PCM to do a dimensional reduction and covariance and various box and tail plots and we've established a workflow that very quickly guides us to what's interesting about a particular service configuration and at this point we can very quickly tell where the bottleneck is and check is it in the right place because sometimes it is you know you theory constraints as you should subordinate to your constraint so the bottleneck has to be somewhere but if it's in the wrong place and you're wasting money so an example of some work from last year on that is is a proof of concept we did which was looking at the effective performance of pretty heavyweight analytics database on one large machine so it was a 40 core one terabyte machine with a full-in memory database and SAR was kind of saying well it's okay everything's 100% these expensive cores are all being well utilized and when we dug in things started to get interesting and specifically what we found was as we increased the parallelism of the queries the latency of the query was rising linearly with it and then we looked more closely and we were seeing that the IPCs were quite low but stable and we were getting interesting patterns of LLC hits and misses and anyway what we found out was that 40% of the cycles are spent servicing cold TLBs and essentially the memory access patterns were 100% random across a four region NUMA system and the effective utilization was only 60% so the message back to you know the coders was try to figure out something on memory locality or NUMA awareness and you'll get you know pretty substantial throughput increase because if you're running these databases in memory that's pretty expensive business right it's RAM is expensive the terabytes pretty expensive per node and it's expensive in terms of energy densities and all kinds of operational stuff that goes with it so this was new insight for the from the point of view of the application itself because previously it had been kind of you know hand carved with Vtune and then checked with SAR but when you look at everything in between the full IO subsystem and you can check that the bottleneck is not quite where you thought it was you basically learn something so that's an example just one example and we've done others we've looked at network type workloads CDNs found some very interesting non-obvious stuff stuff that was configuration sweet spots that were actually contrary to what the system vendor was saying right so so we proved that out and actually some of that work has gone to the Etsy ISG manor group that was mentioned in the session two sessions ago in here by Adrian and the guys they were talking about STI so that's that's the metrics where do we go with the metrics then so we have this info core so I mentioned this kind of thing that started out kind of like a cmdb and and we had this term we'd call it cmdb on steroids so what we're doing is we're trying to basically capture the full the full stack we're trying to capture the relationships across the various layers so for example across the virtualization plane and then through the layers so what are the relationships of dependency and and you know noisiness or you know good neighborly so whatever the term is as you traverse up and down the stack now the reason this is really important is because the level of abstraction is actually increasing if you look at stuff like the open compute project and disaggregation in data centers where the point comes where you could basically stand up a logical appliance based on a you know an OVF you don't necessarily know where all of the you know the the resource elements are coming from so unless you have a global view you're probably not optimized straight out of the box so we wanted to create an information representation that would allow us to pretty much arbitrarily extend the abstractions on either way so we can basically capture what's the SLA saying and roll all the way through to you know at some point in the future what's going on with a particular SSD in a particular sled and a particular tray of a fully disaggregated data center and capture everything in between yeah in ways yes in ways yes but actually for convenience we pre-populate with a lot of stuff from the nova database because this is research by the way so I mean our we don't want to necessarily let's do the best we can from integrating with what's here in 2014 we're kind of trying to rethink what would be the best possible view and then we run some experiments and one of the reasons I got quite interested in coming here today was we're at the point now of thinking about okay well practically with what's there we know what's a path to make in use of something like this we don't want to come up with some other big kind of God database that says I know everything else has to go away this is the thing that's not the idea so what we're trying to do here is saying well well what if you did have full visibility you know and this actually yeah I'll talk in a moment about why we keep this off somewhat outside of what already exists but the discussions we're interested in having is is at what point do we start bringing some of these concepts towards projects that are in flight so the key point is that we capture this context and I'll actually do I'll try and do a demo and well let's see how the network plays out hopefully I can I can give you guys a demo of what this actually looks like and then the near final piece which is where we start moving from decided or interactor is how do you reason over data like this this is the point where we start trying to reconcile okay here's all our you know observations across the landscape across how the services are behaving and we referenced those then against the KPI or some policy element and say well are we optimal or are we not and for now what we've created is a framework that is pluggable so we can basically plug analytics workloads workbooks we use term workbooks that can be simple Python scripts that can then pull external algorithms and look for something that's interesting so the example here was on placement optimization and and what was observable was that this this particular example was safe and there was a suboptimal situation that you can automatically detect by reconciling statistics and you can use that then to trigger a reoptimization so I think yeah the next slide will help put that somewhat in context this is kind of the flow so we do the monitoring we do the fingerprinting for now we're fingerprinting fairly basically so we're looking at this is one you can't see the text here I can see it so it's basically reads and writes with CPU and I think it was something to do with networks down here and we looked at three different configurations of a service and looked at where the kind of max out type situations are going on so you can basically experiment with different configurations of a service and look at parametrically you can actually automatically derive a parametric representation of what those differences actually are which is important because then you can have automation so you can you can basically have policies and reasoners that will allow you to trigger a better result so that's kind of how it works finally then we have the actuation and for now it's it's kind of a hack really we're simply intercepting the the scheduler the filter scheduler with hints or else we're manipulating the heat templates on the fly so not really doing anything with open stack code we're just I think we're still on Havana we haven't touched the line that stuff we have a actually next slide will help we haven't worried about that yet there's some other work going on really is the fingerprinting for now I mean we know intuitively that a database is different from a transcoder is different from something else but that in itself doesn't really tell you a whole lot because you know within transcoding or within databases there's a whole bunch of other kind of potential complexity so what we're doing here is is we're looking up and down the stack at everything and we're seeing where's the bottleneck and is it in the right place and is the context of this particular running service relative to everything else that's running like it's the full service landscape in a reasonable state of optimization or should you move something out to another node you know knowing that utilization might drop but you might gain a bit in terms of service performance that's what we're at right now there is some separate work going on on classification that this is feeding into but we're not so interested right now in that particular detail so the last thing I had in terms of slides was this one I think almost and the key point here is we've been running a series of tests all year actually basic you know experiments on you know bench workloads we've done transcoders and databases and then we've worked with some industry partners on some network type workloads and database type workloads and this would be a fairly typical configuration for something that would use something like OpenStack and the key point here is that our work the apex Lake work we're progressing this because we want to see you know what happens when you can finally get visibility into everything and you have a way of automating turning that visibility into insights that are actionable that that's kind of what we're chasing so each of these pieces is in currently set up to coexist with and complement what can happen with an orchestration system or cloud OS like OpenStack as I said at the start this is not a product some pieces of it are interesting to the product guys but what's interesting to us is and the conversation we'd like to start having is are there things going on here that could be interesting to evolving scheduler type capabilities in OpenStack and are there things that are going on here that could help you know move on you know current or future OpenStack projects for example so this stuff we're not tied to this being you know on a path to any particular product we're perfectly happy if the community is interested in this way of approaching things then we're very interested in seeing if we've got something in common in terms of motivation and views on things so that's that's the slides I could take a risk here now and try and do a little bit of demo just to quickly step through what this stuff tends to look like the resolution is not going to be my friend here I think sure enough so this here is some time-series data from our test bed this is live by the way and it's basically showing a it's bandwidth reads and bandwidth writes the UI is pretty primitive I mean it's not this is not shiny stuff for sure so basically you go in you say for one day ago every 10 seconds give me system metrics which are of type either bandwidth read or bandwidth writes and you can see here the pattern of services that have been set up and torn down over about a 24-hour period this is not going to be easy but if I can find the other tab there's some yeah that's harder to read that's some utilization level stuff I'll skip over that next one here this one is shiny we do have somebody in the group who's a little bit of a graphics aficionado and here's a representation of our test bed actually I think it's 12 or 13 physical nodes in this particular corner and you can see here what the dependencies are from service level through the virtualization down to allocation physical this is new by the way this this version just got put in place about 15 minutes before we got in here and I'm not sure exactly what the the movement type stuff denotes that wasn't on the version that was there an hour ago and the guys are kind of playing around with some information visualization theory here as well but what we can do with this is we can go in we can step in and see what's going on and I'm trying to find one that would have a suitable pattern to talk about ways that were moving into traversing this I give up I'll just look at the whole thing so right now some work we're doing over the coming weeks is writing scripts that traverse the subgraphs the service subgraphs here and if for example you have a physical node that is shared across multiple resources from a system point of view it's interesting to know how the service level performance is patterning out over multiple tenants and if that information can in context with the metrics of the node tell you whether you've got the appropriate multi-tenant in going on or not so this is type of contextual information we can then feedback into the info core and over time then you can really refine your your policies to have best sharing or you know placement across the infrastructure there's one more I might be able to show here this is gonna be hard so again we're back in the ugly okay stuff again so this is the analytics framework and we've got a series of workbooks in place I'll try and I'll try and zoom in on one there it's kind of hard to do it from here I can't really see just to show you what some of these look like some are more easily interpreted than others I'm gonna try and find an easy one to interpret this one this is covariance so if you're interested in seeing there you see the script itself if you're interested in knowing where there is covariance across full system metrics well a workbook can tell you that and why might that be interesting well in this particular case we set up a kind of a toy workload on the bench that had a little smartphone app that was interacting with some quark type systems at network edge and then running you know it basically a canned application on the back end and what the covariance tells you is that is no surprise obviously but it can infer it itself with no guidance is that the as you see an increase in in cell phone users accessing the app then your VM starts to need more disk space so that's not rocket science any human knows this but the point of this work is to take the stuff that any human could figure out and look at millions of those scenarios in real time and be able to trigger adjustments pro preemptively so that's an example of our workbook can be executed and we've got we've got many others and we've made use of external machine learning algorithms as well so back on the back in PowerPoint land I was where are we going next so where we going next with the integration work so some more kind of past research that still on our shelf we did quite a lot of work on SLA some years ago machine readability and SLA manage SLA management stacks service and SLA awareness right up and down the stack so that's kind of ready now to start popping in and integrating with this work so we will probably move on to some classification work as I mentioned there is some work happening elsewhere in the company on classification and we're working with those guys so we may do some some modeling around that and then the whole thing about service on boarding is an interesting area as well we think that there's some stuff that could be done there to say okay got the task got the heat have the OVF spot what next what else could you be doing in the first kind of ten minutes the first hour the first several hours of a service lifetime to see what it's really doing because what we have found is that what it really does is not quite what it says it was going to do longer range stuff that we're doing then so we've got some work underway actually I've got a consortium work going away on going on on real-time automatic application adaptations so an application can intercept and interpret telemetry information from the infrastructure itself and understand what's going on and execute different code blocks heterogeneity is very interesting it's just starting up a new project right now that's going to look at pretty much unconstrained heterogeneity across FPGA GP GPUs core mic type co-processors and data centers and how would you manage that and then an area we're quite interested in as well is this whole DevOps approach to service delivery and we have some background work on visual analytics that's been applied in manufacturing in very high-volume manufacturing that we think would be interesting to bring in there in terms of future DSS type tools for policy tuning in a DevOps environment so that's kind of our default plan say going into 2015 a little bit beyond and I guess well we're interested in questions we're interested in hearing you know what does the what do the open-stack community think about this kind of approach to things and is there some common ground and I'm back on time yes actually so the the heterogeneity project that I mentioned has a HPC element in there yeah yeah it absolutely could do and in actual fact the Merlin instrumentation that I mentioned earlier can already feed into cylinder the reality of our group is we're a small team and so we're only constrained by our own bandwidth so really all this kind of stuff yep that sounds fantastic who can we who can we share the load yeah another question well that's that's unfortunately it's not much you can do about that I mean we can run at fairly high sampling rates we have found I mean look where you're starting from right so cilometer is about one Hertz and we're running at you know hundreds and thousands of Hertz so already we're getting richer data that and we're this is across like a thousand parameters at a time so for now that's quite enough for us to be getting on with and if we've exhausted this particular vein with the level of sampling we're doing then maybe we'll go back and increase again but yeah for sure sampling errors are reality but we're working at several orders of magnitude more fine-grained data than what exists ordinarily today yeah so we've we just we just tried to assess it on its own and it's actually the probes are unmeasurable on a server-class machine we did run it on a Galileo board and I think it came out at 0.4 percent CPU 0.4 percent which is less than the top command so it's very lightweight script even as even in scripted form it's very lightweight but we haven't done comparisons beyond that you can program it so you can send a profile with the probe and you can change it over time so you can send down initial configuration then you can go back and change that you can tell it to do some basic CEP on client side or to back off the the the sampling frequency but the reality is in a conventional data center this kind of data is like not even close to noise you know what really what's really going on there in terms of the media type content moving around and all the other kind of control plane type stuff it's a knit it's what we find I don't know where it's too many sorry I think you were in first bits and pieces of it bits and pieces of it are published some of it's going into a current IEEE journal lots of it is in deliverable reports from the FP7 projects that I've mentioned but none of it is called Apex Lake today is the first day we've mentioned Apex Lake outside the company you could look at monitoring Intel Labs Europe you could look at SLA's SLA management Intel Labs Europe pretty soon you'll be seeing application adaptation the cloud wave is the project I mean I can give you a list of the projects for sure current and future and we're not the only people doing these projects so I mean we would have leaving aside the universities the industry partners we would have would be IBM in Haifa the research labs in Haifa there SAP lots of operators pretty much all the big operators and telcos around Europe we're working with the OEMs and the TEMs people like ALU Eric's and Nokia Siemens all these guys we're in collaborations with pretty much it's almost the who's who of cloud services and future network type research in the jargon of the framework stuff we refer to the objectives 1.1 and 1.2 respectively that's that's kind of the game that's the that's the area we play in sorry yes yeah so the whole provenancing thing is is a topic one right now in fact if I bring the thing back up and try to something I didn't show you with the time slider which tab is it on is that one so here is a lot of time slider and I can now I mean this UI I don't think any there is no kind of scenario where a human offer operator whatever use a UI like this other than to demonstrate it in a room like this but more of the context that we're capturing is snapshotting we're already snapshotting anyway but we do not have for example convincing queries that would give you a view of what happened to trigger and during and as a consequence of that migration that would be some kind of characterization that we've got to figure out so this is live work by the way I mean it's not something kind of we're done with and now we're here to to kind of share it it's this is ongoing stuff in our group we're far from done with it you know we're gonna be on this stuff for for the foreseeable future as I said this is purely for you know it's a toy it's a toy UI our work is focused on how do we get rid of that and be able to reason over a 10,000 node scenario so we right now the agents basically published to pretty much any messaging system AMQP rabbit MQ it's stored in a TSDB time series database we have not had the challenge of scale we're actually talking about this right now you know should we in our next steps should we do the SLA integration and stuff to do with the heterogeneity or should we go back and look at scale in particular now the reason we're not looking at scale is because Nick who was supposed to be here is on scale right so if he was here he tell you his answers to that so he's actually working on a project that is specifically designed just do the plumbing at scale and nothing else after that it's just another big data problem in actual fact technically it's a subgraph isomorphism isomorphism problem and usually I can say that without tripping over it so I that's kind of a roundabout answer to your question is that what you're looking for okay okay I mean peer-to-peer hierarchical structures we've looked at nearest neighbor there's a whole bunch of things but we're not looking at those seriously that's not it's not front and central to what we think is interesting right now we think we're a bit to go in terms of really proving out that we can move from something that's intuitive to a human to something that's parametric and natural to you know a machine reasoning process that's where our focus is it can be that but it can also be other stuff like Q lengths Q lengths are very interesting the methodology that's been behind our approach to performance analysis for some years has been the use USC if you heard of it utilization saturation and errors so it's Brendan Greg stuff so for the saturation stuff in particular we're looking at Q lengths and latencies you can you can get a lot of insight from looking at that stuff at multiple levels or right up through the application but what's a DBMS seeing is that waiting on locks you know all this kind of stuff not sure if it's a focus I mean our work I mean we're I'll have to be honest I mean we're we use open stack we have a deployed in our in our in our test bed but we've never touched line of code so we're not as close as okay somebody's somebody's contradicting me okay see a touch code right what you do what you break okay okay but like I said our focus has been just moving as fast as we can to get the concept on the table which means we we kept it separate it complements it coexist for now but you know if we had time or if we had enough you know channels open with other people who interested then yeah that'd be great stuff to go and explore we just use a TSTB and it's backended on to a Hadoop this this was very sort of incremental work you started in spreadsheets so there was no kind of grand plan for for this from from day one we were it started from you know the problem statement originally on that stuff was when you when you start to transition from spindles to SSDs bottlenecks start moving around at an alarming rate how do you keep up with where they're moving that's what started that work way back that was like five six years ago that was a problem statement any more questions so I don't know if the decks for these things get kept but I did park some email addresses on those three of us from the group here in the room so if anybody's interested make contact it we're gonna be here for the next few days almost a week we'd be quite interested in