 I think we can get started. Thanks for coming. I want to talk a little bit about what we call the yoga. Start out with a little bit about my background and talk a little bit about why a number of us think IRT gateways are actually pretty important in the IRT architecture. And we'll see if you guys agree. And then, you know, really being more in IRT, what's all that about? So I'll try to clear that up, why we're here, why we're interested in IRT, and kind of what our thinking about it is going forward. But then really concentrate the bulk of the talk on the yoga, what it is, we had a hackathon yesterday, and that environment is still off. So if anybody is interested after this talk, come and see me. We'll give you one of the VMs and you can go through the course syllabus. We won't be able to do everything because we have an actual robot there yesterday that guides the program. But you can do most of it if you want to learn a little bit about the yoga. And then show you some end-to-end open examples so we're easily out of that. And then also some examples using the virtual platforms on the cloud spot. So this is what we kind of see the architecture ending up in the next five years or so. There's going to be a number of devices that connect directly to the data center side of the cloud. You know, I kind of use data center, cloud server to mean the other side, right? Whether that's on-prem solution, private hosted cloud, VPNed into something or another, corporate cloud, clouded in your home, I don't differentiate between those. So for the rest of the talk, that's just sort of the cloud side. And then of course in our case, our customers really want the results of IoT to connect into their business apps because modifying or changing how their business apps receive input and behave, that's where they get value for their customers. And so in the enterprise world server though, the right-hand side of the cloud is why they want to do this. And we're kind of interested more from cloud side out the devices. I started looking at IoT back in the late 90s. My, sort of my background is real-time scheduling theory. I worked at first IBM and then at SON during the real-time Java specification and implementing that with the team. And so I spent like a better part of a decade hanging out in the embedded community, new to the open source community. So Lyoda is sort of my first project that I kind of open source. And so that's why we're here. And it'll be in their experience for us. But I do understand the embedded space pretty well. And one of the things about the, there's a number of things about the embedded space that are very different from the enterprise space. You can tell me if I'm preaching to the choir here, but because of the work I did in my graduate work and the work around real-time Java, I've always sat professionally sort of right at the intersection of those two worlds, right? On the one side, Java was enterprise Java, right? That's what the big companies use now to implement their databases and web services and everything and Java's right there. And Java was about the embedded side of the world. And so in a single VM, we had this clash of worlds between enterprise and embedded. And even my graduate work is in this area. So I've had kind of a long history. Actually, my full professional career in computer science has been sitting right on that founder. And so IoT is a natural place for me to hand up because IoT is a lot about the connection in some ways of those two different disciplines. And I look at the IoT gateways as sort of a decoupling point rather than a connector. A lot of people call them the connectors between the two worlds. And given my understanding and the differences among those worlds, I don't think we actually want to connect them. I don't think we want to keep them separate. And the IoT gateway, I think, does that for us because we can isolate all the differences of the industrial automation thing world, the embedded side on the left side of the IoT gateway, and on the right side of the IoT gateway, it can behave like any of our systems connected to the internet, the enterprise, the web, whatever you have on that side. Another characteristic of the embedded space is the large heterogeneity of devices, communications protocols, security issues, physical issues, compared to our side of the world. So we kind of have more or less one layer four protocol for pretty much everything we do. Most of their stuff on their side doesn't even get up to layer four. Pretty much think of it as IP, a lot of it is serial, certainly not wide area. So that's the difference, and the gateways probably spray stuff, or keep them separate, really. Keep couple of the differences. A couple of reasons we want to do this is that, I don't remember getting any more. They didn't seem to complain. You can change the start over. Okay. That's the one thing. That's the one thing? Yeah. Okay. All right. So, I don't know where to start over here. Anyway, expected lifetime. That's sort of a big thing. If you buy one of these devices and your refrigerator, you expect to keep it for a while. I don't expect to buy a new refrigerator as often as I buy one of these. I buy these too often, but I wanted to last a decade and a half, two decades. And the question then is, if that thing is connected with the layer four connection through the public internet to something, is the manufacturer of the device going to keep the software on that endpoint the level of best practices that exist on the internet for 20 years? That's a real question. Can we rely on that? If they don't, that layer four connection degrades and that connection becomes vulnerable from anywhere in the world. So that's where the security aspect comes in. I work at VMware. We're really lucky to be right next to the Stanford campus, and we work with a group there called Secure Internet of Things Project. And it's run by a bunch of professors who, like me, started way back when the almost when the internet started and who were firm believers in the peer-to-peer knowledge under, the peer-to-peer convention of the internet, right? Two machines got to talk to one another over this thing we call the internet. And that sort of degraded it a bit. We got client server, and then we got sort of the big players coming in. And over time, that notion has drifted away a bit. But fundamental in that notion is the idea that a middle box, what we call them, routers, gateway switches, they don't get to mess around with even layer four data, right? Anything under layer four, that's fine. But layer four and above, that was the end point to end point configuration, or end point to end point conversation. Well, these professors there, also our chief research officer at VMware, Dave Tannenhaus, who has a long history at pretty impressive institutions, including DARPA as a program manager, and myself have convinced ourselves that IoT gateways will be the first time that we have a middle box that gets to fool around with all of the layers, all the way up to seven. And that's how you discover something in IoT gateway. Is there a need for changing the conversation completely, terminating the end points in the gateway on either side? In some cases, it's obvious, right? I mean, if you have an IoT gateway function in automobile, you're not going to have the break sensor with its own TCP stack communicating directly over cellular to some data center. It just doesn't make sense. And so it talks CAN bus like it normally does. There's IoT gateway function on the vehicle to do the conversion of all the layers and talk to the data center. At the enterprise level, thinking about the security issue is very critical. And I'll talk a little bit about that more later. So right now, in our view of the world, I just want to differentiate between what we call infrastructure telemetry and content telemetry. And there's an example there with transportation. So content telemetry is the telemetry of the data that everybody in IoT is talking about, where most of the value is going to be derived, where all the excitement is. And for corporations, most of the revenue is going to come from analyzing that data. That's not where VMware sits. We sit squarely on the infrastructure side. So we want to collect telemetry from the IoT infrastructure, from the devices and the IoT gateways that make up that infrastructure and keep track of their operational status. If you think about what VMware does in the data center, of course, there's vSphere, vCenter, vSAN, NSX. But we also have a suite of management tools that manages all those pieces in the data center. We're extending that notion out to the IoT infrastructure. So that's where we want to sit. And to do this, you need basically the same kind of things you need in the data center. You need to get telemetry from all those things that inform you what their operational status is. You need to be able to push control back out to them. You need to have some sort of alerting mechanism. You have to be able to change the configuration and manage the application lifecycle on all those endpoints that you manage. The trouble is that at the enterprise level this really comes in and is causing our customers to think about how they're going to approach IoT. So our customers are the CIOs and their staff of about 80% of the Fortune 1000 companies. And pretty much 80% of the data centers in the world run on our stack. And so our customers are coming to us and say, look, there's a lot of this IoT stuff out there. It looks pretty interesting. Companies are trying to install these things they call IoT gateways all around our enterprise. They come from different manufacturers. They come from manufacturers that we've never heard of. They want to stream data outside our enterprise without us giving any control or understanding where that data is going. And we need help from you who have been our strategic partners in this whole infrastructure thing to help us look at how to get a handle on this complexity coming our way. So what we're trying to do is build a management suite so this doesn't always work. Let me just try this. So we're trying to build a management tool to manage all of your IoT infrastructure and enterprise. And that's independent of where the gateways come from, who does the content analytics, where the gateways are placed. So we want to be able to give our customers the opportunity of managing that with a single tool. Because they're saying, look, if I deploy this gateway from company X, they have a kind of a management solution for their gateways. But company Y has a management solution and there's these small IoT companies with solutions that we think have some value, but they are kind of using some hack-together gateway we've never heard of before and they want to stick it on our walls. And so our idea, and this is just a mock-up, the actual, the version of this tool we have actually running and that we used in the hackathon yesterday, is not as pretty as this. But basically the idea is all the IoT gateways and edge systems in your enterprise, here's the number you're managing, 200 are edge systems, 800 are IoT devices attached to those. You can look at the alerting status. You can then move to a part of the tool where you can control the application lifecycle on all those pieces. So, enough of that. That's our story. We're sticking to it. And I'll show you a little bit. We'll get to Leo really quickly. I'll show you a little bit of what we did at VMworld. I'll have to get out of this again. So along these lines. So at VMworld we had seven different IoT solutions, each with a different kind of IoT gateway. So this first one was two industrial cranes with, this is kind of fuzzy, embedded PCs by a company called SK Solutions. SAP was doing their content-side management for those cranes. We were managing the gateways. Here's the robot we had at the hackathon yesterday. ThingWorks was doing the content analytics Dell IoT gateway. We were managing that gateway. Here's a smart building. That's an Intel gateway. Again, same story. Video surveillance unit. Leota is running as our agent on that. Leota is running on all of these IoT gateways. They're doing the content analytics for that, which is facial recognition and gunshot detection. There's a head unit from a car. That's under our management. Again, pillbox is from Deloitte and a Coke machine. So the message at VMworld was, okay, very different IoT solutions, but this is just all managed by project. So this is how the API looks today, and we can actually push fixes down to that red gateway and then the gateway turns green because we fixed the bug. All right, now. So that's kind of what VMware, our idea, my idea of why IoT gateways are important and how VMware is looking to play now for the open part, the little IoT agent, Leota. So what we really wanted was some sort of a framework in which we could build a portable, modular, easy to deploy framework for IoT gateways that we could get on any IoT gateway in the world. Even if the vendor of the IoT gateway wasn't so interested in working with us, like maybe had their own tool and said, no, we don't want to use your tool. So we wrote it in Python, and we're making it open source. And we're at the early stage, just beginning to reach out to the community. We did the hackathon here. We'll probably do the hackathon again in February in Portland, so we were late getting on the schedule here. But hopefully we'll do it again, so look out for us. It's a BSD2 clause license, which is the most liberal license my lawyers tell me you can do in open source, or do anything you want with it, just keep the headers there. Don't put anything back, put proprietary stuff in and ship it. We don't care. Like I said, we haven't found an IoT gateway in which this won't work. And really, why we're doing it, why we will take a community version into what we call Project ICE right now, this management tool, is exactly that. We want to be independent of content analytics, gateway vendor, or where this is in the organization. Is it in the factory? Is it in real estate? Is it in office buildings? Is it in warehouses? Is it on vehicles or trucks? Okay, so the high level design, we sort of abstract the problem into six major abstract classes. Device represents some sort of a data source connected to an IoT gateway or the edge system. So you'll hear me say IoT gateway, edge system, those are like the same. Device could be on the edge system itself, so in a bunch of our examples, we use RAM memory as a device. So we just create a device representing RAM memory and flow metrics back about that. But it could be any data source basically. Device comms are an abstraction for the communication mechanism between the device and the gateway. And those come in all kinds of flavors. Like I said, CAN bus, MOD bus, Ethernet, EtherCAT, wireless IOProtocol, wireless protocols for the factory, real-time wireless protocols, real-time heart, all over the map on that side of the world. For the home, it's ZigBee and Z-Wave. An edge system, that's the entity that represents the edge system itself. A metric is an entity representing a stream of number, comma, time-stamped tuples. So that's the fundamental thing we do with Liota is create sort of a pipeline to get this stream of tuples back to a data center component. And that's what we call our abstraction for whatever it is on the data center side that's ingesting this stream of numbers. We call them DCCs, data center components. And a package manager. A package manager is a load and unload Liota packages. I'll talk more about the package manager in detail later. Before we get to that, though, here's what I spent some time on the manhandle PowerPoint into doing what I wanted to do yesterday. I really don't like PowerPoint. And to do an animation like this means I cared about doing something for you guys so you would understand it. So first you have to sort of pick of those, we have an abstract class, entity, a subclass of that edge system, and then some concrete implementations of that, like the Dell Edge 5000, the DK300, the Intwine, and so forth. So you instantiate one of those gateway objects or build your own. Do the same thing for device, same thing for communication mechanism. Create a metric. Metrics don't have flavors until you create one and I'll go into how you do that a little bit later. And you pick a data center component that you want to work with, a communication mechanism for that, and some sort of cloud piece to talk to. So these are more or less pluggable and somewhat dynamic. Now, in the talk today, and if you've used Node-RED, you know, you use Node-RED to wire things together, this isn't wiring, this is just PowerPoint. This is just explaining to you what you do in code. I've never been big on graphics for coding. I always find it too limiting, so I just would rather write code, but that's just me. So some comments on that. So the device and DCC comms are not necessarily a PubSub mechanism. So the way that people tend to like to do things like that is have sort of a message bus. WinRiver IDE has MRAA and D-Bus. You can also use sort of MQTT and other messaging protocols, sort of between devices and the edge system and the edge system and the data center. I'm not, some of that's okay, but as I mentioned, you know, my history is way back when I started learning about this stuff in the 90s. So I'm kind of a, I tend to like more tightly bound communication mechanism, so I like to open a socket to somewhere and then just have it. I really don't like HTTP based REST communications for data flow. It's okay for management, it's nice, but for flowing data, I don't like the whole idea of stateless. There's too much about data flows that you should know about the flow called metadata that you don't want to flow every time. And a lot of the solutions I've seen that don't consider this, that they just want to use REST based protocols, have to flow that metadata every time. And that metadata can get pretty big. We, in the early days, we used a, you know, a temperature sensor, a thermistor. We went to Mouser and got an LM35 thermistor. It's just, you know, three wires, a small chunk of plastic, signal power and ground. And, you know, it's a data sheet, literally, not kidding, it's data sheet is 30 pages long. And that specifies how this little chunk of plastic behaves in every conceivable situation. That data sheet is the metadata that belongs with the stream of temperatures that come off that temperature sensor. If you'd want to do serious analytics on something where that temperature sensor is being used to measure the temperature, you may need to know a lot of the stuff that's in that 35 page, 30 page data sheet. Like what is the, what are the error bounds under these particular conditions under this, this source voltage? If you're nest and you're measuring temperature in homes, you don't care. If that temperature sensor is in the reaction, vessel of some big chemical plant, it's in a rocket motor, and you're doing analytics on it, you better know what the error bounds are. Because if you don't, you're not going to get the right answer. So things like that, I feel, you know, we really need a way to take that metadata and move it to the data center component one time and not keep flowing it. So that's the idea that these entities register with the data center component in our model. It may or may not actually cause a flow. Graphite doesn't use registration. Our project ICE does. ThingWorks does. I don't think IBM does now. You sort of pre-register, pre-create the device in Blumex and then go. But our model is that the gateway should do everything. You should never have to do any punching buttons on a GUI to get ready to send a device. So metrics, there's no problem sending the same metric to multiple DCCs. You can register it with multiple DCCs. You get back a registered object. And the registered object then is what participates in the framework. And we'll get to that in a minute. So again, devices and edge systems, you can register them in multiple data center. And by and large, device and edge systems now are placeholders. We don't have much in them. The hope is that over time, we'll understand as we experiment with more of these and more of these get into the repo. We can do some refactorings periodically and move commonalities up into the abstraction and then create concrete representations and put idiosyncrasies in there. Now, how do we get data? So this is a little controversial, given my history in the embedded space, I didn't think I could do a credible job of abstracting every device in the IoT space with some simple API that me and my team would dream up overnight. So for now, we sort of punted on that and we left it completely open. And the way we do this is we have these things called user-defined methods or UDMs, and I'll go into sort of detail about how those work later. So again, registration was you create an entity and register with a data center component. It's at that time that the data center component can associate some metadata with that entity either with a stream of numbers or with the gateway or device. So the package manager is a lot like OSGI bundles. In fact, frighteningly so. So the intern and I that developed it last summer, the package manager, we were thinking, well, this sounds kind of cool. Do you think we should patent it? Or other stuff, and in the back of my mind, I had these OSGI bundles. So we went through that and we said, no, actually, we just rewrote OSGI bundles for Python is what we did. And so that's the idea. Leota package has a list of names of other packages that should be loaded before this. So it depends on these packages. So it's a dependency list. A run method that gets called when you load it. It's called when you unload it. And a way to get access to other objects that have been created that it needs, a registry. So a registry is passed into the run method. Other packages will put representations of themselves or other objects into that registry and any package can pull whatever it needs out to operate. So user defined methods. So this is the way we get to devices right now. And if we're going to, if we want to flow the CPU utilization of one of these edge systems, the way we have done it in all of our sample code is this way. So we define a method and then we create a metric here. Give it a name. There's no units for, no SI units for percent. And give it an interval of 10 seconds and an aggregation size of two, meaning execute this UDM every 10 seconds when you collect two samples, flow those samples to the register data component, data center component. And that UDM can go out and do anything it wants to do. It can be blocking or polling. It could access a dozen different devices, get values from each, do some computation on those, return a single value. It's really up to the UDM. It's just code at that point. The only restriction is that it either returns a scalar, a single number, or a list of tuples of timestamp value. So now I just wanted to walk through the repo a bit and take a look at some of the examples we have there. So this is just a convenience to get some of the values out of config files for our customers. So here's the UDM we're going to use in this example. CPU processes, utilization, disk usage, networking bytes, memory free. So we used to be called IoT Control Center. We're now called Project IS. We were known as Helix at one time and Rialto is another and we'll have another official name when we beta later this year. So Project IS for now we're also known as Control Center. We got tired of changing the name in the code so we're just sticking with this now. Here we're creating an edge system. So the instance type is Dell 5K edge system. We get that back. We register it with RTCC. Set some properties on it from the properties list. Properties in this case are key value store arbitrary key value store for this. Here's where we created that metric. Register that metric with the data center side. This is optional but we create a relationship between the metric and the gateway because that's what this metric talks about is the gateway. And then we start collecting. And we'll go over how these are collected. We do the same thing for the other metrics. Here's a simulated device so we're using RAM as a device but we could just as well have used a thermistor attached to this or some sort of network connected device. Basically go through the same process create the entity register it create a relationship create a metric on it do that start collecting the metric. So I want to show you an example from package manager. So we in my intern this summer I said hey there's these SI units thing we'd really like to have them useful in the code. I don't know if you know SI units or anybody know pint the realization of that in Python. So he said so okay so what kind of what am I going to use it for? I said so first write a simulation of a stationary bike so it can produce some like numbers in physics and then use do some physics calculation on them using units make sure you don't get anything wrong produce a value. So he did that here's where you check units to make sure that the thing you're doing is correct. So he does a bunch of his UDMs here here's his get the power from the bike so he's computing with units and here's the beginning of the the run method for this package. Does he have a dependency list up here? I think he did. But in the run method then this is how he creates the metrics and pushes them out to the graphite open source time series graphing tool. So now I wanted to go over sort of the metric handling and once you create a metric create its metadata register with the data center and start collecting what happens to it. So we have three queues an event queue a collection queue and a send queue. Once a metric is you start collecting it we look at the next absolute time that this metric should be collected or the next collection time and put it in the event queue. Event queue is sorted by priority with the next ready event at the front of the queue. When the event thread wakes up it looks at the front of the queue and says are any of these metrics ready to collect? Yes, grabs them, throws them into the collect queue and keeps doing that until the one at the head of the queue isn't ready to collect and it goes to sleep until the next time it should look. Those in the collect queue get grabbed by a collection thread so there's a thread pool they're assigned to a metric and then the UDM is executed and the UDM can be polling a event so the UDM could go sit and block on a communication point and wait for something to show up there, it's perfectly fine or if you know if you're accessing a device, if you're accessing some sort of a register and there's always a value in there, the UDM can return immediately with that so a polling kind of mechanism. At the end of that if the if you've accumulated enough values then a clone of that metric gets sent to the send thread to the data center component and then the original one goes back into the event queue at the proper place for its next collection cycle and then in the send thread it gets sent to the data center component to which it was registered and there's just a little simulation to walk us through this. This is a single metric in the event queue, its next collection time is 5 and we're at time 0 so at time 5 it moves to the collect queue its aggregation size is 1 so when it gets collected by the collect thread its clone is going to go to the send queue and it'll go back in to be collected at time 10 so a complex case of two of them in there 5 and 8 same thing happens with 5 but now its aggregation count was 2 so it goes back to the party queue or the event queue we don't do that at time 8 because 8 will go through that round get collected and go to 16 at time equal 10 this one goes through, gets sent and now we're at 15 and 16 in the event queue so back to this you can do this you can use Lyota for open N10 using open source stuff we're happy to have that happen one of the problems yesterday at the hackathon was to write a data center component for bluemix and one of the people there said I don't like bluemix because it's from IBM so I'm going to do my own so he did a DCC for pickwreck which is kind of a web statistics tool and he did it in about three hours and so now he can use Lyota completely to send time series data to pickwreck and of course you can use it for sending to any of those we have now bluemix DCC's AWS working works people I know are working on one for a juror and so that's possible as well as I mentioned earlier we did the hackathon yesterday with Cosima our ABB Yumi robot and the simulation environment is still up so if you're interested you can go to that URL and syllabus detailed instructions how to install and work with Lyota and see it come up in in ice and if you know let us know and we'll give you the login creds for one of these Ubuntu VMs and you can go play around with it and if you're interested that's the hashtags that our marketing people are anxiously watching for so do us a favor even if you just tweet with those and say I didn't like it because they don't read the tweets they just count them and if you're interested there's a survey if you want to do that so any questions now I think we're about out of time but maybe a few questions okay cool thanks guys