 Hey, welcome everybody. Jeff Frick here with theCUBE. We are on the ground at the Cosmopolitan Hotel in Las Vegas for the first ever GE Predicts Transform 2016. It's a developer conference, about 2,000 people here to learn about IoT, the Predicts Cloud, how they can get involved. And it's different than the minds and machines which is more of a business focus. I'm the only guy with a tie, I think, in all the Cosmos. I've been walking around to the keynotes and we're excited to be here. And of course, when we come to these things we get the smartest people we can find and share the knowledge. So we're really happy to be joined by Harold Kodish, the CTO of GE Digital. Welcome. How are you, Jeff? Awesome. So a lot of buzz in the keynote. You just got off the keynote about an hour ago. What was kind of your impression of the crowd? Look, they come to know how to build this new ecosystem called Industrial Internet. It's something in the beginning, you compare it to the consumer internet, enterprise internet, the much more mature ecosystem. This is brand new. So people feel that there is a history in the making. And as I was telling them, five years from now they look back and transform 2016, they'll be able to tell themselves this is where it all began. Yeah, well, actually, we were there at the very beginning, Beth and Bill invited theCUBE to the initial launch at the Jewish History Museum in 2013, the original Industrial Internet launch. So it's amazing how far it's come. But most recently is the cloud, the Pridix cloud. So you've been in the industry a long time. You're at EMC, at VMware. So you kind of have a cloud perspective. What makes the Pridix cloud special and different than kind of what people are used to and kind of a traditional business cloud application? I would say that there are about four areas that differentiates from anything else. First, every cloud has its own key abstractions, what you're dealing with. You can deal with all kinds of things, but if you think about a document-based cloud, along the lines of what Microsoft is building, it's for documents, it's for abstractions. You can do other things too, but we are focused on what we call assets. You have clouds or system that are all about ERP, a system of records. In our case, it's about system of assets. The assets are can be the big dirty rotating machinery, can be small things like a pump or heat exchanger, can be a blade in a jet engine. But the idea is that that cloud allows us to aggregate all those pieces, parts, into sub-assemblies, assemblies, whole devices, and then fleets of whole devices. You can model a manufacturing plant, you can model a full airline with Pridix. So this is one thing. The second thing is that the system that we're building are considered, are dealing with critical infrastructure as deemed by the government or the operators. If you're gonna go and deal with a railway system and their whole system is down, it's a critical infrastructure at the security level, you know, at the national level. So many companies and many countries would like, require us to keep the information and the data in the same jurisdiction as it was generated. As let's say, compared to a repository where you store a bunch of pictures, if you're a 16 year old girl in Tokyo or Tel Aviv or New York, you really don't care what information is stored. Our operators do, so we have to right scale the support to make sure that we build data centers at the right level, not too big, not too small, but at the right jurisdictions where those data are gonna, pieces of data are gonna be created and analyzed. So number two, number three, is that if you think about the core execution units that we have is all about machine learning. We ingest the data, store the data, drive insights based on how we analyze the data. And so when you look at that, we have to build machine learning stack that are very low on false positive. In other words, identify very accurately when you have a problem and very low on a false negative. And this is where you identify very accurately, we don't have a problem. So think about us identifying, there is an issue with a jet engine. You know, you send the crew, you send a bunch of winches, you loan the airline a new engine to run while you take it to the shop, then you take it to the shop and you realize that there's nothing wrong with that. That's hundreds of thousands of dollars right there. You know, before you even start dealing with a problem. So our customers require that we'll be very accurate in identifying when there's a problem. And also, but also not cry wolf, wolf every time, you know, we see a shadow of a problem. That means that our machine learning algorithm will have to be of a different breed than what you have for consumers. Consumers, networks are much less sensitive to that kind of a problem. Right, right. So guy, you touched on all kinds of things. So I want to start with really kind of the IT versus the OT thing. Because you mentioned you've got a long IT background. OT is all this machinery and gear, it's been running forever. As the marriage of those two things comes together, what's been the biggest hurdle? What's the biggest opportunity to bring those two systems together? So IT systems for most part really don't care where the information is generated. And when they do care, they see that it's really generated from, you know, knowledge ingestion systems, I would say. iPad, smartphones, laptops, et cetera. We ingest information from devices don't look like computing devices. They are connected to compute devices. This is what we call the edge devices. Right, right. So you go to a wind turbine, you see a small controller next to that. That controller can compute, but we have to take the information from there. 40% of the data that the industrial internet generates is porous. The sensors about to get fried, the reading of the sensors of the charts, it's simply wrong information. And you have to decide, what is the right information and where is the wrong information? And you have to filter that in such a way that you don't clog the data arteries into the cloud by dumping exabytes on a cloud system without the ability even to ingest it, just to figure out that, again, 44 bits out of 10 are not bits that you should have gone and even taken the first place. So that level of dealing with the real world problems and not necessarily in the knowledge enterprise that most of my career was spent on, is really the big difference between the OT and the IT side. And the other thing that you mentioned, and I think it came up in the keynote, look at some of the notes, is most machinery works pretty well. It's been working pretty well for a long time. So kind of the data flow relative to the problems is huge relative to the things you actually have to pay attention to. And the other thing that's so different than in an IT environment is you can control the physical environment, the circumstance. Data centers are completely controlled ecosystems. Not so much the case if you're out in a wind farm, you're out on a ship, you're on a jet engine flying over the desert. So that's a significant difference. So how do you figure out which, how much compute, how much store, how much data to process locally versus the limitations of how much you can send upstream because of connectivity, the amount of data? How does that kind of get parsed out? So that's an excellent question that goes to the heart of the problem that we're dealing with. At the end of the day, as you articulated very precisely, we have to make sure that developers have the freedom to decide what to put in the cloud and what to put close to the edge. If you build a blow up preventer, which is the thing that didn't work in the Gulf oil spill and you'll- It's called a blow up preventer? Blow up preventer. Is that a technical term? Yeah, it is a technical term in the only gas business. It's a set of clamps that shut an oil hose down when you realize that there is a pressure or there's a tear. Some bad condition, right? So the idea is that nobody with the right mind would like to run a blow up preventer in the cloud. They would like to run it very close to the drill. At the same time- It's a little thing called latency. Exactly. Well, latency and inclement condition in the cloud may, God forbid, may not be even available. If you have a subsea rig, then the connectivity is done through satellite, solar flare on the sun can actually take the communication down. So there's all kinds of reasons why the cloud is not going to be available, which is why operators would like to do it very close to the edge device. That being said, blow up preventer is now a very sophisticated piece of software. You don't just look at the pressure, you look at temperature, you try to correlate it with seismic conditions and so on and so forth. You can prepare for something. And so the development and test of that can be done in the cloud. But the developers should be able to do it in the cloud, dev and develop and test the whole thing and then start proving pretty much in a drag and drop style almost, moving the right workloads to where it should execute. It may execute in a gateway, which is this little rack or box that sits on the rig. It may execute in situ, very close to the hose itself or the pipe itself. Obviously, reduce latency, higher cost of compute, but there's absolute availability. So on the spectrum between availability on one axis and cost on the other axis, you have multiple sweet spots where you can decide where to run your workloads. And that's exactly what predicts, give the freedom to developers. Right. So I want to drill down on that because I think that's a great concept that a lot of people just don't get, I don't think, is that you have multiple sweet spots based on what the objectives are that you're trying to achieve. And as you said before, we went online. It's not just because the car manual says, take your car in at 5,000 miles, depending on economic situations. You know, a whole host of factors, you may choose a different optimal based on completely different factors. Right. So look, I think that the good news is that the state of the art in internet computing has moved to the point where we see some real assets we can use. So I'll give you an example in the context of what we just talked about. It used to be that you would execute those workloads. And it was really based on how much space and how much compute resources you can take. So naturally, if you have a big job, it was impossible to move it all the way to the smaller devices at the edge because they didn't have the underpinning that will make it look like it was a cloud. You cannot have a micro cloud in a small controller. But with the introduction of containers, you know, all of a sudden, it's a much lower cost lightweight way to aggregate compute resource, storage resources, and the code that you want. And as long as you have enough resources at each one of those nodes, you can move it to do the right things at the right place. So if you think about the wind farm, every wind farm has multiple turbines. Every turbine has a controller. The whole wind farm has its own gateway. And then multiple wind farm may have a supervisory gateway, which is actually where you control everything. And then you have the cloud. So if it's really a big machine learning job, you know, you have to build a very sophisticated model, you'll do it there. If it's something that allow you to manage multiple wind farms, you'll do it at the supervisory control. If it's something that allow you to optimize the operation of a turbine, where you change the angle of attack as a function of the wind, and you just don't do it in a reactive way, but you forecast how the wind is gonna change based on all kinds of meteorological models, then you can do it very close to the edge. The model will still be executed in the cloud, but the actual real-time operation is gonna go to the edge. And that's how we believe things can move very seamlessly between the various pieces. Right. And it's the sophistication of those types of decisions. It's got to be machine learning, right? It's not a human that can make those in real time. Way too many data points, way too many algorithms. So they're really taking the data and kind of in a supervisory role. Right. So even in machine learning, you have multiple levels. You have one level that says, look, I'm not gonna tell you, I'm not gonna tell the machine how to create, let's go back. You have a rule-based system, right? So if you have a set of pieces of knowledge that got accumulated over decades by experts, expert technicians, expert operator, you'd like to capture those rules and put it in one stack of machine learning. Then you have a machine learning that is called, we call supervised learning. You tell the system what are the basic things it has to look for and you let it figure out itself. And on the extreme side, you have the unsupervised learning where you define for the machine what is gonna be the winning function, what it needs to do, and you let it figure out. It will fail in the beginning and the more data it has, the better it's gonna be. Right, right. And so what we would like to do is we don't wanna make bets. We would like to aggregate all those pieces of machine learning domains and put them all. And because our devices are much more expensive, we can actually spend more money on creating a machine learning algorithm that will be optimized, will be very strong in minimize false positive and minimize false negative. Even it's gonna spend a little bit more money than let's say a consumer internet algorithm where usually the customer or the consumer is not willing to pay anything for that. Right, right. You're not willing to pay anything for an Amazon recommendation. It's nice to have and really if it doesn't hit you as the right recommendation for you, you move on, no harm, no foul. Right, right, right. Back to our example, if we take an engine off the wing, we have to identify what the problem is there. Otherwise we're not gonna do our job well. So it's gonna get fired. So it's a slightly different topic, but very relevant and especially coming from the ITTOT and that's the data regulation, where the data sits. And as you're talking, I'm just thinking of that jet engine or the locomotive that crosses international borders. Absolutely. So how does that work in this space in the industrial internet space when planes by their basic definition are usually crossing international borders? Where's that data? How do you deal with those types of regulations and rules? So is it gonna appreciate, this is a very complicated dance. First, it's... Data sovereignty, that's the word. Data sovereignty, that's the word I was looking for. Data sovereignty and data residency. So if you look at what's happening is that countries would like to have different regulation and rules based on what they think is best for their economy, for the governance of the countries and so on and so forth. We have to live with that. We think that in the next five years, we're gonna see a significant amount, 30% and up of the countries enacting data sovereignty regulations and we'll have to play with that. At the same time, we're working with multinational operators that would like to see pretty much a high level view of the whole operation. Really don't care that some of the data has to be in China, some data has to be in Europe. So we're trying to serve both masters. So one way is, the way we deal with that is that the data itself is gonna be in the appropriate data center. If the plan lands in Beijing or Shanghai and the data is gonna stay in China, so be it. We're building, we're working with China Telecom and building data center in China is gonna be sanctioned by the government. They can put the data there. Now the question is what can leave the jurisdiction and really it's really in many cases it's a high level views of the averages or the economical levers that the operators would like to see. So if it's a United Airlines flight that leaves the lands in Beijing, the actual information may stay in Beijing, but the insights and as a result, the outcome of that can still move back to the cloud where United Airlines in Chicago will be able to see all of that. So we have to work with each one of the customers to understand what's important for them to keep in the jurisdiction and what we can extract as a high level aggregated point of view and give it back to them wherever they are. So Harold, love this conversation. We're running out of time. I want to give you the last word. So again, you've been in the IT business for a long time. You've been in the enterprise space. You've made the jump over to GE specifically and IOT generally in the industrial internet. What are you excited about? Why did you make that move? As you kind of look down the road, I won't say five years, that's forever in our world. A couple of years, two or three years. What's waking you up in the morning? So what I'm excited about is the fact that this predicts is going to be the only cloud that is pure industrial internet. Other companies and other cloud operators and we have about five or six at the mega scale level would argue that you can do it. A cloud is a cloud is a cloud. I would beg to differ. I think that as you move, you follow the bit from the edge device all the way to the cloud. You see it behaves differently. It gets to write a different articulation, different analysis. What keep me awake at night is just the fact that we have to grow this thing and we have to get enough developers. The consumer world has benefited tremendously from the tribe of developers descending on a problem, be it shared rides or auction and really make life easier for all of us. We cannot even remember the time when we, you had to go to a real store and buy the stuff. You look at Amazon or eBay today. We intend to make the same thing in a much more regulated, much more disciplined environment. But the important thing is just to grow the tribe, to have enough developers knowledgeable about predicts and the predicts related abstraction and this is what we're all about. This is what this conference is all about. A heck of a start. I tell you, we go to a lot of conferences, a lot of developer conferences and I don't know that I've ever been to one inaugural one that had 2,000 people. So you're clearly on the right track. Well, thank you very much. Absolutely. Well, thank you for stopping by. I appreciate it and best of luck. We'll see you maybe at Minds and Machines. Absolutely, looking forward to it. All right, Harold Kadesh, the CTO of G Digital, I'm Jeff Frick. You're watching theCUBE. Thanks for watching.