 Hey, welcome back everybody. Jeff Frick here with theCUBE. We're in our Palo Alto studios having a CUBE conversation, a little break in the action of the conference season before things heat up, before we kind of come to a close of 2018, it's been quite a year. But it's nice to be back in the studio, things are a little bit less crazy and we're excited to talk about one of the really hot topics right now which is edge computing, fog computing, cloud computing. What do all these things mean? How do they all intersect? And we've got with us today, David King. He's the CEO of Foghorn Systems. David, press off welcome. Thank you, Jeff. So Foghorn Systems, I guess by the fog, you guys are all about the fog. And for those that don't know, fog is kind of this intersection between cloud and on-prem. And so first off, give us a little bit of background of the company and then let's jump into what this fog thing is all about. Sure, I actually, all dovetails together. So yeah, you're right, Foghorn, the name itself came from Cisco's invented term called fog computing from almost a decade ago. And it connoted this idea of computing at the edge, but it didn't really have a lot of definition early on. And so Foghorn was started actually by a Palo Alto incubator just nearby here that had the idea that, hey, we ought to put some real meaning and some real meat on the bones here with fog computing. And what we think Foghorn has become over the last three and a half years since we took it out of the incubator since I joined, was to put some real purpose, meaning and value in that term. And so it's more than just edge computing, right? Edge computing is a related term in the industrial world. People have said, hey, I've had edge computing for 30, 40, 50 years with my production line controllers and my distributed control systems. I've got hardware to compute. I run, they call them industrial PCs in the factory, that's edge compute. The IT world's coming along and said, no, no, fog compute is a more advanced form of it. Well, the real purpose of fog computing, of edge computing in our view in the modern world is to apply what has traditionally been thought of as cloud computing functions, big, big data, but running in an industrial environment or running on a machine. And so what we call it is really big data, operating in the world's smallest footprint, okay? And the real point of this for industrial customers, which is our primary focus, industrial IoT, is to deliver as much analytic machine learning, deep learning AI capability on live streaming sensor data, okay? And what that means is rather than persisting a lot of data either on-prem and then sending it to the cloud or trying to stream all this to the cloud to make sense of terabytes or petabytes a day per machine sometimes, right? Think about a jet engine, a petabyte every flight, right? You want to do the compute as close to the source as possible. And if possible, on the live streaming data, not after you've persisted it on a big storage system, right? So that's the idea. You touch on all kinds of stuff there, so let's break it down, let's pack it. So the first off is just kind of the OT-IT thing. And I think that's really important. We talked before, we turned the cameras on about Dr. Tom from HP, he loves to make a big symbolic handshake of operation technology, right? And IT and the marriage of these two things were before, as you said, the OT guys, the guys that've been running factories, they've been doing this for a long time and now suddenly the IT folks are buttoning in and want to get access to that data to provide more control. So as you see the marriage of those two things coming together, what are the biggest points of friction and really what's the biggest opportunity? Yeah, great set of questions. So quite right, the OT folks are inherently suspicious of IT, right? I mean, if you don't know the history, but 40 plus years ago, there was a fork in the road in factory operations, were they going to embrace things like Ethernet, the internet connected systems? In fact, they purposely air-gapped or islanded those systems because it was all about machine control, real time for safety, productivity and uptime of the machine. They don't want any, you can't use kind of standard Ethernet, it has to be industrial Ethernet, right? It has to have time-bound and deterministic, can't be a retry kind of a system, right? So different MAC layer for Ethernet, for example. What does the physical wiring look like? It's also different cabling because you can't have cuts jumps in the cable, right? And so it's a different environment entirely that OT grew up in. And so, Foghorn is trying to really bring the value of what people are delivering for AI essentially into that environment in a way that's non-threatening to is supplemental to and adds value in the OT world. So Dr. Tom is right, this idea of bringing IT and OT together is inherently challenging because these were kind of fork in the road that islanded networks, if you will, different systems, different nomenclature, different protocols, right? And so there's a real kind of education curve that IT companies are going through. And the idea of taking all this OT data that's already being produced in tremendous volumes already before you add new kinds of sensing and sending it across a LAN, which has never talked to you before, then across a WAN to go to a cloud to get some insight doesn't make any sense, right? So you want to leverage the cloud, you want to leverage data centers, you want to leverage the LAN, you want to leverage 5G, you want to leverage all the new IT technologies, but you have to do it in a way that makes sense for and adds value in the OT context. Right, well I'm just curious, you talked about the air gapping, the two systems, which means they are not connected, right? No, they're connected to themselves and industrially. Right, but before, right, the OT system was air gap from the IT system, so thinking about security and those types of threats. Now, if those things are connected, right, that security measure has gone away. So what is the excitement, adoption, scare when now suddenly these things that were separate, especially in the age of breaches that we know happen all the time as you bring those things together? Well, in fact, there have been cyber breaches in the OT context. Think about Stuxnet, think about things that have happened, think about the utilities back east that were found to have malware implanted them. Right. And so this idea of industrial IoT is very exciting, the ability to get real time, kind of game changing insights about your production, you know, that's the huge amount of the economic activity in the world could be dramatically improved. You can talk about trillions of dollars of value, which is what McKinsey and BCG and Bain talk about, right, by bringing kind of AI, you know, ML into the plant environment. Right. But the inherent problem is that by connecting the systems, you introduce security problems. You're talking about a huge amount of cost to move this data around, persistent than add value. And it's not real time, right. So it's not that cloud is not relevant. It's not that, you know, it is not used. It's that you want to do the compute where it makes sense. And for industrial, right, the more industrialized the environment, right, the more high frequency, high volume data, the closer to the system that you can do the compute, the better. Right. And again, it's multilayer of compute. You probably have something on the machine, something in the plant, and something in the cloud, right. Right. But rather than send raw OT data to the cloud, you're going to send assist intelligent metadata insights that have already been derived at the edge to update what they call the fleet-wide digital twin, right. The digital twin for the whole fleet of assets that's in the cloud. But the digital twin of the specific asset should probably be on the asset. Right. So let's break that down a little bit. There's so much good stuff here. So we talked about OT IT in that marriage. Next, I just want to touch on cloud because a lot of people know cloud. It's very hot right now in the ultimate promise of cloud writers. You have infinite capacity. Right. Available on demand and you have infinite compute and hopefully you have some big fat pipes to get your stuff in and out. But the OT challenges, and as you said, the device challenges very, very different. They've got proprietary operating systems. They've been running for a very, very long time. As you said, they put off boat loads and boat loads and boat loads of data that was never really designed to feed necessarily a machine learning algorithm or an artificial intelligence algorithm when these things were designed. That wasn't really part of the equation. And we talk all the time about, do you move the compute to the data? Do you move the data to the compute? And really what you're talking about in this fog computing world is kind of a hybrid, if you will, of trying to figure out which data you want to process locally and in which data. You have time, relevance, other factors that just go ahead and pump it upstream. Right. That's a great way to describe it. Actually, we're trying to move as much of the compute as possible to the data. That's really the point of, that's what we say fog computing is a nebulous term about edge compute. It doesn't have any value until you actually decide what you're trying to do with it. And what we're trying to do is to take as much of the harder compute challenges like analytics, machine learning, deep learning, AI, and bring it down to the source. As close to the source as you can because you can essentially streamline or make more efficient every layer of the stack. Your models will get much better, right? You might have built them in the cloud initially. Think about a deep learning model but it may only be 60, 70% accurate. How do you do the improvement of the model to get it closer to perfect? I can't go send all the data up to keep trying to improve it. Well, it's typically what happens and I down sample the data, I average it and I send it up and I don't see any changes in the average data. Guess what? What you should do is inference all the time and all the data, run in our stack and then send the metadata up and then have the cloud look across all the assets of a similar type and say, oh, the global fleet-wide model needs to be updated and then to push it down. So with Google just about a month ago in Barcelona at the IoT show, what we demonstrated was the world's first instance of AI for industrial, which is closed-loop machine learning. We were taking a model, a TensorFlow model, trained in the cloud in a data center, brought into our stack and we're running 100% inferencing on all the live data, pushing the insights back up into Google Cloud and then automatically updating the model without a human data scientist having to look at it because essentially it's MLNML. And that to us, MLNML is the foundation of AI for industrial. I just love that something comes up all the time, right? We used to make decisions based on the sampling of historical data after the fact. That's right. That's how the world's been doing it. Right now, the promise of streaming is you can make it based on all the data. All the time. All the time in real time. It's a very different thing. So, but as you talk about, running some complex models and running ML and retraining these things, when you think of edge, you think of some little hockey puck that's out on the edge of a field with limited power, limited connectivity. So, what is the reality of how much power do you have at some of these more remote edges? Or we always talk about the field of turbines, oil platforms, and how much power do you need and how much compute that actually starts to be meaningful in terms of the platform for the software. Right. There's definitely use cases like you think about the smart meters in the home. The older generation of those meters may have had very limited compute, right? Like, you know, talking about, you know, single megabyte of memory, maybe or less, right? Kilobytes of memory. Very hard to run a stack in that kind of footprint. The latest generation of smart meters have about 250 megabytes of memory, right? A Raspberry Pi today is anywhere from a half a gig to a gig of memory and we're fundamentally memory bound and obviously CPU is trying to really fast compute like vibration analysis or acoustic or video. But if you're just trying to take digital sensing data like temperature pressure, velocity torque, you know, we can take all humidity. We can take all of that. Believe it or not, you know, run literally dozens and dozens of models, right? Even train the models in something as small as a Raspberry Pi or a low-end x86, right? So our stack can run in any hardware with complete OS independent. It's a full up software layer that the whole stack is about 100 megabytes of memory with all the components, right? Including Docker containerization, right? Which compares to about 10 gigs of running a stream processing stack like Spark and the cloud, right? So it's that order of magnitude of footprint reduction and speed of execution improvement. So as I said, we're all small as fast as compute engine. You need to do that if you're going to talk about like a wind turbine, it's generating data, right? Every millisecond, right? So you have high frequency data like, you know, turbine pitch, right? And you have other conceptual data you're trying to bring in, like, you know, wind conditions, right? Reference information about how the turbine is supposed to operate. You're bringing in a torrential amount of data to do this computation on the fly. Right. And so the challenge for a lot of the companies that have really started to move into the space, the cloud companies like our partners, Google and Amazon and Microsoft, is they have great cloud capabilities for AIML. They're trying to move down to the edge by just transporting the whole stack to the, so in a plant environment, okay, that might work if you have massive data center that can run it. Now I still have to stream all my assets, right? All the data from all my assets to that central point. What we're trying to do is come out of the opposite way which is by having the rule small as fast as engine, we can run it in a small compute, very limited compute on the asset or near the asset, or you can run this in a big compute and we can take on lots and lots of use cases or models simultaneously. Right. I'm just curious in the small compute case, and again, you want all the data to analyze it. You want to infer something, right? Does it eventually go back or is there a lot of cases where you can get the information you need off the stream and you don't necessarily have to save or send that upstream? Yeah, so fundamentally today in the OT world, the data usually gets, if the PLC, the production line controller, it has simple KPIs, you know, if temperature goes to X or pressure goes to Y, do this. Those simple KPIs, if nothing is executed, it gets dumped into a local protocol server and then about every 30, 60, 90 days, it gets rewritten over. Nobody ever looks at it, right? That's what they say. There's a lot of, you know, 99% of the brownfield data in OT has never really been mined. Almost like a security camera. It's never been mined for insight. Right. It runs and runs and runs. Exactly. And so if you're doing that interesting and doing real-time decision-making, real-time action with our stack, what you would then persist is metadata insights, right? Here is an event or here's an outcome. And oh, by the way, if you're doing deep learning and machine learning and you're seeing deviation or drift from the model's prediction, you probably want to keep that, right? And some of the raw data packets from that moment in time. Right, right. And send that to the cloud or data center to say, oh, our fleet-wide model may not be accurate or it may be drifting, right? Right. And so what you want to do again, different horses for different courses, use our stack to do the lion's share of the heavy-duty real-time compute, produce metadata that you can send to either data center or cloud environment for further learning. Right. So your piece is really the gathering and the ML and then if it needs to go back up for more heavy lifting, you'll send it back up or do you have the cloud application as well that connects if you need? Yeah, so we build connectors to Google Cloud Platform, Google IoT Core, to AWS S3, to Microsoft Azure, virtually any Kafka Hadoop. We can send the data wherever you want, either on-plant, right, back into the existing control systems. We can send it to OSI soft pie, which is a great time-series database that a lot of process industries use. Right. You can also send it to any public cloud or a Hadoop data-like private cloud. You can send the data wherever you want, right? Now we also have one of our components is a time-series database. You can also persist in memory in our stack just for buffering or if you have high-value data that you want to take a measurement or value from a previous calculation and bring it into another calculation you're doing later, right? So it's a very flexible system. Yeah, we were at OSI soft pie world earlier this year. So fascinating stories that came out of- 30-year company. The building maintenance and all kinds of stuff. So I'm just curious, some of the easy to understand applications that you've seen in the field and maybe some of the ones that were a surprise on the OT side. I mean, obviously, preventive maintenance, right, is towards the top of the list. Yeah, I call it the layer kit, right? So, especially when you get to remote assets that are not either not monitor or lightly monitored, I call it drive-by monitoring. Somebody shows up and listens or looks at a valve and gauge and leaves. Condition-based monitoring, right? That is actually a big breakthrough for some, you know, think about fracking sites or remote oil fields or mining sites. The second layer is predictive maintenance, which the next generation is kind of predictive, prescriptive, even preventive maintenance, right? You're making predictions or you're helping to avoid downtime. The third layer, which is really where our stack is fairly unique today in delivering is asset performance optimization. How do I increase throughput? How do I reduce scrap? How do I improve worker safety? How do I get better processing of the data that my PLC can't give me so I can actually improve the performance of the machine? Now, ultimately, right, what we're finding is a couple of things. One is you can look at individual asset optimization, process optimization, but there's another layer. So often we're deployed to two layers on-premise. There's also the plant-wide optimization. So we talked about wind farms before off-cameras. So you've got, you know, the wind turbines, you can do a lot of things about turbine health, right? The blade pitch and, you know, condition of the blade. You can do things on the battery, all the systems on the turbine, but you also need a stack running like ours at that concentration point where those 200 plus turbines have come together because when you, the optimization of the whole farm, right, every turbine affects every other turbine. So a single turbine can't tell you, right, speed, you know, rotation, things that need to change if you want to adjust the speed of one turbine versus the one next to it. So there's also kind of a plant-wide optimization. I thought, talking about autonomous driving, there's going to be five layers of compute, right? You're going to have the, almost the way it's called, the ECU level, the individual subsystem in the car, the, you know, the engine, how it's performing. You're going to have the gateway in the car to talk about things that are happening across systems in the car. You're going to have, you know, the peer-to-peer connection over 5G to talk about optimization right between vehicles. You're going to have the base station algorithms looking at a microcell or macrocell within a geographic area. And of course you'll have the ultimate cloud, because you want to have the data on all the assets, right? But you don't want to send all that data to the cloud. You want to send the right metadata to the cloud. That's why they are big trunks full of compute. By the way, you mentioned one thing that I should really touch on, which is we talked a lot about what I call traditional brownfield control, automation and control type analytics and machine learning. And that's kind of where we started in discreet manufacturing a few years ago. What we found is in that domain, in an oil and gas and in mining and agriculture and transportation, in all those places, the most exciting, I think, new development this year is the movement towards video, 3D imaging and audio sensing. Because those sensors are now becoming very economical. And people have never thought about, well, if I put a camera in, you know, apply it to a certain application, what can I learn? What can I do that I never did before? And often they even have cameras today. They're not making use of any of the data, right? There's those very large, you know, customer of ours who, you know, has literally video inspection data, every product they produce, every day around the world. And this is in hundreds of plans. And that data never gets looked at, right? Other than training operators that, hey, you missed the defects this day, right? It's just a, you know, as you said, these are right over that data after 30 days. Well, guess what? You can apply deep learning TensorFlow algorithms, right? To build a convolutional neural network model and essentially do the human visioning, right? Now, other than an operator staring at a camera, right? Or trying to look at training tapes 30 days later, it's, I'm doing inferencing of the video image on the fly. Right. So, do your systems close loop back to the control systems now? Or is it more of a tuning mechanism for someone to go back and do it later? Great question. I just got asked that this morning by a large one that asks Supermajor that Intel just introduces to. The short answer is our stack can absolutely go right back into the control loop. In fact, one of our investors and partners, I should mention our investors were Series A with GE, Bosch, Yokagawa, Dell and EMC, and our series beauty a year ago was Intel, Satya, Ramco and Honeywell. So, we have one foot in tech, one foot in industrial, and really, you know, what you're really trying to bring is you said IT, OT together. The short answer is you can do that, but typically in the industrial environment, there's a conservatism about, hey, I don't want to touch the machine until I've proven it out. So, initially people tend to start with alerting, right? So, we send an automatic alert back into the control system to say, hey, you know, the machine needs to be retuned, right? Very quickly, though, for certainly for things that are not so time sensitive, they will just, you know, have us, now, Yokagawa, one of our investors, I pointed out our investors, actually is putting us in PLCs, right? So, rather than sending the data off the PLC to another gateway running our stack, like as an X86 or ARM gateway, we're actually, those PLCs now have Raspberry Pi plus capabilities, right? A lot of them are based in- What type of mechanism? Well, typically, right now, they're doing the IO and the control of the machine, but they have enough compute now that you can run us in a separate module, like the little brain sitting right next to the controller, and do the AI on the fly, and there you don't actually don't even need to send the data off the PLC, we just reprogram the actuator. So, that's where it's heading, right? It's eventually, and it could take years before people get comfortable doing this automatically, but what you'll see is that what AI represents industrial is the self-healing machine, right, the self-improving process, and this is where it starts, right? Well, the other thing I think is so interesting is what are you optimizing for? And there is no right answer, right? It could be you're optimizing for, like you said, a machine, you could be optimizing for the field, you could be optimizing for maintenance, but if there's a spike in pricing, you may say, eh, we're not optimizing now for maintenance, we're actually optimizing for output because we have this temporary condition and it's worth the trade-off. So, I mean, there's so many ways that you can reskin the cat when you have a lot more information and a lot more data. No, that's right, and I think what we typically like to do is start out, well, what's the business value, right? We don't want to go do a science project. Oh, I can make that machine work 50% better. Yeah, but if it doesn't make any difference to your business operation, so what? So, we often start the, you know, we always start the investigation with what is a high-value business problem where you have sufficient data or applying this kind of AI and the edge concept will actually make a difference. And that's the kind of proof of concept we like to start with. Yeah, so again, just a comfortable circle. What's the craziest thing an OT guy said, oh my goodness, you IT guys actually brought some value here that I didn't know. Well, I touched on video, right? So, without going into the whole details of the story, one of our big investors is a very large one, a gas company who said, look, you guys have done some great work with, I call it, software-defined SCADA, which is a term, SCADA is the network environment for OT, right? And so, SCADA is where the PLCs and DCS is connected over to these SCADA networks. That's the controlled automation world. And this investor said, look, you can come in, you've already shown us, that's why they invested, that you've gone into brownfield SCADA environments, done deep mining of the existing data and shown value by reducing scrap, improving output, improving worker safety, all the great business outcomes for industrial. If you come into our operation, our plant people are going to say, no, you're not touching my PLC, you're not touching my SCADA network. So, come in and do something that's non-invasive to that world. And so that's where we actually got started with video about 18 months ago. They said, hey, we got all these video cameras and we're not doing anything. You just have human operations, you're writing down, oh, I had a bad event, right? It's a totally non-automated system. So, we went in and did a video use case around, we call it flare monitoring, hundreds of stacks of burning off oil and gas in a production plant, 24 by 17 of operators just staring at it, writing down, oh, I think I had a bad flare. I mean, it's a very interesting overall process. So by automating that, giving them an AI dashboard, essentially, of, oh, I've got a permanent record of exactly how high the flare was, how smoky was it, what was the angle, right? And then you can then fuse that data back into plant data, what caused that, and also OSI soft data, what was the gas composition? Was it in fact a safety violation? Was it in fact an environmental violation, right? So, by starting with video and doing that use case, we've now got dozens of use cases all around video. Oh, I could put a camera on this. I could put a camera on the rig. I could put a camera down hole. I could put a camera on the pipeline, on a drone. There's just a million places that video can show up where audio sensing, right, acoustic. So video is great if you can see the event, right? Like I'm, you know, flying over the pipe, I can see corrosion, right? But sometimes like, you know, a burner in oven, I can't look inside the oven with a camera, right? There's no camera that could survive, you know, 600 degrees. So what do you do? Well, that's probably you're going to do something like either vibration or acoustic. Like inside the pipe, you got to go sound. Outside the pipe, you go video. But these are the kind of things that people traditionally, how do they expect pipe? Drive by. Right, right. Yeah, it's a fascinating story, Dave. And again, I think at the end of the day, it's again, you can make real decisions based on all the data in real time versus some of the data after the fact. That's right. All right, well great, great conversation and look forward to watching the continued success of Foghorn. Thank you very much. All right, he's David King, I'm Jeff Frick. You're watching theCUBE. We're having a CUBE conversation at our Palo Alto studio. Thanks for watching. We'll see you next time.