 From around the globe, it's theCUBE, presenting Cube on Cloud, brought to you by SiliconANGLE. Hi, I'm Stu Miniman and welcome back to theCUBE on Cloud, talking about really important topics as to how developers were changing how they build their applications, where they live, of course, long discussion we've had for a number of years. You know, how do things change in hybrid environments? We've been talking for years, public cloud, and private cloud, and really excited for this session, we're going to talk about how edge environment and AI impact that. So, happy to walk back. One of our Cube alumni, Simon Crosby, is currently the Chief Technology Officer with SWIM. It's got plenty of viewpoints on AI, the edge, and knows the developer world well. Simon, welcome back. Thanks so much for joining us. Thank you, Stu, for having me. All right, so let's start for a second. Let's talk about developers. You know, it used to be, you know, for years we talked about, you know, what's the level of abstraction we get? Does it sit, you know, do I put it on bare metal? Do I virtualize it? Do I containerize it? Do I make it serverless? A lot of those things, you know, that the app developer doesn't want to even think about, but location matters a whole lot when we're talking about things like AI. Where do I have all my data that I can do my training? Where do I actually have to do the processing? And of course, edge just changes by orders of magnitude. Some of the things like latency and where data lives and everything like that. So with that as a setup, I'd love to get just your framework as to what you're hearing from developers and we'll get into some of the solutions that you and your team are helping them to do their jobs. Well, you're absolutely right. Stu, the data onslaught is very real. Companies that I deal with are facing more and more real-time data from products, from their infrastructure, from their partners, whatever it happens to be, and they need to make decisions rapidly. And the problem that they're facing is that traditional ways of processing that data are too slow. So perhaps the big data approach, which by now is a bit old, it's a bit long in the tooth, where you store data and then you analyze it later, is problematic. First of all, data streams are boundless, so you don't really know when to analyze. But second, you can't store it all. And so the store and analyze approach has to change. And Swim is trying to do something about this by adopting a process of analyze on the fly. So as data is generated, as you receive events, you don't bother to store them, you analyze them. And then if you have to, you store the data, but you need to analyze as you receive data and react immediately to be able to generate reasonable insights or predictions that can drive commerce and decisions in the real world. Yeah, absolutely. I remember back in the early days of big data, real-time got thrown around a little, but it was usually, I need to react fast enough to make sure we don't lose the customer, we react to something, but it was we gather all the data and let's move compute to the data. Today, as you talk about real-time streams are so important. We've been talking about observability for the last couple of years to just really understand the systems and the outputs more than looking back historically at where things were waiting for alerts. So could you give us some examples, if you would, as to those streams, what is so important about being able to interact and leverage that data when you need it? And boy, it's great if we can use it then and not have to store it and think about it later. Obviously there's some benefits there because- Every product nowadays has a CPU, right? And so there's more and more data. And just let me give you an example. Swim processes real-time data from more than 100 million mobile devices in real-time for a mobile operator. And what we're doing there is we're optimizing connection quality between devices and the network. Now that volume of data is more than four petabytes per day, okay? Now there is simply no way you can ever store that and analyze it later. The interesting thing about this is that if you adopt and analyze and then, if you really have to store architecture, you get to take advantage of MERSLAW. So you're running at CPU memory speeds instead of a disk speed. And so that gives you a million fold speedup. And it also means you don't have the latency problem of reaching out to a remote storage database or whatever. And so that reduces cost. So we can do it on about 10% of the infrastructure that they previously had for Hadoop style implementation. So maybe it would help if we just explain. When we say edge, people think of a lot of different things. Is it, you know, an IoT device sitting out at the edge? Are we talking about the telecom edge? We've been watching AWS for years. Spider out their services into various environments. So when you talk about the type of solutions you're doing and what your customers have, is it the telecom edge? Is it the actual device edge? Where does processing happen? And where do these services that work on it live? So I think the right way to think about edge is where can you reasonably process the data? And it obviously makes sense to process data at the first opportunity you have, but much data is encrypted between the original device say, and the application. And so edge as a place doesn't make as much sense as edge as an opportunity to decrypt and analyze data in the clear. So edge computing is not so much a place in my view as the first opportunity you have to process data in the clear and to make sense of it. And then edge makes sense in terms of latency by locating compute as close as possible to the sources of data to reduce latency and maximize your ability to get insights, you know, and return them to users in, you know, quickly. So edge for me often is the cloud. Excellent. One of the other things I think about back from, you know, the big data days or even earlier, it was that how long it took to get from the raw data to processing that data to be able to getting some insight and then being able to take action. It sure sounds like we're trying to collapse that completely is that, you know, how do we do that? You know, can we actually, you know, build the system so that we can, you know, in that real time continuous model that you talk about, you know, take care of it and move on. One of the wonderful things about cloud computing is that two major abstractions have really served us. And those are REST, which is state disk computing and databases. And REST means in the old server can do the job for me and then the database is just an API call away. The problem with that is that it's desperately slow. So when I say desperately slow, I mean, it's probably thrown away the last 10 years of most law. Just think about this way. Your CPU runs at gigahertz and the network runs at milliseconds. So by definition, every time you reach out to a data store, you're going a million times slower than your CPU. That's terrible. It's absolutely tragic, okay? So a model which is much more effective is to have an in-memory computing architecture in which you engage in stateful computation. So instead of having to reach out to a database every time to update that database and whatever, store something and then fetch it again a few moments later when the next event arrives, you keep state in memory and you compute on the fly as data arrives. And that way you get a million times speedup. You also end up with this tremendous cost reduction because you don't end up with as many instances having to compute by comparison. So let me give you a quick example. If you go to traffic.sum.ai, you can see the real-time state of the traffic infrastructure in Palo Alto. And each one of those intersections is particularly in its own future. Now, the volume of data from just a few hundred lights in Palo Alto is about four terabytes a day. And sure, you can deal with this in AWS Lambda. There are lots and lots of servers up there. But the problem is that the end-to-end per event latency is about 100 milliseconds. And if I'm dealing with 30,000 events a second, that's just too much. So solving that problem with a stateless architecture is extraordinary and expensive, more than $5,000 a month. Whereas the stateful architecture, which you could think of as an evolution of something reactive or the actor model, gets you something like a tenth of the cost. So cloud is fabulous for things that need to scale wide, but a stateful model is required for dealing with things which update you rapidly or regularly about their changes in state. Yeah, absolutely. I think about, I mentioned before, AI training models. Often, if you look at something like autonomous vehicles, the massive amounts of data that it needs to process has to happen in the public cloud. But then that gets pushed back down to the end device, in this case, it's a car, because it needs to be able to react in real time and get fed at a regular update, the new training algorithms that it has there. What are you seeing? I have one reason on this training approach and data science in general, and that is that there aren't enough data scientists or smart people to train these algorithms, deploy them to the edge and so on. And so there is an alternative worldview, which is a much simpler one, and that is that relatively simple algorithms deployed at scale to stateful representatives, they're school, you know, digital twins of things can deliver enormous improvements in behavior as things learn for themselves. So the way I think at least this edge world gets smarter is that relatively simple models of things will learn for themselves, predict their own futures based on what they can see and then react. And so this idea that we have lots and lots of data scientists dealing with vast amounts of information in the cloud is suitable for certain algorithms, but it doesn't work for the vast majority of our application. So where are we with the state of what did developers need to think about it? You mentioned that there's compute in most devices, that's true, but, you know, do they need some special Nvidia chipset out there? Are there certain programming languages that you're seeing more prevalent, you know, interoperability, give us a little bit of, you know, some tips and tricks for those developing. Super, so number one, a stateful architecture is fundamental and sure, React is well-known and there are Aka, for example, and Erlang, Swim is another. So I'm going to use Swim language and I would encourage you to look at SwimOS.org to go from play there. A stateful architecture which allows actors, small concurrent objects, to stately evolve their own state based on updates from the real world is fundamental. By the way, in Swim, we use data to build these models. So these little agents for things we call them web agents because the object ID is a URI. They statefully evolve by processing their own real world data, statefully representing it and then they do this wonderful thing which is build a model on the fly and they build a model by linking to things that they're related to. So an edge section would link to all of its sensors but it would also link to all of its neighbors because the neighbors and linking is like a sub in PubSub and it allows that web agent then to continually analyze, learn, and predict on the fly. And so every one of these concurrent objects is doing this job of analyzing its own raw data and then predicting from that and streaming the result. So in Swim, you get streamed raw data in and what streams out is predictions. Predictions about the future state of the infrastructure and that's a very powerful stateful approach which can run all in memory, no storage required. By the way, it's still persistent so if you lose a node, you can just come back up and carry on but there's no need to store huge amounts of raw data if you don't need it. And let me just be clear, the volumes of raw data from the real world are staggering, right? So four terabytes a day from Palo Alto, Las Vegas about 60 terabytes a day from the traffic lights. More than a hundred million mobile devices is tens of petabytes per day which is just too much to store. Simon, you'd mentioned that we have a shortage when it comes to data scientists and the people that can be involved in those things. How about from the developer side? Do most enterprises that you're talking to, do they have the skill set? Is the ecosystem mature enough for the company to take involved or what do we need to do looking forward to help companies be able to take advantage of this opportunity? Yeah, so there is a huge challenge in terms of I guess just cloud native skills and this is exacerbated the more you get out into, I guess, which you could think of as traditional kind of companies, all of whom have tons and tons of data sources. So we need to make it easy and Swim tries to do this by effectively using skills that people already have, Java or JavaScript and giving them easy ways to develop, deploy and then run applications without thinking about them. So instead of binding developers to notions of place and where databases are and all that sort of stuff, if they can write simple object-oriented programs about things like intersections and push buttons and pedestrian lights and in-road loops and so on and simply relate basic objects in their world to each other, then we let data build the model by essentially creating these little concurrent objects for each thing and they will then link to each other and solve the problem. We end up solving a huge problem for developers too, which is that they don't need to acquire complicated cloud native skill sets to get to work. Well, absolutely Simon, that's something we've been trying to do for a long time is to truly simplify things. I want to let you have the final word. If you look out there, the opportunity, the challenge in the space, what final takeaways would you get to our audience? So very simple. If you adopt a stateful computing architecture like Swim, you get to go a million times faster. The applications always have an answer. They analyze, learn and predict on the fly. And they go a million times faster. They use 10% less, no, sorry, 10% of the infrastructure of a store then analyze approach. And it's the way of the future. Simon Crosby, thanks so much for sharing. Great having you on the program. Thank you, Stu. And thank you for joining. I'm Stu Miniman. Thank you as always for watching theCUBE.