 I'm Peter Burris and welcome to another Cube Conversation. We're broadcasting from our beautiful Palo Alto studios and this time we've got a couple of great guests from Swim. And one of them is Chris Sacks, who's the founder and lead architect. And the other one is Simon Crosby, who's the CTO. Welcome to the Cube, guys. Great, thank you. So let's start. Tell us a little bit about yourselves. Simon or Chris, let's start with you. So my name's Chris Sacks. I'm a co-founder of Swim and my background is in embedded and distributed systems and bringing those two worlds together. And I've spent the last three years building software from first principles for edge computing. But embedded, very importantly, that's small devices, highly distributed with a high degree of autonomy and how they will interact with each other. Right. You need both a small footprint and sort of you need to scale down and out. Got it. There's one thing that we say, you know, people get sort of scaling out in the cloud and scaling up and out for the edge you need to scale down and out. And there's similarities to how cloud scale and some very different principles. We're going to get into that. So Simon, CTO. Sure, my name is Simon Crosby. I came this way courtesy of being an academic a long time ago and then doing startups. This is startup number five for me. I was CTO and founder at ZenSource. We built the Zen hypervisor also at Bromium where we did micro virtualization and I'm privileged to be along for the ride with Chris. Excellent. So guys, the swim promise is edge AI. I like that down and out. Tell us a little bit about it, Chris. So one of the key observations that we've made over the past half decade is there's a whole lot of compute cycles being showered on planet Earth. Arm is shipping five billion chips a quarter and there's a tremendous amount of compute, generating a tremendous amount of data and it's sort of trapped in the edge. There are physics problems and economic problems with backhauling it all to the cloud but there's tremendous, you're capturing the functionality of the world on these chips. Yeah, we like to say that if software is going to eat the world, it's going to eat it at the edge. Is that kind of what you mean? Yeah, that's right. When you start running into, when you decide you want to sort of eat the edge, you run into problems very quickly with a traditional way of doing things. So one example is where does your database live? If you live on the edge, which telephone pole are you going to put at your database node in? How big does it need to be? Right. You know, there are a number of decisions that are very difficult to make. And so Swim's promise is, now, you have some advantages as well in that billions of clock cycles go by on these chips in between network packets. And if you can figure out how to squeeze your software in to these sort of, you know, slop cycles between network packets, you can actually do, you actually have a supercomputer, a global supercomputer on which you can, you can do machine learning, you can sort of try and predict the future of how physical systems are going to play out and. Hence your background on distributed systems because the goal is to try to ensure that the network packets are as productive as possible. Exactly. Here's another way of looking at the problem. If you come top down, this region will just think of things in the future, all sorts of things, which you've got computers and maybe some networking in them, presenting to you a digital twin of themselves. Where does that thing come from? Now describe digital twin. We've done a lot of research on this, but it's still as a relatively novel concept. GE talks about it, IBM talks about it. When we say digital twin, we're talking about the simulacrum, the digital representation of an actual thing, right? Of an actual thing. And so there are a couple of ways you can get there. One way is if you give me the detail design of a thing and exactly how it works, I can give you all of that detail and maybe an operator can help use that to find a problem. The other way is to try and construct it automatically. And that's exactly what SWIM does. So it takes the thing and builds models around it that are a. Well, so what do things do? Things give us data. And so the problem then becomes, how can I build a digital twin just given the data? Just given the observations of what this thing is seeing, what its sensors are bleating about, what things near it are saying, how can I build a digital twin which will analyze itself, tell you what its current state is and predict the future just from the data? All right, so the bottom line is that you've got, you're providing a facility to help model real world things that tend to operate in an analog way and turning them into digital representations that then can be a full member, in fact, perhaps even a superior member in a highly distributed system of how things work together. Got that right? A few key points is that these digital twins are in the loop with the real world. And they are in the loop with their neighbors and you start with digital twins that reflect the physical world but they don't end there. You can have physical twins, you can have digital twins of concepts as well and other sort of higher order notions and from the masses of data that you get from physical devices, you can actually infer the existence of twins where you don't even have a sensor. Let's make it real. So you could have a digital, if you happen to be tracking all of the buses in downtown San Francisco, you can infer PM10 pollution as a virtual sensor on a bus. And then you can pretty quickly work out something which is a value to somebody who's trying to sell insurance, for example, okay? And that's not a real sensor on every bus but you can then compose these things given that you have these other digital twins that you're manifest themselves. So folks talk about the butterfly effect and things like chaos theory which is a butterfly effecting the weather in China. But what we're talking about is locality really matters. It matters in real systems and it matters in computers. And if you have something that's generating data, more than likely that thing is going to want its own data because of locality, but also the things near it are also going to want to be able to infer or understand the behavior of that thing because it's going to have a consequential impact on them. Correct, so I'll give you two examples of that. We have, we've been using an aircraft manufacturing facility. And so we can, the virtual twin here is some widget which has an RFID tag on it. We don't know what that is. We just know there's a tag and we can place it in three space because it gets seen by multiple sensors. We triangulate. And then as these tags come together to make say an aircraft sub-assembly, that meaning of an aircraft sub-assembly is kind of another thing, but the nearness, it's the locality that gets you there, right? So I can say all these tags came together. Let's track that as a superior object or there's a containment notion there. And suddenly we're tracking wheel assemblies instead of widgets, right? And this is where the AI comes in because now the AI is the basis for recognizing the patterns of these tags and being able to infer from the characteristics of these patterns that it's a sub-assembly, but I got that right? Right, so you, there's a unique opportunity that is opened up in AI when you're watching things unfold live and that you have this great unifying force to learn off of which is causality. It's the, what does everything have in common? It's that they, it evolves through time. And what do you do when you have billions of clock cycles to spare between network packets? Well, you can make a guess about what your particular digital twin might see next. So you can take a guess based on what your state is, what the sensors around you are saying and just make a guess. Then you can see what actually happens. And you see how, you see what actually happens. You measure the error between what, what you predicted would happen, what actually happened and you can correct for that. And you can do that just add infinite, just trillions of times over the course of a year, you make small corrections for, for how you think your particular system will evolve, whether it's a street, a traffic light trying to predict when it's going to change, when cars are going to show up, when pedestrians are going to push buttons or it's a machine, you know, a conveyor belt or a motor and a factory trying to predict when it might break down. You can learn from these precise systems, you know, very specific models of how they're going, going to evolve and you can, you can play reality forward. You sort of learn a simulation and you can, you can play your own, predict your own future. And then there's very cool thing that shows up on that. And so instead of say, let's take a city and all of its lights, instead of trying to gather all the data from the city and go from solve a big model, which is the cloud approach to doing this, big data and cloud approach. Every, essentially each one of these digital twins is solving its own problem of how do I predict my own future? So instead of solving one big model, you'll have 200 different intersections all predicting their own future, which is totally cool because it distributes well in this fabric of space CPU cycles and can be very efficiently computed. And the consequence of that is that again, you can get these very rich patterns that then these things can learn more from each individual as groups. There's an even cooler thing. Imagine I sat you down by an intersection and I said, write me a program for how this thing is going to behave. First of all, you wouldn't know how to do it. Second, there are enough humans on planet Earth to do this. What we're saying is that we can construct this program from the data, from this thing as developed through time. We'll construct the program and it will be merely a learned model. And then you could ask it how it's going to behave in the future. You could say, well, what if I do this? What if a pedestrian pushes this button? What will the response be? And so effectively you're learning a program, you're learning the digital twin just from the data. All right, so how does SWIM do this? So we know, now we know what it is. How did, and we know that it's using, you know, it's stealing cycles from CPUs that are mainly set up to gather, to sense things and package data up and send it off somewhere else. But how does it actually work? What does the designer, the developer, the operator do with SWIM that they couldn't do before? So SWIM is a tiny vertically integrated software stack that has all the capabilities you find in an open source sort of cloud platform. You have persistence, you have message dispatch, you have peer-to-peer routing, you have analytics and a number of other capabilities. But SWIM hides that, it takes care of it, it abstracts over what you need to do to, rather than thinking about where do you place compute? And SWIM you think, okay, what is my model, what is my digital twin and what am I related to? And SWIM dynamically maps these logical models to physical hardware at runtime and dynamically moves these sort of encapsulated agents around as needed based on the loads and the demand in the network. And in the same way that- In the events. Yes, in the events. And in the same way that you, if you're using Microsoft Word, you don't really, what CPU core is that running on? Who knows and who cares? It's a solved problem. We look from the ground up and the edge is just one big massively multi-core computer and there are similar principles to apply in terms of how you maintain consistency, how you efficiently route data that you can abstract over and sort of eliminate as a problem that you have to be concerned about as a developer or a user who just wants to ingest some data and get insights on how- So I'm going to make sure I got that. So if I look at the edge, which might have 200, might have 10,000 sensors associated with it, we can imagine for example a level of complexity like what happens on a drilling platform on an oil field. Probably it's 10,000 sensors on that thing. All those different things. Each of those sensors are doing something and they're dispatching information. But what you're doing is you're basically saying, we can now look at those sensors. They can do their own thing, but we can also look at them as a cluster of processing capability. We'll put a little bit of software on there that will provide a degree of coordinated control so that models can be built up out of that. So first of Swim itself builds a distributed fabric on whatever computer is available and you can smear Swim between an embedded environment and a VM in the cloud. We just don't care. But the point is anything you pointed at becomes part of this cluster. But the second level of this is when you start to discover the entities in the real world and you're going to discover the entities from their data. So I'll get all this kind of gray stuff. I don't really know what it means, but I'm going to find these entities and what they're related to and then for each entity in stand shape on these digital twins as an active micro, essentially you can think of as a microservice. It's a stateful microservice, which is then just going to consume its own real world data and do its thing and then present what it knows by an API or graphical UI components. So I'm an operator. I install. What do I do to install? You start a process on whatever devices you have available. Okay. So Swim is completely self contained and has no external dependencies. So we can run as the init daemon on a Linux box or even without an operating system. All right. So I basically target Swim at the device and installs. Once it's installed, how am I then acquiring it through software development? Oh, so cool. So ultimately in this edge world, there is a, you've asked the key question which is how the hell do I get hold of this stuff and how does it run and all? And I don't think the world knows the answer to all those questions. So for example, in the traffic use case, the answer is this, we publish an API. It happens to be in Azure, but who cares? Where people like Uber or UPS can show up and say, what's this traffic light gonna do in the future? And then just hit that, right? So what they're doing is going for the insights of digital twins in real time as a service. That's kind of an interesting thing to do, right? But you might find this embedded in a widget because it's small enough to be able to do that. You might find that a customer installs on a couple of boxes and it just runs. We don't really care. It will be there and it's trivial to run. So you're going to be moving it into people who are building these embedded systems. Sure, but the key point here is that, I know you particularly in the QB hearing all these wonderful stories about DevOps and Kubernetes and all this guff up in the cloud, fine. That's where you want those people to be. Oh god, guff. But at the edge, no Kubernetes, okay? There aren't enough humans to run this stuff, so it's got to be completely automatic. It's got to just wake up, run, find all the compute, run ceaselessly, distribute load, be resilient, be secure, all these things, it's just got to happen. So Swim becomes a service that is shipped with an embedded system? Possibly, or there is a potential outcome where it's delivered as software which runs on a box close to some widget. Or rolled out as a software update with some existing manufacturers. In the particular case of traffic, we should be on 60,000 intersections by the end of this year. The traffic infrastructure vendor, the vendor that delivers the traffic management system, just draws out an upgrade and suddenly a whole bunch of new intersections appear in a cloud API and Uber or Lyft or whatever, it's just hitting that thing and finding out what they are. Great, and so as developers, am I going into a Swim environment and doing anything? This is just the way that the data's being captured. So we take data. The power's being identified. Right, take data, turn it into digital twins with intelligent things to say and expose that as APIs or as UI components. So that now the developers can go off and use whatever tools they want and just invoke the service through the API. Bingo, so that's right. So developers, if they're doing something, just hit digital twins. All right, so we talked a little bit about the traffic example I mentioned, being in an oil field. What are some of the other big impacts? How much, as this thing gets rolling, what kind of problems is going to allow us to solve? Not just one, but there's definitely going to be a network effect here, right? Sure, so the interesting thing about the Edge world is that it's massively diverse. So even, you know, one cookie factory is different from another cookie factory in that they might have the same equipment, but they're in different places on planet Earth and they have different operators and everything else. So the data will be different and everything else. And so the challenge in general with the Edge environment has been that we've been very professional, services-centric people bringing in blobs of open source and squads of people and trying to solve a local problem, and it's very expensive. Swim has this opportunity to basically just show up, consume this great data, and tell you real stuff without enormous amounts of semantic knowledge, a priori, right? And so we have this ability to conquer this diversity problem, which is characteristic of the Edge, and also come up with highly realistic and highly accurate models for this particular thing when we're very clear. The widget in Chocolate Factory A is exactly the same as the widget in Chocolate Factory B, but the models will be 100% different and totally useless at either place because of the pipes go bang at six a.m. here, it's in the model. And Swim has the opportunity to reach the 99.9% of data that currently is generated and immediately forgotten. That's right. Because it's too expensive to store, it's too expensive to transport, and it's too expensive to build applications to use. We should talk about cost, because that's a great one. So if you want to solve the problem of predicting what the lights in Palo Alto are gonna do for the next five minutes, that's heading towards $10,000 a month in AWS, okay? Swim will solve that problem for a tiny fraction, like less than a hundredth of that, just on stranded CPU cycles lying around at the Edge. And you have, say, bandwidth and a whole bunch of other things. Yeah, and that's a very important point because the Edge is, it's been around for a while, operational technology, people have been doing this for a while, but not in a way that's naturally and easily programmable. You're bringing a technology that makes it easy to self-discover simply by utilizing whatever cycles and whatever Dave is there and creating a persistence, making it really simple for that to be accessed through an API. And ultimately, it creates a lot of options on what you can do with your devices in the future. So it makes existing assets more valuable because you have options and what you're gonna do with it. If you look at the traffic example, it's the AWS scenario is $50 per month per inch section. No one's gonna do that, okay. But if it's like a buck, I'm in. Okay, all right, and you can do things because then it's worthwhile for Uber to hit that API. All right, so we got to wrap this up. But so one way of thinking about it is I'm thinking, and there's so many metaphors that one could invoke, but this is kind of like the teeth that are going to eat the real world software, the software teeth that's going to eat the real world at the Edge. So if I can leave with one thought, which is swim sort of loosely stems from software and motion. And the idea is that teeth, you need to move the software to where the data is. You can't move the data to where the software is. The data is huge, it's immobile. And the quantities of data are staggering. You essentially have a world of spam bots out there. And it's intractable. But if you move the software to where the data is, then the worlds are always there. Yeah, one thing to note is that software's still data. It just happens to be extremely well organized data. And so the choice is do you move all the not particularly well organized data somewhere where it can operate, or do you move the really well organized and compact? That's right. And information theory says, move the most structured thing you possibly can, and that's the application of the software itself. Exactly. All right, Chris Sacks, founder and lead architect of Swim. Simon Crosby, CTO of Swim. Thank you very much for being on theCUBE. Great conversation. Thanks so much. Good luck. And once again, I'm Peter Burris, and thank you for participating in another CUBE Conversation with Swim. Talk to you again soon.