 Hello, Austin. How is everyone? Great to see you all. My name is Michael Tannenbaum. I'm a co-founder at Mycelial, and while I would be my dream to be with you all in person, I'm speaking to you today remotely about AIML at the extreme edge with WebAssembly, a path forward. Today, we're going to be talking about what WebAssembly is, why it's a good fit for AIML at the edge, what are the struggles and challenges of delivering AIML at the extreme edge that WebAssembly can really help us catapult forward, and what are some of the peripheral challenges that turn out to be not so peripheral when we're trying to deliver AIML solutions at the extreme edge? Just by way of a little bit of background, I am the, as I mentioned, one of the co-founders at Mycelial, and previous to Mycelial, I was a principal solutions engineer at Erecto working on Kubeflow, which is an AIML end-to-end solution that runs on Kubernetes. Originally came out of Google and now is being adopted by a wider audience. I was privileged enough to work on a SIG for the on-premises community for the Kubeflow project. Before that, I was a principal solutions engineer and the AIML practice lead at Mesosphere now named D2IQ. Just very quickly to give you a little bit of context about what Mycelial does, help you understand why it is that we come into contact so often with projects relating to AIML at the extreme edge, is we are the edge native platform for distributed local-first applications. What that really means is that we combine conflict-free replicated data types, which we'll be talking about later because they have a lot of relevance for the extreme edge in a lot of different cases, not just AIML. With WebAssembly, to make it easy to create distributed applications in these highly constrained environments, and our goal is to make it just as easy as making a local application that you might run on your laptop or on a server in the Cloud. Mycelial as a company was founded last year. We raised about $3.8 million from Crane Venture Partners. Anil Akhani is in the audience today from Crane, and if you really want to get a perspective, that's fantastic on the market, as particularly in these areas, I would encourage you to communicate with him. We work with large enterprises that are looking to deliver scaled products in the field. So I think US Department of Defense, large retailers with a big physical footprint, and you'll see why we focus on these folks when you get a little bit more of a sense for where we're coming from. So I mentioned that we're the Edge Native platform for distributed local first applications, but it's just a bunch of buzzwords. What does that actually mean? What the heck is even Edge Native? Well, Cloud Native is a term most of us are familiar with, and Cloud Native is really defined by the benefits it provides over any particular or specific architecture. So with Cloud Native tools, we were able to meet the challenges of web scale, starting at huge companies like Twitter, Facebook, Google, Airbnb, and so on. All Cloud Native tools are designed around the goal of providing scale, agility, velocity, and hopefully without a lot of headache. However, the limitation there is that Cloud Native tools focus, of course, on the Cloud. So racks of servers in a data center, either your own or borrowed from somebody else, like one of the big three Cloud providers and so on. And the applications that run there are Cloud applications. So we can think of them as sort of data center applications. When we at Mycelial and hopefully you at the end of this presentation, think about when we think about the Edge, we think about anything that runs outside the data center. So for us, the goal of Edge Native is to provide that same scale, agility, and velocity. But for applications that run outside the data center, so we would call these real-world applications. And these can be from a variety of different perspectives. So for us, something that runs in a browser tab, it's outside a data center, it's an application, we would call that an Edge application. Then more traditional considerations of Edge, things like the Internet of Things, or mobile, and so on. These, of course, smart meters, other types of smart devices that exist in the real world. Anything outside the data center for us is the Edge. So Edge Native is not a term that we invented, but it's one we are promoting and trying to encourage people to think about, given the change, given the change in circumstance, the different constraints that exist outside the data center versus inside the data center, after all, most people don't have racks of servers in their home or something like that. Maybe this audience, I shouldn't say. But most people don't have access to those tools on certainly on their phone, or perhaps on a smart thermostat, or on their robotic vacuum cleaner, that sort of thing. The constraints are different, the challenges are different, and therefore we need to think about our tooling and what the challenges are that we're trying to overcome differently. And so from our perspective, there is indeed a need to consider separate architectures and separate sets of tooling for the Edge, because of course those constraints are different. But at the end of the day, Cloud Native and Edge Native both are defined by the goals that they seek to deliver to the end user, that being scalability and velocity. We could probably add a few other descriptors there, but generally that's the focus of today's conversation is around everything that runs outside of the data center. And after all, the extreme Edge is just an even more challenging version of the Edge. So you were probably expecting something to that effect as well. So the question that a lot of people ask when we first start considering AIML and the Edge is to borrow a phrase or to paraphrase Jerry Seinfeld. What is the deal with AIML at the Edge? Okay, that'll be the last time I attempt to impersonate Jerry Seinfeld, you're welcome. And the reason is what I like to say is that if you're a human being or even a machine, if you're interacting with a computer and you're not pointing and clicking in 2022, you're using AIML, what do I mean by that? If you're in your home and you say, hey, Alexa, hey Google, hey Siri, you're using AIML and your interaction modality there, of course, is speech. If we're using our mobile phone or our computer, we point and click. That's a fairly standard modality, but if we want to talk to our computer, if we want our computer, and I'm speaking computer very generally here, if we want our computer to recognize our face in order to authenticate us, if we want our car to be able to view the world like a human driver and react hopefully even better potentially than a human driver would, that's using AIML. In 2022, I can't tell you what'll happen in the future, but in 2022, AIML is the way that we can interact with more human modalities like speech, vision, and so on in order to interact with a computer and accomplish whatever our goal happens to be. So because the interface between humans and computers outside of pointing and clicking requires AIML and because humans live in the real world, therefore we need to deliver AIML at the edge. Now you may be saying, well, Michael, why can't I just take all that data and centralize it to some centralized server and in a data center somewhere and then beam back the results. Well, the reason is because for many safety critical situations, really in any situation on the edge, you have to consider the reality, not the chance of, but indeed the reality of disconnection and that's a very real, that's a very real constraint and you'll see it throughout the discussion today. If we take one step below the surface though and we actually look at what is the common pattern that exists in the AIML at the edge world, there's a wonderful talk that was given actually quite 2017 I believe it was by Peter Levine who is a retired partner formerly of Andreessen Horowitz and he gave a talk, it's publicly accessible if you just Google the source below called the end of cloud computing in which he imagines a world that is fast upon us whereby the quantity of data generated at the edge and the need to make real-time decisions at the edge, again, anything outside the data center without reliance on connectivity, without suffering the challenges not just of latency but inconsistent latency. Elon Musk talks about this quite a bit with Tesla where it's called jitter, where we can't actually even plan around the latency because the latency itself is inconsistent. Ultimately, all of these projects boil down to more or less four different steps and Peter Levine created this framework that I think is really quite compelling. The first thing we wanna do in an AIML application at the edge is send something, we wanna determine what's going on in our world and if it's something that we need to evaluate as a and make a decision about which comes to our second point. So we need to take that sensory data and then we need to plug it into an AIML model. We need to create an inference about what that data actually means. So I'll give an example. If I'm a self-driving car and I have much image data coming into me, I need to be able to make a decision as to whether it's safe to continue driving, for example, or whether we need to come to a stop. After I've made that inference, I then need to act on it. So we have to flow data from the sensory tooling, the sensors and to be quite literal into an inference engine, that's your AIML model. And then that AIML models inference needs to then be flowed into a separate set of applications connected to perhaps something physical, perhaps an alerting mechanism to then take an action. And then lastly and sort of asynchronously from the first four, we want to actually use the data that we're collecting at the edge in an intelligent way to actually make our models stronger. And in many cases, as conditions shift over time, for example, a movie on Netflix may become very popular and then everyone's seen it. So why do we want to keep recommending it? We need to learn from what users, what the world is teaching us out in the real world, take that data and actually make our models better, hopefully, or at a minimum, at least update them. That learn function actually has a very interesting opportunity, especially when it comes to the edge and especially when it comes to WebAssembly as it relates to privacy. So today, the way that substantially all AIML training jobs run in a centralized location. So we take all of your data, we hover it up right to the cloud, we plug it into some ETL process and prepare it, pre-process it, run a training job on it, and then push those models back down to the edge. But wouldn't it be wonderful for your own privacy and mine if instead we could run a training job locally on a local set of data and then extract the learnings, not the actual raw data that contains your private information, but the learnings from that back to a centralized source so that it could be amalgamated with other learnings from other users, other cars, trucks, planes, boats, trains, and so on out in the field without actually needing to centralize quite all of that data. And that's really the crux of Peter Levine's argument here. He's saying that the quantity of data generated at the edge far outpaces our ability, just pure bandwidth-wise, to centralize it. And indeed, each individual high fidelity data point that's generated at the edge doesn't need to be centralized if we're going to learn from it. So the opportunity to run sophisticated code at the edge, which as we'll see is a wonderful fit for WebAssembly, the opportunity to deliver sophisticated code on smaller footprint devices, and to be able to extract the learnings from that is a huge advantage that WebAssembly does indeed open up for us. We'll see why. So just as a moment to establish my credibility in this arena, I have attempted to put cloud-native tooling on a wide variety of different edge locations. And I'll be honest with you all, they all failed. I've tried to put Kubernetes on a submarine. I've tried to put Spark on a robotic arm. I've attempted to create centralized repositories for electric car charging stations to do, training at the edge, as well as inference at the edge, a whole variety of different toolings that I've a different, excuse me, different locations where I've tried to put Kafka, Spark, Cassandra, Flink, all of these cloud-native tools that work great in the cloud, they're just really not designed for smaller devices, think Raspberry Pi, and so on. And in particular, they really don't shine at the edge. And the reason is because the edge is not a data center. I mean, a data center, if you think about it, fairly contrived location. If you're in a data center, you know you're in a data center, right? There's racks and racks of servers. They all have tremendous connectivity. They have huge, nowadays, huge amounts of CPU and memory. And cloud-native tooling, to its credit, takes advantage of that environment. So it relies on the fact that there's plenty of memory, plenty of CPU, near constant connectivity. Just as a quick aside, we always like to say that when you're talking about the edge, your disconnection is not an error state. But if one of your Kubernetes nodes is disconnected, well, someone's gonna get a page and very likely will need to come in and fix that situation. Whereas for some of the work that we and others do, if you're talking about, for example, an autonomous naval vessel that spends six months of the year under the water, well, disconnection is actually part of the game there. That's not an error state. That's just regular operations. And then of course, these tools are quite ops-intensive. So if we're gonna be running our own Cassandra, someone has to feed and care for that database. Same with Kafka, Spark, Flink, any Kubernetes, any of these cloud-native tools. And because they know these tools were designed to have access to a ton of bandwidth, they rely on that bandwidth. So if anyone's looked at logs from a Kubernetes cluster, it's a mind-boggling amount of messaging that goes between different applications and different nodes in those clusters. It's very bandwidth heavy, which in the data center is perfectly fine on the edge when moving a kilobyte of data could cost a million times more than it does sitting in a single availability zone in the public cloud, we have to think a lot harder about what we wanna actually be putting over the wire. So can WebAssembly help us? Absolutely it can. I list on the bottom here a couple of resources that I found to be particularly useful in my journey with WebAssembly, the bytecode alliance, I have some wonderful resources. Lynn Clark, if you look them up on YouTube, unbelievably helpful cartoons and diagrams to discuss WebAssembly and then of course, webassembly.org. So what, let's take a quick review for the folks who may not know a little bit, may not be as familiar with WebAssembly. What is WebAssembly? Well, WebAssembly is an intermediate representation and essentially it's a compile target. So other languages compile into WebAssembly and then WebAssembly is then run in different environments. So one of the interesting things about WebAssembly is that you're already using it. Every single browser in the world supports WebAssembly as far as I can't think of an exception. And it's been out in public for a very long time in the world of the browser virtual machine. What's changed is now folks are starting to realize, well, not now, the past few years, starting to realize, hey, we have this wonderful technology that's incredibly small, can provide near native performance, can be created, compiled to from many days in different languages, has some wonderful secure by default functionality involved and provides a standard system interface. After all, we all have browsers on our machine, for example. And so now folks are starting to realize, well, how can we use this not only in a browser environment or in a headless browser environment like a V8 engine or something like that, but also directly on a machine through runtimes like WasmTime, Wasm3 and others. So here are the key properties of WebAssembly. You can look it up that we don't have time in this discussion to go all the way through. Here's the key properties of WebAssembly for you. Super tiny. We can compile it down to very, we can take complex apps and high level languages and get them running in very small footprints. Near native performance, we can get within about 90% of native. And if you're talking about a small device, think Raspberry Pi, being able to, to extract all of that performance is critically important. As I mentioned, it's polyglot, so we can compile from 35 different languages into WebAssembly. I should say that some languages have more functionality at this time than others. This is a wonderful place. If you're looking to invest in the community to help grow it, is actually to help expand and create a more profound set of resources for taking other languages and compiling them to WebAssembly. It's secure by default. So every file, every system permission needs to be permissioned explicitly. So reading a file, writing a file, opening a network socket and so on. And in a wonderful, I don't wanna say accident of history, perhaps we're in a new era. The WebAssembly system interface, the spec for running WebAssembly outside of a browser directly on a machine has a great community alignment behind it. There's still a huge amount of work to be done. Again, a wonderful area for folks to invest if you're looking to contribute to a particular project or area within the WebAssembly universe, being able to expand out not only what is defined in the spec, but also the implementation of the WebAssembly system interface spec. Is a great way to look. A great place to look. So let me make this the sort of challenges of AIML at the extreme edge, a little bit more concrete with a case study. So this is a project that myself and a few of my colleagues in the audience as well had the privilege to work on where we were working with a global manufacturing conglomerate. And what we were trying to do was do on machine anomaly detection. This is something that's very popular these days. And the reason is because if we can get a little bit of a heads up as to when one of these large manufacturing machines is going to fail, then we don't have unplanned maintenance. And when you're talking about a very large organization, sometimes it can be, it's not immediately obvious, but we're talking about $100,000 of losses per hour that that machine is down without being planned for. And the second and even more important reason, of course, is that if a machine fails in an unexpected way, could lead to human injury. And that's certainly not something that we want. And then lastly, the thing to remember that, again, is the key to the edge is that cloud connectivity in these cases is unreliable. So some of these manufacturing facilities are in places with absolutely fantastic network connectivity. Others are in very remote locations that have intermittent connectivity or very expensive connectivity over satellite or something like that. So we need to be able to handle all of the decisions and all of the sovereignty over the data needs to remain local. It cannot have a cloud dependency. So the good news is we were using a TensorFlow model. The good news is that compiling TensorFlow to WebAssembly, there's a few different mechanisms. If you're interested, I would take a look at an organization called Hammer of the Gods, H-O-T-G.AI is their website. They've got some really interesting work around WebAssembly containers and containerizing within WebAssembly ETL and data pipelines as well as serving and monitoring models at the edge. Great folks, open source product, highly, highly recommended to take a look at that, but there are quite a few different mechanisms you can certainly play around with them. But okay, so we've got our model compiled to WebAssembly, but as we saw in the Sense and Fur Act Learn step, there's actually quite a bit of data flow that needs to happen in order to make that work. And indeed, when we're talking about a connected factory, it's not enough just to take the sensor data from one machine, plug it into a model, running on that machine, and then making a decision, taking action based off of that. In fact, we may have dozens, hundreds, even thousands of connected machines of different types that actually need to coordinate together. And again, we can't rely on a cloud centralized service in order to orchestrate all of those, all of those different applications. And we characterize the edge native challenges not only in terms of actually delivering a physical application that is the model and the inferencing engine, but three additional big challenges. They're all data challenges. The first is we need to have, you'll need to handle our local data as the source of truth, because we don't know whether we will have access to peers or other services within the environment. I, as a manufacturing machine, or as a robot need to be able to make sense of my world independently. But then I also need to interact with others. So as I come into and out of connectivity, I need to be able to resolve and zip up, if you will, or merge my data with the understanding of the world that others have. Indeed, I may be out of date, they may be out of date. So I need to be able to merge the data and synchronize the data without conflict. So I need a guarantee, a mathematically proven guarantee that the state will converge to a single state among all of the peer applications that need to share that state in a way that's conflict-free. And then lastly, as discussed before, we need to think about bandwidth differently here. We can't, we need to be smart about what and how we put over the wire because it can be so expensive, whether we're talking about LTE, 5G and so on. So it turns out that there is a wonderful technology called conflict-free replicated data types that solve just this problem. And as the name suggests, conflict-free replicated data types are a data type. So something that looks like a primitive, like an integer or a string, instead and behaves like that under the covers, in fact, has a synchronization component to it. So we actually push all of the logic for synchronizing and merging data in a conflict-free way into the data type itself. So conflict-free replicated data types are just one of the most fascinating new technologies. You can think about them almost like Google Docs, which is to say that when you use a Google Doc or a collaborative document editor, all of the conflicts between, if I change something, you change something, it's gonna converge onto a single state and everyone will see the same state at the end of the day. So conflict-free replicated data types guarantee that all the peers that share a state, in fact, converge to the same state, regardless of the order in which they received a given message or a given update, which if you think about in an offline first, local first, disconnection heavy environment can be absolutely critical because it means that if I'm disconnected from all the other peer applications, all the other machines are sharing state and I'm over here completely isolated. It means that when I reconnect, my state will be conferred, so I don't need to worry about mutating my data locally. And you can see how that looks over here with the example on the right. So for example, in the first machine, the messages are received in the order A, B, and C and the resulting document is A, B, C. In the second machine receives the messages in the order CBA and by virtue of the algorithm converges to A, B, C. And the last one, same thing, receives it in a completely in a third order and again converges to A, B, C. So it doesn't matter for how long I'm disconnected, when I rejoin, I am guaranteed to share the state of every other one and be consistent on that, which is incredibly powerful because now as an engineer, I don't have to write all that logic. It's just in the data type itself. To make it a little bit more concrete, I just wanted to give an example. There's an actual unit test. I've left the Mycelial logo on here specifically so that you know that this is our open source web assembly-based CRDT engine. Don't wanna say that every CRDT works like this, but this is how our open source CRDT engine works, how the Mycelial library works, but this is generally a good idea. So let's just sort of take a quick look at how CRDTs are able to do that. So here at the top, you can see we create two new list CRDTs. On the first one, we append the world hello, the word hello on the second, we append world, and then we generate a diff by sharing the, by passing the vector clocks of the opposite one to the other. That's essentially saying, hey, these are all the unique updates I have, which are the unique updates that you have. In this case, one has hello, the other has world, and then we apply the diff. And by virtue of the magic of vector clocks, which you don't have time to get into today, they converge. You can see here down to hello world, hello world strictly ordered in that way. So even though the first one had hello and the other one had world, they converge always to the same way, to the same state. Now you may be saying, well, how do I know what state it's gonna converge to? Great question. There's no promise that it's the state you necessarily intended or wanted, but once it's converged, you can then take additional steps to say, okay, let's manipulate this data, let's have some trigger that deals with it. So when it comes down to what we actually need to deliver AIML at the extreme edge, WebAssembly is a wonderful tool and a wonderful framework to be able to provide that. And we can take our AIML models and compile them to WebAssembly, and that's great. But we can use WebAssembly actually for some of the data flow, a lot of the data flow as well. And conflict free replicated data types are, in my opinion, the future for that because they just work in the local first conflict free bandwidth optimized way. And when they come together, what we get is a very interesting paradigm for Edge Native that I find particularly compelling where we have safer, cheaper, faster applications, faster systems that are scaled to human beings. After all, that's the goal here is to be able to take a digital process that is indeed very sort of not unnatural, but completely a natural, it's synthetic. And using AIML, using really deep learning, let it interface with the organic world, with people and allow us to interact with computers much in the same way that we interact with humans and much in the same way that we expect of humans. So for example, if a person looks at you, you know they're looking at you. If a computer looks at you, well, they need computer vision, AIML to do that. Thank you all so much for coming to my presentation. I have colleagues, as I mentioned, the audience who will be there, happy to chat with you as well. Please do check out Mycelio online. We're an open source company. We would welcome your collaboration and participation. And we're super active on our Discord. So please do join up or follow us on social. Thanks, everyone.