 Welcome back, everyone, to theCUBE's coverage here, live on the floor at AWS Remars, 2022. I'm John Furrier, host of theCUBE. Great event, machine learning, automation robot space, that's spent at Mars. It's part of the re-series of events, reinvents the big event at the end of the year, reinforce, remars, really intersection of the future of space, industrial, automation, which is very heavily devops of machine learning. Of course, machine learning, which is AI. We have Luis says here, who's the CEO co-founder of OptoML. Welcome to theCUBE. Thank you very much for having me in the show, John. So, we've been following you guys. You guys are growing startup funded by Madrona, eventually have one of your backers. You guys are here to show, okay? This is a small, I would say small show relative to what it's going to be, but a lot of robotics, a lot of space, a lot of industrial kind of edge, but machine learning is the centerpiece of this trend. You guys are in the middle of it. Tell us your story. Absolutely, yeah. So, our mission is to make machine learning sustainable and accessible to everyone. So, I say sustainable because it means we're going to make it faster and more efficient, use less human efforts, and accessible to everyone, accessible to as many developers as possible, and also accessible in any device. So, we started from an open source project that began at the University of Washington, where I'm a professor there. So, several of the co-founders were PhD students there. We started with this open source project called Apache CVM that had actually contributions and collaborations from Amazon and a bunch of other big tech companies. And that allows you to get a machine learning model and run on any hardware, like run on CPUs, GPUs, various kinds of GPUs, accelerators, and so on. It was the kernel of our company, and the project's been around for about six years or so. Company's about three years old, and we grew from Apache CVM into a whole platform that essentially supports any model on any hardware, cloud, and edge. So, it's the thesis that when I first started, that you want to be agnostic on platform? Agnostic on hardware, that's right. Hardware, hardware. What was it like back then? I mean, what kind of hardware were you talking about back then, because a lot's changed, certainly on the silicon side. Absolutely, yeah. So, taking it through the journey. Absolutely, yeah. Because I can see the progression, I'm connecting the dots here. So, once upon a time, yeah, no, do I? I walk in the snow with my bare feet. You have to be careful, because if you look up at the professor in me, then you're going to be here for two hours, you know? So, no, the average version here is that, clearly machine learning model, machine learning has shown to actually solve real, interesting, high value problems. And then, where machine learning runs in the end, it becomes code that runs on different hardware, right? So, and when we started Apache TVM, which stands for TensorFlow Virtual Machine, you know, at that time it was just beginning to start using GPUs for real for machine learning. We already saw that with a bunch of machine learning models popping up, and CPUs, and GPUs starting to be used for machine learning, it was clear that there would be opportunity to run on everywhere, right? And GPUs were coming fast. GPUs were coming, and huge diversity of CPUs, of GPUs, and accelerators now, and the ecosystem and the system software that maps the models to hardware is still very fragmented today. Right, so hardware vendors have their own specific stack, so NVIDIA has its own software stack, you know, and the Sodo's Intel AMD, and honestly, I mean, I hope I'm not being, you know, too controversial here to say that it kind of looks like the mainframe era. We had tight coupling, tight coupling between hardware and software. You know, if you bought IBM hardware, you had to buy IBM OS, and IBM database, and IBM applications, it all tightly coupled, and if you want to use IBM software, you had to buy IBM hardware, right? So that's kind of like what machine learning systems look like today. If you want to buy a certain, you know, big name GPU, you've got to use their software. Even if you use their software, which is pretty good, you have to buy their GPUs, right? So, but you know, we wanted to help peel away the model and the software structure from the hardware to give people choice, the ability to run the models where it best suit them, right? So that includes picking the best instance in the cloud that's going to give you the right, you know, cost properties, performance properties, or you want to run it on the edge, you might run it in an accelerator. What year was that roughly when that went through this? We started that project in 2016, 2015, 2016. Yeah, so that was pre-conventional wisdom. I think TensorFlow wasn't even around yet. No, it wasn't. It was, I think, like 2017 or so. Right. So that was the beginning of, okay, this is opportunity, AWS, I don't think they had released the, some of the Nitro stuff that Hamilton was working on. So they were already kind of going that way. It's kind of like converging. Yeah. The space was happening, exploding. Right. And the way that I was dealt with, and to this day, to a large extent as well, is by backing machine learning models with a bunch of hardware-specific libraries. And we weren't some of the first ones to say, like, no, let's take a compilation approach. Take a model and compile it to very efficient code for that specific hardware. And what underpins all of that is using machine learning for machine learning code optimization, right? But it was way back when. We can talk about where we are today. No, let's fast forward. So again, that's the beginning of the open source project, right? But that was a fundamental, believe, world view there. I mean, you have a world view that was, that was logical when you compare it to the mainframe, but not obvious to the means machine learning community. So, okay, good call. Check. Now let's fast forward, okay? Evolution will go through the speed of the years. More chips are coming. You got GPUs and seeing what's going on AWS. Wow, now it's booming. Now I got unlimited processors. I got silicon on chips. I got Jeep, Jeep, everywhere. Right. Yeah, and what's interesting is that the ecosystem got even more complex, in fact, because now you have a bunch, there's a cross product between machine learning models, frameworks like TensorFlow, PyTorch, Keras, and McLeod, and so on, and then hardware targets. So how do you navigate that? What we want here, a vision is to say, folks should focus. Folks, people should focus on getting their machine, making the machine learning models, do what they want to do that solves a value, solves a problem of high value to them, right? So another deployment should be completely automatic. Today it's very, very manual to a large extent. So once you're serious about deploying machine learning model, you've got to go and understand where you're going to deploy it, how you're going to deploy it, and then pick out the right libraries and compilers, and we automated the whole thing in our platform. This is why you see the tagline, the boots right there, bringing DevOps agility for machine learning, because our mission is to make that fully transparent. Well, I think that first of all, I use that line here because I'm looking at it here on the live on camera. People can't see, it's like I use it on a couple of my interviews. Because the word agility is very interesting because that's kind of the test on any kind of approach these days. Agility could be, and I talked to the robotics guys, just having their product be more agile. I talked to Pepsi here just before he came on. They had this large scale data environment because they built an architecture but that fostered agility. So again, this is an architectural concept that's a system's view of agility being the output. And removing dependencies, which I think what you guys were trying to do, let's say, eight. Only part of what we do, right? So agility means a bunch of things. First, today it takes a couple of months to get a model from when a model is ready to production. We're going to turn that into hours. Agile, literally, physically agile in terms of all-quart time, right? So, and then the other thing is give you flexibility to choose where your model should run, right? So in our department, including the demo and the platform expansion that we announced yesterday, we give the ability of getting your model and get it compiled, get it optimized for any instance in the cloud and automatically move it around. Today, that's not the case. You have to pick one instance and that's what you do and then you might auto-scale with that one instance, right? So we give the agility of actually running and scaling the model the way you want and the way it gives you the right SLAs, right? So. Yeah, and I think Swamy was mentioning that kind. Not specifically that use case for you but that use case generally, that scale being moving things around, making them faster. Not to do that integration work. Scale and run the models where they need to run. Like, Swamy, you want to have a large scale department in the cloud. You want to have models in the edge for various reasons because speed of light is limited. We cannot make lights faster. So, you know, got to have some, look at the physics there, it cannot change. Right, there's privacy reasons. You want to keep data locally, not send it around to run the model locally, right? So, anyways, give the flexibility. Well, that's a good point. Let me jump in real quick because I want to ask a specific question because maybe think of something. So we're just having a data mesh conversation. One of the comments that's come out of a few of these data as code conversations is data is the product now. So if you can move data to the edge, okay, which everyone's talking about, why move data if you don't have to, but I can move a machine learning algorithm to the edge. Because it's costly to move data. I can move compute, everyone knows that. But now I can move machine learning to anywhere else and not worry about integrating on the fly. So the model is the code. It is the product. Yeah, and since you said the model is the code, okay, now we're talking even more here. So machine learning models today are not treated as code, by the way. So they do not have any of the typical properties of code that you can. Whenever you write a piece of code and run a code, you don't even think what is the CPU. You don't think where it runs, what kind of CPU it runs, what kind of means that it runs. But with machine learning model, you do. So what we are doing and creating this fully transparent, automated way of allowing you to treat your machine learning models if you were a regular function that you call and then a function could run anywhere, right? And that's why it's bringing that up to agility, right? That's better. Yeah, and you can use your existing teams. That's better because I can run on the Artemis too, in space. You could, yeah. If they have hardware. So you, and that allows you to run your existing, continue using your existing DevOps infrastructure and your existing people, right? So I have to ask you, because since you're a professor and you're a master class on theCUBE, thank you for coming on. Professor, I'm a hardware guy. I'm building hardware for Boston Dynamics to spot the dog. It's the diversity in hardware. It tends to be purpose-driven. I got a spaceship. I'm going to have hardware on there. It's generally viewed in the community here that everyone I talk to and other communities, open source is going to drive all software. That's a check. But the scale and integration is super important. And they're also recognizing that hardware is really about the software. And they even said it on stage here. It's hardware is not about the hardware. It's about the software. So if you believe that to be true, then your model checks all the boxes. Are people getting this? I think they're starting to. Here's why, right? So a lot of companies that were hardware first that thought about software too late, aren't making it, right? So, I mean, there's a large number of hardware companies, AI chip companies that aren't making it. Probably some of them won't make it unfortunately just because they started thinking about software too late, right? So, you know, I'm so glad to see a lot of the, I hope I'm not just doing my own horn here, but Apache TVM, the infrastructure that we built to map models to different hardware, it's very flexible. So we see a lot of emerging chip companies like SEMA.ai has been doing fantastic work and they use Apache TVM to map algorithms to their hardware. And there's a bunch of others that are also using Apache TVM. That's because you have, you know, an open infrastructure that keeps it up to date with all the machine learning frameworks and models and allows you to extend to the chips that you want, right? So these companies pay attention to that early, gives them a much higher fighting chance, I'd say. First of all, not only are you backable by the VCs because you have pedigree, you're a professor, you're smart, and you get good recruiting for any, and you get good recruiting for PhDs out of the University of Washington, which is not too shared to computer science department, but they want to make money. The VCs want to make money, right? So you have to make money. So what's the pitch, what's the business model? What's your share, so you're thinking there? Yeah, the value of using our solution is shorter time to value for your model for months to hours. Second, you shrink operator all packs because you don't need a specialized expensive thing. Talk about expensive, expensive. Engineers can understand machine learning harder and software engineering to deploy models today. Like you don't need those teams if you use an automated solution, right? So then you reduce that. And also in the process of actually getting a model and getting specialized to the harder, making harder aware, we're talking about a very significant performance improvement that leads to lower costs of deploying it in the cloud. We're talking about very significant reduction in costs in cloud deployments. And also enabling new applications on the edge that weren't possible before creates latent value opportunities, right? So that's a high-level value pitch. But how do we make money? Well, we charge for access to the platform, right? So... Use it with consumption? Yeah, and value-based, yeah. So it's consumption and value-based, right? So it depends on the scale of the deployment. If you're going to deploy machine learning model at a larger scale, chances are that it produces a lot of value, right? So then we'll capture some of that value in our pricing scale. Key of direct sales force, Dan, to work those deals. Exactly. Got it. How many customers do you have? Just curious. So we started, so the SaaS platform just launched now, right? So we're starting onboarding customers. We've been building this for a while. We have a bunch of partners that we can talk about openly, like revenue-generating partners, let's put it this way. So we work closely with Qualcomm to enable SnapDragon on TVM and hence in our platform. We're closely with AMD as well, enabling AMD 100 on the platform. We've been working closely with two hyperscaler cloud providers and I don't know who they are, right? And they're being both here, right? What is that? They both start with the letter A. Oh, that's right. Don't give it away. Don't give it away. One is three, one is four, okay. No, and we have customers in the, we actually, early customers have been using the platform from the beginning in the consumer electronic space in Japan and self-driving car technologies as well as some AI-first companies that actually whose core value come from, the core business come from AI models. They're serious, serious customers. They've got deep tech chops. They're integrating. They see this as a strategic part of their architecture. That's what I call AI-native, exactly. But now there's several, we have several enterprise customers in line now. They've been talking to, of course, because now we launched the platform, now we're starting onboarding and exploring how we're going to serve it to these customers, but it's pretty clear that our technology can solve a lot of their pain points right now, and we're going to work with them as early customers who are going to find it. So do you sell to the little guys like us where we'd be customers if we wanted to be? You couldn't, absolutely, yeah. What would we have to do? Do you have machine learning folks on staff or? So here's what you'd have to do. So you, since you can see the booth, all the scans and nobody can see it later, you can try your demo. Right, so you see. And you should look at the transparent AI app that's compiled and optimized with our flow and deployed and built with our flow that allows you to get your, you imagine those style transfer, you know, you can get you on a Pine app and see how you look like with a Pine app or text. Well, we got a lot of transcripts and video data. Right, yeah, so you can, yeah, you can, right, exactly. So you, you can use that in, there's a very, very clear tutorial how to do it. But I can use it, you're not blocking me from using it. Everyone's, it's pretty much democratized. You can try the demo and then you can request access to the platform. Yeah, but you've got a lot of more serious, deeper customers. But you can serve anybody what you're saying. We can serve anybody, yeah. All right, so what's the vision going forward because our, let me ask this, when did people start getting the epiphany of removing the machine learning from the hardware as it was recently, a couple of years ago? Well, on the research side, we helped start that trend a while ago, right? I didn't need to repeat that. But I think the vision that's important here, I want the audience here to take it away is that there was a lot of progress being made in creating machine learning models, right? So there's fantastic tools to deal with training data and creating the models and so on. And now there's a bunch of models that can solve real problems there. The question is, how do you very easily integrate that into your intelligent applications? So Moderna Venture Group has been very, very vocal and very investing heavily on intelligent applications both, and user applications as well as enablers, right? So we see an enabler of that because we, it's so easy to use our flow to get a model integrated in your application that now any regular software developer can integrate that. That's just the beginning, right? Because now we have CI CD integration to keep your models updated to continue to integrate and then there's more downstream support for other features that you normally have in regular software development. I've been thinking about this for a long, long time and I think this whole code, no one thinks about code. I write code, I'm deploying it. I think this idea of machine learning as code independent of other dependencies is really amazing. Like I think it's so obvious now that you say it. What's the choices now? It's to say that, okay, I buy it, I love it, I'm using it. Now, what do I got to do if I want to deploy it? Do I have to pick processors? Are there verified platforms that you support? Is there a short list? Is there every piece of hardware? We actually can help you. I hope I'm not saying we can do everything in the world here, but we can help you with that. So here's how you, when you have the model in the platform, you can actually see how this model runs on any instance of any cloud, by the way, so we support other three major cloud providers. And then you can make decisions. For example, if you care about latency, your model has to run on at most 50 milliseconds because you're going to have interactivity. And then after that, you don't care if it's faster. All you care is that it's going to run cheap enough so you can help you navigate. And also you can make it choice automatic. It's like tire kicking in the dealer showroom. You can test everything out, you can see the simulation. Are they simulations or are they real tests? Oh no, we run all in real hardware. So we have, as I said, we support any instance of any of the major clouds. We actually run on the cloud. But we also support a select number of edge devices today like ARM and NVIDIA Jetsons. And we have the OctoML cloud, which is a bunch of racks with a bunch of Raspberry Pi's and NVIDIA Jetsons. And very soon, a bunch of mobile phones there too, they can actually run on the real hardware and validate it and test it out. So you can see that your model runs performance and economically enough in the cloud. And it can run on the edge. You're machine learning as a service. Would that be accurate? That's part of it because we're not doing the machine learning model itself. You come with a model and you make it deployable and make it ready to deploy. So here's why it's important. Yes. Okay, so let me try. There's a large number of really interesting companies that do API models as an API as a service. Like you have NLP models, have computer vision models where you call an API and endpoint in the cloud. You send an image and you got a description, for example, or you send, you know. So, but these are using a third party. Now, if you want to have your model on your infrastructure but having the same convenience as an API, you can use our service, right? So today, chances are that if you have a model that you know that you want to do, there might not be an API for it. You actually automatically create the API for you. Okay, so that's why I get the DevOps agility for machine learning. Exactly. It's a better description because it's not, you're not providing the service. You're providing the service of deploying it like DevOps infrastructure as code. You're now ML as code. Yeah, it's your model, your API, your infrastructure, but all of the convenience of having this thing ready to go, like fully automatic, hands off. Because I think what's interesting about this is that it brings the craftmanship back to machine learning because it's a craft. I mean, let's face it. Yeah, and I want human brains, which are very precious resources, to focus on building those models that is going to solve business problems. I don't want this very smart human brains figuring out how to scrub this thing to actually get it run the right way. This should be automatic. That's why we use machine learning for machine learning to solve that. Here's an idea for you. We should write a book called The Lean Machine Learning. Because the Lean startup was all about DevOps. You call machine leaning. No, there's not going to work here. But remember when iteration was the big mantra, oh yeah, iterate, that was from DevOps. That's right. Yeah. Allowed for standing up, stuff fast, double down. We all know the history, what turned out. That was a good value for development. Completely. I mean, if you don't mind me building on that point, you know, something that we see as optimal, but we also see at Madrona as well, seeing that there's a trend towards best in breed for each one of the stages of getting model deployed from the data aspects of curating the data and then into the model creation aspects to the model deployment and even model monitoring, right? And we have, and we develop integrations with all the major pieces of the ecosystem such that you can integrate, say with model monitoring to go and monitor how your model is doing, just like you monitor how code is doing in the department of the cloud. It's evolution, I think it's a great step. And again, I love the analogy to the mainframe I lived during those days. I remember the monolithic proprietary and then OSI model kind of blew it away, but that OSI stack never went full stack. It only stopped at TCPIP. So I think the same things going on here, we're going to see some scalability around it to kind of uncouple it, free it. Absolutely, and sustainability and accessibility to make it run faster and make it run on any device that you want by any developer, right? So that's the tagline. Luis says, thanks for coming on. Professor. Thank you. I didn't know you're a professor. That's great to have you on. It's a master class in DevOps of Chile for machine learning. Thanks for coming on. Appreciate it. Thank you very much. Congratulations. Octo ML here on theCUBE, really important, uncoupling the machine learning from the hardware specifically. That's only going to make space faster and safer and more reliable. And that's where the whole theme of remars is. Let's see how they fit in. I'm John Furrier theCUBE. Thanks for watching. More coverage after this short break. Thank you.