 from our studios in the heart of Silicon Valley, Palo Alto, California. This is a CUBE Conversation. Hello and welcome to CUBE Conversations. I'm James Kobielus, lead analyst at Wikibon. Today we've got an excellent guest and who's a CUBE alumnus par excellence. It's Yaron Javiv, who is the founder and CTO of Iguazio. Hello Yaron, welcome in. I think you're coming in from Tel Aviv if I'm not mistaken. Right, pretty close to Tel Aviv. Thanks, German. Nice seeing you again. Yeah, nice to see you again. So I'm here in our Palo Alto studios and so I'm always excited when I can hear Yaron and meet with Yaron because he always has something interesting and new to share about what they're doing in the areas of cloud and serverless and real time streaming analytics. And now data science. I wasn't aware of how deeply they're involved in the whole data science pipeline. So Yaron, this is great to have you. So my first question really is, can you sketch out what are the emerging marketplace requirements that you at Iguazio are seeing in the convergence of all these spaces, especially real time streaming analytics, edge computing, serverless and data science and AI. Can you give us a sort of a broad perspective and outlook on the convergence and really the new opportunities or possibilities that the convergence of those technologies enable for enterprises that are making deep investments? Yeah, so I think we've sort of anticipated what's happening now. We just called it different names and we'll probably get into this discussion in a minute. I think what you see is that traditional analytics and even data scientists, science was starting at sort of a research labs, people exploring cancer, exploring impact of weather on people's mood, et cetera. And now people are trying to make real ROI from AI and data science. So they have to plug it within business applications. Okay, so it's not just a data scientist sitting in a silo with a bunch of logs that he got from his friends, the data engineer, and he scans them and generates some namesets and runs to the boss and says, you know what, we could have made some money a year ago if we've done something. So that doesn't make a lot of impact on the business. Where the impact on the business is happening is when you actually integrate AI in chatbots, in recommendation engines, in doing predictive analytics on analyzing failures and saving those failures on saving people's life. And those kind of use cases, those are the ones that require a tighter integration between the application, the data, and the algorithms that come from the AI. And that's where we started to think about our platform. We worked on the real time data, which is where when you're going into more production environment of not data lakes, you need very good, very fast integration with data. And we had this sort of fast computation layer, which was day one microservices. And now everyone talks about microservices. We certainly started with this area. And that is allowing people to build those intelligent applications that are integrated into the business applications. And the biggest challenges I see today for organizations is moving from this process of doing some research on data, historical data, and translating that into a business application or into impact on business application. This is where people can spend the year. I've seen a tweet saying, we've built a machine learning model in like few weeks and now we've waited 11 months for the productization of that artifact. Yes, that's what we're seeing at Wikibon, which is that AI is the heart of modern applications in business and the new generation of application developers in many ways are data scientists or have, you know, leverage the skills and tools for data science. Now looking at Iguazio's portfolio, you evolve so rapidly and to address a broader range of use cases. I've seen, and you've explained it over the years that imposition to Iguazio as being a continuous data platform, an intelligent edge platform, a serverless platform. Now I see that you're a bit of a data science workbench or pipeline tooling platform. Can you connect these dots your own and explain what is Iguazio's portfolio? I think they're all nice marketing for the same technology that we've built. Okay, just over the years, you know, people have four years when we started so we had to call it something else. People at that time thought that analytics sort of incorporates data science and when we said continuous analytics, we meant essentially fitting data in, running some of them, spitting some results. This was sort of opposed to the trend of Hadoop, which was a data lake. You throw data in and then you run batch analytics and then like a few days, you have some insight. So continuous analytics was sort of a term that we've came up with, maybe not the best, you know, described that you're essentially taking data in from different sources, crunching it through algorithms and generating triggers and actions or response to user requests, okay? And that was sort of a pretty unique and sort of pioneering this industry, even before they called it streaming or in a real time data science or whatever. And now if you look at our architecture, our architecture, as I explained before, is comprised of three components. The first one is a real time, multi-model database, you know, about it pretty exceptional in its performance and its other capabilities. The second thing is a microservice engine that allows us to essentially inject applications of various skies. Initially we started with application that essentially do analytics, grouping, joining, correlating, and then we started just adding more functions into other things like inferencing, image recognition, sentiment analysis, et cetera. Because we have this function engine, it allows us a lot of flexibility and find a really fast parallel engine on a really fast data can generate remarkable results. And then the industry started calling this microservice thing a serverless. We sort of even were ahead of the game of this serverless game. The third element of our platform is essentially having a fully managed pass, a platform where all those microservices and data managed through a self-service interface sort of think about it as a mini cloud. And you know, we've recently in the last two years we've shifted to working with coronaries versus using our own proprietary microservices orchestration originally. So we've went into all those three major technologies. Now those fit into different application. One of the very interesting application if you think about edge, in the edge you need to serve a mini cloud. You need a variety of data sources and databases with columnar rows, streaming, files, et cetera. We'll support all of them on our integrated platform. And then you need those microservices that could be developed in the cloud and then just sort of shift into the enforcement point in the edge. And you need some orchestration there because you want to do software upgrades you need to protect security. So having all the integrated stack created an opportunity for us to work with providers of edge. And you may have noticed our joint announcement with Google around solution for edge, around retailers and N IoT. We've made some announcement with Microsoft in the past. We're going to do some very interesting announcement very soon. We've made some joint announcements with Samsung and NVIDIA all around. Those errands we continue. It's not that we're limited to edge just that what happens because we have extremely high density data platform. Very power efficient, very well integrated. This has a great fit in the edge. But it's also the same platform that we sell in the cloud as a service or we sell to on-prem customers. So they can run the same things. In the cloud it sort of happens to be the fastest most real-time platform. And in the edge it's sort of an essential feature that you cannot just ignore. So you're wrong. Yeah, Iguazio is a complete cloud native development and runtime platform. Now serverless in many ways seems to be the core of your capability in your platform. Nucleo, which is your technology. You've open sourced it. It's built for prem-based private clouds. But also it has is extensible to be usable in broader hybrid cloud scenarios. Now give us a sense for how Nucleo and serverless functions become valuable or useful for data science or for executing services or functions of the data science pipeline. Can you connect the dots of Nucleo and data science and AI from the development standpoint? Sure, sure. So I think the two pillars that we have as technology the most important ones are the data. We have about I think like 12 patents on our data engine which is a very high performance and Nucleo functions. And also they're very well integrated because usually serverless is stateless. So you end up, if you want to run data, you have some challenges with serverless. With Nucleo you can actually do stateful use cases. You can mount files. You have real-time connections to data. So that makes it a lot more interesting than just Lambda functions. The other thing with Nucleo is that it's extremely high performance. It has about 200 times faster than Lambda. So that means that you can actually go and build things like a stream processing engine. If you joins in real-time of very private database activities, you can just go and do collectors, we call them. Those things go fetch information from weather services, from routers for doing cybersecurity analyses, from all sorts of sensors. So those functions are becoming like those nanobots in the analogy of those movies is that you just send them over to go and do things for you whether it's the data collection and crunching, whether it's the inferencing engines, those things that, for example, get a picture, compare it with the model, decide what's in the picture. And this is where Nucleo really comes into play. The interesting point, you see now an emergence of a serverless pattern in data science. So there are many companies that do like model inferencing as a service. Essentially what they do, they launch an endpoint, a URL endpoint that runs the model inside. You send a vector of numeric values and you get back in numeric values and they conversion that. It's not really different than serverless. It's just way more limited because I don't just want to send a vector of numbers because usually I need to send real data like a geolocation of my cell phone with some user ID and I need this function to cross-correlated with other information about myself with the geolocation and then give me commendation of which product I need to buy. And then those functions also have all sorts of dependencies on different packages, different software, environment variables, build instructions. All those, this is really where serverless technologies are much more suitable. Now, it's interesting that if you'll go to Amazon, they have a product called SageMaker, I'm sure you're familiar with that, which is their data science platform. Now, SageMaker, although you would say that's ideal use case for Amazon Lambda functions, they actually don't use Amazon Lambda functions in SageMaker. And you would ask yourself, why aren't they using Lambda functions in SageMaker? Just sort of telling you, you know, you could use Lambda as a glue logic around SageMaker and that's because Lambda doesn't fit the use case. Okay, because Lambda is not capable of storing large content and machine learning models could be hundreds of megabytes in core. Lambda is extremely slow, so you cannot do high concurrency inferencing with Lambda functions. So they essentially had to create another serverless and call it with a different name, although if they just would have approved Lambda, maybe it was more Swiss Armenian. So with nuclear, we've taken the other approach. We don't have the multiple resources that Amazon have. So we created one serverless engine, one serverless engine that does batch processing, stream processing, can store lots of data, even run continuous services. So all the different computation pattern with a single engine. And then sort of you've started seeing all this trend in data science about, yeah, we need to version our code, we need to, you know, record all our package dependencies and all those things, yes, serverless doesn't. So we just had to go and type more into the existing frameworks. And you've looked at our project, we have a project called Nuclear Jupiter, which is essentially a data scientist can write some code in his data science notebook, and then he clicks one command called nuclear deploy. It automatically compiles his data science artifacts and notebooks, et cetera, and converts it into a real time function that can listen not only on HTTP then, it can listen on streams. It can be scheduled on various timing. It could do batch. So many other things. And the interesting point is that if you think about data scientists, they're not the best programmers because they should be the best scientists. And this is, that means that they actually have a bigger barrier to writing code. So if you serverless framework, that also automates the logging, the auto scaling, the security, the provisioning of data, the versioning of everything, and package dependencies, if they just need to focus on writing algorithms, it's actually a bigger back for the buck. Now, if you just take serverless into DevOps teams, and they will tell you, you know, we know how to instrument Docker, we know how to do all those things. So the value in their eyes is smaller than the value in the eyes of data scientists. So that's why we're actually seeing this appeal that those people that essentially focus in life is writing math and algorithms and all sorts of those sophisticated things that they don't wanna deal with coding and maintenance and operations. And by also doing so, by operationalizing their code through serverless, you can cut back to market, you can address scalability, you avoid rewriting of code. All those big challenges that organizations are facing. You're right, I have to ask, you know, that's great. You have the tools to build, to help customers build serverless functions for AI and so forth inside of Jupyter notebooks. And you mentioned SageMaker, which is an AWS solution and which is up and coming in terms of supporting a full data science tool chain, for pipeline development among teams. Do you have high profile partnerships with Microsoft and Google and so forth? Do you incorporate or integrate or support either of these cloud providers own data science workbench offerings or third party offerings from there's dozens of others in this space. So what are you doing in terms of partnerships in that area? Yeah, obviously we don't want to lock us out from any of those. And you know, if someone already has his workbench then I don't want my customer to say, yeah, you're locking me into your own workbench. In our workbench, the things are really cool because like our Jupyter is connected for real-time connections to the database. And it has sort of other cool features that essentially you're getting like a huge speed boost. We have, we've done some tricks that we've announced with NVIDIA, around feeds and integration with GPUs, like creating a pool of GPUs from each of one of the data scientists running on a platform can essentially launch jobs on this pool of CPUs of owning the GPUs, which are extremely expensive as you know, but what we've done is because of all of our technology beside the actual database engine is open source we can essentially just go and install packages and we've demonstrated that to Google and major and the others. We can essentially just go and load a bunch of packages into their workbench. And I make it very close to what we provide in our managed platform, not with the same performance levels, but functionality-wise the same functionality. So how can you name some reference customers that are using Iguazio inside of high performance data science workflows? Is a, are you, or are you just testing the waters in that market for your technology? Your technology is already fairly mature. So as I told you before, although we, you know we sort of change messaging along the lines we always did the same thing. So when we, we do continuous analytics and we've spoken like a year and two ago about some of the use cases that we run like, you know, telco operators are running real time, you know, health predictive, health monitoring of their networks and auto healing networks and those kind of things. They all use algorithms to control those, those applications we work with right-dailing customers so we can feed a lot of data, generate real-time maps and do fraud detection and other applications on all those things. So we've noticed that all of the use cases that we're working with involve data science. In some cases, by the way, because of the sort of politics that we've, once we've said we have analytics or continuous analytics, we were sort of sent into the analytics with the organization, which we're more focused on sort of data warehousing because analytics is still sort of data warehousing and ado. And the actual people that build the application and sort of those data science applications and sort of real-time AI or incorporating AI into business applications are more the development and business people. This is also why we sort of change our naming because we wanted to make it very clear that we're our technologies about building new applications. It's not about data warehousing or faster queries on a data warehousing. It's about generating value to the business. Now, if you ask a specific application, we just announced a few weeks ago the investment of Samsung in Iguazio, part of that essentially has two pillars beyond getting a few million dollars. It says one thing is that they've adopted Nucleo as their serverless for their internal clouds. And the second one is we're working with them on a bunch of data science use cases. One of them I think was even quoted in the amount we've made with Nucleo. So there are also no, what I can say or not say, but essentially those are real business application. That's really at least three of those that involves, you know, in intercepting data from users and customers doing real-time analytics and responding really quickly. One of the things that we've announced that's because of the use of Nucleo and some tricks that we've done with Nucleo, we actually, what report their performance? Yaron, do you see a fair number of customers embedding machine learning inside of real-time streaming, stream computing back months? This is the week of Flink Forward here in San Francisco. I was at the event earlier this week and I saw at least they're presenting a fair amount of uptake of ML and sort of stream computing. Do you see that as being a coming mainstream best practice? Streaming is still the analytics bucket, okay? Because what we're looking for is applications which are more interactive. You know, if you think about like a chatbot or like doing predictive analytics, it's not about streaming because streaming is still, you know, it's faster flow of data, but it still sort of has delay associated. It's not responsive. It's not, you know, it's not the aspect of latency is less critical in streaming. Okay, the aspect of throughput is higher on streaming but not necessarily the response time. You know, think about Spark streaming. You know, it's good at processing a lot of data. It's definitely not good that the response one would put Spark as a way to respond to user requests on the internet, okay? So we're doing streaming and we see the growth but I think where we see the real growth is embedding it to real applications. The ones when a customer logs in and sends a request or you know, working with telcos on scenarios where technicians have like RFID on their tracks and they send all sorts of information on real time inventory and then a customer calls and says I need to set the box and they could say, you know, this guy needs to go all the way to that customer because how many times you've gotten a technician coming to your house and said no, I don't have that, no, exactly. You know, and then they have to send a different guy. So how do you impact the business on three pillars of the business? Okay, the three pillars are, one is essentially improving your operation or reducing the risk. It's essentially reducing your cost aspect. The other one is essentially how do grab our customers or make customers more successful? So this is around frontend application whether it's bots or doing targeted marketing or those kinds of use cases and also how do you grow your market which is again, around recommendation engines and those kind of things. So all those things, if you won't have AI incorporated in your business applications in a few years, you're probably gonna be dead. I don't see any business as a sustained competition without incorporating the ability to integrate real data with some customer data and essentially go and react based on that. I changed the subject slightly. You mentioned NVIDIA as a partner recently, of course, you announced that a few weeks ago at their event and they've recently acquired Melanox and I believe you used to be with Melanox so I'd like to get your commentary on that acquisition or merger. Right, yeah, so I was a VP data center at Melanox. My last job, I'm still good friends of all the guys there including the CEO and the rest of the team. We met in Megenson last week, I was in Israel so we've talked to the NVIDIA guys as well. I think it's a great merger. If you think about it, Melanox has sort of the best networking and storage technology on sort of the Silicon side and NVIDIA has sort of the best GPU technologies. Melanox also acquired some compute chip technologies and they also have very nice photonics technologies. Then Melanox today is being by all the cloud providers. My previous role was essentially owning those technical engagements with like the Azure and the rest of the gang. So now NVIDIA coming with the computation engine and Melanox coming with sort of the rest of the pieces around storage and networking make them a very strong player and I think it sort of threatens Intel because if you think about Intel they haven't really managed to come at high speed networking recently. They haven't really managed to come with GPUs that sort of combat NVIDIA technology. So I think that makes NVIDIA sort of a pretty strong vendor in that space. And another question is not related to that but you're in Tel Aviv, Israel and of course Israel is famous for the startups in the areas of machine learning and so especially with a focus on cyber security I think Israel is like near the top of the world in terms of just the amount of brain power focused on cyber security there. What are the hot ML machine learning related developments or innovations you see coming out of Israel recently related to cyber security and distributed cloud environments? Anything in terms of just basic R&D technology that we should all be aware of that we'll be finding its way into mainstream cloud and Kubernetes and serverless environments going forward, your thoughts? Yeah, so I think there are different areas that you know the guys in Israel also look at what happens in the sort of the US and there are players in all the different things. I think what's unique about us as a small country is that it's always trying to think outside of the box because we know we cannot compete in a very large market if we don't have innovation. So that sort of triggers this sort of innovation part because of all the security aspects in the country then also there's a lot of cyber. But you know something I've seen one cool startup that's also backed by our VC is doing sort of think about like face unrecognition pretty cool technology of since they could take a picture and make it such that machine learning would not be able to recognize that. Sort of anti cyber attack for image recognition. So that's something pretty unique that I've heard but there are other startups working on all the aspects of DevOps and automation and auto ML and also cyber, automating cyber security and various aspects. Right, right. Thank you very much, Euron. This has been an excellent conversation. We've really enjoyed hearing your comments and Iguazio is a great company, quite an innovator. It's always a pleasure to have you on theCUBE. With that, I'm going to sign off. This is James Kobielus with Wikibon with Euron Javiv and we bid you all have a good day. Thank you.