 From theCUBE Studios in Palo Alto in Boston, connecting with thought leaders all around the world, this is a CUBE Conversation. Welcome back to theCUBE's coverage of the AWS startup showcase. The next big thing in AI, security and life sciences, featuring Hanna for the AI Trek. I'm your host, John Furrier. They were joined by two great guests, Steve and me, Hanna, CEO and Sacha Nair, Securonix CEO. Gentlemen, thanks for coming on theCUBE. We're talking about the next gen technologies on AI, open data lakes, et cetera. Thanks for coming on. Thanks for having us, John. Thanks, John. What a great lineup here. Hi, Stephen. Great stuff. Sacha, let's get in and talk about your company, Securonix. What do you guys do? Take us through, I know you got a slide to help us through this. I want to introduce your stuff first and jump in with Stephen. Absolutely. Thanks again, Stephen, Hanna team for having us on the show. So Securonix, we started the company in 2010. We are the leader in security analytics and response capability for the cyber market. So basically, this is a category of solutions called SIM Security Incident and Event Management. We are the quadrant leaders in Gartner, have over 500 customers today and have been plugging away since 2010. Started the company, just really focused on analytics using machine learning and advanced analytics to really find the needle in the haystack, then move from there to needle in the needle stack, using more algorithms, analysis of analysis and then kind of evolved the company to run on cloud and become sort of the biggest security data lake on cloud and provide all the analytics to help companies with their insider threat, cyber threat, cloud solutions, application, threat emerging internally and externally and then response and have a great partnership with Hanna as well as with AWS. So looking forward to this session. Thank you. I can't wait to hear the news on that next-gen SIM leadership. Stephen, Hanna, talk about what's going on with you guys. Chris, the update, a lot of stuff happening. Yeah, great to be here and thanks for that, Satch and we appreciate the partnership as well with both Securonix and AWS. Hanna is the open source company based on PrestoDB which is a project that came out of Facebook and is widely used one of the fastest growing projects in data analytics today and we make a managed service for Presto easily on AWS, all cloud native and we'll be talking about that more during the show. Really excited to be here. We believe in open source. We believe in all the challenges of having data in the cloud and making it easy to use. So thanks for having us again. Yeah, I'm looking forward to digging into that managed service and why that's been so successful. Looking forward to that. Let's get into the Securonix next-gen SIM leadership first. Let's share the journey towards what you guys are doing here. I'll see open data lakes on AWS has been a hot topic. The success of data in the cloud, no doubt is on everyone's mind, especially with the edge coming. It's just incredible growth. Take us through Satch and what you guys got going on. Absolutely. Thanks, John. So, I mean, we are hearing about cyber threats every day. No question about it, right? So in the past, what was happening is companies, what we have done as enterprises is put all of our eggs in the basket of solutions that were evaluating the network data. With cloud, obviously there is no more network data. Now we have moved into focusing on EDR, the right thing to do on endpoint detection. But with that, we also need security analytics across on-premise and cloud. And your other solutions like your OT, IoT, your mobile, bringing it all together into a security data lake and then running purpose-built analytics on top of that and then having a response so we can prevent some of these things from happening or detect them in real time versus waiting for hours or weeks and months which is obviously too late. So, with some of the recent events happening around Colonial and others, we all know cybersecurity is on top of everybody's mind. First and foremost, I also want to- Making sure it's slide one and that's all based off on top of the data lake, right? This is- Yeah, absolutely. Absolutely. You know, so before we go into on security analytics, I also want to, you know, congratulate everything going on with new cyber initiatives with our government and just really excited to see some of the things that the government is also doing in this space to bring, to have stronger regulation and bring together the government and the private sector. From a security perspective, you know, today, you know, we have one third of the Fortune 500 companies using our technology. In addition, there are hundreds of small and medium-sized companies that rely on security analytics for their cyber protection. So what we do is again, we are running the solution on cloud and that is very important. It is not just important for hosting, but in the space of cybersecurity, you need to have a solution which is not so where we can update the threat models and we can use the intelligence or the intel that we gather from our customers, partners and industry experts and roll it out to our customers within seconds and minutes because the game is real time in cybersecurity and that you can only do in cloud where you have the complete telemetry and access to these environments. When we go on premise traditionally, what you will see is customers are even thinking about pushing the threat models through their standard DevTest lifecycle management, right? And which is just completely defeating the purpose. So in any event, you know, Securonix on the cloud brings together all the data, then runs purpose-built analytics on it, helps you find, you know, very few. So, you know, we are today pulling in several million events per second from our customers and we provide just a very small handful of events and reduce the false positives so that people can focus on them. Their security command center can focus on that and then configure response actions on top of that so we can take action for known issues and have, you know, intelligence in all the layers. So that's kind of what Securonix is focused on. Steven, he just brought up probably the most important story in technology right now that's ransomware and more than, I mean, first of all, cybersecurity in general, but ransomware, you mentioned some of the government efforts. Some are saying that the ransomware marketplace is bigger than some governments, nation-state governments. There's a business model behind it. It's highly active. It's dominating the scene and it's a real threat. This is the new world we're living in. Cloud creates the refactoring capabilities we're hearing that story here with Securonix. How does Presto and Securonix work together? Because I mean, I'm connecting the dots here in real time. I think you're going to go there. So take us through because this is like the most important topic happening. Yeah, so as Sachin said, there's all this data that needs to go into the cloud and it's all moving to the cloud. And there's massive amounts of data, hundreds of terabytes, petabytes of data that's moving into the data lakes. And that's the S3-based data lakes, which are the easiest, cheapest, commodified place to pull all this data. But in order to deliver the results that Sachin's company is driving, which is intelligence on when there's a ransomware possibility, you need to have analytics on them. And so Presto is the open source project that is an open source SQL query engine for data lakes and other data sources. It was created by Facebook. It's part of the Linux Foundation, something called the Presto Foundation. And it was built to replace the complicated Hadoop stack in order to then drive analytics at very lightning fast queries on large, large sets of data. And so Presto fits in with this open data lake analytics movement, which has made Presto one of the fastest growing projects out there. What is an open data lake real quick for the audience who wants to learn on what it means? Is it means it's open source in the Linux Foundation or open, meaning it's open to multiple applications? What does that even mean? Yeah, open data lake analytics means that first of all, your data lake has open formats. So it is made up of, say, something called ORC or Parquet. And these are formats that any engine can be used against. And that's really great, instead of having locked in data types. Data lakes can have all different types of data. It can have unstructured, semi-structured data. It's not just the structured data, which is typically in your data warehouses. There's a lot more data going into the open data lake. And then you can, based on what workload you're looking to get benefit from, the insights come from that. And actually slide two covers this pictorially. If you look on the left here on slide two, the open data lake is where all the data is pulling. And Presto is the layer in between that and the insights which are driven by the visualization, reporting, dashboarding, VI tools, or applications like in Securonix case. And so analytics are now being driven by every company for not just industries of security, but it's also for every industry out there, retail, e-commerce, you name it. There's healthcare, financials. All are looking at driving more analytics for their classified applications, as well as for their own internal analysts, data scientists, and folks that are trying to be more data driven. All right, so talk about the relationship now with where Presto fits in with Securonix because I get the open data lake value in that. I get also what we're talking about with the cloud and being faster with the dataset. So how does Sachin and Securonix and Ahana fit in together? Yeah, friend, great question. So I'll tell you, we have two customers. I'll give you an example. We have two Fortune 10 customers. One has moved most of their operations to the cloud. And another customer, which is in the process early stage, the data, the amount of data that we are getting from the customer who's moved fully to the cloud is 20 times, 20 times more than the customer who's in the early stages of moving to the cloud. That is because the ability to add this level of telemetry in the cloud, in this case, it happens to be AWS, Office 65, Salesforce, and several other Zscalers across several other cloud technologies, but the level of logging that we are able to get, the telemetry is unbelievable. So what it does is it allows us to analyze more, protect the customers better, protect them in real time, but there is a cost and scale factor to that. So like I said, when you are trying to pull in billions of events per day from a customer, billions of events per day, what the customers are looking for is all of that data goes in, all of data gets enriched so that it makes sense to a normal analyst and all of that data is available for search, sometimes 90 days, sometimes 12 months, and then all of that data is available to be brought back into a searchable format for up to seven years. So think about the amount of data we are dealing with here and we have to provide a solution for this problem at a price that is affordable to the customer and that a small, a medium-sized company as well as a large organization can afford. So after a lot of our analysis on this, and you know again, security is focused on cyber, bringing in the data, analyzing it. So after a lot of our analysis, we zeroed in on S3 as the core bucket where this data needs to be stored because the price point, the reliability and all the other functions available on top of that, right? And with that, with S3, we've created a great partnership with AWS as well as with Snowflake, right? That is providing this from a data lake perspective a bigger data lake, enterprise data lake perspective. So now for us to be able to provide customers the ability to search that data. So we, data comes in, we are enriching it, we are putting it in S3 in real time. Now this is where Presto comes in. In our research, Presto came out as the best search engine to sit on top of S3. The engine is supported by companies like Facebook and Uber and it is open source. So open source, like you asked the question. So for companies like us, we cannot depend on a very small technology company to offer mission critical capabilities because what if that company gets acquired, et cetera. In the case of open source, we are able to adopt it. We know there is a community behind it and that it will be kind of available for us to use and we will be able to contribute in it for the longterm. Number two, from an open source perspective, we have a strong belief that customers own their own data. Traditionally, like Steven used the word locked in, it's a key term, customers have been locked in into proprietary formats in the past and those days are over. You should be, you own the data and you should be able to use it with us and with other systems of choice. So now you get into a data search engine like Presto, which scales independently of the storage. And then when we start looking at Presto, we came across Ahana, right? So for every open source system, you definitely need a sort of a for-profit company that invests in the community and then that takes the community forward because without a company like this, the community will die. So we are very excited about the partnership with Presto and Ahana and Ahana provides us the ability to take Presto and cloudify it or make the cloud operations work plus be our conduit to the Ahana community, help us speed up certain items on the roadmap, help our team contribute to the community as well. And then you have to take a solution like Presto, you have to put it in the cloud, you have to make it scale, you have to put it on Kubernetes, standard things that you need to do in today's world to offer it as sort of a microservice into our architecture. So in all of those areas, that's where our partnership is with Ahana and Presto and S3. And we think this is the search solution for the future. And with something like this, very soon we will be able to offer our customers 12 months of data searchable at extremely fast speeds at very reasonable price points. So, and you will own your own data. So it has very significant business benefits for our customers with the technology partnership that we have set up here. So very excited about this. Sachin is very inspiring, a couple of things there. One, decentralized on your own data, having democratized that piece as killer, open source, great point. The company goes out of business, you don't want to lose the source code or get acquired or whatever. That's a key enabler. And then three, a fast managed service that has a commercial backing behind it. So, you know, a great, and by the way, you were going to Snowflake, wasn't around a couple of years ago. So, you know, like, so this is what we're talking about. This is the cloud scale. Steven, take us home with this point because this is what innovation looks like. Could you share why it's working? What's some of the things that people could walk away with and learn from as the new architecture for the new next gen cloud is here. So, this is a big part of it. Share how this works. That's right. As you heard from Sachin, every company is becoming data driven and analytics are central to their business. There's more data and it needs to be analyzed at lower cost without the lock in. And people want that flexibility. And so, slide three talks about what a HANA cloud for Presto does. It's the best Presto out of the box. It gives you very ease of use for your operations team. So, it can be one or two people just managing this and they can get up to speed very quickly in 30 minutes, be up and running. And that jump starts their movement into an open data lake analytics architecture. That architecture is going to be, it is the one that is at Facebook, Uber, Twitter, other large web scale, internet scale companies. And with the amount of data that's occurring, that's now becoming the standard architecture for everyone else in the future. And so, just to wrap, we're really excited about making that easy, giving it an open source solution because the open source data stack based off of data lake analytics is really happening. I got to ask you, you've seen many waves of the industry. Certainly you've been through the big data waves, Steven. Sachin, you're on the cutting edge and just the cutting edge, billions of signals from one client alone is pretty amazing scale and refactoring that value properties is super important. What's different from 10 years ago when the Hadoop, you mentioned Hadoop earlier, which is RIP, obviously the cloud kill, that we all know that. Everyone kind of knows that. But what's different now? I mean, skeptics might say, I don't believe you. This is crazy, it's no way it works. S3 costs way too much. Where, why is this now so much more of an attractive proposition? What do you say the naysayers out there? Well, Steven, we'll start with you and then Sachin, I want you to weigh in too. Yeah, well, if you think about the Hadoop era and if you look at slide three, it was a very complicated system that was done mainly on-prem and you'd have to go and set up a big data team and a rack and stack a bunch of servers and then try to put all this stuff together. And candidly, the results and the outcomes of that were very hard to get unless you had the best possible teams and invested a lot of money in this. What you saw in the slide was that that right hand side, which shows the stack, now you have a separate compute, which is based off of Intel-based instances in the cloud. We run the best in that and they're part of the Presto Foundation. And it's now data lakes. Now the distributed compute engines are the ones that have become very much easier. So the big difference in what I see is no longer called big data, it's just called data analytics because it's now become commodified. It's been easy and the bar is much, much lower. So everyone can get the benefit of this across industries, cost organizations. I think that's good for the world. Reduces the security threats, the ransomware in the case of electronics and Satchin here, but every company can benefit from this. Satchin, this is really as an example of my mind. And you can comment too on if you believe or not, but replatforming the cloud, that's a no brainer. People do that, they did it. But the value is refactoring in the cloud, right? It's thinking differently with the assets you have and making sure you're using the right pieces. I mean, it's no brainer. It costs more money to stand up something then to like get value out of something that's operating at scales, much easier equation. What's your thoughts on this? Go back 10 years and where we are now, what's different? I mean, replatforming, refactoring, all kind of happening, what's your take on all this? I agree John. So, we have been in business now for about 10 to 11 years. When we started, my hair was all black, okay? I am, I am, okay, so, so this, so this, everything has happened here is the transition from Hadoop to cloud. Okay, this is what the result has been so people can see it for themselves. So when we started off with the partnerships with the Hadoop providers and again, I mean, Hadoop is the foundation, right? Which has now become EMR and everything else that AWS and other companies have picked up. But, you know, when you start with some basic premise, first, the racking and stacking of hardware, companies having to project their entire data volume up front, bring in the servers and have, you know, 50, 100, 500 servers sitting in their data centers. And then when there are spikes in data, or like I said, as you move to the cloud, your data volume will increase between five to 20X and projecting for that. And then think about the agility that it will take you three to six months to bring in new servers and then bring them into the architecture. So that big issue. Number two big issue is that the backend of that was built for HDFS. So Hadoop in my mind was built to ingest large amounts of data in batches and then perform some, you know, spark jobs on it, some analytics, right? But we are talking in security about real time, high velocity, high variety data, right? Which has to be available in real time. It wasn't built for that, to be honest. So what was happening is again, so even if you look at the Hadoop companies today as they have kind of figured, you know, kind of defined their next generation, they have moved from HDFS to now kind of a cloud based platform capability and have discarded the traditional HDFS architectures, right? Because it just wasn't scaling, wasn't searching fast enough, wasn't searching fast enough for hundreds of analysts at the same time. And then obviously the servers, et cetera, wasn't working. Then when we worked with the Hadoop companies, they were always two to three versions behind for the individual services that they had brought together. And again, when you're talking about this kind of a volume, you need to be on the cutting edge always of the technologies underneath that. So even while we were working with them, we had to support our own versions of Kafka, Solar, ZooKeeper, et cetera, to really bring it together and provide our customers this capability. So now when we have moved to the cloud with solutions like EMR behind us, AWS has invested in solutions like EMR to make them scalable, to have scaling and scale out, which traditional Hadoop did not provide because they missed the cloud wave, right? And then on top of that, again, rather than throwing data in that traditional older HDFS format, we are now taking the same format, the Parquet format that it supports, putting it in S3 and now making it available and using all the capabilities. Like you said, the refactoring of that is critical that rather than on-prem having servers and redundancies with S3, we get built-in redundancy, we get built-in lifecycle management, high degree of confidence, data reliability, and then we get all this innovation from companies like, from groups like Presto, companies like Ahana sitting on top of that S3, and the last item I would say is in the cloud, we are now able to offer multiple, have multiple resilient options on our side. So for example, with us, we will have, we still have some premium searching going on with solutions like Solar and Elastic Search. Then you have Presto and Ahana providing majority of our searching, but we still have Athena as a backup. In case something goes down in the architecture, our queries will spin back up to Athena, AWS service on Presto, and customers will still get served. So all of these options, but it doesn't cost us anything, Athena, if we don't use it. But all of these options are not available on-prem. So in my mind, I mean, it's a whole new world we are living in. It is a world where now we have made it possible for companies to even enterprises to even think about having true security data leaks which are useful and having real-time analytics. From my perspective, I don't even sign up today for a large enterprise that wants to build a data lake on-prem because I know that is not, that is going to be a very difficult project to make it successful. So we've come a long way and there are several details around this that we've kind of endured through the process, but very excited where we are today. Well, we'll certainly want to follow up with theCUBE on all your endeavors. Quickly, Ahana, why them, why their solution in your words? What would be the advice you'd give me if I'm like, okay, I'm looking at this. Why do I want to use it? And what's your experience? Right, so Presto, the standard SQL query engine for data lake analytics. More and more people have more data. Want to have something that's based on open source, based on open formats, gives you that flexibility, pays you go, you're the only paid for what you use. And so it proved to be the best option for Securonix to create a self-service system that has all the speed and performance and scalability that they need, which is based off of the innovation from the large companies like Facebook, Uber, Twitter, they've all invested heavily. We contribute to the open source project. It's a vibrant community. We encourage people to join the community and even Securonix will be having engineers that are contributing to the project as well. I think, isn't that right, Sachin? Maybe you can share a little bit about your thoughts on being part of the community. Yeah, so first, also why we chose Ahana, like John said, the first reason is you see Steven is always smiling. Okay, so that is very important. I mean, jokes apart, you need a great partner, right? You need a great partner. You need a partner with a great attitude because this is not a sprint, this is a marathon, right? So the Ahana founders, Steven, the whole team, they're world-class, they're world-class. The depth that the CTO has, his experience, the depth that DPT has, who's running the cloud solution, these guys are world-class. They are very involved in the community. We evaluated them from a community perspective. They are very involved. They have the depth of really commercializing an open-source solution without making it too commercial, right? The right balance where the founding companies like Facebook and Uber and hopefully Securonix in the future as we contribute more and more will have our say and they act like the right stewards in this journey and then contribute as well. So, and then they have chosen the right niche rather than taking portions of the product and making it proprietary. They have now, they have put in the effort towards the cloud infrastructure of making that product available easily on the cloud. So I think it's for the no-brainer from our side. Once we chose Presto, Ahana was the no-brainer and just the partnership so far has been very exciting and I'm looking forward to great things together. Like what I said, so thanks so much for that and we've only found your team in the world-class as well and working together and we look forward to working in the community also in the Presto Foundation. So thanks for that. Guys, great partnership, great insight and you know, really this is a great example of cloud scale, cloud value problems as it unlocks new benefits, open source, managed services, refactoring, the opportunities to create more value. Steven, Sachin, thank you so much for sharing your story here on Open Data Links. Ken, open always wins in my mind. Thank you. This is theCUBE, we're always open and we're showcasing all the hot startups coming out of the AWS ecosystem for the AWS startup showcase. I'm John Furrier, your host. Thanks for watching.