 From around the globe, it's theCUBE with digital coverage of AWS re-invent 2020. Special coverage sponsored by AWS Global Partner Network. Okay, welcome back everyone to theCUBE virtual coverage of re-invent 2020 virtual. Normally we're in person this year because of the pandemic, we're doing remote interviews. And we've got a great coverage here of the AP and Amazon Partner Network Experience. I'm your host, John Furrier. We are theCUBE virtual. Got a great guest from Tel Aviv, remotely calling in and videoing in Nimrod Vax with this chief product officer and co-founder of Big ID. This is the beautiful thing about remote. You're in Tel Aviv, I'm in Palo Alto. Great to see you. We're not in person, but thanks for coming on. Thank you, great to see you as well. So you guys have had a lot of success at Big ID. I've noticed a lot of awards, startup to watch, company to watch. Kind of a good market opportunity, data, data at scale, identification. As the web evolves beyond web presence, identification, authentication is super important. You guys are called Big ID. What's the purpose of the company? Why do you exist? What's the value proposition? So first of all, best startup to work at based on last door worldwide. So that's a big achievement too. So look, four years ago we started Big ID when we realized that there is a gap in the market between the new demands from organizations in terms of how to protect their personal and sensitive information that they collect about their customers and employees. The regulations were becoming more strict, but the tools that were out there and to the large extent still out there, we're not providing to those requirements and organizations have to deal with some of those challenges in manual processes. For example, the right to be forgotten, organizations need to be able to find and delete the person's data if they want to be deleted. That's based on GDPR and later on even CCPA. And organizations had no way of doing it because the tools that were available could not tell them whose data it is that they found. The tools were very stylo. They were looking at either unstructured data in file shares or windows and so forth. Or they were looking at databases. There was nothing for big data. There was nothing for cloud business applications. And so we identified that there was a gap here and we addressed it by building Big ID basically to address those challenges. That's great, great stuff. And I remember four years ago and I was banging on the table and saying, regulation can stunt innovation because you had the confluence of massive platform shifts combined with the business pressure from society. That's not stopping. It's continuing today. You're seeing it globally, whether it's fake news and journalism to privacy concerns for modern applications. This is not going away. You guys have a great market opportunity. What is the product? What is small ID? What do you guys got right now? How do customers maintain the success as the ground continues to shift under them as platforms become more prevalent, more tools, more platforms, more everything? So I'll start with Big ID. What is Big ID? So Big ID really helps organizations better manage and protect the data that they own. It does that by connecting to everything you have around structured databases and unstructured file shares, big data, cloud storage, business applications and then providing very deep insight into that data. Cataloging all the data so you know what data you have where and classifying it so you know what type of data you have. Cluster analyzing the data to find similar and duplicate data and then coordinating that to an identity. Very strong, very broad solution. Fit for IT organization. We have some of the largest organizations out there. The biggest retailers, the biggest financial services organizations, manufacturing, et cetera. What we are seeing is that there are with the adoption of cloud and with the success obviously of AWS that there are a lot of organizations that are not as big that don't have an IT organization that have a very well functioning DevOps organization but still have a very big footprint in Amazon in other kind of cloud services. And they want to get visibility and they want to do it quickly. And Big ID is really, Small ID is really built for that. Small ID is a lightweight version of Big ID that is cloud native built for your AWS environment. And what it means is that you can quickly install it using a cloud formation template straight from the AWS marketplace. Quickly stand up an environment that can scan, discover your assets in your account automatically and give you immediate visibility into that your S3 bucket into your DynamoDB environments into your EMR clusters, into your Athena databases and building immediately, building a full catalog of all the data. So you know what files you have where you know what tables, what technical metadata, operational metadata, business metadata and also classify that information. So you know where you have sensitive information and you can immediately address that and apply controls to that information. So this is data discovery. So the use cases, I'm an Amazon partner. I mean, we use the cube virtuals on Amazon. But let's just say hypothetically, we're growing like crazy. Got S3 buckets over here, secure encrypted and REST and all that stuff. Things are happening. We're growing like a weed. Do we just deploy small ID? Is that how it works? Is that use cases? Small ID is for AWS and big ID is for everything else or? So small ID is that you can start small with the small ID. You get the visibility you need. You can leverage the automation of AWS so that you automatically discover those data sources, connect to them and get visibility. And you could grow into big ID using the same deployment inside AWS. You don't have to switch, migrate. And you use the same container cluster that is running inside your account automatically scale it up and then connect to other systems or benefit from the more advanced capabilities the big ID can offer such as correlation, such by connecting to maybe your Salesforce CRM system and getting the ability to correlate to your customer data and understand also these data that you're storing, connecting to your on-premise mainframe with the same deployment, connecting to your Google Drive or Office 365. So, but the point is that with the small ID you can really start quickly small with a very small team and get that visibility very quickly. Nimra, I want to ask you a question. What is the definition of cloud-native data discovery? What does that mean to you? So cloud-native means that it leverages all the benefits of the cloud, right? It gets all of the automation and visibility that you get in a cloud environment versus a traditional on-premise line. So for one thing is that big ID is installed directly from your marketplace. So you could browse, find this solution on the AWS marketplace and purchase it. It gets deployed using CloudFormation templates very easily, very quickly. It runs on a elastic container service so that once it runs you can automatically scale it up and down to increase the scan and the scale capabilities of the solution. It connects automatically behind the scenes into the security hub of AWS. So you get those alerts, the policy alerts fed into your security hub. It has integration also directly into the native login capabilities of AWS. So your existing data dog or whatever you're using for monitoring can plug into it automatically. That's what we mean by cloud-native. And if you know if you're cloud-native you've got to be positioned to take advantage of the data machine learning in particular. Can you expand on the role of machine learning in your solution? Customers are leaning in heavily this year. You're seeing more uptake on machine learning which is basically AI, AI is machine learning but it's all tied together. ML is big on all the deployments. Can you share your thoughts? Yeah, absolutely. So data discovery is a very tough problem and it has been around for 20 years. And the traditional methods of classifying the data or understanding what type of data you have has been looking at the pattern of the data. Typically regular expressions or types of kind of pattern matching techniques that look at the data. But sometimes in order to know what is personal or what is sensitive, it's not enough to look at the pattern of the data. How do you distinguish between a date of birth and any other date? Date of birth is much more sensitive. How do you find country of residency or how do you identify even a first name from the last name? So for that, you need more advanced, more sophisticated capabilities that go beyond just pattern matching. And big ID has a variety of those techniques. We call that discovery in depth. What it means is that very similar to security in depth where you cannot rely on a single security control to protect your environment. You cannot rely on a single discovery method to truly classify the data. So yes, we have regular expression. That's the table state basic capability of data classification. But if you want to find data that is more contextual like a first name, last name, even a phone number and distinguish between a phone number and just a sequence of numbers, you need more contextual NLP based discovery named entity recognition we're using to extract and find data contextually. We also apply deep learning CNN capability. It's called CNN, which is basically deep learning in order to identify and classify document types which is basically being able to distinguish between a resume and a application form. Finding financial records, finding medical records. So our advanced NLP classifiers can find that type of data. The more advanced capabilities that go beyond a small ID and to big ID also include cluster analysis which is an unsupervised machine learning method of finding duplicate and similar data, correlation and other techniques that are more contextual and need to use machine learning for that. Yeah, and unsupervised also a lot harder than supervised. You need to have that ability to get that. What you can't see, you got to get the blind spots identified and that's really the key observational data you need. This brings up the kind of operational, you heard cluster, I hear, governance security you mentioned earlier, GDPR. This is an operational impact. Can you talk about how it impacts on specifically on the privacy protection and governance side because certainly I get the clustering side of it operationally, it's great, everyone needs to get that. But now on the business model side, this is where people are spending a lot of time scared and worried actually, what the hell to do? Yeah, one of the things that we realized very early on when we started with big ideas that everybody needs discovery. You need discovery and we actually started with privacy. You need discovery in order to map your data and apply the privacy controls. You need the discovery for security, like we said, right? Find and identify sensitive data and apply controls. And you also need discovery for data enablement. You want to discover the data in order to enable it, to govern it, to make it accessible to the other parts of your business. So discovery is really a foundation, a starting point that you get with small idea. How do you operationalize that? So big idea has the concept of an application framework. Think about it like an Apple store for data discovery where you can run applications inside your kind of discovery iPhone in order to run specific use cases. So how do you operationalize privacy use cases? We have applications for privacy use cases like subject access request and data rights fulfillment, right? Under the CCPA, you have the right to request your data. What data is being stored about you? Big idea can help you find all that data in the catalog that after we scan, we find that information. We can find any individual data. We have an application also in the privacy space for consent governments, right? Under CCPA, you have the right to opt out. If you opt out, your data cannot be sold, cannot be used. How do you enforce that? How do you make sure that if someone opted out, that person's data is not being pumped into glue into some other system or analytics into Redshift or Snowflake. Big idea can identify a specific person's data and make sure that it's not being used for analytics and alert if there is a violation. So that's just an example of how you operationalize this knowledge for privacy. And we have more examples also for data enablement and data management. There's so much headroom opportunity to build out new functionality, make it programmable. I really appreciate what you guys are doing, totally needed in the industry. I can just see endless opportunities to make this operationally scalable or programmable once you kind of get the foundation out there. So congratulations Nimrod and the whole team. Question I want to ask you, we're here at Reinvented Virtual. Three weeks, we're here covering CUBE Action, check out the CUBE Experience Zone, the Partner Experience. What is the difference between big ID and say Amazon's, Macy? I mean, let's think about that. So how do you compare and contrast? And Amazon is, they say we love partnering, but we promote our ecosystem. You guys have a similar thing. What's the difference? There's a big difference. Yes, there is some overlap because both small ID and Macy can classify data in S3 buckets. And Macy does a pretty good job at it, right? I'm not arguing about it. But small ID is slightly different. It's not only about scanning for sensitive data in S3. It also scans anything else you have in your AWS environment, like DynamoDB, like EMR, like Athena. We're also adding Redshift soon, Glue and other data sources as well. And it's not only about identifying and alerting on sensitive data, it's about building a good catalog of data. It's about giving you almost like a phone registry of your data in AWS, where you can look up any type of data and see where it's found across structured, unstructured, big data repositories that you're handling inside your AWS environment. So it's broader than just for security. Apart from the fact that it can be used for privacy, I would say the biggest value of it is by building that catalog and making it accessible for data enablement, enabling your data across the board for other use cases, for analytics in Redshift, for Glue, for data integrations, for various other purposes. We have also integration into Kinesis and you'll be able to scan and let you know which topics use what type of data. So it's really a very, very robust full blown catalog of the data that across the board that is dynamic. And also, like you mentioned, accessible to APIs, you know, very much like the AWS tradition. Yeah, great stuff. I got to ask you a question while you're here. You're the co-founder. And again, congratulations on your success. Also the chief product officer of Big ID. What's your advice to your colleagues and potentially new friends out there that are watching here? And let's take it from the entrepreneurial perspective. You know, I have an application and I start growing and maybe I have funding, maybe I take a more pragmatic approach versus raising billions of dollars. But as you grow the pressure for app sec reviews, having all the table stakes features, how do you advise developers or entrepreneurs or even business people, small, medium-sized enterprises to prepare? Is there a way, is there a playbook to say, rather than looking back saying, oh, I didn't do with all the things, I got to go back and retrofit, get Big ID? Is there a playbook that you see that will help companies so they don't get killed with app sec reviews and privacy compliance reviews? It could be a waste of time. I mean, what's your thoughts on this? I think that very early on when we started Big ID and that was how our perspective is that we knew that we are a security and privacy company. So we had to take that very seriously out front and be prepared. And so security cannot be an afterthought, it's something that needs to be built in. And from day one, we have taken all of the steps that were needed in order to make sure that what we're building is robust and secure. And that includes, obviously, applying all of the code and CI-CD tools that are available for testing your code, whether it's SNEAK or checkmarts or these type of tools, applying and providing penetration testing and working with the kind of best in line kind of the pen testing companies and white hat hackers that would look at your code. These are kind of the things that that's what you get funding for, right? And you need to take advantage of that and use that. And as soon as we got bigger, we also invested in a very kind of a very strong CISO that comes from the industry that has a lot of expertise and a lot of credibility. We also have kind of CISO groups. So each step of funding we've used extensively also to make our kind of security posture a lot more robust and visible. Final question for you, when should someone buy big ID? When should they engage? Is it something that people can just download immediately and integrate? Do you have to have, is the go to market kind of and you target the VP level? Or is it the, how do you, how does someone get in know when to buy you and download it and use the software? Take us through the use case of how customers engage with that. So customers typically have those requirements when they start hitting and having to comply with regulations around privacy and security. So very early on, especially organizations that deal with consumer information get to a point where they need to be accountable for the data that they store about their customers and they want to be able to know their data and provide the privacy control they need to their consumers. For our big ID product, this typically is a kind of a medium size and up company with an IT organization. For small ID, this is a good fit for companies that are much smaller, that operate mostly out of their IT is basically their DevOps teams. And once they have more than 10, 20 data sources in AWS, that's where they start using a count of the data that they have and they need to get more visibility and be able to control what data is being stored there because very quickly you start losing account of that information. Even for an organization like big ID which isn't a big organization, right? We are 200 employees. We are at the point where it's hard to keep track and they keep control of all the data that is being stored in all of the different data sources, right? In AWS, in Google Drive and some of our other systems, right? And that's the point where you need to have start thinking about having that visibility. You know, all growth plans start big, dream big, start small and get big. I think that's a nice pathway. So small gets you going, you lead right into the big ID. Great stuff. Final question for you while I got you here. Why the awards? Someone's like, hey, big ID, this is cool company. Love the founder, love the team, love the value proposition, makes a lot of sense. Why all the awards? Look, I think one of the things that was compelling about big ID from the beginning is that we did things differently. Our whole approach for personal data discovery is unique. Instead of looking at the data, we started by looking at the identities. The people and finally looking at their data, learning how their data looks like and then searching for that information. So that was a very different approach to the traditional approach of data discovery. And we continue to innovate and to look at those problems from a different perspective so we can offer our customers an alternative to what was done in the past. It's not saying that we don't do the basic stuff. The REG-X is the connectivity that is needed, but we always took a slightly different approach to diversify, to offer something slightly different, more comprehensive. And I think that was the thing that really attracted us from the beginning with the RSA Innovation Sandbox Award that we won in 2018, the Gartner Cool Vendor Award that we received. And later on, also the other awards. And I think that's the unique aspect of big ID. You know, you solve big problems and certainly as needed, you know, we saw this early on. And again, I don't think the problem's going to go away anytime soon. Platforms are emerging more tools than ever before that converge into platforms. And as the logic changes at the top, all that's moving on to the underground. So congratulations, great insight. Thank you very much. Thank you for coming on the queue. Appreciate it, Nimrod. Okay, I'm John Furrier. We are theCUBE Virtual here for the Partner Experience APN Virtual. Thanks for watching.