 Live from Midtown Manhattan, it's theCUBE, covering Big Data, New York City 2017, brought to you by SiliconANGLE Media and its ecosystem sponsors. Okay, welcome back everyone. Live here in New York, this is theCUBE's coverage, Big Data NYC, our event, we've been doing it for five years. This is our event in conjunction with Strata Data, which is the Orion Media, we run a separate event, but we've been covering the Big Data for eight years since 2010, Hadoop World. This is theCUBE, of course theCUBE is never going to change. They might call it Strata AI next year, whatever trend that they might see. So, but we're going to keep it theCUBE. This is in New York City, our eighth year of coverage. Guys, welcome to theCUBE, our next two guests is Andrew Burke, Chief Privacy Officer, and Andrew Gilman, Chief Customer Officer and CMO. These are startups, so you got all these fancy titles, but you're on the A team from Immuda, Hot Startup. Welcome to theCUBE. Great to see you again. Thanks for having us, appreciate it. Okay, so you guys have the startup feature here this week on theCUBE, our little segment here. I think you guys are the hottest startup that is out there and that people aren't really talking a lot about. So you guys are brand new, you guys have come really good reputation, get a lot of props on inside the community, especially in the people who know data, data science, and know some of the intelligence organizations, but respectful people like Dan Hutchins just says you guys are rock stars and doing great. So why all the buzz inside the community? Now you guys are just starting to go to the market. What's the update on the company? So great story. Founded in 2014 with their Series A investment. It was announced earlier this year. And the team, group of serial entrepreneurs sold their last company to CSC, ran the public sector business for them for a while. And really special group of engineers and technologists and data scientists headquartered out of DC, customer success organization out of Columbus, Ohio and we're servicing Fortune 100 companies. So Immuta, I am, and UTA. Immuta.com, we just launched a new website earlier this week and in preparation for the show. And the easiest way to say, So Immuta, Immutable, I mean. Immutable, I'm sure there's a backstory. Immutable, yeah, we do not ever touch the raw data. So we were all about managing risk and managing the integrity of the data. And so risk and integrity and security are baked into everything we do. And so we want our customers to know that their data will be immutable. And that in using us, they'll never pose an additional risk. I think blockchain, I think about immutability. I'm like, I'm so into blockchain these days, but as you guys know, I'm totally into it. There's no blockchain in our technology. I know, but let's get down to why the motivation to enter the market. There's a lot of noisy stuff out there. Why do we need another unified platform? Yeah, so I mean, the big opportunity that we saw was, you know, organizations had spent basically the past decade refining and upgrading their application infrastructure, right? And doing so under the guise of digital transformation, right? So we've really built out organizations, people, processes to support monolithic applications now. Those applications are moving to the cloud. They're being re-architected in a microservices architecture. So we have all this data now. How do we manage it for the new application which we see as really algorithm-centric, right? The Amazons of the world have proven how do you compete against anyone? How do you disrupt any industry? That's operationalize your data in a new way? Well, they were developer-centric, right? They were very focused on the developer. You guys are saying you're algorithm-centric, meaning the software within the software kind of thing. It's really about, we see the future enterprise to compete, you have to build thousands of algorithms. And each one of those algorithms is going to do something very specific, very precise, but faster than any human can do. And so how do you enable an application, an algorithm-centric infrastructure to support that? And today, as we go and meet with our customers and other groups, the people, the processes, the data is everywhere, the governance folks who have to control how the data is used, the laws are dynamic, the tooling is complex, right? So this whole world looks very much like pre DevOps IT or pre-cloud IT. It takes on average between four to six months to get a data scientist up and running on a project. So here's, let's get into the company. I want to just get that gist out, put some context code. I see the problem you saw, a lot of algorithms out there, more and more open source is coming onto the scene where the Linux Foundation have their new event, rebrand, open source summit shows, exponential growth in open source. So no doubt about it, software is going to be, new guys are coming on, new gals, tons of software. What is the company positioning? What do you guys do? How many employees? Let's go down by the numbers and then talk about the problem that you saw. Okay, cool, so company we have, we'll be about 40 people by Q1. Heavy engineering, go to market. We're operating and working with Fortune, as I mentioned Fortune 100 clients, highly regulated industries, financial services, healthcare, government, insurance, et cetera, right? So where you have lots of data that you need to operationalize, that's very sensitive to use. What else? Company positioning, so we are positioned as data management for data science. So the opportunity that we saw, again, managing data for applications is very different than managing data for algorithm development. So you're selling to the data science? CDO, Chief Data Officer, you're selling to the analytics? Yeah, so in a lot of our customers, like in financial services, we're going right into the line of business, right? We're working with managing directors who are building out next generation analytics infrastructure that need to unify and connect the data in a new way that's dynamic, right? It's not just the data that they have within the organization, they're looking to bring data in from outside. They want to also work collaboratively with governance professionals and lawyers who, you know, in financial services, they are, you know, we always jest in the company that different organizations have these cool new tools, like data science have all their new tools and everyone's, and the data owners have flash disks and they have all this, but the governance people still have Microsoft Word. And maybe the newer tools are like Wiki, so now we can get it off a word and make it shareable. But what we allow them to do is, and what Andrew Byrd has really driven, is the ability for you to take internal logic, internal policies, external regulations, and put them into code that becomes dynamically enforceable as you're querying the data, as you're using it to train algorithms and to drive, you know, mathematical decision-making in the enterprise. Andrew, let's jump into some of the privacy, the Chief Privacy Officer, which is code words for you doing all the governance stuff. And there's a lot of stuff business-wise that's going on around GDPR, which is actually relevant. There's a lot of dollars on the table for that too, so it's probably good for business. But there's a lot of policy stuff going on. How do you, what's going on for you guys in this area? So I think policy is really catching up to just the world of big data. We've known for a very long time that data is incredibly important. It's the lifeblood of an increasingly large number of organizations. And because data is becoming more important, laws are starting to catch up. And I think GDPR is really, you know, it's hot to talk about. I think it's just the beginning of a larger trend. People are scared. Yeah. I mean, people are nervous. It's like, they don't know, this could be a blank check that they're signing away. Well, so. The enforcement side is pretty outrageous. So I mean, the, the, so. Is that right? I mean, people are scared? Or what do you think? Yeah, people are, I think people are terrified because they know that it's important. And they're also terrified because data scientists and folks in IT have never really had to think very seriously about implementing complex laws. I think GDPR is the first example of laws forcing technology to basically blend software and law. The only way, I mean, one of our theses is the only way to actually solve for GDPR is to embed laws within the software you're using. And so we're moving away from this meetings and memos type approach to governing data, which is very slow. It can take months. And we need it to happen dynamically. This is why I wanted to bring you guys in. Not only Andrew, we knew each other from another venture, but what got my attention for you guys was really this, this intersection between law and society and tech. And this is just the beginning. I mean, you look at the, the, the tell signs there. Peter Burris, who runs Research for Wikibon, coined the term programming real, the real world life. Basically you got wearables, you got IoT. This is happening. Self-driving car. Who decides what side of the street people walk on now. I mean, law and code are coming together. That's algorithms, there'll be more of them. Is there an algorithm for the algorithms? Who teaches the data set? Who shares the data set? Wait a minute, I don't want to share my data set because I have a law that says I can't. Who decides all this stuff? So that's exactly, I mean, that's what we, we are starting to enter a world where governments really, really care about that stuff. Justin in- But Silicon Valley, it's not in their DNA. And you're seeing it all over the front pages of the news. They can't even get it right in inclusion and diversity. How can they like work with laws? Tension, tension is brewing. And in the US, our regulatory environment is a little more lax. We want to see innovation happen first and then regulate, but the EU is completely different. Their laws in China and Russia and elsewhere around the world. And it's basically becoming impossible to be a global organization and still take that approach where you can afford to be scared of that. I don't know how I feel about this because I get like all kinds of like, you know, rushes of intoxication to fear. I mean, look at what's going on with Bitcoin and blockchain. Underbelly is a whole new counterculture going on around in immutable data, but anonymous cultures where the complete anonymous underbelly going on. I just think the risk factors going up when you mentioned IoT, so it's where you are and your devices and your home. Now think about 23andMe, Verily, Freenome, where you're digitizing your DNA. We already started to do that with MRIs and other operations that we've had. But if you think about now, I'm handing over my DNA to an organization because I want to find out my lineage. I want to learn about where I came from. How do I make sure that the derived data off of that digital DNA is used properly, not just for me as Andrew, but for my progeny, right? And so that introduces some really interesting ethical issues. And it's an intersection of this new wave of investment to your point, like in Silicon Valley, bringing healthcare into data science, into technology and the intersection, and the underlying, the underpinning of the whole thing is the data, right? It's how do we manage the data and what do we do? And AI really is the future here. And even though machine learning is the key part of AI, we just wrote an article this morning on Silicon Angle from Gina Smith, our new writer. Google Brain Chief, AI tops humans in computer vision and healthcare will never be the same. Absolutely. And they talk about little things like in 2011 you can barely do character recognition of pictures, now you can, 100%. Now you take that forward in Heidelberg, Germany, the event this week we were covering, the Heidelberg Laureate Forum, or HLF 2017, all the top scientists were there talking about the specific issue of, this is society blending in with tech. So there's just societal impact, legal impact, kind of blending. Algorithms are the only thing that's going to scale in this area. This is what you guys are trying to do, right? Exactly, I mean that's the interesting thing. When you look at training models and algorithms in AI, AI is the new buzz, is the new cloud. I walked, we're in New York, I'm walking down the street and there's the algorithm era and everything is earnestly young, billboards on algorithms, I mean who would have thought, right? And AI, you know. The Cube is going to be in AI too soon. Hey, we're AI, we'll brought to you. Hey Siri, do the Cube interview. But the interesting part of the whole AI and the algorithm is, you have n number of models, right? We have lots of data scientists and AI experts. Right, Siri goes off. Sorry Siri, I was trying to join the conversation. Didn't mean to insult you, Siri. But it's all just, it's applied math, right? By a different name. And you have n number of models, I mean 90% of all algorithms are a single linear regression. What ultimately drives the outcome is going to be how you prepare and manage the data. And so when we go back to the governance story, governance in applications is very different than governance in data science because how we actually dynamically change the data is going to drive the outcome of that algorithm directly. So if I'm in a muta, right? We connect the data, we connect the data science tools, we allow you to control the data in a unique way. I refer to that as data personalization. So it's not just can I subscribe to the data, it's what does the data look like based on who I am and what those internal and external policies are. So think about this, for example. I'm training a model that doesn't mask against race and doesn't generalize against age. What do you think is going to happen to that model when it goes start to interact either it's delivered as a chat bot? Context is critical and the usability of data is because it's perishable at this point. Data that comes in quick is worth more but historically the value goes down but it's worth more when you train the machine. So two different issues. Exactly. So it's really about longevity of the model. How can we create and train a model that's going to be able to stay in? It's like the new availability, right? It's going to stay, it's going to be relevant and it's not going to, it's going to keep us out of jail and keep us from getting sued as long as possible. Well, Jeff Dean is going to quote one more thing to get context. I want to ask Andrew over here about his view on this. So, Jeff Dean, the Google Brain Chief behind all this stuff is saying, oh AI enabled healthcare. The sector is set to grow at an annual rate of 40% through 2021. When it's expected to hit 6.6 billion spend on AI enabled healthcare. 6.6 billion. Today it's around 600 million. That's the growth just in AI healthcare impact. Just healthcare. This is going to go from a policy privacy issue. One, healthcare data has been crippled with HIPAA, slow this down, but where's the innovation going to come from? Where's the data going to be in healthcare and other verticals? It's just as one vertical. Financial services is crazy too. I mean, honestly healthcare is one of the most interesting examples of applied AI and it's because there's no other realm, at least now, where people are thinking about AI and the risk is so apparent. If you get a diagnosis and the doctor doesn't understand why, it's very apparent and if they're using a model that has a very low level of transparency, that ends up being really important. So I think healthcare is a really fascinating sector to think about, but all of these issues, all of these different types of risk that have been around for a while are starting to become more and more important as AI takes you. All right, so I want to wrap up here, give you guys both a chance and you can't copy each other's answer. So we'll start with you, Andrew, over here. Explain Immuta in a simple way, someone who's not in the industry, what do you guys do and then do a version for someone in the industry? So elevator pitch for someone who's a friend who's not in the industry and someone who is. So Immuta is a data management platform for data science and what that actually gives you is we take the friction out of trying to access data and trying to control data and trying to comply with all of the different rules that surround the use of that data. Great, now do the one for normal people. That was the normal people, okay, okay. Okay, I can't hear the one for the insiders. And then for the insiders. She was magic. It's magic, we're magic, you know. Coming from the infrastructure world, I like to refer to it as like a VM word for data science, right? We create an abstraction layer that sits between the data and the data science tools and will dynamically enforce policies based on the values of the organization but also it drives better outcomes because today, the data owners aren't confident that you're going to do with the data what you say you're going to do so they try to hold it there like the old server huggers, the data huggers. So we allow them to unlock that and make it universally available. We allow the governance people to get off those memos that have to be interpreted by IT and enforced and actually allow them to write code and have it be enforced as the policy mandates. And the number one problem you solve is what? Accelerate with confidence. We allow the data scientists to go and build models faster by connecting to the data in a way that they're confident that when they deploy their model that it's going to go into production that it's going to stay into production. And what's the GDPR angle? You've got the legal brain over here in policy. What's going on with the GDPR? How are you guys going to be a solution for that? So we have the most, I'd say, robust option of policy enforcement on data, I think, available. And so we make it incredibly easy to comply with GDPR. We actually put together a sample memo that says, here's what it looks like to comply with GDPR. It's written from a governance department sent to the internal data science department. It's about a page and a half long. So we actually make that very onerous process. I wonder if the team is for GDPR. You guys know the size of that market in terms of spend that's going to come around the corner. I mean, I think it's like the Y2K problem that's actually real. Exactly, it feels the same way. And actually Andrew and his team have taken apart the regulation, article by article, and have actually built in product features that satisfy that. So it's interesting. I think it's really impressive that you guys bring a legal and a policy mind into the product discussion. I think that's something that I think you guys are doing a little bit different than I see anyone out there. You're bringing legal and policy into the software fabric, which is unique. And I think it's going to be a standard, in my opinion. So hopefully this is a good trend and hopefully you guys keep in touch. Thanks for coming on theCUBE. Thanks for making time to come over. This is theCUBE breaking out the startup action, sharing the hot startups here that really are a good position in the marketplace as the generation of the infrastructure changes. It's a whole new ball game global development platform called the internet, the new internet's decentralized. We even get the blockchain. We want to try that a little later. Maybe another segment. This is theCUBE in New York City more after the short break.