 Live from San Francisco, it's theCUBE. Covering RSA Conference 2020 San Francisco, brought to you by SiliconANGLE Media. Hello everyone, welcome back to theCUBE coverage here in San Francisco, the Moscone Center for RSA Conference 2020 for all the coverage for three days. I'm John Furrier, host of theCUBE. You know, as cybersecurity goes to the next level as cloud computing goes, continues to go for enterprise, large-scale AI and machine learning have become critical managing the data. We've got a great guest here from Intel, Kaz Warzynski, senior director of the AI price with Intel. Thanks for joining us. Oh, thanks. So data is a huge data problem when it comes down to cybersecurity and generally across the enterprise. Now it's well-known, well-documented, but you're here giving a talk about machine learning, privacy, because everyone wants to know who the bad guys are. So, do the bad guys deserve privacy? Okay, we'll get to that later. But first, touch about your talk. You gave a talk here at RSA. We'll get into the other stuff later. I gave a talk, so thanks for having me. I gave a talk on a whole suite of exciting new techniques known as privacy-preserving machine learning. So this is a set of machine learning techniques that help people realize the promise of AI and machine learning, but we know that machine learning systems rely on underlying data to train. So how can you also respect the privacy and the security of the underlying data while still being able to train and use AI systems? And just take a moment, where are you within the Intel sphere? Because Intel, obviously chips and power to all the enterprises in large skips. Are you on the software side, AI group? Explain where you are with the Intel. So I'm in the AI group at Intel, but I have the most fun job at Intel, I think, so as I work in the CTO office of the AI group, which means I get to think about more futuristic, where is AI going? What are some of the major inflection points? One of these that we've been looking at for the last couple years is this kind of collision course between the need for data to train machine learning systems, to unlock all the power of AI, but still the need to keep data private. Yeah, and I think that's generally consistent with our editorial and our research, which is the confluence of cloud native, large-scale cloud computing, multi-cloud, and AI machine learning, all kind of coming together. Those are multi-generational technologies that are coming. So this wave is big. That's right, and I think one thing that's kind of maybe underappreciated about machine learning, especially in production, is it's almost always a multi-party interaction. So you'll have, let's say, one party that owns data, another party may own a model, they're running a system on somebody else's hardware, so because of the nature of digital data, if you want to share things, you have to worry about what other parties may be doing with those data. You know, Kaz, you bring up a great point, I want to get your reaction thoughts on, is that it's multidisciplinary now. And as people are breaking into the field, I mean, people are really excited about AI. I mean, you talk to someone who's 12 years old, they see a test lab, they see software, they see all these things, they see all this cool stuff, so machine learning, which powers AI, is very enticing to anyone that's got kind of technical or nerdy background. And so it's attracting a lot of young people. And so it's not just getting a computer science degree, there's so much more to AI, because talk about why, what someone needs to be successful to engage in the AI wave. You don't need to just be a coder, you could be outside the scope, because it's an integrated model, or is it? It's very much so. My group at Intel is very heterogeneous, so I've got kind of mathematicians, but I also have coders, I have an attorney who's a public policy expert, I have cryptographers. I think there's a number of ways to get involved in meaning, my background is actually in neuroscience. So... That makes sense, yeah, you can stitch it all together. Well societal change has to be, the algorithms need training, they need to learn, so having the most diverse input seems to me to be a posture that the industry is taking. And what's, is that right? Is that the right way to think about it? How should we be thinking about how to make AI highly effective versus super scary? Right, well one of the efforts that we're making part of my message here is that to make these systems better, generally more data helps. If you can expand the availability of data, that's always going to help machine learning systems. And so we're trying to unlock data silos that may exist across countries, across organizations. So for example, in healthcare, you could have multiple hospitals that have patient data. If somehow they could pool all their data together, you would get much more effective models, much better patient outcomes, but for very good privacy reasons, they're not allowed to do that. So there's these interesting ideas like federated learning, where you could somehow decentralize the machine learning process so that you can still respect privacy but get the statistical power. Let's double down on that for a second because I want to explore that. I think this is the most important story that's not being talked about. It's nuanced a little bit. Healthcare, you had HIPAA, which was built for all the right reasons back then. But now when you start to get into much more of a cross-pollination of data, you need to manage the benefit of why it existed with privacy. So encryption, homomorphic encryption, for instance, data and use. Yes. When it's being used, not in flight or being grasped, becomes now you have the three triads of data. Yes. This is now causing a new formula for encryption, privacy. What are some of the state-of-the-art mindset thinking around how to make data open, usable, but yet either secure, encrypted, or protected? That's right. So it's kind of this paradox of how do I use the data but not actually get the data? You mentioned homomorphic encryption. So this is one of the most kind of leading-edge techniques in this area where somehow you're able to, there are ways of doing math on the data while it stays encrypted. And the answer that comes out is still encrypted. And it's only the actual owner of the data who can reveal the answer. So it seems like magic, but with this capability, you enable all kinds of new use cases that wouldn't be possible before, where third parties can act on your sensitive data without ever being exposed to it in any way. So discovery and leverage of the days that we're getting at, in terms of the benefits, some of the use cases. So stay on that. The use cases of this new idea. Is discovery and usage, how would that work? Well, so when we talked about federated learning and pooling across hospitals, that's one set of techniques. Homomorphic encryption would be, for example, suppose that some AI system has already been trained, but I'd like to use it on sensitive data. How do I do that in such a way that the third-party service isn't, you know, what makes I think machine learning different from different types of data security problems is that machine learning, you have to operate on the data. You're not just storing it. You're not just moving it around. So how do you? Yeah, and this is a key thing. So I got to ask you the question because one of the things that's a real interesting trade-off these days is AI and machine learning is really can create great benefits. But also people just go to the knee-jerk reaction of, you know, oh my God, it's scary, my privacy. So you saw a front line with Amazon, just facial recognition, oh my God, it's evil. So there's a lot of scared people that might not be informed. How should companies invest in machine learning and AI, from your opinion, and how should they think about the next 10-year trajectory starting today, thinking about how to invest? What's the right way to think about it, build a team? What's your thoughts on that? And this is the number one challenge right now. Yeah, well I think some of the scary issues that you mentioned are legitimately scary. They're going to have to be resolved not by companies, but probably by society and kind of our delegates. So lawmakers, regulators, part of what we're trying to do at the technical level is give society and regulators a more flexible set of tools around which you can slice and dice data, privacy, and so on, so that it's not just all or none. I think that's kind of my main goal. As an organization, I think again, this idea of having a heterogeneous set of talents, you're going to need policy experts and applied mathematicians and linguists and neuroscientists. So diversity is a huge opportunity. Very much so. Not just diversity of people, but diverse data. Diverse data, diverse kind of mindsets, approaches to problems that are hard, but very promising if so. Okay, let's flip to the other side of the spectrum, which is what should people not do? What's a failure formula? One dimensional thinking, what's an identification of something that may not go in the right way? Well, you know, one distinguishing feature of the machine learning field, and it's kind of a cultural thing, but it's given it a lot of traction, is that fundamentally it's been a very open culture. So there's a lot of sharing of methods. It's a very collaborative academic field. So I think within a company, you want to kind of be part of that culture too. So every company is going to have its secret sauce, it's things that it needs to keep proprietary, but it's very important for companies to engage this broader community of researchers. So you're saying, which I would, well, maybe I'm what I would agree with, but I'll just say you can agree or disagree. To be successful, you got to be open. So if you're data driven, you got to be open. That's right. You got to be equal to better data. That's right. More data, more approaches to data, kind of more eyes on the problem, but you know, still you can definitely keep your proprietary, you know, it kind of forces organizations to think about what are our core strengths that we really want to keep proprietary, but then other things let's, you know, open up. All right, so what's the coolest thing you're working on right now? What are some of the fun projects you guys are digging into? And you got a great job. Sounds like you're excited about it. I am very excited. I think is the most exciting thing. I mean, I wish I could be 20 again in computer science or whatever field, because I think AI is more than multi-generational thing. It's super exciting as a technical person, but what are you working on that you're excited about? So I'm very excited about taking some of these things like homomorphic encryption and making them much more available to developers, to data scientists, because it's asking too much for a data scientist to also be a kind of a post-quantum crypto expert. So we've written an open source package called HE Transformer, HE for homomorphic encryption. It allows the data scientists to kind of do their normal data science in Python or whatever they're used to, but then they kind of flick a switch and suddenly their model is able to run on encrypted data. Can you just take a minute to explain why homomorphic encryption trend right now is really important? I mean, give a peek into the why, because this is something that's now becoming much more real, the data in use kind of philosophy. Why now? Why is it so important right now? Well, I think because of cloud, in the power of cloud and the fact that, you know, data are collected in one place and possibly processed in another place, you're going to have to, you know, your data are moving around and they're being operated on. So if you can know that, you know, as long as my data are moving around and people are operating on it, but it's staying encrypted the whole time, you know, not just in transit, that gives a much higher level of comfort around. And the applications are going to probably be onboarded. I mean, you almost imagine new applications will emerge from this. I mean, discovery, cataloging, API integration points. I mean, you almost imagine that stress will go up. And you can also kind of end up with these different business models where you have entities that compete in some spheres, but they may decide to collaborate in other ways. So for example, banks could compete on, you know, lending and so on, their normal activities, but in terms of fraud detection, they may decide, hey, maybe we can make some alliance where we cross-check with each other's models on certain transactions, but I'm not actually giving you any transaction data, so that's maybe okay, right? So that's a very powerful concept. Yeah, it's really interesting. I mean, I think the compute power has allowed the overhead seems to be much more robust because people are working on this for in the 80s and 90s, I remember. But it was just so expensive, overhead-wise. That's right, yeah. So you bring up a great point here. So, and this is one of the areas where Intel is really pushing, my team is pushing. These techniques have been around for 20 years. Initially, they were maybe like 10 million times slower than real time, so people thought, okay, this is interesting, you know, mathematically, but not practical. There've been massive improvements just in the last two years where now things are running, you know, a hundred times slower than kind of unencrypted math, but still, that means that something that took, you know, would take 50 milliseconds, now it takes five seconds. That's still not an unreasonable amount. All right, Cass, you're my new friend now, my best friend on AI, and I got a business to run, and I'm going to ask you, what should I do? I really want to leverage machine learning and AI in my business. I'm investing in more tech. I got cloud, I'm building my own software. How should I be investing? How do I build out a great machine learning AI team and then ultimately capabilities? How should I do that? Okay, well, I would start with a team that has a kind of a research mindset. Not because you want them to come in and like write research papers, but the path from research into production is so incredibly short in AI. You know, you have things that are papers one year and they're going into production at Google Search and within a year. So you kind of need that research mindset. I think another thing is that you want to, you're going to require a very close collaboration between this data science team and your CIO and kind of systems, and a lot of the challenges around AI are not just coming up with the model, but how do you actually scale it up and go to production with it? And interesting about the research, I totally agree with you. I think you can almost call that product management, kind of new fangled product management, because if it's applied research, you kind of have your eye on a market generally, but you're not making hardcore product decisions. You're researching it, you're writing it. Is that what, you got to do the homework. You got to dream it before you can build it. Well, I'm just saying that the field is moving so fast that you're going to need on your team, people who can kind of consume the latest papers. Oh, you're saying consume the research as well. Yeah, I mean, if they can contribute, that's great too. I mean, I think this is this kind of open culture where, you know, people consume, they find some improvement, they can then publish it at the next year's conference. It's just been this incredibly healthy ecosystem. So it's all for accelerations, a big part of the cloud. Awesome. Well, I really appreciate your insight. This is a great topic. I could go for an hour, one of my favorite things. I love the homoformic encryption. I think that's going to be a game changer. I think we're going to start to see some interesting discoveries there. Give a quick plug for Intel. What are you working on now? What are you looking to do? What's your plans? Highs hiring? Doing more research? What's going on? Well, so we think that this intersection of privacy and AI is kind of at the core of Intel's data-centric mission. So we're trying to figure out, you know, whatever it takes to enable the community, whether it's, you know, optimized software libraries, it could be custom silicon, it could be even services. You know, we really want to listen to customers figure out what they need. It's so funny, Moore's Law is always going to be around. The next wave is going to have more compute. It's never going away. More storage, more data. It's just gets better and better. Thanks for coming on, Catherine. I appreciate it. Thanks for having me. Okay, we have Intel inside the queue, breaking down the future of AI. Really exciting stuff on the technology front. Security data, it's all going to happen at large scale. Of course, it's the queue bringing you all the data here at RSA. I'm John Furrier. Thanks for watching.