 Live from Madrid, Spain. It's theCUBE, covering HPE Discover Madrid 2017. Brought to you by Hewlett Packard Enterprise. We're back at HPE Discover Madrid 2017. This is theCUBE, the leader in live tech coverage. My name is Dave Vellante, and I'm with my co-host for the week, Peter Burris. Kat Graves is here. She's a research scientist at Hewlett Packard Enterprise and she's joined by Natalia Vasileva, CUBE alum, senior research manager at HPE, both with the labs in Palo Alto. Thanks so much for coming on theCUBE. It's good to see you again. You're welcome. So for decades, this industry has marched to the cadence of Moore's Law. Bowed down to Moore's Law, been subservient to Moore's Law. But that's changing, isn't it? Absolutely. What's going on? Yeah, I can tell Moore's Law is changing. So we can't now go, we can't increase the number of course on the same chief and have the same space. We can't increase the density of the compute today. And from the software perspective, we need to analyze more and more data. We are now marching also into the era of artificial intelligence when we need to train larger and larger models. We need more and more compute for that. And the only possible way today to speed up the training of those models to actually enable the AI is to scale out because we can't put more course on the chief. So we try to use more chiefs together, but then there is a communication bottlenecks coming. So we can't efficiently use all of those chiefs. So for us on the software side, for the people who works how to speed up the training, how to speed up the implementation of the algorithms and their work of those algorithms, that's a problem. And that's where Kat can help us because she's working on a new hardware which will enable, which will overcome those troubles. Yeah, so what in our lab, what we do is try and think of new ways of doing computation, but also doing the computations that really matter. You know, what are the bottlenecks for the applications that Natalia is working on that are really preventing the performance from accelerating. Again, exponentially like Moore's law, right? We'd like to return to Moore's law where we're in that sort of exponential growth in terms of what compute is really capable of. And so what we're doing in labs is leveraging novel devices. So you've heard of Memrister in the past probably, but instead of using Memrister for computer memory, non-volatile memory for persistent memory-driven computing systems, we're using these devices instead for doing computation itself in the analog domain. So one of our first target applications and target core computations that we're going after is matrix multiplication. And that is a fundamental mathematical building block for a lot of different machine learning, deep learning, signal processing. You kind of name it, it's pretty broad in terms of where it's used today. So Dr. Tom Bradisic was talking about the dot product and it sounds like it's related, it's got matrix multiplication, suddenly they start breaking out on Ives, but is that kind of related? That's exactly what it is. So if you remember your linear algebra in college, a dot product is exactly a matrix multiplication. It's the dot in between the vector and the matrix, so the two itself. So exactly right, our hardware prototype is called the dot product engine. It's just doing, cranking out those matrix multiplications. And can you explain how that addresses the problem that we're trying to solve with respect to Moore's law? So, yeah, let me, you mentioned what the problem is with Moore's law. From me as a software person, the end of the Moore's law is a bad thing because I can't increase their compute power anymore on a single chip. But Fouquette is a good thing because it forced her to think what are unconventional, it's an opportunity. It forced her to think what are unconventional devices which she can come up with. And we also, as you mentioned, can understand that general purpose computing is not always a solution. Sometimes if you want to speed up the thing, you need to come up with a device which is designed specifically for the type of computation which you care about. And for machine learning type of application, again as Fouquette mentioned, there's matrix, matrix multiplications, matrix vector multiplications. These are the core of it. Today, if you want to do those AI type of applications, you spend roughly 90% of the time doing exactly that computation. So if we can come up with more power efficient and a more effective way of doing that, that will really help us. And that's what the product engine is solving. Yeah, as an example, some of our colleagues did an architectural work sort of taking the dot product engine as the core and then saying, okay, if I designed an architecture, computer architecture specifically for doing convolutional neural networks, so image classification, these kinds of applications, if I built this architecture, how would it perform and how would it compare to GPUs? And we're seeing 10 to 100x speed up over GPUs and even 15x speed up over if you had a custom built state of the art specialized digital ASIC. And so even comparing to the best that we can do today, we're seeing this potential for a huge amount of speed up and also energy savings as well. So follow up on that if I may. So you're saying these alternative processors like GPUs, FPGAs, you know, custom ASICs. Am I infer, can I infer from that that that is a stop gap architecturally in your mind? Or because you're seeing these alternative processors pop up all over the place. Yes, yes. Is that a fair assertion or? Well, I think recent trends are obviously favoring a return to specialized hardware. Yeah, for sure. I mean, just look at NVIDIA. I think it really depends on the application and you have to look at sort of what the requirements are, especially in terms of where there's a lot of power limitations, right? GPUs become a little bit tricky. Yep. So there's a lot of interest in the automotive industry, space, robotics for sort of more low power, but still very high performance, highly efficient computation. So when I, many years ago, when I was actually thinking about doing computer science and realized pretty quickly that I didn't have the brainpower to get there, but I remember thinking in terms of, there's three ways of improving performance. You can do it architecturally. What do you do with an instruction? You can do it organizationally. How do you fit the various elements together? You can do it with technology, which is what's the clock speed? What's the underlying substrate? Moore's Law is focused on the technology. Risk, for example, focused on architecture. FPGAs, ARM processors, GPUs focus on architecture. What we're talking about, to get to the back to that, doubling the performance every 18 months from a computing standpoint, not just a chip standpoint, now we're talking about revealing and liberating, I presume, some of the organizational elements, ways of thinking about how to put these things together. So even if we can't improve the technology, or get improvements that we've gotten out of technology, we can start getting more performance out of new architectures, but organizing how everything works together. If I got, and make it so that the software doesn't have to know everything about the developer, doesn't have to know everything about the organization. Have I got, am I kind of getting there with this? Yes, I think you're right. And if we're talking about some of the architectural challenges of today's processors, not only we can't increase the power of a single device today, but even if we increase the power of single device, then the challenge would be, how do you bring the data fast enough to the device? So we will have problems with fitting that device. And again, what dot-product engine does, it does computations in memory, inside. So you limit the number of data transfers between different chips, and you don't again face that problem of fitting their computation thing. So similar or same technology, different architecture, and using a new organization to take advantage of that architecture, the dot-product engine, being kind of that combination. I would say that even technology is different. Yeah, I would say, my view it is more, we're actually thinking about it holistically. We have in labs, we have software working with architects, working with like, course technology. I mean it's not just a clock speed issue. And it's thinking about how you, what computations actually matter, which ones you're actually doing, and how to perform them in different ways. And so one of the great things as well with the dot-product engine in these kind of new computational accelerators is with something like the memory-driven computing architecture, we have now an ecosystem that is really favoring accelerators, and encouraging the development of sort of these specialized hardware pieces that can kind of slot in in the same architecture that can scale also in size. And you invoke that resource in an automated way, presumably. Yeah. Exactly. What's the secret sauce behind that? Is that software that does that, or an algorithm that chooses the algorithm? Gen Z. Yeah, no, Gen Z is underlying protocol to make those devices talk to the data. But at the end of the system software, the algorithms also, which will make a decision at every particular point, which compute device I should use to do a particular task. With memory-driven computing, if all my data sits in the shared pool of memory, and I have different heterogeneous compute devices being able to see that data and to talk to that data, then it's up to the system management software to allocate the execution of a particular task to the device which does that the best in the most power-efficient way, in the fastest way, and everybody's wins. So as a software person, you now, with memory-driven computing, are going to be thinking about developing software in a completely different way. Is that correct? I mean, you're not thinking about going through an IOS stack anymore and waiting for a mechanical device and doing other things. It's not only the IOS stack. As I mentioned today, the only possibility for us to increase the, to decrease the time of processing for certain algorithms to scale out, that means that I need to take into account the locality of the data. It's not only when you distribute the computation across multiple nodes, even if we have some enuma-based, which is, we have different sockets in a single system with local memory and the memory which is remote to that socket, but which is local to another socket. Today, as a software programmer, as a developer, I need to take into account where my data sits, because I know in order to accept the data on a local memory, it will take me hundreds of nanoseconds to accept my data in a remote socket. It will take me longer. So when I develop the algorithms in order to prevent my computational cores to stall and to wait for the data, I need to schedule that very carefully. With memory-driven computing, giving an assumption that again, all memory not only in a single pool, but it's also evenly accessible from every compute device. I don't need to care about that anymore. And you can't even imagine such a relief it is. It makes our life so much easy. Yeah, because you're spending a lot of time previously trying to optimize your code for that factor of the locality of the data. How much of your time was spent doing that menial task? Years, I think, in the beginning of the morse laws, in the beginning of these traditional architectures, if you turn to the HPC applications, every HPC application developed today needs to take care about data lack. Or you even, the gains. Yeah, and you hear about when a new GPU comes out or even just a slightly new generation, they have to take months to even redesign their algorithm to tune it to that specific hardware. And that's the same company, maybe even the same product sort of path line, but just because that architecture has slightly changed, changes exactly what Natalia's talking about. I'm interested in switching subjects here. I'd love to talk about, spend a minute on women in tech. How you guys got into this role, both obviously strong in math, computer backgrounds, but give us a little flavor of your background, Cat, and then Natalia, you as well. Me or you, I'll go for it. Hmm, I don't know. I was always interested in a lot of different things. I kind of wanted to study and do everything. And I got to the point in college where physics was just something that still fascinated me. And I felt like I didn't know nearly enough. I felt like there was still so much to learn and it was constantly challenging me. And so I decided to pursue my PhD in that. And it's never boring and you're always learning something new. Yeah, I don't know. Okay, so that led to a career in technology development. Yeah, and I actually did my PhD in kind of something pretty different. But sort of towards the end of it, decided that I really enjoyed research and really was just always inspired by it. But I wanted to do that research on projects that I felt like might have more of an impact and particularly an impact in my lifetime. My PhD work was kind of something that I knew would never sort of actually be implemented in, well, maybe in a couple of hundred years or something we might get to that point. And so there's not too many places where you, at least from my field in hardware, where you can be doing sort of what feels like or feels very cutting edge research, but be doing it in a place where you can see your ideas and your work be implemented. And so that's something that led me to labs. And Natalia, what's your passion? How did you arrive here? Yeah, as a kid, I always liked different math puzzles. So I was into math and pretty soon it became obvious that I like solving those math problems much more than writing about anything. So when, I think in the middle school, it was the first class on programming. I was right into that. And then the teacher told me that I should probably go to specialized school and then that led me to physics and mathematics museum and then mathematical department at the university. So it was pretty straightforward for me since then. I mean, you're both obviously very comfortable in this role, extremely knowledgeable. It seemed like great leaders. So why do you feel that more women don't pursue a career in technology? Do you have these discussions amongst yourselves? Is this something that you even think about? I think it starts very early. For me, both my parents are scientists. And so always had books around the house, always was encouraged to think and pursue that path and be curious. I think it's something that happens at a very young age and various academic institutions have done studies and shown when they do certain things to, it's surmountable. Carnegie Mellon has a very nice program for this where they went, I think, for the percentage of women in their CS program went from 10% to 40% in five years. And there were a couple of strategies that they implemented. I'm not going to get all of them, but one was sort of peer-to-peer mentoring, having when the freshmen came in, pairing them with a senior, feeling like you're not the only one doing what you're doing or interested in what you're doing. It's like anything human. You want to feel like you belong and can relate to your group. So I think, yeah. Let's say a last word. On that topic? Yeah, sure, on any topic. But yes, I'm very interested in this topic because less than 20% of the tech business is women. It's 50% of the population. For me, it's not the percentage which matters. You just need, don't stay in the way of those who is interested in that and give the equal opportunities to everybody. And yes, the environment from the very child foods should be the proper one. Do you feel like the industry gives women equal opportunity? For me, my feeling is yes, but you also need to understand that I'm- Because of your experience, yeah. Because of my experience, but I also, I originally came from Russia, I was born in St. Petersburg, and I do believe that Soviet Union, ex-Soviet Union countries has much better history in that because in Soviet Union, we didn't have men and women, we had comrades. And after the Second World War, there was women who took all hard jobs and we used to get moms at work. All moms of all my peers have been working. My mom is an engineer, my dad is an engineer. I think from that, there is no perception that the woman should stay at home or the woman is taking care of kids. There's less of that than probably in the States. So for me, yes. Now I think that industry going into that direction and that's right. It's instructive, great. All right, well listen, thanks very much for coming on theCUBE. Sure. Sharing the stories and good luck in lab, wherever you may end up. And good to see you again. Thank you. Thank you very much. All right, keep it right there, we'll be back with our next guest, Dave Vellante for Peter Burris. We're live from Madrid, 2017, HPE Discover. This is theCUBE.