 From the Fairmont Hotel in the heart of Silicon Valley, it's theCUBE covering, when IoT met AI, the intelligence of things. Brought to you by Western Digital. Hey, welcome back, everybody. Jeff Frick here with theCUBE. We're at Downtown San Jose at the Fairmont Hotel. When IoT met AI, it happened right here. You saw it first, the intelligence of things. Really interesting event put on by Read-Write and Western Digital, and we're really excited to welcome back any time CUBE alumni and always a fan favorite. She's Janet George, she's fellow and Chief Data Officer of Western Digital. Janet, great to see you. Thank you, thank you. So as I asked you when you sat down, you're always working on cool things. You're always kind of at the cutting edge. So what have you been playing with lately? Lately I've been working on neural networks and TensorFlow. So really trying to get study, understand the behaviors and patterns of neural networks, how they work and then unleashing our data at it. So trying to figure out how it's training through our data, how many nets there are, and then trying to figure out what results it's coming with, what are the predictions. Looking at how the predictions are, whether the predictions are accurate or less accurate and then validating the predictions to make it more accurate and so on. So it's interesting, so it's a different tool. So you're learning the tool itself. You're learning the underlying technology behind the tool. And then testing it actually against some of the other tools that you guys have. I mean, obviously you've been doing meantime between failure analysis for a long, long time. So first off, kind of experience with the tool. How is it different? So with machine learning, fundamentally, we have to go into feature extraction. So you have to figure out all the features and then you use the features for prediction. With neural networks, you can throw all the raw data at it. It's in fact data agnostic. So you don't have to spend enormous amounts of time trying to detect the features. So for example, if you throw hundreds of cat images at the neural network, the neural network will figure out images, features of the cat, the nose, the eyes, the ears and so on and so forth. And once it trains itself through a series of iterations, you can throw a lot of delanged cats at the neural network. And it's still going to figure out what the features of a real cat is. And it'll predict the cat correctly. So then how does that apply to the more specific use case in terms of your failure analysis? Yeah, so we have failures and we have multiple failures. Some failures through the human eye, it's very obvious, right? But humans get tired and over a period of time, we can't endure looking at hundreds and millions of failures and some failures are interconnected. So there's a relationship between these failure patterns or there's a correlation between two failures. It could be an edge failure. It could be a radial failure, eye pattern type failure. It could be a radial failure. So these failures, for us as humans, we can't scale. And we used to be able to take these failures and train them at scale and then predict. Now with neural networks, we don't have to take and do all that. We don't have to extract these labels and try to show them what these failures look like. Training is almost like throwing a lot of data at the neural networks. So it almost sounds like kind of the promise of the data lake, if you will, if you've heard about from Hadoop Summit for ever and ever and ever, right? Dump it all in and insights will flow, but we found often that it's not true. You need hypotheses. You need to structure and get it going. But what you're describing though, sounds much more along kind of that vision. Yes, very much so. Now the only caveat is you need some labels, right? If there is no labels on the failure data, it's very difficult for the neural networks to figure out what the failure is. So you have to give it some labels to understand what patterns it should learn, right? And that's where the domain experts come in. So we train it with label data. So if you're training with the cat, you know the features of the cat, right? In the industrial world, cat is really what's in the heads of people. The domain knowledge is not so authoritative, like the sky or the animals or the cat. The domain knowledge is much more embedded in the brains of the people who are working. And so we have to extract that domain knowledge into labels. And then we're able to scale the domain through the neural network. So then how does it then compare with the other tools that you've used in the past in terms of obviously the process is very different, but in terms of just pure performance, what are you finding? So we're finding very good performance and actually we're finding very good accuracy, right? So once it's trained and it's doing very well on the failure patterns, it's getting it right 90% of the time, right? But yes, but in a machine learning program, what happens is sometimes the model is overfitted or it's underfitted or there's bias in the model and you've got to remove the bias in the model. Or you've got to figure out, well, is the model false positive or false negative? You've got to optimize for something, right? Because we're really dealing with mathematical approximation. We're not dealing with preciseness. We're not dealing with exactness. In neural networks, actually it's pretty good because it's actually always dealing with accuracy. It's not dealing with precision, right? So it's accurate most of the time. Interesting, because that's often a comment about the kind of different computer science statistics, right, is computers binary statistics always has kind of a confidence interval. What you're describing, it sounds like the confidence is tightening up to such a degree that it's almost reaching binary. Yeah, exactly. And see, brute force is good when your traditional computing programming paradigm is very brute force type paradigm, right? The traditional paradigm is very good when the problems are simpler, but when the problems are of scale, like you're talking 70 petabytes of data or you're talking 70 billion rows, right? Find all these patterns in that, right? I mean, the scale at which that operates and the scale at which traditional machine learning even works is quite different from how neural networks works, right? Traditional machine learning, you still have to do some feature extraction. You still have to say, oh, I can't, otherwise you're going to have dimensionality issues, right? It's too broad to get the prediction anywhere close, right? And so you want to reduce the dimensionality to get a better prediction. But here, you don't have to worry about dimensionality. You just have to make sure the labels are right. Right, right. So as you dig deeper into this tool and expose all these new capabilities, what do you look forward to? What can you do that you couldn't do before? It's interesting because it's grossly underestimating the human brain, right? The human brain is supremely powerful in all aspects, right? And there's a great deal of difficulty in trying to code the human brain, right? But with neural networks, and because of the various propagation layers and the ability to move through these networks, we are coming closer and closer, right? So one example, when you think about driving recently, Google driverless car got into an accident, right? And where it got into an accident was a driverless car was merging into a eel lane and there was a bus. And it collided with the bus. So where did AI go wrong? Now, if you train an AI, birds can fly. And then you say penguin is a bird. It's going to assume penguin can fly. Right, right. We as humans know penguin is a bird, but it can't fly like other birds, right? It's that anomaly thing, right? Naturally, when we are driving and a bus shows up, even if it's eeled, the bus goes. Right, right. We eeled to the bus because it's bigger. We know that, right? AI doesn't know that. It was taught that eeled is eeled, right? So it collided with the bus. But the beauty is now large fleets of cars can learn very quickly based on what it just got from that one car, right? So now think of, so there are pros and cons, right? So think about you driving down highway 85 and there's a collusion. It's Sunday morning. You don't know about the collusion. You're coming down on the hill, right? Blind corner and boom. That's how these crashes happened and so many people died, right? If you were driving a driverless car, you would have knowledge from the fleet and from everywhere else. So you know ahead of time, we don't talk to each other where we are in cars. We don't have universal knowledge, right? Right, in car to car communications, right? Car communications and AI has that. So directly, it can save accidents. It can save people from dying, right? But people still feel, it's a psychology thing. People still feel very unsafe in a driverless car, right? So we have to get over our psychology. I think they'll get over that. They feel plenty safe in a driverless airplane, right? Or in a driverless light rail. Or when somebody else is driving, you're fine with the driver who's driving. You just sit in a driverless car. But there's that one pesky autonomous car problem. Just when the pedestrian won't go in the car and they stop and it's like a friendly battle off. All right, well good stuff, Janet. And it's always great to see you. I'm sure we'll see you very shortly because you are at all the great big day to conferences. Thank you. So thanks for taking a few minutes out of your day. All right, she's Janet George. She's the smartest lady at Western Digital, perhaps in Silicon Valley. We're not sure, but we feel pretty confident. I'm Jeff Rake. You're watching theCUBE from when IoT meets AI. The intelligence of things. We'll be right back after this short break. Thanks for watching.