 Eric. Eric is here now. Hello. Hello. All right. So I welcome Sanjay, Eric, John, and Michael for the Arts Expert panel. So here you go. Over to you. Great. Thanks. So yeah. So thank you all for attending the session today and thanks, three of you, for taking time to participate. So this is like an Ask the Experts panel. So it's very much hoping that the audience here will be participating and asking questions. Feel free to either add them directly in the chats or the Q&A page as well. So that's where I will be keeping track of the questions. And I thought just to kind of get started, we could all introduce ourselves. So everyone has sense of kind of who we all are and why we consider ourselves experts in the AI field. And then we can start answering questions. If there aren't any from the audience, there's a few that we have prepared at least to kind of get the ball rolling. So just to get started, my name is Michael Clifford. I'm a data scientist, data science manager now. I'm working in the Office of the CTO at Red Hat on AIOPS problems. And I will be moderating this session. Sanjay, would you like to introduce yourself? Absolutely. So I'm Sanjay Arora. I work in Red Hat's Office of the CTO and we have something called the AICOE, AI Center of Excellence. So I work there partially on scaling training of neural nets and training of reinforcement learning agents on our platform, so OpenShift, Open Data Hub, and partially on some machine learning research projects with Boston University, who we have a big collaboration with. Don? Hello, Don Chesworth. I'm a data scientist here at Red Hat. I have been at Red Hat for about five years, primarily in customer experience the whole time. Two things I guess I focus on, one being building classification models to classify text. Primarily what I've been interested in using PyTorch. And then the second, hoping to carve a path within Red Hat for data scientists to be more like software engineers. So a data scientist who uses containers who sets, I set up my own CI CD, trying to get most of my jobs to run an OpenShift having APIs that do the predictions, things like that. Eric? Hi, I'm Eric Erdoganson and I also work at the AICOE at Red Hat. I was Sanjay and Michael and in previous companies I did work in applied AI, kind of like starting really prior to the advent of really open source AI ecosystems. And the projects like optical character recognition for Arabic characters, or also some of the early work in AI for drug discovery. And here at Red Hat in recent years I've been focusing on helping customers and people internal to Red Hat migrate their AI and ML workloads onto OpenShift. So getting actual repeatable AI workflows and application deployments. So in that sense it overlaps a little bit with what Don just talked about, is really start treating AI and ML like first class software engineering. Cool, awesome. Thank you all. So yeah, I think that I think that title is talked today is real-world data science. I think it's all of us are working on these problems of not just focusing on what's it called, prototypes or whatever, but how do you actually implement these types of tools into actual production applications. But cool, yeah, before we get into that I think it's always like kind of a good practice to kind of define our terms. There's a lot of buzz words going around today especially with AI, machine learning, deep learning, all this stuff. So and just like a good starting question for the three of you. Like what in your opinion is the difference between AI and machine learning and are either of them like different than deep learning in a meaningful way? Well, I guess I'll take a stab. I think you can view it as like a set inclusion kind of framework. The highest level is AI, which encompasses a lot of technologies, like common machine learning technologies like neural nets or decision trees, but also other technologies that aren't really exactly learning models. Chess programs are AI, but they aren't exactly machine learning in the traditional sense, or at least the early ones were not. Expert systems are AI. They were huge of course in like the 1980s. Not quite as much now, but you still see them. They still have a lot of domain applicability. So underneath AI there is machine learning. Again, like a lot of the big popular stuff now and the neural nets and the decision trees, nearest neighbor algorithms, these are all machine learning models. Then inside of that set is deep learning, which came out of the early neural network world, but of course extended to much larger networks with advanced structures and requiring larger datasets and new kinds of gradient descent technology to actually train. If I can add to that, the history is interesting. My dates might be off, so I don't know all the details, but in the 60s I think there was this huge optimism in all the sciences really. Physics was making huge strides, mathematics was making huge strides on very abstract questions and computing was. And so I think it was John McCarthy who invented LISP. I think he was the one who organized this conference with the who's who of that day's computer science. And I think they had a very optimistic timeline. They said something like in the next five years you'll have computers that can talk and translate languages and play any game and they'll be like humans. And I think Shannon was in that conference too and of course none of it panned out. And the question was how do we start implementing something like this, something that's like an animal brain. And people tried all kinds of things, but really if you are mathematically minded, you open a math book, it looks like it's a sequence of axioms and then through logic you have theorems and their proofs and corollaries and lemmas and so on and so forth. And so he said maybe it's logic, right? Maybe if I define a few terms and I can implement mathematical logic, maybe I'll just let it run and see what it comes out with and I'll have all these theorems and nice proofs. Another stab at the problem basically said let's define entities and their relationships and logic and concepts however we define a concept. And again let's try you know graph algorithms and let's invent new algorithms that can basically answer questions based on all the stuff I told the computer. And all of them had some successes but they were limited. And then I'm guessing in the 80s definitely by the 90s, people doubled down on this trend where they said let's just give it data. So that was called statistical learning or now machine learning. And the idea was there are inputs, there are outputs, there's some mapping from the inputs to outputs that we don't know. Presumably our brain like when I see the three of you it's doing some mathematical computation and saying that's Dawn and that's Eric and that's Michael. And so what if we had an algorithm that takes a few input and output pairs and infers an approximation to that mapping? And that is really machine learning, right? That it's trying to find that mapping and that leads to a whole host of interesting questions like what should the algorithms look like? So like Eric said, it's neural nets, decision trees, boosting, right? All these things. And then there's a whole theoretical framework called PAC probably approximately correct which says which basically answers the question if I give you a finite data set with some sample size what's the probability that the mapping you learn from this finite data set generalizes well? It performs well on new data set with high probability. So there was a lot of progress made in that direction. And so in the 90s it was done in the context of support vector machines and of course clustering and these are all techniques. And even neural nets which fall in this domain of statistical or machine learning which is data heavy. I think the very first perceptrons came out in the 1940s, right? So during World War II, con nets I think were first invented in the early 90s if I'm not mistaken. So many of these ideas weren't new ideas, they were computational challenges. But even with deep learning the initial idea was well, we know the brain has these cells like neurons and there are these connections, axons, centrites, I'm horrible at biology and they connect. Maybe we should make a computational device like that. And just as an aside yesterday, Quantam magazine which is great for popular science had some, they were reporting on some research that a neuron in our brain is actually not at all equivalent to a node in a neural net. It's far more complex, right? So it needs, I think they said seven, eight layers or five to eight layers of a small neural net to actually approximate one node. But I think that's the global way to organize all these thoughts. There is the general fuzzy question of AI and we don't really understand intelligence. So you try and see what works and then there's a very data focused approach. And actually like IBM for example has an IBM MIT Watson lab outside or in Boston and they are trying to combine neural nets with some of the old symbolic ideas. So they call it Neuros, what is it called? Neurosymbolic neural nets or something. So these are open questions. Sorry, yeah. So you kind of touched on this. This is another question that we wanted to get to here. So how important is data or what is the role of data essentially in these three different hierarchies of types of artificial intelligence, machine learning and deep learning? What is the role of data I guess in these different domains? Because it seems to me like initially we say AI can be these rule-based expert systems that don't necessarily require data other than a human. But then now as we go deeper down the funnel, the data becomes more and more critical. Is that true? Yeah. I mean, I think everybody sort of by osmosis these days knows that like, you know, data and frequently quite large data sets, you know, into the terabyte or petabyte range sometimes are used for, you know, some modern training algorithms. I thought it contrasts that with like what happened with expert systems because I actually, you know, did a few of these back in the day. And, you know, I think the way you phrase the role of data is a great way to phrase it because, you know, the kind of data you collected for those mostly took the form of interviewing other humans. The general pattern there was, you know, I want an expert system that captures some operational knowledge that humans are currently employing. And so the way you actually did this was literally interviewing, you know, human domain experts, watching them work. And so it was a very different kind of data. It was a very, you know, human process where he has the software developer, you know, you were interacting human to human with the experts. And it was your job to really take all that information and coat it up as a bunch of very explicit rules in whatever kind of, you know, expert system framework you were using. So it was a lot like human speech in relationship to, you know, machine speech. It was very low, low bandwidth, very small amounts of data compared to, you know, the raw hugeness of today's data sets. But it was high context. So there's a lot of, there were playing a lot of human context to say, oh, here's the rules. I'll never forget we did this. We did this for Kodak Eastman, back before Kodak Eastman basically went under after the advent of digital photography. But like we actually interviewed a team of people who inspected film gels. And at the end of the project, like the amount of the amount of code we actually produced was, I don't know, it'd fit on a screen and there was probably like 20 or 50 lines of code. And we were discussing what that meant. It's kind of like the, you know, the amount of the amount of data it actually took to capture what they wanted was very small in the sense of a very small amount of rules. But the amount of effort it took to find that was, you know, quite extensive. It was several weeks of interviewing and discussing and coding. So it was almost the polar opposite of like what we do these days, which like Sanjay said is, you know, throw a lot of data something and find something that works statistically. Yeah, I mean, on that point too, I mean, we talk about like real world data science, right? And there is a, I think a tendency to say, let's throw tons of data at this, this problem. We have tons of data, let's throw tons of data at it. Is there any like, or what are your opinions about like the quality of data or like how the data quality impacts the, or even the ability to perform some, some machine learning task? I would, I think anyone who has worked as a data scientist anywhere has a very high failure rate in terms of projects. And most of it is driven by the quality of the data, which is either completely missing or it's not labeled, right? Which is a big issue. So for example, we collect better bytes of log files probably per day or per week in a large corporation, nothing is labeled. And so yes, it does a lot of data on disk. What do I do with it? Right. And that's where I think it's not just the size of the data set, but a combination of defining the appropriate problem. And that often comes from the domain expert, right? People who look at log files every day or look at radiological scans every day. And for example, a radiologist might tell you, I don't need a machine learning model for this. I can make this decision in five seconds per scan. And I only get like 20 scans a day, so I don't care. So it's the problem which defines what data you need and what kind of labeling you need. And if you have good quality data sets with the appropriate labels for the appropriate problem, you might not need millions of images, for example. There's a lot of good data science you can do with like linear regression, which is, I think, unfairly maligned. It's one of the, you know, it's mathematically rigorous. It's wonderful. It works very well. It has very nice nonlinear generalizations like Gaussian processes or tree-based methods like Eric mentioned. And so I think, especially for people who work with data scientists, I would say when you look at a lot of data, it helps to work with the data scientists to define what they need and what would you need to do to provide them that data so that it's still useful to you. I can define problems on log files that probably are completely meaningless to someone who actually wants to use it. It might be a nice machine learning paper, but they can't use it. So I think it's that the data story is probably 90% of the cause or failures of data science projects. And it's mostly because there might be a lot of data, but it's not labeled or it's badly labeled or the problem definition is so vague that you really can't do much. Yeah. So I have a practical example. I'd like to jump in on that just from this week. So this week, last week. So I'm building a classifier that is fairly theoretically, it seems fairly easy. Just three classes. The difference between the three classes is fairly obvious theoretically. So it was unlabeled data. It's basically text coming in and classifying it into three different groups. So we actually went through a maybe two month process of having the data labeled, sample set labeled so we could train the neural network model. And through that training, we set it up as well as you would want to. We took snapshots and decided, okay, what's the inter-rater reliability? Like how common we had subsets that multiple raiders were rating? How common are the ratings that currently agreeing? And then going back to the definitions of what we have for these three classes. So it was like a fairly sound process. And my model is definitely better than the one before because of all this labeled data and having humans in there labeling correctly. But my one class has, for its accuracy, these things we use for that precision recall F1 measure is pretty amazing. One of them is okay. And the third one is significantly worse in precision and recall. So I'm back to the labeling and look through it. And it's interesting that it was obvious to the people who are in charge of that process that the one class, people were really had a hard time determining whether something fit into that one or not. Like as humans, we are having a hard time. Is it really that or is it the others? And the one that I had the lowest inter-rater reliability, which is the phrase they use, is the one that the model is having the hardest time with. So yeah, it definitely nine times out of 10 goes back to the quality of the data. And I'm here working on the quality of the model, like coming up with 200 different models to try to find one that's accurate. And yeah, it leads back to, even when we do have the time and the effort and the intelligence to label the data, sometimes it falls short at that step. But one thing we did back when we used to do like optical character recognition stuff was it's kind of like what Don just said sounded similar. It was, you know, you think you have a feature set and you think it's good and it's not working that well on certain cases or maybe all the cases. And it's like, well, you know, as a human, look at the feature vectors. It's like, if I can't, you know, it's like, I can't tell either. And I think, you know, we as humans, we often just intuitively access channels of information that we're not giving to the models. And like, I think optical character recognition is good for this because we just look at text and, you know, who knows what our brains are doing with that information. That's like, the model's not seeing text, the model's seeing vectors of numbers. And so it's like, you should look at those vectors of numbers. And it's like, you know, that's kind of like the umbelt of the model. And it's like, if often find it's like, yeah, it's like, I can't, I can't figure that out either. So why should I expect the model to be able to do well on my crappy feature set that I can't even use? And actually, if I can add something to what Don and Eric just said, it there's an enormous amount of weight put on predictions, because we generally use these models for predicting something. But most data scientists don't stop at the prediction part. They don't say, my model's performing well, I'm done. They say, okay, this day is done. Now, in some ways, the real work starts, which is, I want to understand what this model is doing. And of course, it depends on the use case, right? Sometimes you don't care. You say, okay, it performs well, I'll just put it introduction. It's not a very high state situation. But often you want to understand what it's doing. And that takes a lot of, there's no, that people try to come up with general techniques. And if it's linear regression, or, you know, depending on the class of models, there might be something. But really, that's very honest, you know, low level scientific work, where you try and understand what's going on. Yeah, I think to that point, I want to make sure we get to this question, to the very least before our time is up here. But so we talk about, yes, like you get your model to do with prediction, you like the output results of your test sets. And now the real work starts, like, how do you all like put your models into production at Red Hat? Like, what are some of the ways that I mean, I think this is even to kind of a somewhat of a green field or a new discipline about how what is the best way to actually productionize intelligent applications or machine learning models? Do you three have any experience or words of wisdom to give the audience about this process? Yeah, I can jump in. So not knowing the range of our audience here, I would say like, one thing to think about at first is whether you need what they call online or batch prediction. So whether your system needs a result immediately, meaning someone, a customer clicks a button, and we receive that data, and they need a response right away that's influenced by your model, that would be an online prediction model. Or what seems to be more common, at least at Red Hat, is that we can do batch predictions, meaning you get in the data from the day before or the week before you run things on it, and then you provide the whole batch. And that's a lot easier to set up. I mean, you have a lot more options for batch processing, right? Because you want it to be immediate within the stream of data. And we have ones where, let's say you have reports or, you know, decisions being made for the future based on your model. You can do those batch, like you can take in a whole month's of data, run it, provide insight to your stakeholder, usually internal, and then they can use that data. So those are pretty easy to set up. You can run them just about anywhere. As long as you can feed the data to your model and then feed the output somewhere. When we're doing online predictions, and that's usually setting up an API, which is a self-contained thing that's sitting there listening. Listening for data. You set it up so that your system on an event, let's say the customer hits send or some type of event like that. The data would be sent via usually REST API into the model prediction thing. For me, it's a container. The container will receive that information, run my model within the container, output the results back via REST to wherever it was sent from, and then that data can influence the next step in the pipeline. So that one's harder to do, obviously. It needs to be, the latency is important. Let's say, if it's in my area, customer experience, if someone at Red Hat opens up a support case, and we want that model to add the predictive value to the pipeline as soon as it can, so that our support associates have that added insight about the case immediately. If our model takes an hour to provide a prediction, and our customer has an agreement with us that we will respond within an hour, obviously, we didn't help out at all, right? So, yeah. And when you talk about implementing these things, it sounds like you're doing containerization of your models and things. Are you leveraging any tools like Kubernetes or OpenShift to make these things, like the API is accessible to the folks that need the results? Yeah. So, I'm in this odd space where I'm trying to do what you might call cutting edge neural networks on GPUs, and the OpenShift community, the Kubernetes community, I'm kind of one step ahead than the one step ahead of me. Like, for example, I have a PyTorch model that requires a lot of shared memory. Kubernetes, gosh, I haven't seen the numbers in a while. I think 1.19 did not allow the configuration of shared memory. So, the AICOE team, Yall's team put in a merge request for Kubernetes 1.20. It got in there. It's now in OpenShift 4.7. So, I keep running into those chicken egg scenarios where I can't get an OpenShift soon enough. But, yeah, like most of the models that I have are built in a way that they're OpenShift ready. I'm running on UVI containers. I do all my development within a container. When I push my code, it unit tests within a different container, and then it pushes it into a prod container. And then I can run it in, we have an internal OpenShift where I can run my models. Cool. Well, we're almost out of time here. I think there's one question in the Q&A. So, we have to have to do it. And it's a great question. Do you think that quantum computing take us closer to an AI system mimicking a human brain? That's loaded, but I'm going to go for it. My own personal opinion on this is that while sort of like it's superficially and trivially true that brains obey quantum physics because everything does, I kind of think that the quantum mechanics does not deeply influence the way brains operate. That's my take on like what I've read about brain structure and how neurons work. Other people's opinions differ, but I guess when I read the arguments about quantum computing in brains, it mostly feels like, I don't know, well brains are pretty weird and quantum mechanics is weird and we're sort of superficially non-deterministic and so is quantum mechanics and so there must be something quantum going on. And I'm not actually sure that's true. I guess that's my my own hot take on that. Yeah, in the last one minute, I'll take a stab on the second part of it. I mean quantum computing, I'm a jaded person, right? So I was in physics grad school, undergrad, right, like 10 years, and quantum computing was always one of those things which people work very hard on and they were extremely good physicists building them, but everyone promised it will be out in three years, three years, and one day I'll be wrong, right? But for now, I'll take the bet that it's like 20 years away, maybe even 50 years away, but one day I'll be wrong and I'll happily pay some bet to someone. On the second part, I think as far as I know, I'm not a neuroscientist. We don't understand the brain. We know bits and pieces of how it works and I'm not sure if understanding our brain is a prerequisite to building an AI system with some reasonable definition. The classic analogy is that of birds and planes, of course, we build planes that don't fly like birds, but they work because we understand the physics very well. And if you understood the mechanisms of intelligence, which is I just said three words that if someone told me to define them, I don't know how I would. But let's say we did, then maybe we could build something, but really in practice, I think what happens is the way someone, when I was an undergrad, describe mathematics researchers, they said it's an infinite pie and there's a hole in the center and you put a bunch of mathematicians and they nibble at this pie and every now and then someone finds a blueberry and then all the other mathematicians slash and score rushing towards that blueberry saying, hey, I found something, but no one today has any idea whether you're famous or not famous or have a prize or don't have a prize on how to make progress on this. All of us have some beliefs. Most data scientists day-to-day don't work on this, but the ones who do, the machine learning researchers, now even they have beliefs, some think, well, reinforcement learning is what I find interesting. I have these concrete problems, I'm going to work on them. Someone else goes and works on symbolic systems. Someone goes and works on scaling neural nets to 10,000 GPUs. Who knows? It's just, I think, one of those things where people who work on this just have to keep nibbling at their thing and I can predict the future. All right, cool. We've gone over time here, so thank you all for staying a couple minutes late and thank you, Don, Eric, and Sanjay for being such great experts this morning. Yeah, that's the end of our session. I think if there's any additional questions, I'll be at least in this track room for most of the day, so feel free to reach out.