 Welcome to site 522 Machine Intelligence. I don't want to go over all items in the course syllabus, just the important topics that I will briefly mention here. So please, as all communications will be true, learn. Just take a look, start with there right now. There's nothing there, just the course syllabus. Take a look at it, all the details. I'm sure you know the class times and tutorial times and everything else. I just, before we jump into the material, I want to go over most important parts of the syllabus. And maybe we have a little bit of discussion before we start with the material. And I want to start with material today because we have to cover a lot. And in order to do that, we cannot waste time. So the objectives of the course. I hope you see, because one of my problem with this class is, with this classroom is that it was not traditional boards and I have difficulty to write with the markers. So hopefully we will be doing okay. So objectives of the course. So we want to learn basic concepts of AI and we will clarify what we mean with AI when we start soon. So we want to see what are the basic concepts, not just one or two methods that I can look, I don't know, at some YouTube videos and say, okay, methodics works like this. But real, the basic concept, what are they? We want to learn different, anything that I, right here we want to learn. So this is the objective. Learn different methods for function approximation. So why function approximation? Because as we will hopefully see and discover together, machine learning, AI, computational learning is nothing but function approximation. We approximate functions that are not known. Nobody knows the equation for those functions. So we approximate that. That's the fundamental difference between AI and classical math. So we want to learn some of them. So we want to learn how to select the correct or the right learning scheme. So if a problem is given, how do I know which method from the AI repository can I select to solve that problem? This is one of the fundamental skills in AI that you cannot acquire in a short time. There's no way around it. There is no online course. There is no certificate. There is no YouTube video. There is no blog that can prepare you to do this in two months or four months. After 10 years, you will be there. So, and that can save time. That can save development time. So that can save a time of design. So of course, you want to see and learn what is the difference between shallow and deep learning. Although deep learning for us is, I don't know, is maybe 20% of what we do in this course. But since it's one of the, at the moment, is one of the most successful AI technologies, we have to talk about it. We will talk about it. We will go in details of one or two of those methods. And we want to see what is, what we just say, what is the difference between shallow and deep? Should we dismiss now all shallow techniques that we have? Or we still keep the shallow techniques? That's very important for us to figure out in this course. There is no way that we can solve all problems just with deep techniques. There is no way we can do that for many, many reasons, as we will hopefully find out together. So, how do we, how do we verify the learning capability of different techniques? As a matter of fact, we will start today with a historic example in this regard. If some, if a technique says I can learn, how do I verify that it has learned? How do I know? Who is in charge? Who will tell me yes, this technique has learned, and now you can use it and you can rely on it. Because when I deploy it after some sort of training, then people will trust that it has learned. Now it will deploy its knowledge in some way. Okay, we will learn how to run and evaluate experiments. I'm trusting everybody can see, even if I go up to here. So if you don't let me know, just scream. So if big part of AI has become, has left a theoretical domain, which freaks out many computer scientists. We are doing many things for which we don't have any theory. But we are doing them because experimentally they are fantastic. So AI has become largely empirical based on experiments that we see. And we run and we say, yeah, it's giving me 98% accuracy. How do we run those experiments? How do we make sure that they are reliable? Even if you say this just experience, okay, how do I know that it's reliable experience? We want to learn some of it in this course. And of course, we want to see how can I write a scientific paper? So you might say, okay, so I'm graduating in two months. I'm going working as an engineer. I don't care about scientific paper. Trust me, if you go into any organization that is a serious organization, any company, they have research and development. At least you should be able to write a white paper, report how fantastic your software is, your device is, your robotic system is. You have to be able to write scientifically as an engineer. You don't need to stay in an academic environment. So we want to also take a look at how do we write a scientific paper from A to Z and say, okay, I had this problem. I selected this AI technique. I run this type of experiments. I validated this way and therefore my method is fantastic, so buy it or hire me. So this is what we want to do in that, in that, in that regard. So, so also important for this course, the TAs that we have, Shivam, Amir, Ortaza, and Aditya. So Shivam, Ortaza, and Amir are here. Please stand up so that everybody can see you. So they will be in charge of tutorials and many other things that you will have. So you will make a lot of communications with them. Thank you. And Aditya may not be on campus very frequently, but you will communicate with him. You have their emails on the course syllabus. And you will know from case to case, from week to week, to which one of them you can communicate to get responses, to clarify things, and so on. So very important for us to know who is who. You have my office address and everything. I don't want to go into that. It's all in course syllabus. So of course for us and for you, the grading scheme is important. Again, I'm not going through everything in the course syllabus. So I'm just, I want to spend not more than 15 minutes on syllabus. You read it if there's a problem we talk about. So we have lecture quizzes, lecture quizzes, which amount for, so we have five of them and they are 15%. It doesn't mean we have quizzes during the lecture. It means we have quizzes that cover lecture material and they will be online and you will get an email and usually you have one or two days time to do it and then you do it online and they are automatically graded by the learn system. I'm pretty sure most of you have done that. So five lecture quizzes, they are supposed to basically be advanced class attendance checking, so I will not challenge you in the quizzes. They will be kindergarten questions in the quizzes. So if you're not sleeping during the lecture, you should all get 100 in the quizzes. So it's just, okay, so you were there, you were listening, so tell me, what did he say? So, and we will go one by one. So again, you have time, you will get an email from me. Next quiz is up, you have two days time doing it. It's about this topic. So then we have tutorial tasks, which we will have four of them maybe, maybe more, maybe less, but they will cover 20%. These three guys that just stood up, they are responsible for tutorial tasks, so they will do the tutorials. They will design the tasks, it's completely up to them. Because each one of them is an AI expert. So I will not check what they are doing because they are teachers themselves, so they can do it. So you have to connect with them, of course I will coordinate with them. They know what we are doing, I know what they are doing, so we will coordinate closely, but they are in charge of the tutorial tasks. So if you miss tutorials, and this is the reason that I didn't sign as far as I can remember any override that says I have a time conflict with tutorial. Because if you miss tutorials, there are a lot of stuff that you want to cover in tutorial and you would miss them. We have also a midterm for this course, which will be around March. We are flexible in that. I will coordinate with you when should we do it, because you have other things to do. From my perspective, if I don't get any push from the department that I should do it this way or that way, I can push it to the end of March. I don't mind. So we can make it something between midterm and final, and we can put it anywhere we want. So the task is I want to check and see whether there are some things that we will check with the quizzes, some things that we will check with the tutorial assignments, and there is a project that we will check a lot of stuff. But midterm we'll check something that conventionally we can only check with the exams. So I ask you questions, so okay, describe it please to me. What is it? So and the midterm is 25%, and of course this course is fundamentally about a project paper which amounts for 40% of the marks. So project paper means as an individual student or as a group, which could be two, three, or four, if the problem is difficult enough. So you will select a problem, you will work on it, we will talk about it all the time, we will go together, we will have some deadline that, okay, tell me what problem you have selected, what is your project, it could be your fourth year design project ideally, or part of it, some of it, something that you may or may not integrate within your design project. That's the best case scenario because then it doesn't take energy away from you, you will be doing your design project and a part of it is also you are doing it within the machine intelligence course. So we will talk about this a lot because this is the main task, we want to do an AI project, run experiments, and then report it in a scientific paper, not less than five pages. You are encouraged to build any teams you want, but you cannot be more than four, and if you are three or four, the problem should justify it. So you cannot select an easy problem and then say, we will four, and three of us will go on vacation and one of us who is really hard working will do it. So it should be accordingly something that is for people. Okay, any questions so far? No questions? Okay, good. So lecture topics, again, I will not go in all details. So what we will cover, we will cover encoding and embedding right off the bat, because you cannot start AI not before you know, okay, so how do I deal with data? The data comes, the data is big, how do I deal with it? And we will see that AI is about data, you know that AI is about data. Without data, there is no AI, so I have to deal with data. So how can I even start thinking about any AI technology if I don't know how to handle data, how to encode data, how to embed data within a specific application? So when we talk about this type of stuff, we are talking about data compression or dimensionality reduction. We are talking about approaches like principal component analysis. We are talking about methods like Disney. We are talking about sophisticated methods like Fisher vector, and so on. So we will talk about all this type of stuff about encoding. How do you, encoding, I use it in a much broader way. So you cannot work with the raw data sometimes. Sometimes we can, and that would be ideal if I just can't get the data raw data and just put it through some sort of AI technology. But most of the time, we can. So, and then we also talk about K-fold cross-validation. How do I validate techniques? And we talk about leave one out, and so on. So techniques that we use for experimentation. How do I know that the experiments that I'm running are reliable? So I have one million data points. How do I organize my experiments? How much of it do I take for training? How much of it do I take for testing, and how do I test? So we will answer these questions. And for that, we have to be a little bit patient. We cannot jump from into the exciting AI techniques. We have to answer these questions, be patient, and maybe in the third week, this would take us two weeks. When we go through this, okay, the data comes like this. I can compress it like this. I can just package it like this. I can clean it up like this. I can organize it like this. I could do experiments like this. Okay, now we are ready to do some serious AI design. Of course, then the second class of the topics that we will cover are classification and clustering, which is not an exaggeration to say that 95% of things that you hear are about classification and clustering. So this is the big bulk of tasks that AI technologies are doing, and they are really good at it, specifically as classification. Maybe not so much at clustering, but specifically at classification, they are really good. Provided there is enough data, provided the data behaves, provided I have enough power to do it, and many other things that we will hopefully discuss together. So we will start with simple things like K-means, almost, I don't know, 60, 70 years old, very simple techniques, five lines of Python code. You can run it immediately, get some results. I have seen startups that say we are doing AI driven this and this, and then you dig in and say, okay, what do you use? They use K-means. Okay, I love K-means, but K-means, if you use just K-means, this is hardly AI. This is one of the cute members of the AI community, but it's not really AI. It's clustering, I love it, it's really good, it's good technique, it's simple, it's capable, it's good, it converges behavior, everything. But calling your product AI driven if you just use a five line clustering techniques in Python, this would fall apart by the first two diligence done by the investors, and many of them do. So don't do that, don't do that. So we'll talk about fuzzy C-means, which we'll talk about, okay, how do I classify data when it's not about yes and no? And it's a question of degree. Many problems are still a question of degree, and you need rather a percentage, not yes and no. We will of course talk about SVM support vector machines. This is one of our favorites. This was even before deep learning that gave us a boost in AI technology. We cannot have an AI course and not talk about support vector machines. The best ever classification technique we have with mathematical guarantee. In contrast to many other AI techniques that doesn't give you any guarantee, support vector machine gives you guarantee in writing. What you find is optimal, which means nobody can do any better. So SVM is one of the best techniques we have, and we will definitely talk about it. And of course, we will talk about self-organizing maps, which slide short almost before back propagation network were introduced. And they are one of the few unsupervised technologies. At the moment, they are being neglected for some reason in the mainstream AI. I have no idea, everybody is in love with supervised learning. And I get a million data, they are labeled somehow. Somebody has labeled them, and then we put them through AI, and then we get this. There is a lot of power in unsupervised techniques, which do not need a teacher. And we have to go in that direction if you wanna leave the domain that at the moment you are calling weak AI, and we wanna do a strong AI. Strong AI is self-organizing, it doesn't need a teacher. It will pick up things that it needs. So we have to, we will talk about this sort of techniques. So okay, so then of course we will get really interesting with learning. So okay, how do we learn? How do we learn in AI? What does it mean to learn something? And here we have to stop with perceptron. And then we will talk about back propagation. And we will talk about convolutional neural networks. And we will talk about auto-encoders. And we will talk about some other stuff. So we will talk about different learning techniques in context of artificial neural networks, which is at the moment arguably the most exciting, the most powerful part of AI, which has given all of us excitement to be here today, are the artificial neural network. So we will talk about many of them. So we also talk about things that again, they are being a little bit, have been a little bit ignored, like the self-organizing maps, like reinforcement learning. What you see done here and there, maybe since maybe a year or so, people have started to talk about reinforcement learning again. But you usually see it with the adjective deep before that, which is I would say nonsensical. There is no such a thing as deep reinforcement learning. It has never been, it will never be. Because reinforcement learnings are not a type of neural networks. They come from a different corner of AI. They do different things. You can connect it to the networks. Yes, you can. We have been doing that for 20 years, 25 years. And you can connect them to shallow networks or deep networks. So, saying deep reinforcement learning is a misleading term. But we will talk about reinforcement learning agents, arguably the only AI technology that can learn from scratch, from zero. It can start with no data. So, just learning by doing, learning by making a mistake. One of the most exciting concepts in AI. So, this is in many, many, many things that we are already using. When you go in an unknown environment, for example. So, then we will go into uncertain versus vague. Because this is actually what AI does. So, if we have a situation where this is uncertain, we don't know what is what. There is no equation for it. Or it is vague, we don't know. So, is this this or is it that? So, we don't know the boundaries. Then AI can do a lot. So, here we will talk about probability. We will talk about fuzzy logic. Whereas probability has been around for almost 300 years. Fuzzy logic is slightly 70 years old. And it's still a struggling. So, this guy is still not being accepted in the family. So, because it's a different corner of AI. So, but we don't have any preference. We will cover everything that has a little bit of hope of giving us artificial intelligence. So, we will do that. And this class of techniques are really interesting. Because they will work with linguistically formulated knowledge. And it seems that one of the problems we have right now with deep network is that they are not interpretable. So, giving half a million numbers to somebody and say this is the decision that I said this image represent cancer doesn't help that much. But maybe we can find some sort of smart techniques that can take a neural network, convert it to a bunch of rules that a physician can understand. So, we will not give up. Many people are fast to say that you have to abandon logic, nonsense. Absolute nonsense. There will be no AI without logic. Without Kantorian logic, Boolean logic, fuzzy logic, any other type of logic. We need logic because humans use logic. So, we need that in artificial way as well. So, we will talk about those methods as well. And so, of course here, decision trees and random forests are also among techniques that we should not abandon. In many cases, I can, anybody who is using a deep network with 120 layers, I can easily challenge you, okay, give me the same features. And with random forests, I will give you the same results. It's an outrageous thing to say because random forests is a collection of simple classifiers, decision trees that we had in the second year. Just a tree, just a tree. A tree is when you break it down and then you conquer it. Divide and conquer. It still works. Deep learning has not canceled out everything that we know in computer science. It's just reinforcing some of it. So, you have to be careful about it. We also talk about evolution and animals. So, and at the end, we will talk about ethics, which is very important. I'm not worried about that robots will conquer the planet in ten years. That's nonsensical. You can take a deep neural network and with, I don't know, millions of minutes of training and paying 50 PhD students trained to recognize a cat and dog. This technology can hardly conquer the planet in 50 years. So, don't worry about that. So, I'm sure the militaries of the planet, the crazy dictators and governments will come up with some strategies to misuse AI as you can misuse a knife. You can use a knife to just cut bread and share it with a friend or you can kill somebody with it. So, there is nothing different with AI. So, but those prospects that AI will do will conquer the planet are nonsensical from today's perspective. So, there are a lot that we can say about the 40% for paper, but I want to take it step by step so we will talk about it as we go along. In the first tutorial on Thursday, I will start and say something about it and then we can take it one step at a time. So, I don't want to say much today. So, the course, the project paper is due on April 24th by midnight. So, I put it as much as I could back into the April that you have time, but it should also have enough time for us to evaluate because many of you are graduating, you need your grades. So, it cannot be further back into that. So, any question before we jump into the material? Any question? No question? Okay. Good. Good. So, let's start by asking the question, why interest or so much interest in AI? Why there is so much interest in AI? Is a valid question? Why is that? Well, first of all, we have recent progress in algorithms. And this is one of the things that people simply ignore and don't want to see it. Don't overlook it. Some people have somehow the imagination that you can do AI without algorithm design. I don't know. Again, it's the magic of YouTube. You should have taken a data structure and algorithm course. There is no way around it. Otherwise, you cannot go deep in AI. You need that understanding of how algorithm work. What is an iteration? What is recursion? So, what is it that you go inside an algorithm and do something? You need that understanding, that basic understanding. And in recent years, we had a lot of progress. The actual progress came around 2006, 2007 that I will talk about. And that was a really simple idea that suddenly transformed everything. But that was an algorithmic idea. It was not the GPU. The GPU is also important. But you can have all GPUs of the world. If you don't have the algorithmic idea, nothing will happen. So you need the algorithmic idea. And those brains that come up with algorithmic ideas are different type of brains. I don't know what type of brains they are, but they work that very really nicely and they come up with insanely innovative idea. We will talk about some of this idea today just to point to them. And of course, availability of data. So data has never been so available as today in 2018, 2019. So data is available. AI is, if you just accepted from here for the time being, that AI is about function approximation. The magic of AI. You know, this will demystify AI and I cannot explain it in an exciting terms to my grandma, but AI is function approximation. What is function approximation when the function is unknown? Nobody knows it. There is no equation for it. So and then it will approximate something that you cannot see. That's the magic of AI. So the relationship between in and output. So you cannot do function approximation when there is no equation if there is no data either. So you need data. Naturally, if there is no data, there is no AI. So we need AI and the availability of data has helped for the reemergence of AI technologies. And of course, you cannot do any of that without computational power. So we're just indexing 2 million images. It's not much. Is it 2 million images? It's not much. Two weeks, two and a half weeks, just to go to them, do some calculation. Two and a half weeks, that's not acceptable. If I go to a regular, any application, astrophysics, robotics, navigation, satellite imaging, medical imaging, you go to a regular institution, you easily get millions and millions of images. So if I want to spend two weeks on 2 million images, that's not going to work. So you need a lot of computational power, and GPUs gave us that power. Yes, they are expensive, I know. And you cannot just tell people, go get some GPUs. Well, not everybody has access to GPUs. But this has become a point of concern because not everybody has access to those platforms, which means what? The research and development on AI will be monopolized by just some big companies and big universities, and whoever has the money to buy the computational power. This is very worrisome, very worrisome. We just installed, today we finished and connected to a Dell server with four GPUs, price tag $200,000. So this is minimum that you need to work with some big data. So who can afford that? So that's a problem, but at least we have it as a research community. So we have the computational power, the data is available, and we had some neat ideas, how to shape and design algorithms. So now we can do stuff, okay? So I wanna give you just a little bit, because it's important about the history of AI, it's always good to know the background. So where are we coming from? So, 1901, we started with PCA. Principal Component Analysis is a statistical method. Some people may disagree with me that I say PCA is an AI method. But there is intelligence when you take a million numbers and you make it 100, you gotta be intelligent. So 1901, for me, is the start of AI, because for the first time we figured out the way how we reduce the dimensionality of the data. So how you compress it without losing information. So that was Kay Pearson. And today there are many, many, many AI technologies that cannot work without PCA. And there are many different versions of PCA, kernelized PCA, did PCA, that PCA, many PCAs. But the fundamental concept is the same as 120 years ago. It's unbelievable that some ideas just stick around. And you can use them again and again. 1933, PCA development, so it got better and better. By H Otterling. So short before the world fell in the madness of the Second World War, we figured out the perfect constellation for PCA. So okay, we have to use it this way to make sure that we don't lose information. But here in 1958, something that most people would agree with me, okay, this was really substantial. And that was the introduction of the concept of the perceptron by Frank Rosenblum. So the concept of perceptron was important. So we had the concept of auto-mountain, which is a unit at software entity that does something autonomously. And then we have perception. So an auto-mountain that does perception is perceptron. So, and if you take a look at the PhD thesis of Frank Rosenblatt, you see that for the first time, questions of intelligence are accompanied with some equations. So he tries to develop a probabilistic view for that, which was amazing. So, but the notion of perceptron was crucial. Without perceptron, none of us would be here. There would be no machine intelligence course. There would be no deep networks. So this was definitely a major milestone. 1965, we got fuzzy sets by Zadi from Berkeley. So fuzzy sets was outrageous because 1965 was the beginning, the boom of digital technology. Everybody was showing off that now we can calculate all numbers really accurately. And then a guy comes along and talks about fuzzy sets. What are you talking about? Do you want to calculate everything accurately? Not fuzzily. People didn't get this until the 80s that this theory is not fuzzy itself. Is there to capture the fuzziness? Where is fuzziness in the human reasoning? When you give navigation instructions to your friend and he wants to park and he said, come, come, a little bit to the left. What is a little bit to the left? Can you put it in a number for a computer? Can computer understand a little bit to the left? Or you have to say 13.5 inches. So very important, 1965. And 1969, we basically figured out the limitations of perceptron. These markers are gonna be problem. And of course, you have to take a look at the book by the Minsky and Puppert. So this was the idea by Rosenblatt. Minsky and Puppert made it concrete, made it explicit. I said, look, so this is a nice idea. Yes, but this problem, that problem, you cannot deal with it this way. You have to do it that way. We got a more concrete of the idea, what is a perceptron? What is a perceptron? The smallest artificial neural network that can learn a linear problem. Fundamental, the building block of everything that we have today. So 1969, of course, very important. 1982, 1982 we got self-organizing map. Thanks to Teodor Kohonen. So this is amazing, this is amazing. Because it was independent from everything else that was happening towards the back propagation network that happened after that. Self-organizing maps was, okay, I put a bunch of this processing units that you call perceptron together. And I give the data and say, figure it out. What do you mean, what? Self-organizing. It should organize the data itself without any instruction. That's a crazy idea. How do you do that? Well, by a fundamental concept that we will use again and again and again and again, similarity and dissimilarity measurements via distance metrics. Something that has been around since Sir Isaac Newton. But now we are using it in a different way. Applied on big data. Well, then it came 1986. 1986, the year of back propagation. Of course, Rommelhart and Jeff Hinton. My story, my journey with AI started here. So I was an undergrad student and somebody I got a job as a URA I guess. And so can you read this paper and implement it in ANZC? I read it, I struggled with it two, three months and I implemented it. I still have the code. I have no idea what was the code doing. So I'm sure I didn't understand half of it. But that was the beginning of my journey with AI and I said, my God, that's so interesting. So the concept of learning. So if you have one process in your name, make it do what you want. That's fascinating for an engineer to take an entity and say, okay, do what I want. Not just for engineers, I guess. Same year, 1986, we got ID3, ID3 algorithm by JR Quinlan. And by 1993, he also gave us C45 algorithm, same guy. So these are about decision threes. First time ever, back in the 50s when we promised too much with AI. And after mid and well into the 60s, nobody was listening to us anymore. Because back in the 50s, we promised we can have GPS, not navigation GPS, general problem solver. Look at the audacity. We said AI, we didn't even know what AI is, but we were promising what it can do. It was about expert systems and rules. We were saying AI can, we can write a program with AI that can solve any problem. Today, we cannot do it and if it is my guess in 50 years, we will not be able to do it. Maybe in 100 years, we will not be able to do it. We couldn't even come up with some rules. So for the first time Quinlan what he gave us and this is still, this is in MATLAB and Python and you are using it every day and you don't know. You take a bunch of numbers, you give it to this algorithm, they process it and they construct a beautiful decision tree for you. Fantastic. The concept of entropy, information gain, which comes from communication system has nothing to do with AI. People that come from outside of the field, they bring fresh ideas because they don't have our biases, so they bring a new perspective. So algorithms for decision tree construction, very, very important. Today relevant. Today, random forests based on the same algorithm can compete with a deep network. That's amazing. Not for every application, but for some application and they are interpretable. Which means human expert can understand the results and I can say why this decision was made, very important. So 1995, we got SVM. We got SVM by quarters and Vapnik. Vapnik has worked on support vector machines for almost three decades, almost three decades. He had not much possibility back in Moscow to run experiments and then somebody asked him, come, come to New York, we will work together, we will figure it out. So they figured it out, they did the last adjustment and then the idea and development and run it and then they showed. On one of the major data sets that everybody is using today, which is MNIST, digit recognition data set, showed, my God, we can really do nice classification. Again, what is important for support vector machines is that it gives you the mathematical guarantee that it will find the best results. Nobody can over beat that with one condition. It has to be about a binary classification, yes and no. For multiclass, we have some problems to overcome. So in 1995 and 2001, we also got random force by multiple authors. We got by TK Ho and partly by Breiman, L Breiman. So random force, again, is important because it shows one of the major concept of AI is either you come up with a big solution, like a deep network with 120 layers, or you come up with many, many small solutions and then combine the forces on some of the learning. We classify it, it's very good because computationally, they are super efficient, we cannot dismiss them. So people try to make them exciting and attractive by saying, okay, deep random forest, what is deep random forest? Because I have many, many layers, why? Okay, so then we came also to 1995. So many things happened in 1995. What else happened in 1995? CNN, convolutional neural network happened in 1995, or maybe a little bit before that. CNN went among a series of ideas that we were trying to, we were, when we get to neural networks, we talk about it in details, to overcome the challenges of artificial neural networks. Before that was Neo Cognitron, a Japanese colleague. It was not successful because he failed to just provide the learning scheme for it. This guy said, okay, this is the architecture and this is the learning scheme, not just architecture. And it was not like, so the idea was the 1995, but nothing happened until 2006. The architecture was good, the learning algorithm was okay, but still you couldn't train it. It still would not go to an end, would go forever and ever and ever and ever until 2006. 2006, we got a fast learning algorithm for deep belief networks. Of course, the godfather himself was behind that, Jeff Hinton. So fast learning algorithm for deep networks, first time ever that we figured out the way, how do you train a network that is so complicated? So many layers, how do we do that? So that was the breakthrough. And then when we have a breakthrough, other breakthroughs follow. In 2007, this came from Toronto, something came from Montreal. Greedy, layer-wise, training for deep nets. That came from Benjuan, his team. So now we were complete, now we had everything. Here, here the work of Frank Rosenblatt in 1956 was complete. Now we know how to define artificial neural networks, put them together, make it deep, train it for a specific task, difficult task. So but still, nothing happened. Why? Because somebody has to do this, make his hand dirty, or her hand dirty, and do it. Show for one case that is possible. That happened in 2012 with Alex. Then the euphorie and hype and AI and everybody running around is since 2012, because for the first time, somebody showed that you can take a deep neural network and train it for a difficult recognition task, like image identification on a large database of over one million images, cats and dogs and buildings and cars and humans and bicycles and everything, 2000 classes. So somebody did it. Yeah, the theory was there, but somebody has to do it. This is engineering work. Theory is nice, theorems are good, but who does the job? Who takes it and applies it on a real world problem and say, because in order to do that, you had to work two years to get to here. It's not easy to run an experiment with an architecture that nobody has tried, nobody has tried. You don't even know how to do it. For everything you have to, and most of the code that they wrote was done afterwards, available to everybody, because everything that you do is pioneering work, unbelievable. So okay, enough history. So what are the main tasks? You don't have a clock here. What are the main tasks for AI? We will go through all of them virtually. I wanna go through all of them. I wanna go through the history of AI, just go one by one and see, why did it take us 70 years to get here? Almost 120 years to get here, starting from Pearson. So we will hopefully figure that out together. So what are the main tasks? Of course, classification. This is where we are exaggerating it, and this is very dangerous, and it's becoming one-sided, and everybody is doing classification. Since two years, I'm intentionally avoiding classification, because I don't wanna get biased with the mainstream. Everybody is doing classification, because it's straightforward. You give me the data, you give me the label, I train it, I generate it, I give it to you. There you go, I can go home. Which is okay for many cases, but not generally. Of course, we do estimation. You estimate, usually we estimate the probability of something, which could be also prediction. So are you predicting what happens when I get to the curve, what would my robot see? Or what happens tomorrow on the stock market? Or if I give this chemotherapy to the patient, will the patient survive or not? So can you predict that? So you can easily imagine many, many useful applications for prediction, which is we estimate the probability of an event based on the data from the past. Of course, search. This is my domain. I love it, because this is an absolute minority in the mainstream. Not many people are doing search, because search is tough, and you cannot show 99% accuracy for search easily. You have to spend years of development on it. So, searching for something in a large archive, searching for patterns that we have never seen, we have not, is impossible for human operator to see those patterns. Can we do that? That would be something, an exciting aspect of AI. Find things that we cannot see as humans. I wanna see AI doing that for us. And of course, optimization. The entire history of human evolution of plants and animals on this planet is about optimization. How can it be intelligence without optimization? We have to optimize things. So if I come up with something, can I optimize it in some way? And then I can deploy it as a solution. And of course, inference. This is not an exhaustive list. So just the most important one. How can you infer knowledge based on the knowledge that you have from the past? New knowledge. How can you infer new knowledge? Give you a simple example that we will come back to it. So you say, if the student work hard, the student will be successful. And then you see that the student is not working very hard. According to the rule of logic, you cannot make any statement. So you have only the statement that says, if the student works very hard, the student will be successful. If you observe that the student works hard, just hard, not very hard. Computers cannot make any statements about that. So inferring knowledge from previous observations is subject to uncertainty and vagueness. This is still a big part of AI. People are telling stories to the young people, now we can do everything with deep networks. Now we can't. I love deep networks. I work with deep networks. My career started with artificial neural networks, but it's just 20% of the problems. 80% we need all that, everything else. So don't customize too much, because especially you're young. I can't customize. I will retire in some years. So I can focus on search and say, I will only use this one. I don't have anything to risk. You have to keep your options open. So you have to be a generalist at the moment. So you have to have the overview, the broad overview of what is possible and what is not. So I'm making the argument that all this is approximation. So AI is about approximation. And what is approximation? Approximation is that you have a black box, some X goes in, some Y's comes out. So there is an F of X inside this and X is a vector and it's usually multi-dimensional. It's usually hyper-dimensional. X has one million factors. Nobody knows F of X. So if I give you X, can you approximate Y? That's what AI does. I know it takes away the magic, but okay. It's not an easy task. It's very difficult. So what we are doing, that we say, so Y is F of X, whereas Y can be a vector and X definitely is a vector, which means F maps X to Y and F is unknown. So when you go with this to the department of mathematics, the general answer is okay, so bad luck, we cannot do anything. You have some interpolation and extrapolation techniques, regression, that comes from math too. But there is no general solution. If this X is my measurement at the stock market and I wanna guess what happens tomorrow, that's it's highly non-stationary, non-linear, crazy, semi-chaotic problem. We don't have mathematics for that. Well Bertrand Russell recognized that 100 years ago. But the mathematics that we have is not for this world. It's for some imaginary simplified world. AI get closer, because AI doesn't care. So an AI will try to estimate or to predict or to infer this F, this unknown function. So simplest approach is regression. Also a good old technique, nothing new. So and if you would tell somebody 20 years ago that regression is an AI technique and say what? No, it's just a statistical method. Of course it is. But regression can do really nice stuff. Okay, so what is, I wanna spend some time and see what is AI? And what are we understanding on the AI? So go back up. Okay, hopefully it stays. No? Okay, doesn't wanna stay. That's fine, we don't have much here. I have a big circle to draw, so maybe I make it a rectangle. Okay, so I make it a big rectangle whereas in my hand notes it's a circle. It's a more beautiful geometrical shape. So AI, which is for me the same thing as machine intelligence. I traditionally have used the term machine intelligence. Many people on this campus have used it. We have been hesitant and I'm still hesitant to use world AI frequently because again, if you do one of those two, three things, you can hardly say I'm doing AI. So when we say as engineers, we are doing machine intelligence, that means every one of those things that is working in the practice we are using. There are many things in AI that are not working, they are purely theoretical. We don't work with them, we don't care because we are engineers. You wanna do something practical. So of course, there are one aspect of AI that is about stats, statistics. And historically, probability theory is the biggest part. There is no AI without statistics. There is no AI without probability theory. It has to be because if you talk about prediction and estimation to certain instance also inference, Bayesian inference, none of those we can do without probability theory. Probability theory brought us to the moon in Kalman filter. So it has done its exam. We know it works, we are using it. So this is definitely part of it. So then we have logic. Logic, for example, we have decision trees and all its derivatives. And we have things like fuzzy logic. Again, I have been around in the past two, three months and I hear again and again in some conferences which are dominated by colleagues that have put all their eggs in the basket of deep learning. Dangerous thing to do, especially for young engineers. And I hear, yeah, so the logic approach is gone. So symbolic approach is gone. I don't think so. I wanna keep those stuff still. I see they are working in my lab. Why should I abandon? Why should I abandon something that is working? So there are things that are in the middle like reinforcement learning that have no connection to anything else but they have been specifically for engineering design crucial. Do you think we can come up with a Mars rover navigation system without relying on technologies like reinforcement learning? Not gonna happen. How much training can you do in Nevada? How much, how similar is that to Mars? I know, it's not red. So how can you be sure that when you deploy it in a completely unknown environment, it does fine. It has to learn from scratch. It has to learn from interaction. Not possible with neural networks. Not shallow, not deep, not average, depth, whatever. You need techniques like reinforcement learning. Robotics without reinforcement learning, unimaginable. Not gonna happen because you need the interaction. And networks, networks are static. After they learn, they are static. You just use them. They cannot be adjusted anymore. Not the regular way. Unless we come up with some tricks. So then we have classifiers and we have clustering. Of course, again, as I mentioned, we have K-means. We have FCM. We have support vector machine and we have many others. These are just, please be careful that these are just examples. We have many other clustering and classification techniques that we may not mention. But I think historically and practically we benefit a lot when we talk about this tree. But there are many others that people are using. So, encoding, method like PCA, very old and method like tessiny, T-S-N-E, which is very young. So how do I use those techniques for dimensionality reduction or for visualization? Visualization is very important. Now, retrospectively we are rediscovering the power of visualization because if that black box contains a billion numbers, okay, who can understand that? Nobody. Do you think people use it if they cannot understand it? You will make the experience when you go out in the real world as an engineer that your customers wanna understand the product. They wanna know how it works because they don't wanna be just stupid users. They wanna know what is behind it? What does it mean when it says this? So, important to have visualization techniques like tessiny. We have techniques that are on the border and go a little bit out. Metahoristics, like genetic algorithms. Like, I bring it a little bit out, like swarm intelligence and colonies. Things like that. Some colleagues, that's the reason that I pushed this to a little bit out of the AI because some colleagues may disagree with me and say, no, no, no, no, they are not AI techniques. They are optimization techniques. Well, that's my argument. You cannot have artificial intelligence if you cannot do optimization. Can't happen. You have to optimize the topology of the network. You have to optimize the weights. You have to optimize the batch processing. You have to optimize everything. How can you learn without optimization? No way. And of course, I saved perhaps the best for the last, artificial neural networks. You have many different things. Again, you have SOM, which could be also here. SOM could also be here. You have back propagation, which is also called MLP, multi-layer perceptron. And you have deep learning. Now, with respect to the size of deep learning, I'm actually exaggerating a lot. So maybe I correct this. This is deep learning, even smaller than that. With respect, what do I mean? With respect, the body of literature and what part of AI is a very small part of AI is extremely successful. That's why we hear about this small part of AI so much. Of course, nobody want to ignore that. I don't want to ignore that as an engineer. I want to see what is it about. I want to use it. Sure, but be clear that we have a lot of other stuff. And they're good. They're not bad. They are not mediocre. They are not inefficient. They're really good. They are good for different things and different perspectives. So this is by no means complete. It's just based on my subjective evaluation. So it's not, so when we say machine learning, so what is machine learning? Machine learning is this. So machine learning does not include these guys. Machine learning includes these guys. Machine learning includes these guys. Include these guys. Include these guys and so on. So there are some that we don't call machine learning. For example, we don't call genetic algorithms machine learning. We don't call fuzzy logic machine learning. Fuzzy logic does not have any learning. It's just static rule-based. Does it mean they are useless? Of course not. They still have some level of intelligence. But they are not falling with, if you go to a machine learning conference, there will be no paper whatsoever about fuzzy logic or genetic algorithms. Generally, there's always exceptions. So just to have the big chunk, again, the success, the part of AI that is very successful is the machine learning. And within the machine learning, what is very successful is deep learning. What is a very small part of what we actually wanna talk about. Okay. So we have 10 minutes. That's not good. So let me... So we can ask the question, what is intelligence? So what is intelligence? So in the dictionary, people say that act of knowing, which means what? Intelligence is equal knowledge. Is it? If I compare with people with all due respect to every human that I don't know, I know many knowledgeable people who are very stupid. Intelligence is not knowledge. Has to be something else. Big part of it is knowledge. You see people, when they know stuff and say he's smart, what do you mean? He knows a lot. But being smart is something else. Memorizing stuff is not intelligence. And whatever you know, somehow you have memorized it, right? Practice, you have a sharp sense of just capturing stuff, but you have memorized it. So is knowledge intelligence fundamental question for us? We say that intelligence is the exercise of the understanding. So that means we are saying intelligence is understanding. So that makes it more complicated. What do you mean? What should I understand? Because just knowing something doesn't mean you understand it. So everybody know what are cooks in 100 degrees. Okay, but do you know why? Can you explain it? Not really, I just heard. What are cooks in, boils in 100 degrees. Okay, but why? What happens to the molecules? And why is that? What happens if you have salty water? I don't know. So you just memorized it. Like a deep network that swallows everything because it's big, so it's a small problem on our big network, so I know it. So basically when we talk about this, we are talking humans versus machines. And then on the other side, we have basically two factors. We have thought and we have actions. For humans, we use thought to do reasoning. Reasoning is a very complicated process. And we use that to act. We use reasoning to act. Most reasoning happens unconsciously, subconsciously. What do you reason? And the act, that's human. So machines do rational decisions. They do not reason, as we reason as human beings, and then they take rational actions. So this is what AI does. So we try to make computers and machines and robots to do rational actions. Actions that are devoid of any bias, emotions. Just look at the situation and take the best action. Rational action based on some calculation. No emotions, no preferences, no cultural background, nothing. So AI is conserved with this. We wanna get here. We wanna reason like humans and act like humans. We are not there yet, so we will need some years to do that. Okay, so one of the big guys in AI asked the question, can we measure intelligence? It was clever because everybody else was discussing. We are talking fifties, for these fifties. Short before the Second World War, during the Second World War and after the Second World War because governments spend a lot of money to attract young, smart people, to build them nice weapons that can kill many other people. And one of them asked this question. I said, okay, guys, we cannot define it. Okay, when we cannot define it, can we measure it? How can we say a computer is smart? So, and he said, okay, so let's assume you have three rooms. In one room, there is a computer with some keyboard and there is a human being sitting in front of that keyboard, and this supposed to be the nose of that guy, so this is the bird view. And this guy is the judge. And the judge is using a computer. And the computer is connected to two other rooms. In one other room, there is another computer and another keyboard, and there is another human being operating that computer. So, another human. And there is, in other room, there is an AI computer. Whatever that means, what is an AI computer? I don't know, I have no idea. Hypothetically, so, and then the guy who asked this says, we know how smart this AI computer is depending on how much time takes the judge to figure out that this is a stupid computer. When the judge starts asking many questions and the questions comes from the room A and room B and the judge does not know whether A is human or computer or B is human or computer, the judge doesn't know. The judge just see on his monitor two answers from A and B. And the judge knows one of them is computer and wanna figure out which one is the computer. How long would it take the judge to guess that room B is the stupid computer? So, that time, that takes the judge to guess that room B is the computer, Alan Turing argued, is proportional to machine intelligence quotient, MIQ. So, if it takes five, two seconds, two questions, okay, it's not, it's not, it's a stupid. Five minutes. Alan Turing says by 2000, a computer has 30% chance to fool a human for five minutes, he was not wrong. I guess we are right now at around 20 minutes, the major one that I know, I may be wrong a little bit, maybe 25 minutes, 30 minutes. But by 2000, we couldn't fool a human by five minutes. So, what does that mean? What does that mean from today's perspective? So, this is the Turing test. So, I will upload the seminal paper by Alan Turing that helped his own government during the Second World War to decrypt the encrypted messages by Nazis. And as a reward, we tortured him, we discriminated against him because he was almost sexual and then 50 years afterwards, we pardoned him for being a homosexual. Lesson learned, don't work for governments. So, companies pay a lot and if they don't like you, they just fire you, they don't torture you. So, even that can't even happen to Alan Turing. So, and from today's perspective, Turing test is still relevant because it tells us one thing, as long as the ultimate validator of AI is a human being, there is no strong AI because there is a human being that knows whether the computer is stupid or not. So, for the AI to become independent, to move out from the basement of humans has to go outside of the judgment of humans. Then we talk about artificial general intelligence, strong AI. We are years, decades, I would say centuries away from that. I don't think we will get there soon. And I am not in a hurry at all to get there because if you look at the human affairs, what do we do with technology? So, the first immediate idea is, okay, so how can we create robot soldiers that go and kill everybody? Well, you don't just create an app that people have fun and mistakes the dog with a grandpa and then they have a laugh. So, I don't know why people want to always do serious, violent applications with that. So, there is something else that we will, I want to talk about it next time because then somebody came along and said, no, no, no, the question is, how can we measure it? The question is, is this real intelligence? So, and then we take it from one level higher. So, next time we will continue with this and then hopefully we jump into the encoding and embedding. And so, we have also the first tutorial next time that I will use to go in details for the paper project. Okay, see you then.