 Okay, good afternoon everybody. So, we talked a little bit about what people call fuzzy logic and we want to dedicate this lecture to because it offers us one mechanisms that we use a lot which is inferencing, how can I use existing data to infer new knowledge? So we want to cover that today. Hopefully we get a good idea about it. So fuzzy logic was supposed to be computing with words. I guess we are not there yet, mainly because our computers cannot process words. The natural language processing that we do with TFIDF and LSTM and all that are nice and good but we are not processing really words. So processing words is a different thing. With that, you need special type of computers. We don't have those computers because if you have those computers, you should be able to process linguistic variables. So that is variables whose values are not numbers. Words or sentences in a natural artificial language. So that's in one of the first three papers that the father of fuzzy logic came by and established the concept of linguistic variables which was in mid-seventies of past century was something outrageous. So what is linguistic variable? So I want to work with numbers. So a linguistic variable is a collection of five things, a variable name, a set of terms T of x, a universe of discourse, a set of syntax rules, and a set of semantic rules. So that's your variable name. This is set of terms. This is your universe of discourse. And this is your syntax rules. And this are your semantic rules. Semantic. So usually when we are working with variables, I just have a variable name and it has a value. That's it. So but apparently linguistic variables are more complicated. So that's the way that humans... So the main motivation for fuzzy logic was and as we convey a lot of information through the language and the human language is highly sophisticated and it's compressed, it's granular, it's ambiguous. If I tell you a little bit to the left, what does that mean? What is it a little bit? So somebody is not that old. What does that supposed to mean? We do that and on a daily basis we don't have a problem with that. Humans communicate with the vagueness of the language. But apparently if you want to put it into a computer you have to come up with something that makes it more accessible to computers. So let's say we were talking about temperature. Temperature is something we understand. It's intuition. We can feel it. There's no problem with that. And temperature could be... So when we describe temperature as intensity, we use many different hedges, linguistic hedges. We say hot, warm, cool, very hot, extremely hot, and so on. Let's say I will go with very cold, cold, cool, warm, and hot. So let's say these are the five terms that I'm using to describe temperature. Of course as humans we don't do that. As humans we are very flexible. We play with many adjectives, with many intensities. We combine them. So we are quite creative about the language. So let's say we go... You say in Canada we are used to what? Minus 30 to plus 30. It's actually more drastic than that. But let's stick with minus 30 to plus 30. So that's the range of temperatures that we see, let's say. So I intentionally made a symmetrical just to make my diagram more beautiful. Usually things in reality are not symmetrical, not linear, they are nasty. And then say, okay, so I don't know, maybe... So if I go for very cold, for example, it could be... Let me just start from the center. So if I put a triangle around zero, that means... If you are at zero, this is a cool temperature for us in Canada. So if you go to Saudi Arabia, zero is not a cool temperature. It's an extremely cold temperature because people are not used to it. So but for us, you see the subjectivity of measurement. So when you use linguistic variables, subjectivity flows in. And then suddenly one word means something here and means something else in another culture because of the environment, circumstances, whatever. It becomes really human. And let's say then I have cold and cold is something like this and then very cold is something like this. So let's say if this is minus 10 degrees and this is minus 20 degrees and this is minus 30 degrees. And I can go symmetrical. Again, I'm making my life easy. I'm assuming that the temperature, perception by human is symmetrical and is nicely distributed, which is not. So and then I have warm temperatures perhaps and I have hot temperatures of some sort. So let's say 10 degrees, 20 degrees and then I have 30 degrees at the end. So, okay. What does that mean? First of all, if I see that... Okay, so let me go to the end and then I come back and explain. So first of all, so this is your linguistic variable. This is the linguistic variable. The linguistic variable is temperature. And this is our set of terms which apparently I chose subjectively. I chose them to be five. It could be seven. It could be nine. Why is it on even? Because it has to be something in the middle. There is always a middle, logically. So it has to be five, seven, nine, 11, whatever. So we can say I can start with extremely cold. Very, very cold, very cold, cold, still cold, cool. Not so cool. A little bit warm, warmer, warm, hot, extremely hot. You'll play with it. How much accuracy do you need? It's up to you. You pay the price because any function that you add, you are adding to the computation. So if you measure something, let's say you measure minus, whatever, 23. If you measure minus 23, you see... So this is very cold, cold, cool, warm, hot. So if I look at minus 20, I see that minus 20 cuts two sets, very cold and cold, which means minus 20 is regarded very cold and cold at the same time. But minus 20 is not cool anymore. It's not warm. It's not hot. So if I write it, so minus 20 in language of Zada will be... So let's say this is 0.75 and this is 0.4, let's say. I don't know. So it would be 0.75, 0.400. So 0.75, 0.400, which is very cold, cold, cool, warm, hot. Okay. So of course, in order to assign this temperature to some terms, you need some syntax. So you need some syntax rules to say what is what. If I say temperature, which part of it is very cold? Which part of it is warm? So how do you assign the syntax? To come from the term to the actual value, you need some syntax, sorry, some semantic. So what is the meaning? What is the meaning of very cold if I want to measure it? Yes. They should add to one. But maybe I used something unnormalized, but they should add to one. You're right. But you don't have to. It makes life easier, but you don't have to. But if I draw something really beautiful like this symmetrical and this is one, they should add to one. So unless they are nonlinearly distributed. So, okay, and when I come here, see these are my membership functions and this is my universe of discourse. So this is a set of all variables that I can calculate. So this is actually, this is the fuzzy set and this are the fuzzy subsets. People mistake that. They show you a fuzzy subset and talk about a fuzzy set. So, okay, so what? So what? You know what you just did by defining a linguistic variable, by giving the variable name, a set of terms, some syntax rules, some semantic rules, and the universe of discourse. So here, you have a word. Here, you have numbers. This is what fuzzy logic does. Fuzzy logic gives you a bridge between words and numbers. Why? Because at the moment our computers are stupid and they cannot process words. So, whatever you give me in terms of linguistic information and linguistic knowledge, I have to convert it to some numbers such that the computers can process it. And in general, NLP, which usually we don't include fuzzy logic because fuzzy logic is unfound table of the AI, still many of us in AI don't like fuzzy logic. I don't know why. There's some weird stuff going on. And maybe it has something to do, whatever. So we don't want to see this as part of NLP. I mean, it's not, I would say, because it gives you something actually quite unique. At the moment, we cannot really exploit what fuzzy logic offers because our computers cannot process words. So, at the end of the day, so fuzzy logic can be as accurate as you want and only one thing changes, the size of this terms, which is this membership functions of the fuzzy subsets. If you want to have a super precise answer, you go with 111 membership functions, not with five. We will see an example today that it would cost you a lot of computation if you increase the number. But for the first time, actually, fuzzy logic was one of the first techniques that gave us the choice. How much accuracy do you need? So, no, not much. I just need a rough estimate. Okay, go with five or seven membership functions. No, no, I need high accuracy result. Okay, go with 25. So, we said that there is a difference between fuzzy sets and classical sets which are a subset of fuzzy sets. Maybe that's what people don't like. So, because we know the classical sets, the yes and no sets, the Boolean sets, the true and false sets, or a special case of fuzzy sets. And if you're working for 50 years and you're using a theory and then somebody comes and says, your theory is a subset of my theory, people get upset. Most of it is ego. We get over it. When the guys who invented that died and after 20 years and then we realize, oh my God, he was so smart. Okay. So, we said that the classical set is the set of variable x such that x, of course, comes from the universe of this course, x. And x is something. X is delta. X has some property. So, that's how classical sets are defined. Anything you come up with is a set. Anything can be understood and formulated as a set. So, x come from temperature and x is very hot temperatures. X comes from the set of all men and x is a very young man. Anything you want. So, anything you want is a set. Now, if I want to define a fuzzy set, we sort of hinted to this. First of all, you cannot just go with x. You also need to give me a membership function, mu of A, mu of x in the set A. So, for every element that you are giving me, you have to give me a number between zero and one. Because here, everything that you list is a member. Everything that is not there is not a member. Why do you think digital technology is so everywhere? Because it's so simple. If I list it, it's a member. If I not list it, it's not a member. Here, for every x, I have to tell you to what degree it's a member. That must be exhausting. Yeah, but if you want to have a more realistic model of the world, maybe you have to do that. So, and again, of course, x still is coming from the universe of discourse and the membership function, mu x of A, belongs to the closed interval zero and one. And we talked about this. We violated two rules. Blasphemy, bad people. We don't use this theory and so on and so on. So the membership function, the membership function mu A quantifies the degree of belongingness, if this is a word, of x. Now, since A is now a linguistic variable, we can modify its meaning. Okay, what does that supposed to mean? Well, we talked about it that usually that mu A, which was in dual binary logic, Boolean logic, was an F characteristic function and is either zero or one. Either you are a member, you are not a member. Mu is a matter of degree. So you can get any number between zero and one. And if you're looking at the set of old men, hopefully I'm not a member of that set with membership of one. Hopefully you caught me a slag and say, okay, you belong to the set of old man with a membership of 45%. The rest is open, you can still feel young. So, but if that set is regarded as a fuzzy set, then it's a linguistic variable. And any linguistic variable can be changed. The meaning can be changed. You say, hot and very hot. Old and not very old. Look, in the logic, if you have a statement, if you have a knowledge about old people, that's all you know. If somebody says, I know something about not very old people, you have no clue what he's talking about because you can only talk about old people, not very old people, not extremely old people, just old people because we don't have any way of modifying the meaning of those words because they are not in the classical logic. They are not linguistic variables. They are binary functions. Okay, so if I look at my 001010101, let's say this is my A. This is my fuzzy set or fuzzy subset. Whatever that means, it means sunny day. It means a young woman. It means bright light. It means hot temperature, whatever it means. It doesn't matter what it means. We can dilate this function. So dilation. Or we can concentrate this function. So concentration. And then hopefully that means something. So if I have mu A, so mu A concentrated of x will be, for example, mu A of x raised to power 2 because the numbers are between 01 and 1. If I raise it to 2, I suppress them. So I compress them. So big deal. What do you think probability is? You just count, divide by number of total frequency. Big deal, but see how much we do with that philosophy. A lot. So then the mu of A dilated of x would be mu of A of x raised to half. So then because the numbers are between 01 and 1, if I get the square root of that number, the numbers will be higher. So I dilate the values. So okay, what does that mean? Okay, so let's say mu of A, mu A is known. For example, bright, cold, bright, old, cold, whatever adjective you want to use. So you know mu A. You have some measurements. You know what is the perception of people. Who is an old guy? You know what is the perception of people extremely cold in Canada? Minus 55. That's extremely cold. So you know that. Okay, if I know that, what is the membership of, sorry, what is the membership of very A? Whatever A is. So very bright, very old, very cold. What is that membership? You can do it as the concentration of A. So our concentration being defined this, raised to power 2. Why raised to power 2? It works. It doesn't contradict my intuition. I'm going with it. Deal with it. But it could be also 1.9, yeah? Or 2.1, yeah? What's your point? So you have a lot of degree of freedom, yeah? Like the reality. In the reality, you have a lot of degree of freedom. Get two people together who agree what exactly means to have pleasant temperature. Is it 21, is it 22, is it 23, is it 24? There is a lot of vagueness. What is the membership of more or less A? Well, that could be the dilation of A. Dilation of A. More or less A. What is the membership of very, very A? Well, that could be the concentration of concentration of A. So what is very, very old? So I take it power 2. So if my membership to the set of old people is 80%, my membership to the set of very old people cannot be more than 80%. It has to be less. And my membership to the class of very, very old people has to be even less than that. So it's all about subjective perception. Of course. What did you think? Fuzzy logic is about how can we model human thinking. And human thinking is a mess. It's a lot of ambiguity, a lot of granularity, a lot of vagueness, a lot of subjectivity. I say this. You say that. Nobody can agree. We do some statistic. Is 23, 23 the perfect temperature plus minus 8 degrees. So a lot of variation. So what about the membership of not very A? Then I would go with 1 minus concentration of A. Not very A. So I'm slowly getting an idea. Oh my God. If I understand, if I understand normalized functions as fuzzy subsets, I get a lot of flexibility. I still don't know what is good for. I can write it on the board and just feel good about it. But if this is going to be useful, we have to be able to do something substantial with it. If I cannot do, I don't know, function approximation, classification, some sort of learning, if I cannot do that, I don't care that it looks nice. And it's amazing that suddenly from old, I can say very old, more or less old, very, very old, not very old. From logical perspective, that was a revolution in the 60s that suddenly you can do this. What about the mu of more A? More A, for example, you want to say older. Not the set of old man, the set of older man. So, well, you could take A and raise it to the power 1.25. Where do you get that number? I don't know. I didn't go out at weekends and sat down and played with numbers. This is one of the major critics to fuzzy logic. Where do you get those numbers? Where do you get your numbers? What is the mu of less A? Well, I can go to the other direction. A raised to the power 0.75, whatever the membership is. So we can play with this, which at the moment I have no idea where are you going with this. So what? So what? Well, most likely we can do something useful. But then I have to go back to Aristotle again. Let's go back to Aristotle. I don't mean literally. Go back to the old wisdom. Yes. Sorry? Where is it coming from? Just the Duncan experiment. It makes sense. So, because I know it has to be more or it has to be less, I can come up with 1.15. Still OK. It's one of the critics to fuzzy logic that there is no deterministic way of getting those things established. So there is a lot of them. Yes, there is a lot of them. Yes, I mean mu of A. So why I wrote it that way? Because some people do that. So I should actually write mu of A. But I just did it that way. Because a fuzzy subset is represented through its membership function. So going back to the roots, modus ponens. Now we want to get serious. For almost 2,000 years, we didn't have anything but modus ponens. There was no AI, there was no SVM, there was no deep, there was no shallow. It was just modus ponens. So you have a rule and you say if A is true, then B is true. Where does the rule come from? While many centuries, painful life, veils swim from Hawaii islands to Alaska, 5,000 kilometers, not eating anything over three to four months to get there in May because they know there will be food. How do they know? If there is a spring, there is food in Alaska. Okay, they figured it out over millions of years. There are some rooms. And they get there, barely surviving. And then there is abundant of hearings. Oh my God, they are delicious. Two months they eat and eat and eat and they head back 5,000 kilometers toward Hawaii. They have a good life. Observation. So you have a rule. Because of life experience, you have a rule. And then today you make an observation and you see that A is true. And then you do a conclusion. You say B is true. Our philosophy basically for 2,000 years was based on this. Modus ponens. You have a rule. You make observation based on the observation that has to be compatible with the reality. You make an observation and based on the rule that you are using, you reach a conclusion. If students study, they will be successful. Students are studying. Students will be successful. Sounds cheesy. But this is one of the rules. Is that a rule? Or is it something we have invented? I don't know. So there is another one which may not be used that much. Modus ponens. Again, you have a rule. Of course, when we say you have a rule, you don't have a rule. You have many rules. You have many rules. So rule. If A is true, then B is true. Some people want to make us believe that you don't need this anymore because we have deep networks. I love deep networks, but we still need this. If you want to do comprehensive AI, you still need this. Then you observe B is false. And then conclusion is A is false. What is if students study, they will be successful. Students are not successful. Students did not study. Sounds simplistic, isn't it? So one of the things is this. You can say if A is true, then B is true. That doesn't mean that you know. This is the only thing that you know. You don't know. If you use Boolean logic, you don't know. What happens if I... The observation is not A is true. The observation is very A is true. So you have to observation if somebody is old, then Alzheimer is very likely. But then I make an observation somebody is very old. I cannot make any conclusion. I don't think about yourself. You can make conclusion. I can make conclusion. Computer cannot make any conclusion. Because my rule is for old people, not for very old people. Because everything is a matter of yes and no. There is no gradient. Okay. I thought AI is about function approximation, wasn't it? We said that many times. AI is function approximation. Yeah, sure. Yeah. Everything is function approximation. Logic is function approximation. So... So AI is function approximation. Now we can use decision trees, DTs, for function approximation. So we can use... Some people may say misuse. Decision trees for function approximation. Because they are not really... They aren't for that type of functionality. Yes. So no, this is the same rule. If A is true, then B is true. Then you observe that B is not true. Then you have to conclude that A was not true. So the premise was wrong. We don't usually use modest tolerance a lot. Most of the time we work with modest opponents, which is more aligned with our logic. So if you have a space, X1, X2, X3, X4, X5, and I have Y1, Y2, Y3, Y4, Y5, and this is my input-output space. Every time that you reach a decision using a decision tree, you basically find a point inside one of these squares because decision trees are discrete. So I said, if X1 is this, and X2 is this, and X3 is this, and X4 is this, then Y is this. Y is Y1, Y is Y2, and so on. If we could, if somebody had magical power and would show us the actual function that describes the relationship between an output, which is all we are after with AI, what is that magical function that describes the relationship between input and output? If somebody had that magical function, you would see that you are getting something like this. And if we could get this, you would, of course, see that you have a lot of error in your approximation when you are using decision trees because decision trees give you discrete value. Let's say I simplify this in the center of those squares. But the function, the actual function that describes the relationship between temperature, humidity, wind, and whether the tennis club is open or not is a very complicated function. It's a nonlinear function. It's a function with a lot of curves. So if I use decision trees for the function approximation, it would not be a good approximate. Why not? Because they are discrete. Physiologic is not discrete. It gives you value between zero and one. So the expectation is that's this going to be much more accurate. So if I draw the same thing, if I give the same problem to physiologic, then I get something like this. So I have to draw in a reverse order. Let me see. I can do this. Five. So I'm drawing them in other directions. So this is x1, x2, x3, x4, x5. And here I have this. I have y1, y2, y3, y4, y5. So now I'm not discrete valued. I have functions that move between zero and one. So now I want to come up with a... It could be understood this way, not that it is only this way. I want to come up with a version of decision trees that can work with numbers between zero and one, not just discrete values, yes and no. Is it either a square or is it a circle? I don't care what else it is. Now, if I draw roughly, don't take this too seriously because I'm not trying to really be accurate here. So if I draw the same function, if I try to somehow draw the same function here, if I approximate this with fuzzy logic, I will get a point here, a point here, a point here, a point here, a point here, a point here. And if I connect them, this is a much more realistic version, a much better approximation than the decision tree. Of course, my drawings are not accurate or not scaled or anything. I'm just making the... At the global level making the point that if I go from scalars to functions, I gain more information, of course you do. You go from one number to a function, of course you do. Yes, we do, but are you going to process functions? Because every function that we have is operating on scalars. So it's not that easy. That's the magic of fuzzy logic, which means here... So you are processing rules like this. If x1, then y1. So if x1, then y1. You process it, you get a point. If x2, then y2. If x2, then y2. So you get a point here. Hopefully we can go to an example to see how that works. So, and so on. So I need a set of rules. A set of rules, which is nobody sees this actually. All you see, somebody gives you a set of rules. Or even worse, nobody gives you rules. You still get that Excel file with some numbers in it. Okay, how can I extract rules from some numbers? Well, easy. K-means. Apply K-means. Get the clusters. Construct function around them. Voila. You have your rules. It's very simple. It's not really difficult. So what does that mean? How can we use this in a systematic way? This is very high level. So your inputs come in, whatever the inputs are, then you get the first block. In this first block, you do some sort of coding. This coding is usually happened in fuzzy systems with some sort of membership functions. So the input comes in, and you put them through some membership functions. This step is called fuzzification. So you fuzzify your input. The example was that we said, what was it, cold temperature? No, it was minus 22 degrees. It was 0.75. Let's make him happy. 0.25, 0, 0, 0. That's the encoding. So minus 22 degrees become encoded as a vector. In this case, five numbers, because I chose five membership functions. Why did you choose five membership? I don't know. I felt like it. I can do seven, nine, 11. So that's the encoding part, which means you don't deal with minus 22 anymore. Minus 22 degrees is for you now, this vector, which shows how nasty things can get for fuzzy systems, because you have to process a lot more information. The first 20 years, we couldn't actually do any fuzzy inferencing because there were no hardware for that, and you couldn't run it on regular PC until there were some Japanese companies. They came up with hardware, and then suddenly you could buy a card that could process 10,000 rules per second. Nowadays, another problem. You can do millions of rules in a second. So after you code stuff, you go somewhere that you have a rule base. You have rule number one, rule number two, rule number N. If X1 is this, then Y1 is this. If X1 is this, and X2 is this, and X3 is this, then Y3 is this. If you eat unhealthy and you smoke and you are old and there is history in your family, then you're susceptible to this type of diseases. It's a rule, so I can put it. If the government messes up, there is political conflict, and the old price goes down, then the stock market behaves this way. So rules. Apparently those rules are not straightforward, and apparently they don't take one variable. They take many, many variables. The last part is a decoding part. So this part, by the way, we call it inference, which is a magical word that actually was dying when in 1965, phasologic emerged, and then phasologic basically revived the word inference. You can infer knowledge. And the decoding, we call it dephasification, and in the dephasification you have multiple membership functions, and in the dephasification what it does basically gives you a combined value of the output function and says this is your answer, and you get some number, whatever. I don't know, 12.5. You are controlling something, whatever that is. So this would be your output. So actually the output of a phasic system is a function like this, but since our computers don't understand that, we have to convert it to a number. So we use some tricks to convert the caught surfaces underneath functions into a number. Okay. So I want to have an ingenuity example for this. How does this work? Especially because the most successful application of phasologic was control. So designing controllers, what do you have learned? The P-controller, PID controller? So now I challenge you. Any problem, pick any problem. This weekend, we come. I get a MATLAB or Python compiler. You get a MATLAB or Python compiler. Somebody else grabs a problem to control, and I give you a perfectly designed controller with phasologic in five minutes. Smooth. Nice. Efficient. This is amazing. As engineer, you cannot not love fuzzy inferencing. So how does that work? So example, of course, we go back to inverted pendulum. We didn't really do anything with reinforcement learning. We just defined the reward function. We didn't do it because the design in reinforcement learning is a pain in the neck. It cannot do it in five minutes on the board. So it takes some time. So we didn't go into it. So again, I have a cart. I have a cart that can go back and forth. And my pendulum is here. And the pendulum can go to the left or to the right. Problem analysis, what are the inputs? First, the error, which is the difference in radian. Anything that deviates from 0 or 90 degrees, if you want to put it at 90 degrees, anything that deviates from 0 or 90 degrees is an error. I want to keep it that way. Second, omega or angular velocity because there is a velocity in that. So the angular velocity, how fast is it falling? Then I need to know output. What is the output? The output will be, let's say, current, plus or minus for direction. So I want to either I put plus and I go to the right or minus ampere, some ampere, some small amount of ampere to go to the left. So I want to control the pendulum, right? Go left and right. So I know my inputs. I know my output. Which is the beginning of every AI project? What are your inputs? What is the output? So, OK. Then the first step is falsification. So what is the error? The error is something that goes, let's say, from minus 30 degrees to plus 30 degrees. And in the middle is 0 degrees. And here is minus 15 degrees. Here is plus 15 degrees. So I'm assuming if you go minus 30 degrees or 30 degrees, this is too late anymore. So we cannot save it. It's gone. Actually, around 27 degrees is then too late. So this is, of course, 1. And this is 0. This would be for us 0. 0 error. Then this would be for us negative small. And this would be for us negative medium. I'm intentionally using those acronyms or abbreviations because it's very common in engineering design of fuzzy controller. Z for 0 and S for negative small and M for negative medium and B for negative big and so on. On the other side, I go and I have positive small and I have positive medium. We are just putting a label on it, so this is the set of terms of the linguistic variables. So I have to put a name on it such that I can work with it. So put any name on it. So 0 for me is anything. See, if really is 0, the membership is 1. If it is minus 1 or plus 1, it's still regarded 0 with a likelihood or membership of 0.999. So that's the tolerance that we want to have. Then we have the other input, which is angular velocity. Let's say angular velocity goes from minus 10 degrees per second to plus 10 degrees per second. And here in the middle is 0 degrees per second. And then we have here minus 5 degrees per second. And here I have plus 5 degrees per second. So what is the angular velocity that I measure? And I do the same thing. I do the same thing, 1, 0. I want to keep it simple. So 0 is this and this is negative small and this is negative medium. This is positive small. This is positive medium. I'm keeping the labels the same just for sake of simplicity because they are numbers. And then I need the current as my output. So how should I control? I need the current. The current goes from minus 10 milliampere to plus 10 milliampere. In the middle is 0 milliampere. And then here I have minus 5 milliampere. Here I have plus 5 milliampere. Everything's symmetrical. This is 0, 1. And 0 is again this. Negative small. Negative medium. Positive small. Positive medium. I'm keeping the labels the same. OK. So what? No, I don't know what you want from me. So what is, when I was working with PID controller, I knew what's going on. I had some equations. The world had a structure. I was a happy guy. So at the moment I'm confused. I don't know, OK, what, what. Well, we need somebody of experience who can give us some rules because fuzzy logic does not work without some rules. So let's say we have no clue how to get rules. A senior engineer comes and say, look, relax. You are new here. I give you some rules this time. But next time you have to figure it out yourself. So what are the rules? The senior engineer will give us a small table and say, look, this is the error. And this is your omega. And this is the current. The output. If your error is 0 and your omega is 0, then the current is 0. Is that clear? You are a smart guy. You say, yeah, I get that. Nothing to control. Everything is OK. If the error is 0 and the omega is negative medium, then current should be positive medium. So it's falling this way. You have to go in the opposite direction with appropriate level of current. So if it is falling fast, you have to push. If it's falling slow, you have to push a little, which we will control with the level of ampere. So is it 5 milliampere or 10 milliampere? In which direction? If the error is 0, but the omega is positive small, then the current will be negative small. Because again, we want to work against that in the opposite direction. If the error is positive medium and the omega is 0, then the current is negative medium. And the senior engineer tells me, look, you know, how many rules should I get for that problem? Can anybody tell me? How many rules do I get for this? How many rules are there? Five? Five cubes. Five times five times five. So if this is negative medium, this can be this, this, this, this, or this. And in any case, this can be this, this, this, this, or this, just for negative medium. So it's five times five times five. So I have a lot more rules, but the guy gives me just four rules. I say, you should be fine. That's a risky business, but I don't have a choice. I'm new in the company. I have to figure it out. So next time I will write a program and I will get all rules and some of them are don't care rules. Some of them are theoretically possible, but practically don't make sense. So, okay. Now, and the senior is taking care of us, the senior engineer. So, you know, I give you, I give you an example, calculate this, and then I come back tomorrow. I can take a look and see what you did. And he says, look, I give you input measurements. The error is 27 degrees and the omega is minus 1.5 degrees per second. What should be the current? Okay, now, now this is an engineering problem. Okay, now, now I understand the world. Now I am in my element. Okay, good. Okay, so I have input. I have to calculate output, but I don't have any equations. I don't have PID controller. I just have some weird looking functions that some people call fuzzy subsets. And he gives me one rule and say, look, process these rules. I want to make the life really easy for you. I know if you really want to infer knowledge with fuzzy logic, you have to process all rules at every iteration. So, if you make any measurements, you have to process five cube rules every time to make sure that your controller stays stable. But he is not tough on me. He says, look, this is the rule. If error is positive medium and omega is zero, then current is negative medium. This is the rules that a nice colleague gives me. Say, don't worry about the rest. I know that you have five cube rules. Just show me that you have understood the problem by processing one rule. Because if you can process one rule, just put it in the full loop and you process all five cube rules. It's not a big deal. Okay. So if you look at the error, so if I look at 27 degrees, 27 degrees will be somewhere here. Right? 27 degrees will be somewhere there. And I will cut these two functions and you see that I'm not drawing the other ones because for this rule, they are irrelevant. We say that they are not firing. So if I get a rule, some of the rules will fire. If I get, sorry, if I get a measurement, for these measurements, not all of those rules are applicable. They don't fire all at the same time. Some of them are applicable. And we call them they fire. So I'm just looking at these two last ones that are applicable because I'm looking at 27 degrees. I don't care about negative medium, negative small and zero because they do not affect. They are zero for my calculation. So let's say this is 0.8 and this one is other rule. So because you said positive medium, positive medium, positive medium, this here is positive medium. I'm not dealing with positive small because the guy did not give me any rule for positive small. I could deal with it, but he didn't give me the rule. I don't know what happens for positive small. Then I look at the omega. I get these two membership functions because minus five, sorry, minus 1.5 degrees per second is somewhere here. So what is this? This is negative small and zero. And if I cut these two, omega is zero. I'm interested only in this guy. This guy is the other rule that I don't have it. And this guy, let's say is 0.7. So this is zero. This is the zero. And this is the zero. So we are doing manual inferencing, processing just one rule. In reality, you have millions of rules. It gets really nice if you have many rules and they are nicely designed and they are multi-variable, hyper-dimensionality. It gets really sophisticated. So you said and. And in fuzzy logic is equal minimum. You can do it other way, but to model, to mathematically model the logical and we use minimum. Why? False and false is false. So the minimum of zero and zero is zero. True and true is true. So the minimum of one and one is one. False and true is false because the minimum of zero and one is zero. It makes sense. Okay. So I use minimum to model and. So if error is positive small and omega is zero is about minimum of 0.8 and 0.7, which is 0.7. Which means modus ponens, the observation that if error is positive small and the omega is zero, this observation is not true and false in fuzzy logic is true to 70 degree, to 70 percent, to a degree of 0.7. Is not 100 percent true? Is not 100 percent wrong? Okay. But now for my controller I need to calculate the current and the rule tells me the current is negative medium in this case and the negative medium where is negative medium? This is the negative medium. So now I have the current and the negative medium is this and I don't care about the rest because I don't have a rule for the rest. And now I caught this at 0.7. If my, if the premise is valid at 0.7 if what I'm seeing, if what I'm measuring is valid to 70 percent I can only conclude also with 70 percent. I do not have magical powers. I can only work with numbers that I get. So I will cut it with 70 percent which is if error is positive small and omega is zero, so then 0.7, then the current is what was it? Negative medium. So this region is the answer of the fuzzy controller for you. This is the answer. So this is the answer. But I cannot give this to my supervisor because we need a number. Right? Because this is minus 10 milliampere. So how much is this? How much is it? What are many different ways? If I, for example, if I calculate the center of gravity for this area, for this area, if I calculate the center of gravity this will cut it here, sorry. This will cut it here and this will give you minus 6.3 milliampere. If you really draw these functions carefully and accurately you will get that. So this step that I did that was the defazification because that red part is the fuzzy answer of the controller but I need a scalar. So I did a very simple trick and they say you are giving me a surface I calculate the center of gravity of that surface the center of that surface and I use it as a crisp number as a scalar because my controller needs a number. So I'm pushing that card back and forth. I need a number. So now if you are measuring with the inverted pendulum 27 degrees and the angular velocity is minus 1.5 degrees per second you have to push it to the left with 6.3 milliampere and you're gonna be fine. So I guess it was 1984 or 85 the city of Sendai in Japan opened the first fully automated subway system that was completely controlled by fuzzy logic. And as one of the things that they did they put a colander with a aquarium full of water in there and the level of the water was marked with red and the subway would accelerate, accelerate, stop, go and this level would not change. Nothing can impress an engineer who knows what controller does than that. How you can accelerate and stop because what is the challenge? When you go from station A to station B you have to accelerate as much as you can because you want to be on time. But at some point between the stations you have to stop accelerating and start accelerating because you want to stop smoothly. Who likes it if the stop becomes it? Nobody likes that. Stupid engineering design. It has to be smooth and it had more than 25% efficiency in energy. And the day, this is the dream of every engineer although the guy was a computer scientist. The day that they opened that Sendai subway system if you can look it up. In the first station there was one guy standing and there was cameras on him that was the guy who invented fuzzy logic. Honda gave him some prize. Japanese are really good. After they did that Europeans came saying, what is this fuzzy logic? So if you want to get the attention of Europeans and North Americans you have to ask Japanese to do something because here we spend millions of dollars first to do feasibility studies. What would happen if we go inside this technology and invest? So we spend five years, spend money, give advisors and consultants money to figure out would it be good to invest in AI? What Japanese do? They say, let's go in AI. Let's go in. The money that you want to spend on the consultants and do feasibilities that you spend on the AI project. Japanese were very good. They are not very good anymore. So back in the 70s and 80s they are not, they are not. They have lost that drive for innovation. They were insanely innovative in the 70s and 80s. They have lost some of it. I don't know what happened to them. North American, we were sleeping anyway. Early 2000s they say, what Japanese are doing what? European are doing it too. What is it? Have Japanese created that? No. It's coming from UC Berkeley. What? I don't know. Things like that. So it's very important. So fuzzy controllers can approximate the relationship between inputs and outputs via interpolation in a vague environment. This was one of the things that people didn't understand and we do that with any new technology. I don't know what is wrong with us. We don't really learn. We did that with neural networks. In the 70s and 80s, there were underground theories. You couldn't say you are working with neural networks. You want to say, I'm looking into connectionless models. You wouldn't use the word neural networks. Same thing happened with evolutionary algorithms. How does that work? What is the math behind it? Same with physology, basically any new technology. The question was, what is this? What does it mean? Then in the late 90s, we had, among others, many, many work on it, a PhD person from Germany wrote a PhD and showed mathematically that what it does is a new type of interpolation. It's interpolation in a vague environment that numbers are not accurate. Oh, okay. So we have a new type of interpolation. Fantastic. Now, what physology does not have any capacity of learning. So how do you get those rules? How do you get the rules? So you can, if you have numbers, so let me write it here, maybe. Okay, we have one more. So how do we get the rules? So we can get the rules from experts. But that's not going to be efficient. I have to sit down with an expert, a financial expert, a medical expert, and ask him or her, how do you do it? So that's one of the first applications of physology in automation of concrete plant. They went in and sat down with the expert, how do you control the plant? And the workers, the workers were saying, yeah, if I look, if the pressure is high and the temperature is low and my mood is low too, I do this. So you interview the expert and you get some rules. This is not very reliable. So most of the time we get it from data via clustering and via genetic algorithms and many other methods. You give me the data, I run clustering on it and I get the center of classes and I can define the rules. It's possible. We can do it. There is also the concept of evolving, evolving fuzzy rules. This is one of the most recent development that when you define the rules there is something in difficult systems that we call concept drift. If you design something also a neural network, clustering, reinforcement learning, decision tree, fuzzy system, doesn't matter. Thing in reality change. You do diagnosis and new disease arise. Or we rename the disease or we have new measurements for the disease. So the concept rifts. So you have to be able to adjust. Reinforcement learning does a fantastic job in that regard because interaction neural networks suck at it. You have to take them offline, train them again and deploy them again. Very expensive. Evolving fuzzy systems and decision trees fantastic for those type of problems. They can adjust as you go. They just minimally okay, I change this rule, that rule, this rule. There you go. We are fine. So one of the things that I would look into that are evolving fuzzy systems. And of course to our combination of fuzzy systems and neural networks to our combination of fuzzy systems and evolutionary algorithm and so on. So today in the in the tutorial we will talk about TensorFlow and CNN. So how can you design convolutional neural network in TensorFlow and hopefully we can go step by step such that you can follow. And as you know next Tuesday I'm not there. We will use the tutorial time to talk about evolutionary algorithm Morteza and Amir will do that. And on Thursday I will use the tutorial time for a long lecture on Bayesian approach. So that I will see you I will see you on Thursday.