 untuk ke kedua-duanya dari kedua-duanya ke kedua-duanya ke kedua-duanya ke kedua-duanya ke kompensi yang tersebar di bawah, tapi ini bukan apa yang berlaku di dalam perbincangan. Sebelumnya, kita membuat perbincangan. Jadi, aktiviti baru tidak memperkalkan sebaik-baik ujian di dalam perbincangan sebelumnya menuju kembali ke metode. Lain satu masalah yang terakhir adalah menuju kembali ke dalam masa sebagai sebuah kebiasaan belajar terutama tidak mungkin. Kenapa? Kerana untuk kembali kita kita melihat perangkatan perangkatan, tanpa membuat masa yang sering, perangkatan perlu terus menghidupi perangkatan perangkatan. Dalam perangkatan perangkatan, tetapi ini membutuhkan perangkatan yang dapat dilengar di atas. Ini bukan kekalkan, ia tidak dapat dilengar di atas. Selain perangkatan perangkatan, yang satu boleh menangis, perangkatan perangkatan mempunyai kekalkan dan mempelajari masa yang berlainan. Dan ia tidak dapat berhenti melakukan perangkatan perangkatan. Ada juga masalah yang penting pada kertas Lili Krab at House, kekalkan dan perangkatan perangkatan perangkatan perangkatan. Jadi mari kita melihat ini. Pertama, perangkatan perangkatan tidak mengubah perangkatan perangkatan. Jadi, kita memastikan kita menggunakan kekalkan untuk mengelakkan bagaimana perangkatan yang dikatakan. Kemudian kita menggunakan perangkatan perangkatan, dan kemudian kita menggunakan kekalkan. Lantuk kita sumberkan yang berbeda kita memiliki dari Also, feedback connections in the brain serve a number of functional roles. So, basically, top-down control through feedback connections has a very well established link with gain control. So, it's basically the enhancement or suppression of neural responses. So, for example, attention to a particular feature in your visual field, right? So, that's what feedback connections influence also. It also drives activity in the cortex rather than just modulate or enable it because backprop modulates or enable activity, neural responses. So, this is where Jeffrey Hinton comes in now. So, during neural information processing system 2022, right? He basically proposed this forward forward algorithm because the main problem is that backprop does not model the cortex well enough. It's very inaccurate. He wants to model something more like the cortex and also pipeline able to pipeline data. So, what is the forward forward algorithm? So, you can see, right? On the left-hand side over here, we have a default with backpropogation. That's the formal name for this learning procedure that we've always been using. So, we're comfortable with one forward pass in front, forward propagation and then backpropogation, right? However, the forward forward algorithm actually just has two forward passes, two forward propagation. That's how it works. But one of them is called the offline phase, formally named by Jeffrey Hinton. Offline phase and the online phase will come back to that later. Also, there's another difference between the two. On the left-hand side, the forward with backpropogation has only one cost function. We're all comfortable with that fact. It only has one cost function and it comes at the end evaluating how well the predicted value is in regards to the actual value. But forward forward doesn't have one cost function. It has multiple and it has a single cost function in every single layer. So, for the cost function, if you're wondering what the cost function is, it looks like this, right? The goodness function, formally given by Jeffrey Hinton, is basically just the summation of the square rectified linear neurons. So, the activated neurons are basically... So, that's the loss function. But basically, what you have to know, right? Because it will take two more. But basically, what this cost function does, the brief summary of it is that this goodness function gives us a goodness value for each layer. And we want the goodness value for each layer to be increased during the online phase. So, one of the forward passes will be called the online phase. During this phase, we want to increase the goodness value of every single layer as much as possible so that the model can differentiate real and fake data. And then for the offline phase, which is the other forward pass, it's going to decrease the goodness value for every single layer, so much so that the model can, again, differentiate real from fake data. So, this is the two different phases. So, online phase, let's get into online phase first. Okay, so it corresponds to win. And during the online phase, we only fit in real data. We'll come back to what real and fake data is later on. But you just have to know that online phase, we only fit in real data. We don't fit in fake data. And we want to, again, maximize activation in every single layer. And it is supplied. It can come from a data set or whatever. But you do have to modify it. We'll come back to that later on again. But the offline phase, basically, it corresponds to sneak, like our sneak face in our brain. And we fit in fake data during the offline phase. We never fit in real data. And then we also minimize activation, which is talked about just now or so. And we can get fake data by modifying real data or we use a generative model, basically. So this is the probability function for how the model differentiates real and fake data. So you can see, we minus off a threshold, which is a hyper parameter from the goodness value. Then we fit it into a logistic function which outputs the probability of the data being real. But what is real data? What is real data? I'm sure you're comfortable with the MNIS data set. Do you know the MNIS data set? Okay, the MNIS data set is a data set of images from zero to nine. It's all head-width. Oh, yeah, zero to nine. So this is an example of real data. So a normal image from this data set would be just a four, right? But notice something on the top left. It is a one-hot encoding label of this image based on the top left, the first 10 pixels of this image. This is real data. We basically merge both labels and features basically whatever we want to use to train the model together. We merge them together and then think data is the same thing again, but the one-hot encoding, aka the label or whatever you want to use as the label doesn't need to be one-hot encoding. It can be just a single number. Basically, that will be wrong. That's the only difference. That will be wrong. So that's called fake data. Yeah, this is the code for one-hot encoding label onto images. First 10 pixels again. Also for the forward or algorithm, right? So as I've talked about it before just now, we know that forward propagation is basically matrix-multipline the width with the inputs. Then we add bias. The width of sum of inputs. Then we feed the width of sum of inputs into an activation function to make it non-linear basically. But for forward forward, it's a bit different. So before we do all that matrix multiplication and then giving it into an activation function, we first normalize it. Why do we want to normalize it? Because if this previous layer feeds every single information into the next layer, basically the next layer can just look at this previous layer vector. This layer is a vector. We have to think of this previous layer as a vector. If we feed everything into the next layer, this next layer can just learn off the vector. It can look at the length of the vector. Again, goodness value is affected by basically the length of the vector. So if we feed the length of the vector in, it knows basically the goodness value already. So it can tell if it's fake or real data without learning at all. So this next layer basically just doesn't learn anything at all. So that's why we normalize it first. Then it becomes a unit vector. Then you have a length of one on the vector. So now you don't have any information. You're not feeding any information at all. So then the next layer will be able to learn stuff. So the MNIS data set, let's look at the fun part now. I've explained so much. Let's look at the fun part, the coding part. So this is a very simple model summary taken from TensorFlow for the MNIS data set. We're classifying. So this is how it performs up. So validation loss over training. Then validation accuracy over training. 97.5% accuracy in 21.1 second. It's pretty good. Pretty good performance. So let's look at forward-forward. First we put labels with images. Then we create negative data by randomly permutating the labels. So we are basically just gathering the labels around mixing them and then giving them to all of the images. So it's just randomly giving each image a random label. So this is training the model again. Goodness value we've talked about it before. Calculating the loss again, we've talked about it before. It makes the goodness of real data above the threshold so that the model can differentiate between real and fake data. And that makes the goodness of fake data below the threshold. Again, same purpose. Calculating derivatives. Predicting images is a bit different. It's a bit different from normal feedforward with back propagation because the way that you predict images with forward-forward is that let's let's look at Amnesty data set again. So it's handwritten digits from 0 to 9. So you have 10 labels, right? You have 10 labels 0 to 9 and basically you cycle through all of them. So 0, 1, 2, 2, 4, 5 and basically what you do with each of them is you feed them into an image. The image that you want to predict so you merge them together. You get the whole goodness value of basically that image with that label and then once you get the goodness value of all the labels along with the same image. Your prediction is just the label with the maximum amount of goodness. That's your prediction and then let's look at how it works. So same time 22.4 seconds not a big difference okay not a big difference train errors 0.09 test errors 0.09 it's not bad it's really not bad but let's look at another problem set we will be looking at two different problem set blood cell data set basically it contains all different images both those are white blood cells and this is a PyTorch CnN to classify them so it's nothing to advance or whatever it's just a normal convolutional layers pulling layers drop out layers and then feed forward layers so the loss is this not charging so we're the loss at okay the loss is 0.05 after like 1,396 seconds which is quite long and then this is the architecture for the forward forward algorithm so basically like this first one didn't work as well that it was clearly overfitting but this second architecture work better but still the train and test accuracy is not that good you can see 54% right it's really not that great however we so I tested this on the titanic data set as well and you can see that it works way better it's like 80% accuracy 80% test this is very good for something that's not using back propagation because it takes in all of the advantages from forward forward discuss all of the disadvantages from the feed forward with back propagation and it still produces quite a good accuracy on both train and essence so in the spirit of open source this is my gear hub for it and the can go for it so what are some of the advantages and disadvantages of forward forward so advantages include it can be used when precise details of forward computation is unknown because it doesn't have to perform back propagation also it can pipeline sequential data while learning this is one of the things that we learn in the context and it does not need to store neural activities or stop to propagate but obviously it has some disadvantages also right it has less generalisation power it's somewhat slower as well and it's unlikely to replace back propagation for situations where power is not an issue but if that being said there are very specific use cases for when this can be better than feed forward with back propagation that being model of learning in the context so a very good model of learning in the context and it's a way of making that's a way of making use of varying raw power and a lot of hardware without resulting read possibility this allows us to do that basically yeah and this is my github if you want to follow these are basically just two friends that helped me along very important yeah that's basically it yeah so my power is so low my battery is going to run out so basically you're saying that I can I'll need to run this I'll need to run one single support