 So welcome back to the series just for the main experts trying to learn something about deep neural networks. So up until now we've just looked at classification problems and we've also just stuck to our multilate perceptron that is just this densely connected neural network. And I want to draw to a close by just moving away from these classification problems where our target variable is actually categorical in nature to a target variable that is numerical in nature. So we're trying to protect a continuous numerical data point value and that is called regression problems. So I'm going to show you just how to construct a regression problem, how to construct that deep neural network to handle this regression problem. So one or two little changes we have to make from the classification problem that we looked at before. And I just want to show you one more piece of insight into how these layers are constructed and just something to show you that you can explore on your own something else you can change about your models. This is going to be the end of the section on the densely connected neural networks and we're going to move on to the I think much more exciting world of convolutional neural networks very soon. So let's have a look, we're going to use the reader library, Keras of course and Plotli, let's have a look. So I'm just going to import some data and I'm going to use the read underscore CSV from my reader library and I'm going to import this file. Of course this is on our pubs, this document that we're looking at and the file, this actual RMD file and this data file will be available on GitHub. You can just look at the links in the description below this video. So regressiondata.csv, call names is false, so it's just rows and columns of numerical values. Now actually the first row does not contain column headers, there's no variable names. And if I look at the dimensions of that I see I have 4,898 samples over 11 columns and those 11 columns will be my 11 variables. That's 10 feature variable and the last one just to tell you is the target variable which is numerical in nature. As per usual this is my way of working, I want to change it into a matrix and I'm also going to remove the dimension names because even if I change it into a matrix and if I look at a view that's with a capital V in our studio at this I'm still going to have the x1, x2, x3 at the top so I remove all of that. This is a pure matrix of numbers, that's the way I like to use it. Just to look at the summary of column number 11 which is my target variable, we see that it's numerical variables with a minimum of 2.5 and a maximum of 9.3, medium of 5.9 and we see the first and third quartiles there. If I use plotly, remember plotly, open and close brackets, that's just going to give me this empty canvas and I use my pipe operator there so I'm going to add a histogram. On the x-axis is going to be the data set, all the rows, column 11, I'm going to give it a name. The layout will have a title, we see the beautiful title at the top and the x-axis as a list will have a title and the zero line being false, I don't like these big black lines across my plots and graphs here. You can do what pleases you and look at this wonderful, we see these peaks that it forms my numerical target variable ranging from the 2.5 to the 9.3 at the end so we see this beautiful histogram, lovely. Now the train test, we're going to do that split, I'm going to do it in the usual way, I'll set the random seed so when you run this code you get exactly the same split, remember I'm going to just create this index of values 1 and 2 and that's what the 2 is for, it will automatically start at 1 and I do that over and over again until the number of rows of the data set, that's the 4900 odd and we're going to get here, I want an 80-20 split so that 80% of the data is going to make it into the training set and 20% into the test split, we've looked at this many times before, I'll just move over quickly and I'm going to create my x-train, x-test, y-train, y-test variables and doing that by using this index that I created up here, very easy to do. I've also got to normalize my data and remember I'm going to calculate the mean and standard deviation of my training data set and I'm going to scale my test data set according to the mean and standard deviation of the training set not the test set's mean and standard deviation and then of course I've got to just scale x-train as well and the scale will do that for me, it will bring all 10 of the feature variables it will bring them into a mean or zero and a standard deviation of 1, great stuff. Let's build our model because we're going to see the first little difference. Still a sequential model, I'm going to have one dense layer, two dense layers, three dense layers it's just for demonstration sake here, nothing special, I have 25 hidden units in each I have the rectified linear unit as my activation function in each and my input shape for my first hidden layer is just the number of feature variables which is 10, I've got 20% dropout in each of these and then here's the special one. The last layer, because this is a regression problem I'm just going to have a dense layer with one unit. No activation function, nothing, nothing else, just one unit, that's it. Let's look at the summary of that model which then gives me 1,601 linear parameters and we can see the last layer there, just a single node no activation function, nothing, I want that pure value that comes out of there and this is the bit of deeper insight I wanted to share with you this time around as we end off the section on multilay perceptrons or densely connected neural networks is this get config function. So I'm using pipe just to pass model as the first argument here to get config and look at this, let's start with our first layer, it has a class name this is behind the scenes because we just did this there's lots of other arguments that I can pass to layer dense and here they are, so the class name was dense the conflict was name dense 1, I didn't give it a name remember we could give it a name so just the default trainable, that is something we're going to look at in the future especially when you get to convolutional neural networks it says that all these weights and biases I want them to be trainable with back propagation I want them to be updated now what we're going to do with convolutional neural networks you can actually download pre-trained neural networks that'll have these values already in them and they can form the first part of your neural network and those weights are cast in stone, someone else created them used huge numbers of data and has already trained those weight and biases values and we're going to use them so we can actually set that our weights for that layer is non-trainable especially if we bring that layer in from a pre-trained model we see d type there is float32 and that's the default for tensorflow it takes 32 bit floating point values the units 25, 25 units activation we actually said really use biases that by default a set is true you can also set that as false so there's no biases a bias vector in your deep neural network or tensor I should say and the kernel initializer remember the first time this runs it just initializes random weights random weight values and there's different ways you can go about these and the way that is default is by variance scaling and the config is a scale or standard deviation of 1 and the mode is this fan underscore average and the distribution is uniform and there's no random seed so look into all of these these are actually quite fascinating and there are a lot of these distributions you can use to set your weights initial the bias initializer is going to be set to zeros the kernel regularizer is set to none the bias regularizer is set to none as well the activity regular... I'm not going to say that word today this is impossible for me none kernel constraints and bias constraints also called clipping those are set to none also something you can look at then we get to the dropout layer the dropout is also trainable set to true this trainable the rate is 0.2 which we set we set no noise shape and we didn't seed it so that the same values are used every time and then we get to dense layer number two so there's lots of these arguments which we never ever used when we constructed this look into those they are quite amazing so we've got a compile now and that's our different second difference other than the dense layer with the single node node activation functions our loss is just going to be we'll choose mean squared error here and our optimizer is going to be RMS propagation the RMS prop there and we see there the RMS prop empty parentheses there because we're just accepting all the defaults and the defaults are a learning rate of 0.001 a row of 0.9 epsilon null decay rate of 0 for our learning rate and a clip known and a clip value of 0 and 0 now the metric we're going to set to mean absolute error now the metric remember that's like a type of loss function but it does not form part of the actual gradient descent in the back propagation that just gives us our view of the error as this runs the loss function that we're actually going to use is not mean absolute error but mean squared error let's fit that data we ran it through 50 epochs here batch size of 32 validation split of 0.2 and I put in a callback and my callback was early stopping and I was monitoring the mean absolute error and if five values did not do any better it would terminate those epochs verbosity at 2 I ran my model do that and then this let's just print this out so something new if I use this backwards pipe so look at this it's less than and a minus sign so this evaluate function is going to give me two values and I'm placing them inside of this list lost and mean absolute error and that is what it's going to return for me the loss and the mean absolute error so I pass x-test and y-test to that I didn't want anything printed to the screen and I'm just going to do that evaluation and that's going to give me these two values loss and mean absolute error and I'm going to print those two out using this you haven't seen this before in R the paste zero function so I have my string there mean absolute error on test colon space and I'm pasting this with the print sprint f function and this percentage dot 2f means two decimal places and what I want of these two is the mean absolute error and it prints me out the mean absolute error I remember it was about 3.7 to 9.7 an absolute, a mean absolute error of 0.62 so not too bad so there you go in short a regression problem we don't often deal with regression problems but now you know how to do it and you know how to create that last layer of your multilayer perceptron your densely connected neural network and you know which of these loss layers the MSE mean squared error is a good loss for that you've also learned about all these other arguments that you can put that you can change inside of the creation of your model next up we're going to start looking at convolutional neural networks