 Welcome to the next module of the course. In this module, we will build our first neural network model for image classification task. In this exercise, we will be using tf.kiras API. We are going to build a model to classify images and for this exercise, we use a fashion MNIST dataset. Fashion MNIST dataset is very, very similar to MNIST dataset that we used earlier for building Hello World model for TensorFlow. The fashion MNIST dataset has 10 labels of different fashion accessories. There are 60,000 images in the training set of fashion MNIST and 10,000 images in the test set. Each image is 28 by 28 pixel in size and is associated with exactly one label. So I would like to tell you that when you are going through this particular collab, I would urge you to stop collab if you are not, if you want to try things on your own and you know open a separate collab notebook and try coding the things that you see on your screen. That will help you to understand the TensorFlow Keras API better. So let us begin. To begin with, we first connect to the collab runtime, collabs cloud runtime specifically and the TensorFlow 2.0 may not be installed in this runtime. So to begin with, we first install TensorFlow 2.0 using the pip command, pip install command and you know whenever commands start with exclamation mark, this particular command runs on the system and it will install TensorFlow 2.0 SDK or API on the system. So let us do it. So you can see that in this particular runtime, the TensorFlow 2.0 API is already available. So nothing happened. If this API is not available, TensorFlow will be downloaded and then installed in the cloud runtime. Once we download this TensorFlow 2.0, the next task is to import the libraries that are required for us in building the model. So we will be using Keras API. So we are going to import the Keras library and we are going to use NumPy for manipulating and storing the data. So we will also import NumPy and we are going to use matplotlib.pyplot for plotting various images of objects in fashion in this dataset. So we are also going to import that particular thing. Finally, we will make sure that we have the right TensorFlow version loaded on the collab runtime. We ensure that by printing the tf.tf.underscoreunderscoreversionunderscoreunderscore. So this particular command prints a tf version. So let us run this and make sure that we have tf 2.0 beta 1 installed and you can see that it actually printed tf version which is 2.0 0.0 dash beta 1. So we essentially ensure that we have the right version of TensorFlow installed on the collab runtime. So the next task as we talked in one of my previous one of my previous sessions that in order to build machine learning models we require data. So data is the first prerequisite for machine learning model and in this exercise fashion MNIST is the dataset that we are going to use. We are going to use our training data in fashion MNIST for training our model. The training data is, so the training data in this particular dataset has an image and associated label and the image is presented to us in form of 28 by 28 pixels and each image has exactly one label associated with it and there are 10 such possible labels. The labels are given IDs ranging from 0 to 9. And one more important thing to note here is the MNIST dataset is, so all the images in MNIST dataset are all grayscale images. So you can see the MNIST dataset some of the images are printed on the screen right over here. And these are images of objects from different classes, so you can see that this is an image from one class, these are tops, then bottoms form the another class, then there are some shoes that form another class, then there are bags and so on. So fortunately the fashion MNIST dataset is already present in TensorFlow and since this dataset is already present in TensorFlow we can directly import and load the data from TensorFlow. If your dataset is not present in the TensorFlow we will have to write or we have to make provisions for making sure that our dataset is available in TensorFlow and we have covered this in one of our previous modules. So you can go back and refer to that if you want to bring in your own data in TensorFlow. But for the purpose of this exercise or the dataset that we are using here which is fashion MNIST dataset is already present. So we will use fashion MNIST.loadData command to load the data in the colab. And this loadData command essentially returns us 4 numpy arrays. So two arrays corresponds to training data and the other two correspond to the test data. So in training data we have one array of training images and one array for training labels. Similarly in test we have one array for test images and one array for training one array for test labels. And as we said earlier each image is 28 by 28 numpy array and pixel value in each of the cell of this array ranges from 0 to 255. And the labels are an array of integers from 0 to 9. And you can see you can you can see all the labels on the corresponding class name displayed over here. Since this class names are not included in the dataset we will actually store them in an array so that we can use it later to print the name of the class along with each image. This will be used later when we are exploring our dataset or whenever we are trying to print the actual label and the predicted label. So for all those purposes this particular list will be very useful. So if we execute this particular thing we will get all the class names in the list in the array. So let us execute couple of previous cells we have executed up to this point. Let us execute let us first load the fashion image dataset and let us put the class names in the class name array. Now let us explode a dataset and see how exactly image looks like. So before looking into image so you can see that train images is of the shape 60,000 28 by 28 this 28 by 28 here is the pixel size of each image and we have 60,000 such kind of images in train images image numpy array. Likewise there are 60,000 labels in the training set so it is good to see that there are exactly 60,000 images and 60,000 labels corresponding so one label corresponding to every image. So this particular sanity check is very nice to confirm. And then we said that each label is integer between 0 to 9 let us confirm it by checking the training labels and you can see that there are labels some labels are displayed which are between 0 to 9. Let us also do sanity check on the test set test set has shaped 10,000 28 by 28 there are 10,000 test images each of size 28 by 28. So you can observe that the test image has exactly the same pixel size and that as a training image which is a good thing and the test has 10,000 in the test labels has 10,000 items in it. So you know here the test image shape the number of test images and number of labels are matching which is a good sign. So this so by doing this particular exercise we made sure that our data is clean and there are no missing labels or any such kind of problems there. So the first thing that we do after getting data is exploration we just did. Now after exploring the data it is important to pre-process the data. In the data pre-processing step we generally transform the data we normalize the data. So let us see how we do that in case of images. But before normalizing the data let us plot an image and see how it looks like. So we will be using IM show method of plot and we will be you know plotting the first image in the train image from the train image numpy array and along with image we will introduce a color bar and we do plot dot show to actually print the image. So you can see an image of an ankle boot on the screen. So you can see that there are exactly 28 rows and 28 columns each cell in this array corresponds to one pixel value and pixel value is range from 0 to 25. So this is the color coding of the pixel value. So this is a visual representation of the first image. Now as part of image normalization what we will do is we will make sure that each pixel value ranges from 0 to 1 and the simplest way to make sure that each pixel value ranges from 0 to 1 is we can do that by dividing each pixel value by the range of pixel value which goes from 0 to 255. So essentially we can simply divide each pixel value by 255 and this will make sure that each pixel has value between 0 to 1. So after performing this particular operation by dividing each training image by 255 we get all 10 images where each pixel value is between 0 to 1. We do the same thing on the test data and in order to make sure that training data and test data is pre-processed or specifically normalized in this case in exactly the same way. It is very important to ensure this particular step that we use exactly the same way of normalizing across training and test data. Let us run this particular cell and this cell has now the trained images and test images NumPy array contains normalized values for each of the images. Let us confirm that the data is in correct format and we have normalized it correctly. So we will again use im.plot.imshow function to print each of the image. So here what we do is we are going to plot first 25 images that is specified in a for loop here and we are going to have a subplot of size 5 by 5. We will add x and y ticks and then finally we will do imshow. In imshow we are going to print ith image and along with the image we are going to print the class name of the image that we had stored in the class name array and class name will be stored as a label on x axis and finally we print all these images using plot.show method. So now you can see that each image is printed here along with their names just below the object. So we have plotted 25 images in 5 by 5 grid each row has 5 images and there are 5 such kind of rows. Now that we have exploded data, normalized it, visualized the data the next task is the core task of model building. In this case we are going to use a neural network model for classifying the images. Let us look at the architecture of the neural network model before getting into the code. So we essentially have 28 by 28 image where so there are 28 rows and 28 columns and each cell has a value between 0 to 1, between 0 to 1 and now what we are going to do is we want to essentially use this pixel information to learn the label. So essentially we are trying to design a function that takes into this particular image and it maps this particular image to labels and we have labels, possible labels are between 0 to 9, there are 1 to 9 labels. So this is a classic so we will try to solve, we will try to build a model for this particular task and we are going to, we can build any machine learning for model for this but in this particular exercise we are going to use a neural network model for the task of detecting the class of the fashion object. So what we will do is, so we will have the following. So whenever we are building a neural network model, so what are the components of neural network model? So the first component is the architecture of the neural network model. By architecture, in architecture we specify number of layers and number of units per layer. So what we will do is in this particular example we are going to build a very simple neural network model. So we have 28 cross 28 image as an input, image as an input. So what we do is, first introduce a layer called flatten layer. What is flatten layer does is, it takes this to 28 by 28 pixel values and convert that into a list of 784 numbers. Because it is 28 by 28 there are totally 784 values or 784 cells. So we essentially you know open these cells up and append rows after one after the other. So we have this pixel values where which are indexed between 0, 1, 2 up to 783. So we have 784 such kind of values, so the image which is in the matrix form is opened up in form of a list or in form of an array. So this particular array becomes an input to a layer, we will use one hidden layer having 128 units. So this is the first hidden layer, then we have an output layer and we have an output layer there will be 10 units. Each unit corresponds to one class. So this is the model, so let us draw this in form of network. So this is a block representation of a neural network model. So we have 1, 2, 784 as an input then we have 0, 1, 128 is the first hidden layer and then we have 10 units in the output layer 1, 2, we have classes ranging from 0, 1 to 9 and these are the outputs. We connect each of this pixel, so we are going to use a dense layer that makes sure that each node in the previous layer is connected to the node in the next layer. So we are going to use again dense layer over here, so each of the inputs from the first layer are getting connected to the second layer and this is essential model. Each of this node, let us see what they are doing, so each of this node, so let us say we have this particular node, so each of the node also has an additional unit called, additional connection called as bias, so each one of them has a bias. So let us expand how one such node looks like. So we have all 784 inputs and one additional input called as bias. And here, so in this particular node, so if we what I will do is, let us draw this node slightly bigger, so this particular node does 2 things. So in the first phase, it does nothing but a linear combination. So now what you can see is that there is a weight associated with each of the incoming arrows, let us also put arrows here, so information flows in this particular way. So here first what we do is, we will have, we essentially do linear combination which is B plus w1, let us call them as this pixel value is x1, x2, x784, so w1 x1 all the way up to w784, x784 and then we apply some kind of a non-linear activation on this, activation on the whole thing over here. Let me represent this, let us expand this in the next slide, let me have a bigger circle which represent one neuron, we said that there is a bias that is called inputs as x1, x2 up to x1, 784 and we saw in the previous slide that we do linear combination. So we will write this compactly, say that i is equal to 1, 2, 784, wi, xi. So these are linear combination that we do between, so these are the weights essentially that are flowing, w1 is a weight correspond to x1, w2 is a weight correspond to x2 and w784 is the weight corresponding to 784th unit, so there are all these kind of units here and we apply the non-linear activation, v plus into 784 wi xi. So essentially we get some real number over here which we pass to the activation function. In this case we are going to use ReLU which is a popular choice for activation function and ReLU works something like this, so there is a real number, this is a real line. So this is the value of let us say the value of z which is the value coming out of the linear combination and this is the ReLU of z. So what happens is, so up to 0 for any negative number ReLU gives the value of 0 and for a positive number all the positive numbers are written by ReLU. So this is how the ReLU function looks like, so the ReLU helps us to you know bring in the nonlinearity in the equation. So each of the hidden unit has this particular computation going within them. Now coming back to our collab, we write this particular whatever I explain just now we write this using this particular code. So we first use a flatten layer which helps us to convert the array you know a matrix representation of the input to an array representation. So this particular flatten layer will convert this 28 by 28 representation to an array of size 784, then we have one hidden layer which we call it as dense and I just explained how the dense layer works and dense layer has to 128 such kind of units and we are going to use ReLU as an activation function. Finally we have a dense layer of 10 units and each of these dense layers are using softmax as an activation function. So the softmax layer returns an array of 10 probability scores that sum up to 1. Each node contains a score that indicates the probability that the current image belong to one of the 10 classes. Having set up the model here, now we will compile the model. So after setting up the model what else is required for training the model, we require to specify what kind of optimizer we will use for training the model. We will also specify what kind of loss functions we should be using and we also specify the metric to track during the training process. So in this case we use Adam which is an adaptive which uses adaptive learning rate. So Adam as an optimizer which is proven to be one of the better optimizers for training deep neural network models. We are using sparse categorical cross entropy loss because we have 10 different classes in the output and we are using accuracy as a metric, as a measure that we will be monitoring or a metric that will be monitoring. Let us execute this particular step, set up the model and now we will compile the model. You can also use other optimizers here like RMS prop or standard gradient descent. But Adam is one of the one of the default choices for you know training the neural network model. So we are going we are sticking with Adam and yeah and sparse category cross entropy loss is the appropriate loss for this particular data set. I would encourage you to stop at this point and try to code these things on your own and see whether you are able to you know replicate what we are seeing in this particular colab. After compiling the model the next step is to train the model and we said that for training the model we need to specify what are the training we have to specify essentially the training data and how for how much for how many iterations we want to train the model. We also specify it is advisable to train the neural network model in a batch setting. So we sometimes also specify what is the what is the batch size. Sometimes we also specify the regularization parameter for training. So in this case we will keep it simple being this being the first model we will simply specify the training data which is training images or the features and the labels. So training images has all the pixel values and training labels have the corresponding labels for each of the image and we will train this model for 10 epochs. You can train this model for longer but since this is more of a demonstration we are choosing 10 as an epoch size. You have to be careful with number of epochs if you you know train for longer epoch there is a chance that the model will overfit. So you have to watch out for overfitting and we are not specifying the batch size. So default batch size of 32 will be used for this particular fit function. So let us let us train the model and see where we reach. So you can see that there is a progress bar that shows us progress. So these are the updates that we do in each of the epoch and you know in order to remind you an epoch is one full iteration of a training set. You can you can also see the number the time taken per sample which is 92 microsecond in this case and you can also see the loss and the accuracy numbers. You can observe that the loss is going down as we train the model further and further and accuracy goes up. So we started with loss of somewhere like 0.49 and accuracy of 0.82 and after 10th iteration let us see where we reach. So model has completed all the 10 epochs and the loss has come down to 0.23 and accuracy has gone up to 91 percent. So we started with 82 percent accuracy with the initial parameter set and after 10 epochs we got accuracy of 91 percent. So this is an important part. So we train the model at this point we have a model. By model we mean we have a neural network model with weights learned. So after 10th epoch we have all the weights corresponding to each edge of the neural network. Let us use this model let us see how this model performs on the unseen dataset and as we have been talking in this class or you must be aware having basic background in machine learning that we want machine learning algorithms to work well on the future data. And how do we really test its performance on the future data? So we use some data as a surrogate for the future data. So we use that particular dataset is called as test data in the parlance of machine learning and test data is not at all exposed during the training. So we will evaluate the accuracy of the model on the test data. So we will use model.evaluate function which will take test images as an input and will also supply the actual labels as an output. So actual labels help us to compare the actual label with the predicted label and that helps us to get the test accuracy. So the evaluate function returns the test loss and then the test accuracy we are more interested in the accuracy part so we will print that particular thing out. So let us evaluate the model and we got test accuracy of around very close to 88 percent which is slightly lower than the training accuracy. The numbers that you see here after 10th epoch is the accuracy on the training set and since we trained the model on the training set, training set is training set always receives higher accuracy than the test set. So this you know made you know after watching looking at the test accuracy we are sure that we have model of a reasonable quality. Let us use this model now to make predictions because that is going to be our objective we have trained the model we will be using this model to predict the label of a new fashion item. So we use model.predict as a function which takes bunch of text images as an input and returns prediction for each of the image. So let us run this on all the test images and look at the prediction for the first image. So this particular predictions is so prediction for 0th image is an array in itself. So this array you can see that there are some float values here each of the value represent the probability. So for example if you take the first value the first value represent the probability of this image having the 0th label. So it appears that you can see that this particular image has the highest probability mass or highest probability at label 9. So we essentially what we will do is we will use np.argmax function to assign the label corresponding to the position having the highest probability mass. So as you can say as you can see that the highest probability mass was at position 9. So we assign the label 9 for the first image. So let us look at what was the actual label and yes actual label was also 9. So we know that this is a correct prediction. Let us actually you know plot this in form of a graph. We will plot image and we will also plot the prediction value and the actual value. So we are going to use actual we are going to print actual value as well as predicted value and if the actual value is matching with the predicted value we will use blue color and if there is a mismatch we will use red we will use we will be using the red color. That information we are specifying here and here we are getting the predicted label by doing np.argmax just as we did it over here and we are essentially getting the prediction array as an input. So let us look at how does the first image looks like. So what we do is for the first image we supply the predictions the predictions array and the test label and the test image. So we essentially give all this information and we try to print this. So you can so this is the actual image along with its label this is the actual label which is there in the bracket which is ankle boot and is the predicted label which is again ankle boot with 99% confidence or with 99% probability. You can see that there is a very tall graph at the ankle boot at a position corresponding to ankle boot. Let us look at the label for the 12th image. So 12th image is a sneaker which is correctly predicted sneaker but here the probability is much smaller compared to the earlier example. So this is 49% probability and you can see that there is a strong probability mass here of 49% here but there are a couple of other possible predictions there are some there is substantial probability mass here but since this is the highest probability mass we assign it the label corresponding to the position which is a sneaker label. So let us plot several images along with their prediction just to see you know how we do on some more images. So we are going to print 5 rows and 3 columns each column each row has essentially 3 images and we have here are the prediction for first 15 objects and you can see that in this case all the objects have been correctly classified. Some objects have been classified to the 100% confidence level and so this particular sneaker example that we saw has the least confidence amongst all these labels that we all the objects that are on the screen. Finally we want to make finally let us understand how to make prediction for a single image. So this is a single image which has shape of 28 by 28 and the Keras prediction function is optimized to make prediction on a batch or a collection. Hence what we do is we also insert the single image into a collection and then send it for the prediction. So let us add image to a batch where it is the only member and you can see that the size of the batch is 1, 28, 28 and we will pass this particular image tensor to the predict function and it will give us the prediction vector and prediction vector has 10 values as we saw earlier and we will plot the array of predicted value and on the x axis we have you know the labels corresponding to the names of the label and you can see that there is a very high probability mass on the ankle boot which is also displayed in blue. So this is the correct prediction. And let us do np.argmax and you can see the label of 9 just as before. So in this particular exercise we built an image model. In this exercise we built an image model with a feed forward neural network. So this was our first model that we built with TensorFlow Keras API. In the next exercise we will use the TensorFlow Keras API to build models for structured data as well as for regression problems. See you in the next module. Thank you.