 Let me talk about deep learning with Python TensorFlow. This is a fairly high-level talk, so I don't want to maybe offend some of the deep learning experts in the room, but I'm going to kind of keep it fairly simple. But I kind of try to go into an explanation of what TensorFlow is doing under the covers and what deep learning actually is so that you can get a better idea of what's going on. So just as an introduction to myself, and I was told to give details into my background. So I'm a developer advocate, so you can kind of take that. You can parse that any way you want to. I know you're machine learning people, so you can either think of it as developer being the subject or the noun there. So either I'm a developer who advocates or I'm an advocate for developers. So either way, I have a developer background, so I was a Python engineer before I joined Google. I became a developer advocate, but I'm still a developer in the sense that I still built lots of things, but the focus is different. It's more about enabling other developers to do things. I'm based in Tokyo, Japan, so if you're ever out there, you can kind of catch me up. But I also travel a lot to Asia, so I'm in Singapore, and India, and Taiwan, and places like that are pretty often. You can follow me on Twitter and Google+, if you're into that thing. I don't post there, so don't do that. So as a background, I'm a Python person, but I also recently have been doing a lot of go. This is an image that was created by a Japanese guy who does a lot of really cool go images. I think the best one is the one raising Japanese onsen. And also I do a lot of Kubernetes related stuff. So if you're into containerization and things like that, come talk to me later. So deep learning, kind of 101. So deep learning is something that people say a lot of, and it's kind of a buzzword. But what does it actually mean? For me, it's a type of machine learning. So machine learning is kind of like if you draw the Venn diagram, right? Like machine learning is this big circle, and then deep learning is this kind of circle that's inside of it, right? So deep learning is a specific type of machine learning where you actually create neural networks. And neural networks are not just neural networks, but actually deep neural networks. So I'll be talking about neural networks in this and do a little bit of demo, but it's not particularly deep. But I want to get that over to you that you actually, in order to be doing deep learning, it actually has to be deep. But so what are these neural networks and what are they good for? So one of the kind of classic problems that neural networks are really good at are there's really two main problems. One is classification, where you take some input and you put it into one of the number of buckets. You say it's either A, B, or C. What's the probability that it is A, B, or C? And then the other one is regression, where you basically create a formula from the input. So you say you get some input in and you train it on some data and you get a formula out that kind of describes the data or allows you to kind of make predictions on the grid data based on the formula. So just to kind of give an example of what these kind of neural networks are, let's look at a couple of problems. This is the playground.tensorflow.org website, which is a kind of cool little app that you can play with. But it has a couple of little data sets here and these are like orange and blue little dots in the data set. And what you want to do is you want to have, you want to create a neural network that can classify this one. Is it blue or green or blue or orange based on if it sees new data, can it classify it as blue or orange? And three out of the four of these are pretty boring. So if you take this one here and you just, without even creating a neural network, do you just take the inputs and train it? You can actually have it solve the problem. Because all you have to do is really draw a line down the middle and you can say everything on this side is blue, everything on this side is orange and you're done. The same thing kind of goes for this. Like if you use the right type of inputs, like this is X1 times X2, then you can just straight up create a very simple way to classify the images. So if these are the type of like classical creation problems that we had to solve, we would never need neural networks, right? We would just draw lines down in the middle of these plots and we could classify things one or the other based on these lines. But what if you have something that looks like this little spiral here, right? It's a spiral, but once it hits the middle, it changes into orange. So these orange ones, orange is on this side of the spiral and then blue is on this side of the spiral. Like no matter what kind of inputs you give it, it's not gonna be able to actually figure out what the input is, right? So even something like this, where you have the orange on the outside and the blue on the inside, if you give it the right input data, it can actually solve it by just drawing a circle. But with this one, you can actually, you can't actually, it can't actually figure it out. Like it will never actually converge on the right answer. So what we need is we need something that will give us a kind of a way of solving the problem. And so we can do that by like having these neural networks and what you do is you give it these inputs and at each part of the neural network, we can solve kind of part of the problem, right? So like here we can say, take these inputs and say, okay, we're gonna create some kind of an image that looks like this and then we're gonna test it to see if it's kind of right and see how good it is. And then based on like how good it is at solving the problem, we're going to change these little weights that connect this particular node in the network to the other nodes in the network, right? So when we do that, it kind of is able to come up with a reasonable approximation. So this may or may not actually converge. Let's see if I give it a little bit more, give it a little bit more help. These take a little while to figure it out, but essentially what it's doing is it's trying to solve a little bit of the problem at a time and then eventually it will come up to come up with a solution that combines like all of these values into a final solution. And here it's actually getting kind of clever because you can see like some of the gaps here. It'll classify as orange even though they're not really necessarily orange, which means it'll probably not do really great on, that's actually a little bit of overfitting right here. But you can kind of see that this is something that neural networks can potentially solve. So what is a neural network? So a neural network is essentially a type of way or a way of taking the input data and then getting the classification out. So what we have here is we have say like an image, maybe a cat image. We change that into kind of an array or a set of inputs, which then go get sent into these layers in the middle. The hidden, these are hidden, what's called hidden layers, or the actual meat of the neural network. And those will have what's called an activation function where the inputs are then transformed in order to get to move it to the next layer or to the output. So this is, I'll talk a little bit about what these will actually do later and what these inputs actually are. But you can think of neural network. It's like neural networks are loosely based of how the brain works. So if you imagine that each part of your brain has a neuron that's connected to another neuron and you form memories and do have pattern recognition based on how strong the connections between the neurons are, each one of these different sections or these different layers have what's called a weight, which then kind of weights the connections between them. But from an actual computation point of view, you basically take an input image and what you do is you transform that into what's called a tensor. And then you perform operations on it at every step in the neural network and then eventually get out the output. And the output is also in the form of a tensor. So what you're doing for this whole thing is just basically doing operations on tensors. And that's kind of a reason why you get the name TensorFlow because you're basically like having this big flow graph of tensor operations on tensors. So I mentioned tensors, what are tensors? So tensors are a kind of an abstraction or a very generic way of saying a matrix. So you can think of something like a vector in computer science as being just an array. Or if you have a matrix, it's like a two-dimensional array. But a tensor is something that is n-dimensional. So it can be any number of dimensions. So see, a matrix is a type of tensor, a vector is a type of tensor, but you could also have like a thousand or 10,000-dimension tensor. This is also perfectly fine. And what's interesting about these things is that matrix operations, let's say matrix, when you do matrix multiplication, you all did that in high school, right? Do you guys remember how to do matrix multiplication, by the way? Who does? Who remembers how to do it? You do. Well, that's more than I thought. You don't remember, I know you. But anyway, what you can do with matrix multiplication is you can, that same method for doing multiplication, actually generic is, it can be genericized or like can be actually done on tensors as well. So you can do things on, depending on the size and the number of dimensions of the tensor, you can multiply that if the dimensions and the number of, and the size of the tensors are right, you can actually multiply it against different types of other tensors. So this is what actually the multiplications look like. So when you have an input, say this x1, x2, x3, that's essentially a tensor. In this case, it's just a vector of three values. And then you apply the weights. So this is the weights between each of the nodes in the neural network. And that's also a tensor. And then this B, this is the biases. So this is actually something that's trying to correct for overfitting or things like that are some bias in the original data, in the training data. But what you have here is you're basically just doing a matrix multiplication on the weight. So the weight is actually stored as a tensor as well. And then added, matrix added to the biases. And then at the end, so once you do that, you actually get out a kind of garbage looking tensor. It's not very meaningful to you or to a person. It's gonna be numbers like 120, 37 and 50, you know? But what you really wanna know is like whether the input actually fits into category A, B or C here at Y1, Y2 or Y3. And so what we do is we apply the softmax function, which will then bound the end results into a value that's between one or zero. And that will give me a probability that it fits into a particular bucket. So each one of these Y1, Y2 and Y3 values are ended up being between one and zero. So you'll get something like 70% in Y1, 3% in Y2 and 1% in Y3. So you can essentially say that this, the input fits into the Y1 bucket. So that's how the neural network, how you would actually send data through the neural network. So if we're saying if we know the weights and the biases are ready and we send some data through, we'll get an answer out. But that's like when we already trained the neural network, right? We already have the weights and the biases set up to work. But we need to actually figure out what these weights and these biases are. And that's what we do with training. So this is kind of a flow graph. You can't really see the actual, the text here. But what it's essentially doing is this, you take some input and then you put it through a couple of layers in the neural network. So up until the middle part is actually just the prediction part. So it's actually doing the prediction steps that I talked about earlier. But then what it's doing at the end is it's using what's called an error function or a lost function. And what that is, is it's taking some training data. So I say, here's my input. I know it fits in the Y1 bucket. So then you run it through the neural network and you say, okay, what does it give me output? So maybe it says that they're all equal or something like that. In that case, what we wanna do is we wanna take the difference between what we expect it to be and what the neural network actually gives us. So once we take the difference, or essentially the error in this case, and there's a number of methods that you can use to get the error. In this case it uses cross entropy and that's a common one. But you basically take the error and then you use that error to then correct the weights and the biases that we have set up earlier. So these like, what we do is we take a, using what's called a gradient function to then move the weights and the biases in the direction that's actually gonna give us the correct output. And this is actually, there's a lot of math involved in this but that's essentially what it's doing. And so each iteration through the neural network as you're training will actually inch each, the idea is that you inch the weights and the biases closer and closer to giving you a correct value. So that's kind of like an overview of what the neural networks do and what they, and how you're training at kind of a high level. But why is neural networks or why are machine, why is machine learning even an interesting topic nowadays? Like you see it all over the place and in lots of meetups like this one and people are talking about TensorFlow, this and neural networks that like, so why do people care about this? So one of the reasons why people care about this is because I think of a number of breakthroughs that have happened in machine learning recently. And these are led by a lot of people at Google and other companies but what it essentially means is that we've actually been able to use machine learning to actually solve some interesting problems. So some problems that people actually care about and that can actually use up until recently, like we've been able to use machine learning and it's been great and interesting and all of that stuff but we haven't really been able to make it into a product or into a thing that people can use every day. It's been mostly academic research and things like that. So the reason why I think that people use that have been caring about it is because of these kind of breakthroughs recently. So this is an image of the inception model that Google uses for things like classifying images and adding tags and labels to images. And this is, you can think of each step in this as a layer in the neural network so you can see that it has, I think something like 100, like no I forget, it's actually like 60 something, I think blocks here. And it's, so you have each individual layer that breaks off into other layers so you actually have this huge flow graph, right? And so each one of these is actually kind of a matrix multiplication. If you remember from the maths, matrix multiplications are like, you have one value and then you times that to like every other value in the other matrix. And if you have something like a 10,000 dimension vector or array or tensor, you can imagine how many times you have to multiply values together. And each one of these is a matrix, like multiplication and addition or something similar. And you can think of, like say you have an image that's like a megabyte big, you can imagine how much computing processing that would actually take to actually go through this whole graph and then get some sort of output. And then you have to do that millions and millions of times to actually create a graph. So one of the other things that's really interesting about neural networks recently is also that they've been able to scale in the sense that if you give it more data, it actually learns and gets better at, has better accuracy. So recently, up until recently, like we've been able to do things like medium neural networks and small neural networks because the processing for large neural networks was actually too much. It was too resource intensive for us to be able to actually do it. But recently we've been able to actually start parallelizing that and making that easier to compute. And so we can actually start training these large neural networks or actually building these large neural networks and proving that they can actually scale as you get more and more data. So in terms of the breakthroughs that I'm talking about, like we're talking about these large neural networks. Like when you have medium and small neural networks, these are great unlike kind of an academic level, but you can't actually create something that people can use every day. Even with large neural networks, it's not like gonna be able to clean your house like a robot or something like that. It's still something like just adding little labels to images, but with medium and small neural networks they're just not complex enough or not good enough at solving these kinds of problems. So one of the things that is important for creating these large neural networks is being able to use things like to do these complicated computations. So you need to be able to do these in a reasonable amount of time. You don't want to be taking like months and years to actually train your neural network. So things like GPUs are great, but in order to do anything like really interesting, you need something on the level of a supercomputer. And it's not really many people have access to supercomputers, I know I don't personally. So a few GPUs will help you, but you have to wait hours, days, or even weeks for your output to come out. And maybe a supercomputer can help you with that, but that's not something that people really have access to. And even at Google, we don't really have access to supercomputers in the traditional sense. By the way, do anybody have access to a supercomputer here? Usually somebody does at these kind of meetups. I've gotten a few hands, nobody, okay. So some of the things that we've used, this kind of thing for at Google is to like label images. So talks a little bit about taking the text out of the images or identifying where a text is in images. You know, playing go, that sort of thing. And at Google, you know, this is a graph showing like how much, how many projects within Google are actually using machine learning. You can see that from 2004, there's like this kind of hockey stick, kind of growth graph. And that's basically because of all of these kind of breakthroughs that have happened recently. So coming back to TensorFlow. So TensorFlow is essentially what is a kind of a second or third iteration of the library that we use at Google to do machine learning. So previous to this, we had a library called Disbelief that we used to develop machine learning algorithms. But it wasn't terribly flexible. And we also wanted to create something that was open source and we could build a community around. And so that's why we released TensorFlow. And what it is, it's just a generic library for doing machine learning, not necessarily neural networks, but that was the first thing that we really added support for. So what it is, is at kind of a high level, is it does all the things that I said that you need to do for neural networks. But what it does is it actually creates this kind of flow graph, so that each one of these little layers becomes a generic type of part of a flow graph, which then can be computed on a device. So something like a CPU or a GPU. And those can be maps kind of independent of the actual operations that you're doing. So when you write TensorFlow code, you don't really have to think about, is this gonna run on a CPU, is it gonna run on a GPU, that sort of thing. TensorFlow kind of takes care of all of that for you. And it does other things, like being able to distribute the work onto multiple GPUs or multiple CPUs or a combination of CPUs and GPUs and that sort of thing. So some of the core TensorFlow concepts and structures are graphs, operations and tensors, like these are just essentially classes that you can use in Python to create your TensorFlow models. And then there's other ones like constants, placeholders, variables and sessions. So the constants are what you'd expect. Those are constant values. Placeholders are values that are inputs into your neural network. And then variables are things that are modified as you're actually training the model. So the weights and biases, those type of things will be variables. So as you're training, those will get updated. And then a session is basically just encapsulates like an environment. So a single run through your training steps. And then TensorFlow supports a lot of operations. This is a totally non-exhaustive list. But you know, like these are types of MIATH operations you can do on tensors, like multiplication, division, subtraction, that sort of thing. So what I want to do is we're gonna do this live. I'm going to start up Jupyter here and come on through creating a TensorFlow model. So this is like a really, really simple example. But it kind of helps illustrate, that's too big, that's all right. Let's clear this. So this is going to be an example. And I'm really glad that the previous talk was all about how to pre-process images and things like that, because this particular demo or this particular example is also an OCR. So MNS is a really popular or really a well-known training set. And what it is is it's just a bunch of handwritten numbers that we're then gonna do OCR on to actually figure out what number is actually written there. But all the data is totally pre-processed and everything. So you have like each individual care images is a single number. So if you're like writing your phone number, you still have to figure out like how to, you know, adjust all the numbers and split them up and all of that stuff before you even get to this part. So I'm glad that the previous talk was all about that. So what I'm gonna do is like just load the data. So what this is is just basically 55,000 images of handwritten numbers. And this is already kind of parsed into a format that's good for inputting it into the model. So kind of everything that you can kind of take everything from the last presentation up until the point that we actually put it into TensorFlow. You actually have to do like most of that stuff. But what you end up getting here as input into TensorFlow is let's just run all above. So here what we have is we have this, these are, this is the training set of images and we have 55,000 images and they're all 784 values big. And what 784 is is basically what it does is it takes the image which is 28 by 28 and it creates a single basically big long array of 780 values, 784 values. And what those values are is basically a value of zero to one based on how dark the pixel is. So if it's kind of a light pixel, it'll be like 0.23 or if it's a pure white pixel, it'll be zero or if it's a very dark pixel, it'll be like near one. So you can kind of see what the factor, the tensor looks like. But it's basically a big single kind of vector. And then let's see, give me a second. So here we actually train the network or we create the neural network here. So this is actually setting up all of the values. So here we have the placeholder which is our X input tensor. We can see that it's 784 big and then we're giving it none here because 784 is just one image and then we can have like any number of images in the training set as input. And then here this W and V are the weights in the biases because like what we're gonna be doing is just creating a neural network that just has a single hidden layer. And that one single hidden layer, we have these W and V as the weights in the biases. And then what we're gonna do is do exactly what we did what I showed earlier which is we're gonna take the input and we're gonna do a matrix multiplication on it with the weights and then add the biases to that. And then at the end, we're gonna do a softmax which is gonna give our output. And then here's where we actually do the training. So here we're gonna do this Y prime is basically our kind of training data. And so that's gonna say, this is the exact same format as we get out of our neural network. So when we get out of our neural network, we have a value or a tensor that's basically a single vector that's like has 10 values in it. So those 10 values correspond to the 10 buckets that it can be in. So if you get a single image and there's a one, it's an image of a one, then we have the output of our neural network is going to be the probability that it's a zero, probability that it's a one, probability that it's a two, and so in our 10 size vector that we get out. And then our training data is exactly the same as that except we know exactly what the number is in the training data. So the number that it is, if it's a one, then it's going to, in the place where it says a one, it's gonna be 100%, so there's gonna be a one value and then all of the other numbers will be zero. And then we use that, we input that into cross entropy in order to, or we create this cross entropy function in order to actually do the loss and then we do a gradient descent on it in order to actually optimize for the training step. So it's a little bit difficult to visualize, but essentially what you're doing is you're taking the diff and then you're doing this kind of gradient descent thing where you're taking a value, if your difference is really high, then you're modifying the weights in the biases so that they, so that you move in a direction that's gonna give you a better value next time. So here when you get kind of a bad value for cross entropy, you kind of say like, well, if I go in this direction or if I modify the values this much, it'll probably be okay, but you don't really see this whole kind of plane, I guess. You see just an individual part, so it's kind of like walking down a mountain with a flashlight. You can kind of, as long as the slope is going down, you can kind of find your way, but you could hit local minimums if you're not careful, if you're using that kind of algorithm. So once we have the training step, we can actually do the training. So here we have, we're basically initializing all of the variables that we set up earlier and creating our ascension and then running the initialization of the variables and then actually doing, running through 1,000 times to train this particular set. And then what we're doing is we're taking this training set and the training set we noticed was 55,000 values, right? So what we're doing is not going through 1,000 times and then each time training on the 55,000 values, we're actually picking a batch of randomly each time, we're picking a batch of 100 out of those 55,000 to train on. And what this does is actually, if the training data is reasonably random, we can actually just do a random sample and you get very close to an output from your neural network or your training will work pretty much the same way as if you trained on the entire data set. So it's kind of like if you're polling, say like everybody in America or everybody in Singapore to figure out who the next president should be, you can kind of basically take a random sample of everybody in Singapore and ask them who the next president would be and the next extrapolate that into all of Singapore. And that's kind of what we're doing here. It's basically taking a representative sample from the training set to actually train on and then run that. So then that will actually give us our kind of proper weights in the biases. And then at the end here we can actually just, we can actually train the, or test the prediction on a test set. So the test set is exactly the same as the training set except we didn't actually train on it. So the training, so the neural network has never actually seen the training set or this test set, I'm sorry. So you're basically what you're doing is you're testing it on data that the network hasn't seen yet to train it, check its accuracy. And we can see here that it's about 92%. So it's actually pretty bad, but this is kind of a very good way of kind of visualizing and stepping through what an actual neural network is doing. So one of the things that I didn't talk about earlier was that because we're actually talking about a single hidden layer in here, you can actually kind of visualize the way that this particular neural network is training by looking at the probability that a particular pixel is black or is dark, how it influences the output of the neural network. So in this case, you see if these like in these blue areas, if there's a dark pixel in those blue areas, then it's going to be highly likely that that's gonna be a zero. And the same thing for a one and a two and a three. And you can kind of see that these blue areas kind of show in the pixel or in the image where the dark pixels would likely be. And so you can kind of see that it looks kind of like a seven or kind of like an eight or kind of like a nine. But this is like a really, really simple example. So who can tell me like why this would not be a really great way of figuring out whether a number is right or not? Or whether a particular image represents a particular number or not? Anybody have any ideas why this is a bad way of doing it? No? Everybody thinks this is good. One in 10 images, like one in 10 is like coming out wrong is okay. So how about if we take this zero and we like move it a little bit, like, you know, say five pixels or six pixels to the left, like what would happen then? You know, in this particular example, like you would say this particular neural network would not really actually give you a very good output. It would, some of the pixels might actually go or the right side of the zero might actually go through the middle of this big red section of this particular image which would negatively correlate with it being a zero. And so that would cause it to have a very, you know, high false positive rate. So in this case, you actually had me to do a little bit more to actually get good or higher accuracy for these type of, you know, doing this type of OCR work. So one of the things you can do is like actually make it more, make your neural network deeper, right, so by making it deeper and like adding extra kind of layers to it, you can actually make it, be able to deal with these type of problems a lot better. So one of the ways of building multi-layer networks is a common way is to build a convolutional network. And building convolutional networks are very, are a little bit more complicated and so they require a little bit more time to explain than I have. But I'm going to kind of go over it a little bit and tell why it actually does a better job of for images that are kind of skewed or moved to the side a little bit. So one of the things that this particular example will do once we, when you do the convolution is you can go over the image kind of one pixel at a time like and create a new image. In this case, it's a nine pixel image based on these kind of sliding windows of the original images. And then, so that's the actual kind of like convolution part and then you can then this is like where we create a convoluted feature, a convolved feature. And then do some other like max pooling. So in this case, we're taking these also these parts and then doing a two by two filter. I can't remember exactly what this was doing but there's the convolved duty and then the adding these max pooling allows us to convolve the image into like kind of an easier image that will correspond much better to our actual number. So in this case, like what you'll get out is like this, even if the image is like the zero or the one or whatever is slightly slid to the left or the right, the this convolved feature will actually give us something that's kind of normalized with that kind of input. And then what you can do is after you do one convolved convolutional layer, you can do the same thing. You can create another convolutional layer on top of that in order to improve the accuracy of the normal network. So here we're actually like this is just basically, so you've reduced to the image to seven. So like we have you basically between the two different layers, you densely connect them, which means that all of the values from the previous layer are connected to the values in the other layer. The original example was also a densely connected as well. And then this does like a number of things like dropouts and creating readout layers in order to improve the accuracy, just a little bit beyond what I can talk about. But once you do these type of things, you can actually make this, you can actually improve the accuracy quite a bit. So this actually runs for quite a lot longer than the original example, but you can get up to about 99.2, which is reasonably respectable with just a two layer network. But one of the things that's really, really interesting about TensorFlow is that you can kind of find all of these in Python. So it's much easier to kind of understand and go through. And then once the, as you're building this, you can actually do things like use something like TensorBoard. So this is actually a no longer existing URL. But what TensorBoard does is allows you to like, every intermediate step kind of visualizes the inputs and outputs of the particular layers. So what you actually do is, here we go, like if I can blow this up a little bit, is you can kind of take these like this kind of TF or TensorFlow summary file writer and then kind of wrap your existing graph in order to have it write out log files kind of automatically. And then these can, those log files can then be, so in this case, he's taking the training step and the test step and writing logs for each of those. And then once you, you can run TensorBoard on the directory of logs that's output in order to actually visualize things like the entropy. So in this case, it's like the loss function. So you can see that the loss function is getting lower over time, which means that the accuracy should actually be going on. So these type of things you can kind of visualize and there's like a lot of really cool examples. So one of the things that happened also recently was the TensorFlow Dev Summit in San Francisco. So if you go to the website for this, if you search for TensorFlow Dev Summit, you'll come up at this website. And one of the things that they did was they recorded all the talks. So here's kind of on the top pages, this kind of marketing, TensorFlow, Magickler, Machine Learning for everyone, but you can see the videos from all the talks that were done. And some of these are very, very good. One of them that was really cool was TensorBoard and kind of creating custom visualizations and showing all of the things that you can do with TensorBoard. So that was one of the things that's really, really good. Because Jupyter and some of these ways to help visualize Machine Learning can only really go so far. So some of the things that came out of the TensorFlow Dev Summit was that TensorFlow is now 1.0 and one other thing that I thought was really cool was that there is a new project called TensorFlow Fold. And what this does is actually, it's a little bit difficult to explain, but kind of flattens an arbitrary kind of inputs into something that can be easily run through the existing TensorFlow. So something like, say, a sentence can be kind of flattened into a graph that allows you to kind of parse the sentence. So I thought that was pretty interesting. And this is something that you can kind of build and that works really, really well with TensorFlow. So you can do things like this one. They have examples for building only tree LSTMs for doing only sentiment analysis on text and things like that. And then kind of lastly, I kind of want to show the, or talk about some of the other features of TensorFlow from a little bit more high level. So one of the things that makes TensorFlow really cool is that the graph that it creates, it creates each step in the flow graph can be run independently. And so you can actually break the processing out and then run it on multiple devices, like multiple GPUs, multiple CPUs. And then those can be, technically, can be run on even other machines, like so you could have multiple machines in a distributed network that can be all doing the processing together. So there's a number of ways to do distributed training, but TensorFlow supports like several of those. So the main thing that the distributed training gets you is like I kind of mentioned is that you can schedule the individual pieces of the data flow graph processing to be run on CPUs or GPUs. And it will use things like RPC or RDMA for communication between those and for things like sending, when you need to send data over the network and does other things like kind of make sure that if there's faults or problems that happen during the training that it can kind of handle those. And it supports like several of the different things. So if you're kind of familiar with this, there's a couple of ways of breaking up the data or breaking up the problems so that you can run training steps in parallel. One is like model parallelism and one is data parallelism. So model parallelism, the idea is that you use the same model on multiple machines or you actually break up, sorry, this is the one where you break up the model itself and run different parts of the model on different machines. So in this case, you're just sending the data, you're using each part of the model, we'll see all of the data, but you're breaking up each different part of the model to be run on different machines. So if you imagine that flow graph, you can imagine that parts of it are run on machine A, parts of it are run on machine B, parts of it on machine C, et cetera. And then there's data parallelism where what you do is instead of breaking up the model itself, you break up the data, you segment the data into multiple pieces and then you use the same model on each of the machines, but you only train this set of machines or this machine using a subset of the data. And then at the end, you combine that. And then there's sub types, so here's just like sub graph or full graph for model parallelism or synchronous or asynchronous and there's several kind of trade-offs between which one is best and which one is not good. There's pluses and minuses to all of these. Just as an aside data, there are Google uses data parallelism for most of its processing or for most of its machine learning work. And then we do kind of a an in-between, go as an in-between between synchronous and asynchronous. So another thing that's kind of interesting about TensorFlow is it provides kind of a machine learning API that allows you to then kind of abstract away the actual hardware and things like that since you're running your computations on. So that allows you to use things like at Google we're creating something as part of Cloud Platform that's called Cloud ML, which will allow you to run TensorFlow graphs on Google hardware. So you can actually generate TensorFlow graphs and say, okay, Google run this on your hardware and distribute it whatever and have it worry about all of the hardware and latency issues and all of that stuff. And so you can kind of basically throw, use the existing hardware that we have for doing things like machine learning, but be able to have an interface that's not proprietary. So it's an interface that is open source and you write it and it will run anywhere but hopefully someplace like Cloud ML would be actually be faster and easier to use. It'll also let you use things like the dedicated machine learning hardware that Google is developing. So like one of those is like TensorFlow processing units. So these are essentially like GPUs but they use a lot less power and they're specifically designed for doing machine learning work. One of the things that's bad about GPUs is that they were originally designed as graphic cards and so that they consume a lot of power and that's when you put thousands of them in a data center that's actually costs a lot of money and generates a lot of heat. So we have these new hardware. So that's kind of an overview of what TensorFlow does. And so you can imagine other things like maybe even open source hardware for things like TensorFlow processing units or machine learning or whatever and then having people map TensorFlow operations directly to that type of hardware as well like even because it's open source. So that's one of the cool possibilities of for things like TensorFlow. So I think that that's probably enough TensorFlow and Cloud ML and stuff like that but if you're interested in learning more then check out TensorFlow.org. There's lots of really cool documentation here. So things like if you click on resources there's quite a lot of TensorFlow documentation here. So develop is where we wanna go. And then if you go over here to get started or a programmers guide this is where all the kind of cool documentation is written and it's really easy to follow. Also check out there's like, I don't know if this actually still works but yeah there's a cool TensorFlow workshop that you can check out. This is bit.ly slash TensorFlow-workshop or if you search for like TensorFlow workshop on GitHub you should be able to find it. So that's another really good resource for checking out TensorFlow and getting your hands wet and kind of taking the next step towards playing with this. So I hope that was useful and was it too dry and all of that stuff. So if you have any questions or anything like that then I'd be happy to answer them. That was a very enthusiastic hands up. So yeah, go ahead. Yeah. Do you want one by one or do you want any one by one? What's that? I have three questions. Okay. Do you want one by one or all at a time? Let's go one by one. It's a DPU on the Google Cloud platform for like under the business under the TensorFlow. So then do you want a few requests by now or why not you look at the open source DPU? So okay, so that's actually like, so first off I think we're gonna probably make it so that it works on Cloud ML first before it actually can be run just as a, are you talking about just on like a regular VM from a regular VM or something like that? Yeah, so I don't really know if that's like specifically in the roadmap yet for is there like, is there like a specific use case? Like, why wouldn't Cloud ML work in that case? Like, is there a specific use case that you want to run GCD on? That's it. I don't really want customers like me. Right. I'm only limited to Shuda and I only am formed to use NDDM. Right, so that's another question, right? Yeah, a lot of community have been asking why Google went and gave it to them and not have any support for EMV or even if it doesn't give you guys 15 BU, but why restrict us to only using NDDM? Okay, yeah, I mean that's like great feedback but I would probably argue that you would probably, that you would, that's using TPU outside of, like using a GPU makes a lot of sense because you can use it for not just machine learning but for other things like lack of processing but for something like a TPU, it's a very specific type of piece of hardware. So you could probably use something, like using the Cloud ML service would probably be fine for you rather than having to run, having it attached to a VM but we can kind of talk offline about that later. I don't want to use, because I'm on service, right? That's why I do not have Cloud ML on-premise. So what you said is I do get a watch, I mean I want to deploy and support internally because I'm on a data engineer, I don't know how to steer my model, I don't know how to audition. Right. How much do I talk to? Well, that's a good question. So there's a couple of like of, I don't have the links handy but if you like give me an email later I can send you some information about how to, there's a couple really good workshops on how to deploy TensorFlow in a distributed way on your own hardware and so I can kind of point you at that. So yeah, hopefully that'll work for you. So if you give me your card later I can kind of point you at that. Any other questions? Yeah. Does Google plan to provide any high-level library like Keras for configuring the neural network? Yeah, so that's one of the things that was interesting about the TensorFlow Summit. So one of the things that came out of that was that Keras was going to be one of the supported APIs for TensorFlow and it's actually going to be moved to the contrib folder within TensorFlow and it's going to be supported moving forward at Google. So for Keras specifically, in the future you should be able to use that for things like Cloud ML and stuff like that. I don't know about other libraries and how well those will actually work but for Keras specifically you should be good. And that the TensorFlow team is also looking at including and providing support for a lot of more high-level libraries. As you kind of noticed like TensorFlow is very kind of somewhat low-level in terms of actually creating a neural network. So things like Keras are much more high-level and allow you to create code that's a little bit easier to understand if a little less powerful. So TensorFlow kind of exposes all the knobs and sliders for you whereas like Keras is a little bit more gives you like kind of sane defaults and is a little bit better UX or a little bit better user experience. I'm going to go with you. We got to creating the neural network in TensorFlow. You add those hidden layers automatically by the machine or you have to add it for your own. So the hidden layers themselves are created by me. So you would end up writing those but when you actually train that's it updates the values to give you values that would give you the right outputs. But you can't just create a neural network automatically. You have to kind of know what each of the levels are supposed to need to do or what you're planning on doing for each of the steps. I think you had a question, Ian. You say the tradition is that TensorFlow is meant for neural network but it can do much more than that. What about other algorithms? Is it a super one for filtering? So I think, I don't know the status of these projects but one of them was like there was a project called TensorFlow Wide which is neural networks are deep. The other types of algorithms are applied anyway. So the idea was that you would have the, I don't get it either, but the TensorFlow Wide is basically a way of his part of TensorFlow that implements other types of machine learning algorithms that aren't neural networks. But I'm not really, I don't know offhand like the status of that project. I just know that that project exists. So if you're interested in checking that out, go look for TensorFlow Wide. Yeah. If I want to learn about TensorFlow, so I need to acquire your TPU. You don't need to, you run it on TPUs now. TPUs just make it run faster or more efficiently than running GPUs or CPUs. CPUs are the least efficient. But so if you just run like training on CPUs, it'll take a really long time. GPUs will give you a, you know, run it like 10 times or on an order of N, 10 times faster. So like probably less than 100 times faster but more than 10 times faster. And then, you know, TPUs will let you to run it a little bit faster than that. But hopefully more cost effectively. So the idea is that if you can run it on TPUs, then it'll be probably cheaper because it uses less resources and all that stuff than actually running it on GPUs yourself. Yeah, one more question. Is that okay? Yeah, yeah. For the parallelism, will it be like Apache Spark where the machine and the parallelism is somehow transparent to the user for the TensorFlow? So transparent to the person that writes the model. Yeah, to some degree. There is a little bit that you have to do but it's relatively simple. So TensorFlow generally will be able to map the, for the most part is able to map all of the operations to that stuff automatically. So from the user's point of view or for the person writing the models or writing the TensorFlow code, they don't really necessarily have to think about how many machines it's gonna run on and that sort of thing. And if you add machine nodes to the cluster, you don't have to change your code or anything, you'll just run on more machines and hopefully run faster. The problem with distributed training with things like that is generally that the network will bottleneck very fast. So you have to be kind of cogent to that. So it's not really that you can just throw in machines at it, generic machines at it. You have to kind of care about a little bit about the network. Once you get above like five to 10-ish nodes, probably on the word of 10, 15, 20, that type of level, you need to care about how many, what your network is like and how fast it is and things like that. But from a theoretical point of view, you can kind of scale it out as fast as you want, as much as you want. Do you mind if I take that offline? Because I think we're kind of out of time. Is that okay? All right. Is it okay? Okay, okay, we'll go ahead. Do you want to give them the 8D or the solution over there? For TensorFlow Fold? Yes. I think that it's the, I mean, I'm not, I haven't really looked at it too much. I just thought it was kind of an interesting product or an interesting project that came out of the Dev site. But I think that it's from what I looked at it, it looked like it was kind of a little bit separate from actually running the layers. So it was like, it was almost like a library for doing like this kind of like more pre-processing type of steps before you actually run it on networks. But I'm not really sure. So I could like kind of ask around and see if I can figure that out and get back to you if you have an email address or something. I live next to start a research in Singapore? You're asking the wrong person. I had no idea. You want to send your resume or something? Sure. What's that? You want to ask about it? What do you mean? Okay. Yeah, do you mind if I take it offline? For a convolutional network? You can have like theoretically any number of layers. So for the example that I did, I just did two layers. But you can theoretically do more layers. In this particular, in that particular example, I don't think that adding more layers would actually make it more accurate. But depending on the problem, if you add layers, it'll help with... Why I asked this question? Sure. I don't have a strange hand right now. So depending on the number of layers, if the thing comes on you to work and suddenly the thing just won't work, it will get broken and stuck. So it will turn out to be different. Yeah, do you mind if I continue that offline with you? So thanks to all of you for coming.