 All right, hello. Thanks a lot, Sam. Hello, hello. OK. So let me introduce myself. I am James. I'm a GDE in machine learning based in Bangkok. And I'm also a data scientist at Agoda. So last month, I had a lucky chance to go to Mountain View and attend the TensorFlow Summit for the second time. So it's like a one-day event that I met a lot of TensorFlow people over there. It was very fun and very nice. Let's see how it looks like in two minutes. So here we have the 1-2 of the other video downloads. I'm really excited to see how much users are using this and how much impact it's having on the world. I'm very excited to say that we have been working with other teams and we hope to bring TensorFlow Lite to move a lot. So TensorFlow is also an early-stage project. And so we'd really love for you to get interested and help us to fill the stage. We hope to be able to get our mission for TFK. So it's going to be a long way for good processing for the past, flexible, and easy to use. We're very excited today to announce that APLONJS is joining the TensorFlow app. In general, a lot of our working team mission is to make changes in health and then use that ability to improve people's lives. These are good examples of where there's no opportunity for us. Yeah. All right. So that's the recap in two minutes. So what I'm going to do today, I'm going to pick some of the topics that I'm excited about and basically highlight and show you the topic that I pick. So before I start, I just want to learn more about you guys. So how many of you use TensorFlow in your daily job? Oh, nice. And how many of you use machine learning in your daily job? All right. Cool, cool. So I think this talk will be most relevant for those of you who are using TensorFlow. But if you're a beginner to TensorFlow, then I hope that would inspire you to learn more about TensorFlow. So let's start the first one. The first topic that I pick is tf.data. So the reasons why I picked this one, because in machine learning, data pipelining is extremely important. And before we have tf.data, there are several ways, like many ways, that we can feed data into TensorFlow Graph. I think the most naive way is to use tf.constant and put NumPyray on it. So it's highly inefficient. A bit better way is to do the feed dig in the session.run. Like, basically, you write the Python code, like do data preparation in NumPy, and then do chaffo, bash, and you basically feed it into your TensorFlow Graph using session.run and feed dig. That's a bit better, but still not the best way. And the last thing, I think the most efficient way before we have tf.data is to use something called QRunner. And basically, you write the TensorFlow Graph to do data preparation and use QRunner to feed data into the GPU or something like that. But the problem with QRunner is very difficult to work with, especially if you want to do something complicated. So that's the reasons why they come up with this tf.data API. So they promise us, like, tf.data API should be first fast, flexible, and easy to use. So they abstract away a lot of stuff that we have to do it by ourselves, and they, like, pre-define everything. And you can basically do almost everything regarding data pipelining using tf.data. The idea of tf.data is basically ETR2 for TensorFlow. I know, like, most of you would probably know what ETR is. So the extraction part. So tf.data allows you to extract data from various data source, like it can be an empire, or it can be a text file, image, or even CSV file, something like that. Second thing, it also allows you to do two outer data representation or extract features, and also do data augmentation, shuffle data, bash your data. So this is the transformation part. And lastly, the load part. So you have to load your data into TensorFlow graph, like, to run on GPU or something like that. So this is the last part. So let's see the code. The E part, extraction part. So let's say you have a list of files. In this case, it's like a tf.record file. And you want to extract the data from tf.record. So you can do just, like, these two lines, pretty simple. And you want to shuffle and do, like, midi-bashed and something like that. This is the transformation part. You can also, like, extract features on this part or do data augmentation, something like that. And the last part is to tell TensorFlow how you're going to feed that data into the TensorFlow graph and train it on your device. So this is, you can do this make one short iterator and then iterator.getNext. Basically, you can feed that feature, the last part, into your TensorFlow graph. So let's talk a bit about performance, the first one. They have some tips that basically can speed up your TensorFlow, the data set API. The first thing is very easy is to, if you have very large files or like many, many files, you can load data in parallel using this argument, non-parallel reads. And another thing that I think this is very exciting is you can do something called data.prefetchToDevice. So this function makes sure that for the next batch, it will load the next batch of your data into the device memory. So let's say you are running this, like, the current batch and it makes sure that the next one is already in memory. So when we want to run the next one, it's like you can load it immediately. Flexibility. So the flexibility of data set API stem from the fact that they use functional style. So you can do data set.map or .filter.flatmap. And basically, you put a function that, let's say you have a function that extract image, like you write a function that extract image using TensorFlow graph, right? So you can put that function inside the data set.map to extract your image data. So in this case, it's very flexible because you can define any function and then you just put it in the data set.map. And you have, like, filter.flatmap and other stuff, functional stuff. And also now it supports nested structures, like dictionary. And it also supports the spa tensor. And also, if you don't want to write your function using TensorFlow, you can also write your function using Python instead. So you can do this with data set.fromgenerator. And data set.fromgenerator will take Python function, like not a TensorFlow. You don't have to write TensorFlow code. You write TensorFlow Python function and then you can use the fromgenerator. But of course, the performance wouldn't be as good as the one that you defy with TensorFlow code. And lastly, if you are a hardcore guy, you can also write your custom op kernel using C++, right? And ease of use. So TF.data designed a way that it also supports eager execution mode. How many of you know about eager execution? All right, cool. So like, eager execution is basically the way that let you, so it's a PyTorch way of doing TensorFlow, basically, one word. Or it let you work with TensorFlow in a Pythonic way, defined by run. And it's much easier than using doing the graph first and then run later or something like that. So for example, here, let's say you have data set. So you can think of data set as a iterator. And you can just do this for loop, like for bash in data set and put that bash inside the train model. So this is very PyTorch way. And another thing is, they also have pre-made function that encapsulates all the, I show you, like it makes the data pipelining much easier for the standard case. So before, I show you before that, if you want to do shuffle, repeat, and you want to extract the, you have the TF record and you want to extract something so that it can be consumed by the TensorFlow model, you can do the TF, the path single example, something like that. So they put it in one single function called make bash feature data set so that this is very easy, very convenient. Another thing is, now they support CSV. So you can do make CSV data set. And you can think of it just like pandas.vcsv. This is very convenient. And once you get data set, you have like iterable and you can do like for loop to get data into your TensorFlow model. So to recap, TensorFlow data is like fast, flexible, and easy to use. And I highly recommend for those of you who want to start using TensorFlow or don't use TF.data, I highly recommend you to move to TF.data for data pipelining. Second thing I want to talk about is the practitioner guide law of high-level APIs. So the reasons why I picked this one, because I myself used this estimated or high-level API for a while. And I think it's highly convenient. It's much easier to use than it makes my life very, very easy when I do TensorFlow code. So that's why I picked this one. And I will try to convince you why it's very easy. So for high-level API, the core object of high-level API is something called estimator. So what is an estimator? Estimator encapsulates everything that you need into a single object to train a machine learning model. It hides a lot of TensorFlow concepts that you don't have to learn, like sessions, or it also do the looping for you. So now you don't have to even know about session, because if you use estimator, you don't have to handle it by yourself. It automatically handle it in this estimator object. And once you have estimator, you can interact with all, like you can train, you can evaluate, predict, and export safe model. If you use scikit-learn, so basically, they try to build this in a scikit-learn style, basically. And what you have to do when you use estimator is two things. First, you have to define input function. And you can define input function using tf.data that I just talked about. Second thing is you have to define something called model function. So model function is a function that gets features and labels. And then you have to just define how you're going to train the model, how you're going to evaluate the model, how you're going to predict, and how you're going to export the model. And that's all. But the good news is for standard machine learning algorithms, they also pre-made the model logic or model function for you. So you don't like it even easier because you don't have to even specify your own training graph. And I will show you example in a minute. So let's say you want to, like you have project high-recommendation project. And your goal is to recommend heights to users. And what the data that you have is you have height info. You have info about users. And you have the interaction between user and height. Like this user likes this height or something or how much the rating that this user gives to this height, something like that. And we can define this into a machine learning problem or a simple one. So features is height and user. And you want to predict if this user will like this height or not. So it's basically like level is 1 or 0, like or not like. So this is like standard machine learning problem. It's like a binary classification. They have this pre-made estimator called DNN classifier that you can use it immediately. So let's start with a very simple case. Let's say you just use one feature, which is hack ID. So this is not going to be personalized at all. But let's start simple. You use hack ID. And what you have to specify is what you're going to do with hack ID. So here I say, do the embedding. And this is something called feature column. I will talk about it in a second. And then you instantiate your DNN classifier and specify how many hidden layers you want and put the feature column into it. And specify that your training input, your evaluation input. And then you can train using train and evolve. Evaluation method. And that's it. Only like a couple of lines. You can train your first model. And the good thing is it comes with a lot of free stuff. So if you open TensorBoard, when you specify the DNN classifier, you have to specify the model directory that you want to set up your model. And when you open TensorBoard on that directory, you will see a lot of free stuff, free summaries and metrics. So you see your training loss, evaluation loss. You will see your accuracy. And also, for this binary classification, you get AUC, area under the curve of precedent recall automatically. So if you want to do something more complicated, there's a feature column that can support you. What is feature column? Feature column allows you to do feature engineering and you don't have to do it inside your estimator. So you do feature engineering outside and put it in something called feature columns. And then you just train your model just like your usual thing. And they have a bunch of functions that support feature engineering like bucketizing, crossing, hashing, embedding. So let's see example. Let's say that you want to put more features into your model. You have tags of the hike. So it's like kids-friendly or dog-friendly or not, something like that. And what you want to do with this feature is you want to do one-hot encoding. You can use this indicator column. And let's say you have this elevation gain and you want to treat it as a numerical value. You can do numerical column. And you can put also normalization function. And lastly, if you want to use this stand as one of the feature, but you want to bucketize it, then you can do this bucketize column and put everything inside a list. And that's it. You can train your second model. And if you want to do a bit more complicated, like you want to make it personalized, so you have to use a feature inside your model, the easiest thing that you can do is you can just like embed user ID and put it as another feature like this. And then that's all. Everything should work just fine. So another thing that I want to highlight is, so I talk about DNN classifier, but they also have a lot of other stuff. And they now have gradient booster tree. I think like most of you would know about XT boost when you want to do a gradient booster tree. So now you can do gradient booster tree with TensorFlow like this. So the question is, why would I borrow using like GBT with TensorFlow? So I would say if you want to compare, let's say you want to compare GBT with deep learning. So if you do like XT boost and you do deep learning in TensorFlow and how you can compare them is like, it's not that easy. So now it's like you can compare in a single framework. So very convenient. And I think they also support like SVM and all other stuff. But let's say that you are not satisfied and you want to do something a bit more complicated. They also support it. They introduce something called head API. So before we talk about head API, we talk about what is model function first. So to train machine learning or neural networks, basically you have features and you specify your network can be like fully connected or convolution RNN depends on the problem. And in the end, you specified your prediction, your loss, right? And the prediction and loss, this is what they call head. So, and like, okay, the network and head becomes what they call model function. The idea is like they have like refactor that head outside of the estimator. For example, you have DNN classifier. Now it becomes what you can use DNN estimator and put binary classification head into it. Okay, you might think like, so why we like put more lines instead of one line to two lines? So the idea is now you can play with a lot of heads. So if you want to not just do binary classification, you want to do multi-class or you want to use like Poisson regression head or any other head, then you can just like use this head API. And what I'm excited about is this multi-head because you can combine many heads into a single estimator and do multi-tax learning using this API. So it's very convenient. And also, if you still think that you want to do a bit more, you can specify your own model function. So it takes features, labels and more as an input. And then you just like build your training graph, your evaluation graph, your prediction graph, and then return something called I think model spec or something like that. All right. Last thing is the TensorFlow serving. So if you use this API estimator API, so you can also save your model like in an easy manner. What you have to do is there's a method called export-save-model and you have to just specify the input receiver input function. Like how you can handle requests when it comes to the real-time inference. And that's all you need to do. Once you have this like save object, you can put it in TensorFlow serving or if you have your own API server, like for example, Adagoda, we use Scala as like the API server for real-time serving. So we use Java API, TensorFlow Java API to load this object that was written in Python, something like that. So definitely I recommend you to check it out. The high-level API is very convenient to work with. Okay, the third thing that I want to talk about is distributed TensorFlow. I picked this one because I didn't know before I attended the summit, I didn't know that it's like, is this easy to do distributed TensorFlow? So I just want to share with you how easy it is. So let's talk about the architecture, like distributed training architecture first. So this is like not distributed. You have single CPU, you have single GPU on one single server, right? But let's say you have multiple GPUs on a single node or single server. So the question is how you want, how can you distribute your training on this architecture? And the most complicated architecture is you have multiple nodes and each node has several GPU, right? So how do we do this in TensorFlow? So today estimator, you have these two lines, right? If you use estimator and what you have to do, you just have to put one more line, which is the distribution strategy. And now they have this mirrored strategy, which is suitable for the case where you have a single server but multiple GPUs. So you can just put this inside the run config and then use that run config in estimator. And now you can do distributed training very easily. And they are now working to put more strategy inside these APIs. And I think the multi node, like in case that you have many servers and each server has many GPUs, I think they're gonna support it soon. Also, they say like also check out this horrible project. I think the project from Uber that they built distributed training framework on top of TensorFlow, it's gonna be a bit different than what I said before, but maybe suit for your case. So yeah, if you wanna try, just install the TF Netli. TensorFlow Hub, I think this one is the one that I'm most excited about because I've never heard about TensorFlow Hub before I attend a summit. And this is actually what I wanted, like I wanted this functionality in TensorFlow for a long time and now it becomes like it really so. So I'm really excited about it. What is TensorFlow Hub? How many of you know about TensorFlow Hub? Okay, okay, cool. So before we talk about TensorFlow Hub, we talk about these repositories in software engineering world. So repositories allow you to create your code and share your code or reuse other people code, right? So why don't we do the same with TensorFlow Hub? No, with machine learning, sorry. Why don't we do the same like in machine learning world? So that's why they create something called TensorFlow Hub that you can build machine learning model, share it, and also reuse other people model. So why is like it's a good idea? So think about like, if you want to build a machine learning model, then you need a lot of things. You need algorithms, you need data, you need compute power, you need also like machine learning expert, right? But the idea is like the TensorFlow Hub did these four things into something called model. And then you can share your model on TensorFlow Hub and other people can reuse this model so that they don't have to have these four things. They can still like do something with machine learning. And one thing that I want to point out is like, they call this as model, not model. Because model is like, when you have a model, you have a specific input and specific output. So it's not really like shareable. Model is something that you can like, it's composable, it's like reusable and also retainable is also important thing. So it contains pre-trained weights and the TensorFlow graphs. And I think like for those of you who use Keras before, I think the Keras application resnet, something like that is exactly the same thing. So that allows you to do like pre-trained model very easily. So use case for TensorFlow Hub, the most common use case would be to do image retraining. At a go-down we also do image retraining like we want to classify photo image like into a specified category. Maybe it's like bedroom, kitchen or something like that. But we don't have like millions of image. So we don't want to train it everything from scratch. So we just like take the model to train on image net and then retrain it on our data. I think like a lot of you probably like try to do it already, like doing it already. And if you go to TensorFlow Hub, you can see a lot of modules that they have. Let's see the code. This is the code for image retraining. Not retraining, but now like image using pre-trained model to classify image on new data that you have. So what you have to do is just call hub.modu and copy paste the URL that the model URL that you want. So in this case is the NAS net. NAS net is probably the state of the art image classification architecture right now. And you can do like once you have that module, it's a function that take image as an input and then produce features vector for you. So you can think of it as like a feature extractor. So extract the features from your image. And then once you have that features, you can just put in like a dense function and do the binary classification or multi-class classification as you want. The good thing is like in order to get that feature vectors, so it use like 6,100 GPU hours to actually train that features NAS net. So you can reuse it like and you don't have to like use any GPU power for this thing. And also you can retrain your model as well. You just put retainable true and then put something called tag and then you also retrain your image model. So available image models, they have NAS net, they have P NAS net, mobile net. Also standard stuff like inception, REST net, mobile net. For text classification, this is also very exciting because okay, let's start with sentence embedding. So the idea is like, let's say you have a sentence and you want to classify if it's positive or negative. The idea if you want to like put that string, transform that string into a fixed length vector and use that vector to do this binary classification problem. And now they have these called universal sentence encodings that allows you to actually embed sentence. So I think if you know about what to work, what to work embed each word into a fixed length vector, but now you can actually embed a sentence into a single vector using this universal sentence coding. And the paper, I think it released the exact same day that the event happened, the TensorFlow Summit happened. So it's very new and you should try it. I think the algorithm behind it used the transformer from the paper called attention is all you need and it like trained that on like multi-tags, lot of tags, NLP related tags to get the universal sentence encoding. So if you want to use it with TensorFlow Hub, then it's pretty easy as well. You just have to use this hub dot text embedding column. And now this is the feature column that I talked about before in estimator. Once you have this, you can use it inside your estimator just like I showed you before. And of course you can retrain it if you want. Other text modules, they have this neural network language module, what to work and something called Elmo. And they're working on other stuff as well, adding more stuff inside TensorFlow Hub. And last thing is you can also build your own module and put it in TensorFlow Hub and share it to other people. And I think that would be like really nice to also collaborate or share your expertise to other. Last thing, this is the last topic that I'm gonna talk about, debugging TensorFlow. Okay, the reasons why I picked this because I myself like was, I was really annoyed when I tried to debug TensorFlow code. It's very hard and it's not intuitive at all. So when these two come out, it's gonna be really helpful for you to debug TensorFlow. The idea is you can actually have this in your TensorFlow board. And what you can do is you can literally see TensorFlow through the graph as you run like step. Like you have this step to run your graph and then you can see the TensorFlow value inside each node. And that's very cool. And you can also see the source code that specify that node in your code, right? And what I'm going to do is I'm going to do the demo that they did in the TensorFlow Summit. Okay, I hope it works. All right, yeah. So the idea is I'm gonna train MNIST model. Very simple, but it's not a usual one. It's a debug MNIST one. So what's gonna happen is, okay, I trained the model. All right, and this model went like, it gets stuck at step three. We can get only like 10% accuracy. So we suspect that something wrong, like something is going on inside the code, right? And maybe some node is probably has now value or infinity value or something like that. That is our hypothesis. But we don't know exactly which one produced this non-node infinity value. And we want to debug the code using this TensorFlow debugging tool. So first thing that we have to do is we have to start a tensorboard. Oops, all right. Start a tensorboard with a special argument called debugger port. And you just put your debugger port. Okay. When you go to tensorboard, you will see this thing. So this like suggests you that if you want to use the TensorFlow debugging plugin, what you have to do inside your code. For example, if you use the tf.session, you don't use estimator, what you have to do is you just have to put this additional line, tensorboard debug wrapper session. And then it should just work. And you have, if you use estimator, then you do this, right? So I'm gonna run the MNIST model again, but now with the special flag that will turn on this plugin. Okay. Now we have this very cool visualization. So this is like, you can see your graph here. And on the left-hand side, you can see also your, I think this is the name note that you specified on your code. So you can see inside your accuracy name note, what other name note you have, something like that. And what you can do is you can say step. Okay. And now it actually takes one step. And the cool thing is you can click continue and you can say, please run it until we meet some condition. And the condition that we are looking for is if any note contains infinity or minus infinity or no, no, no, right? And then we just have to do that and we run until it meets a condition. All right. But I think this is not the, okay, I think I did something wrong. Okay, okay. Can I start it over? Sorry for that. I did something wrong. I actually know how to use this last week, so please forgive me. I do this, meet this condition and then now it should work. Okay, okay, it's work. So it run until it meets that condition and it says this log like this note cross entropy log has minus infinity value. And we can see in the graph that okay, this is inside this graph. And what you can do is you can track like, what is the note before this log? So it's softmax and you can say expand and highlight and see that, okay, because softmax get like zero value, that's why when you take log zero, it's basically minus infinity, right? And you can see the code that produced this error. Okay, it's a bit hard to see, but yeah. Okay, you can see that this is the softmax logits. So when you know that this is the error, what you can do is you can go back to your TensorFlow code and then fix it and it should work. Okay, so let's go back. So like I said before, there is just additional argument that you have to specify when you open TensorFlow. And that is from my presentation. Let's see if you have any questions. Do we have time for Q and A? So we have time for maybe one or two questions. Does anyone have a question? How about TensorFlow 7? I mean, you just say that you're learning about if an email is the most important thing for big emails, right? Yeah. Is there any like, you know, one second look where you can follow or you can learn to use TensorFlow. I mean, the most important is the debugging. But I want to share, you know, if you've got a reserved story. So it's okay. Yeah. Use those debugging to see more actively. Yeah, thanks. Yeah, okay, thanks for the question. I think if you go to TensorFlow.org, they have this tutorial on debugging TensorFlow. But I don't think they have the tutorial for interactive version yet, but they have the tutorial for the, like, not the interactive version. You can just follow that first. But everything should, they say like, everything should be exactly the same with the non-interactive one. It's just like make it easier for you to do when you visualize stuff. All right, yeah. Any other question? Let's sit up for you. Oops, sorry. So I have a little question. Yeah, sure. It's about just debugging. So we figured out that the problem is with the softwares that produce the zero value. Yeah. So how can we fix it? What example should change our code to prevent this error? So, like, the simplest way is just put one in when you take lock, like, lock's mess one, right? Like, yeah. Thank you.