 Hello, everyone. In this talk, I'm going to show you how to design functions that can be correctly graph-converted using two of the most exciting features of the new Therson Flow Relays, 2.0, AutoGraph and TF Function. But first, let me introduce myself. So I'm sorry. So I'm Paolo Galeone. I'm a computer engineer, and I do computer vision and machine learning for a living. And I'm literally obsessed with TensorFlow. I started using TensorFlow as soon as Google released it publicly around November 2015, when I was a research fellow at the University of Bologna at the Computer Visual Laboratory. And I never stopped since then. In fact, I blog about TensorFlow. You can see the others of my blog there. I answer questions on Stack Overflow about TensorFlow almost daily. I write open source software using TensorFlow and I use TensorFlow every day at work. For this reason, Google noticed that this strong passion and awarded me with the title of Google Developer Expert in Machine Learning. So as I mentioned, I have a blog, and I invite you to go read it, mainly because this talk is born from a three-part article I wrote about TF Function and AutoGraph. So after this brief introduction, we are ready to start. So in TensorFlow 2.0, the concept of graph definition and session execution, the core of the descriptive way of programming used in TensorFlow 1, are disappeared or better. They've been hidden in favor of the IG execution. IG execution, as everyone almost should know, is the execution of the computation line-by-line, pure, typical of Python. This new design choice has been made with the goal of lowering the entry barriers, making TensorFlow more Pythonic and easy to use. Of course, the description of the computation using Dataflow graphs, proper of TensorFlow 1, have too many advantages that TensorFlow 2.0 must still have. For instance, graphs have faster execution speed, are easy to replicate and to distribute. Graphs, moreover, are language-agnostic representation. In fact, a graph is not a Python program, but is a description of a computation. Being agnostic to the language, they can be created using Python and then exported and used in any other programming language. Moreover, automatic differentiation comes almost for free when the computation is described using graphs. So to merge the graph advantages proper of TensorFlow 1 and the ease of use of the IG execution, TensorFlow introduced the TF function and autograph. So this is the signature of the function. And TF function allow you to transform a subset of Python syntax into a portable and di-performance graph representation with a simple function decoration. As you can see from the function signature, in fact, TF function is a decorator and uses autograph at default. Autograph let you write a graph code using natural Python-like syntax. And in particular, autograph allow you to use Python control flow statements like the if, as, why, for, and so on inside a TF function-decorated function. And it automatically converts them into the appropriate TensorFlow graph nodes. For instance, if statement Python becomes a TF code, for loop become a TF while, and so on. However, in practice, what happens when a function decorated with TF function is called. So this is a schematic representation of what happens and it is a two-phase execution. In particular, the most important thing to note is when a function decorated with the TF function is invoked, eager execution is disabled in that context. And on the first code, the function is executed and traced. Being eager executed, disabled by default, every TF dot method just define a TF operation that produce a TF tensor object as output, exactly in the same way as TensorFlow one. It's the same exact behavior. At the same time, autograph starts and is used to detect the Python construct that can be converted to the graph equivalent. So a while becomes a TF while, and so on. So once gathered all these pieces of information, we can build the graph. So we have the function trace, autograph representation. And so since we have to replicate the eager execution after every single line, what happens is that every execution, every statement is an execution order force using the TensorFlow one TF control dependency statement. At the end of this process, we have build the graph. Then, based on the function name and on the input parameters, a unique ID is created and it is associated with the graph. Then the graph is placed and cached into a map. So we can just have a map ID equal graph. Any functional code then will reuse the defined graph only if the key matches. Of course, since the TF function is a decorator, it forces us to organize the code using functions. In fact, functions are the new way of executing something into a session. Now that we have a basic understanding of how TF function works, we can start using it to solve a simple problem and see if everything goes as we described it here. So this is a problem. The problem is really easy. It's just a multiplication of two constant metrics followed by the addition of a scatter variable b. Really, really easy. So this is the TensorFlow one solution. In TensorFlow one, we have to first describe the computation as a graph inside a graph scope. By default, there is a default graph always present, but in this case, we explicitly here. Then we create a special node with the only goal of initializing the variables and everyone familiar with TensorFlow one should have seen this line a thousand of times. And then in the end, we create the session object and this is the object that received the description of the computation. The graph and places it upon the core cloud hardware. Then we can finally use the session object to run the computation and getting the result. So this is the standard implementation in TensorFlow one and the TensorFlow two thanks to eager execution. The solution of the problem is becoming really, really easier. In fact, we only have to declare the constants and the variables and the computation is executed there directly without the need to create a session. In order to replicate the same behavior of the session execution, rewrite the code instead of function. Executing the function as, in fact, the same behavior of the previous session to run of the output node. The only peculiarity here is that every TF operation, like TF constant, TF matmool and so on produces a TF tensor object and not a Python native type or an empire array. Therefore, for this reason, as you can see in the last line, we have to extract from the TF tensor the numpy representation by calling the dot numpy method. We can call the function as many times as we want and it works like any other Python function. So right now we have only pure eager function. But what happens if we try to decorate this function and convert it to its graph representation using TF function? So, adding the decorator, pretty straightforward. And of course we might expect that since this function works correctly in eager mode, we can convert it to its graph representation just by adding the decorator. Let's try and let's see what happens. I added the two print statements before the return statement. One, it's a print statement executed only by Python, the first one. And the second one is a TF print statement that is a node in the graph. This will help us to understand what's going on. So, this is the first output we see on the console. When the function is called, the process of graph creation starts. At this stage, only the Python code is executed and the execution is traced in order to collect the required data to build the graph. As you can see, this is the only output we get. The TF print call is not evaluated since it has any other TF method. TensorFlow already knows everything about that particular node and therefore there is no need to trace their execution. Moving forward, we can see the second output. So, we got that exception. TF function, the correct function, tried to create variables on a no-first call. But in eager execution, this function worked correctly. So, what's going on here? The exception, of course, is a little bit misleading since we call it this function only once, but the exception is talking about a no-first call. But of course, TF function in practice called this function more than once while trying to trace its execution to create the graph. But in short, as it is to understand, TF function is complaining about the variable object. As this first exception brings us to our first lesson of this talk. And this is the lesson. So, a TF variable object in eager mode is just a Python object that gets destroyed as soon as it goes out of scope. And that's why the function works correctly in eager mode. But a TF variable in a TF-decorated function is the definition of a node in a persistent graph since eager execution is disabled in that context. So, since the graph is persistent, we can define a variable every time we call a new function. And this brings us to the solution of the problem. The solution is to just think about the graph definition while defining the function. So, we can declare a new variable every time the function is called. We have to take care of this manually. Declaring a variable as a private attribute of the class F and creating it only during the first call, we can correctly define a computational graph that works as we expect. And in short, this brings us to our second lesson. The second lesson is that eager functions are not graph convertible as they are. There is no guarantee that functions that work in eager mode are graph convertible. Always define the function structure to think about the graph that's being built. Okay, so this was the first topic of the analysis of TF function. Now we can move forward and analyze what happens when the input type of a TF function, decorated function changes. This part of the talk is, by far, perhaps the most important part, since TF function should bridge two different completely, should bridge two different completely words. In fact, Python is a dynamically-typed language where a function can accept any input type. While TensorFlow being a C++ library under the hood is a slickly, statically-typed library. And every node in the graph must have a well-defined type and also a well-defined shape. So we are going to define a function to test what's going on when we change the input type. This is the function, is the identity. And as we can see, only one, the function accept a Python variable, x, that can be literally everything. Only two, we have a print function that's executed only once during the function tracing. On the third line, we have the TF print function that is executed every time the graph is evaluated. In the end, since this is the identity, we return the input parameter. Okay, this is the first test. When the input, as we can see, is a TF tensor, we expect that a graph is built for every different TF tensor, the type. And this should happen, of course, only once. And then we have to reuse every time we call the same function with the same type, the same graph created on the first code. On every second code, therefore, we don't expect to see the Python execution line, but only the output of the graph execution. Let's see the output. As you can see, everything, when the input is a TF tensor, work as we expect. And since everything is going smoothly, we can try to deep dive a little bit inside the autograph structure and check if the graph that is being built after the autograph execution and the function tracing is what we think. So in short, we think that we should only contain the TF print statement and the return of the input parameter. Okay, using the TF autograph module, is it possible to see how autograph converts a Python function to its graph representation? The code, of course, is a mess because it's machine generated, but we can notice something unexpected. Maybe I can try to move this line. This is a little bit unexpected. In fact, there is a reference to the Python execution inside the graph translation. So this is strange, and this is not what we expected when we want to just create a graph. We can analyze only this part. Without digging too much into the constructor, we can see that there is the name of the function that is Python executed. Of course, there is print. It's argument, Python execution, comics. Wrap it inside a control dependency or return. The second parameter of the autograph converted call is the owner, and as you can see, is none. This means that there is no package known to autograph or TensorFlow that contains the print function definition. So in short, this line is a statement that gets converted to a TF node operation, and it has the only set effect to force the execution order. In practice, we are just enforcing the execution order of the sequence lines, the sequence statements, after the execution of a TF node node. Okay, we can see now after this short analysis of how a function gets graph converted, what happens when the input is not a TF tensor, but is a Python native type. Okay, the code is similar to the previous one. We just defined an Apple function called printinfo to be sure that everything that you... To be sure that we are feeding the correct data type to the function. Since the function is trivial, we expect, of course, the safe behavior we get before. Okay, as we can see, now we can see what happens when a Python integer is fed as input and something weird is going on. Of course, since the Python execution, as you can see, is displayed not only once as we might expect since this is a single data type integer, but it's executed twice. The graph, therefore, is very recreated at every function invocation, and this is really weird. But, trust me, things are getting even worse because now, on the first execution, we have defined two graphs for the one value and for the two value. But what happens if we feed now the same value but with a different data type, so with a float? As you can see, the graph now is not being recreated at every invocation, but given a float input, we get an integer output. So this is no more the identity function. This is somehow broken. In fact, the written type is wrong, and the graph that is being built for the integers one and the integers two is being reused for the float values one and two. So this was my phase when I discovered this. So I spent some time to figure out what was going on, and I summarized this on the next lesson. This is lesson number three. So TF function does not automatically convert a Python integer to a TF tensor. With the data type expected, so since the integer in Python are 64 bits, we expect a TF int 64, and so on. The graph ID, when the input is not a TF tensor object, is built using the variable value, not the type. This is a design choice of the TF function authors that I don't like that much, since it makes the graph conversion not lateral, and you have to worry about this behavior. Moreover, since this new graph is being recreated for every different Python value, we have the risk of designing terribly slow functions. In fact, we can see a simple performance measurement. G is the entity function here. In the first loop, G is fed with the TF tensor object produced by TF range function execution. The second loop instead invokes G with 1,000 different Python integers. And this means that we are building 1,000 different graphs. Autograph is actually optimized and it works well when the input is a TF tensor object, as you can see from the time measurement here. While it creates a new graph for every different input parameter, while for every different input parameter value, while with huge drop in performance. And this brings us to the first lesson, use TF tensor everywhere. Seriously, this is the mantra to repeat. TF tensor is not the only TensorFlow object that we have to use when we are using TF function. In fact, TF function has this web behavior when using Python update types, but also has other web behaviors when using other Python and native constructs. This brings us to the last part of the presentation. Really brief. So what happens when we just plug inside the TF function that we have in function some Python operator? This function works correctly in your mode. Given the TF tensor x that holds the constant value of one, we expect to get the output A equal B, since A and B are the same pattern object. I guess that everyone here should agree that the final answer should never be reached because if we feed a number, every condition should be satisfied and we should never reach the watt lines. But in practice, what happens and if we execute this function, this is the output. What? Yes. So, keeping this really short, there are several problems in that function. The bigger one that affects TensorFlow from the early raises is that the Python equal operator is not overloaded as a TF equal. Then the second huge problem is that Autograph handles the conversion of the if, if, and statement, but not the conversion of the Boolean expressions defined using the Python built-in operations. So in short, the correct way of writing the function is to use the TensorFlow Boolean operators everywhere instead of using the Python native operators. And this brings us to the last lesson, the operator lesson. That's this one. Use the TensorFlow operators operations everywhere seriously. Otherwise, you get that weird behaviors, completely nonsense and really hard to debug. So we are reaching the end. And this is a recap of the five points. So the variable needs a special treatment. You have to think about the graph while designing the function. Eager to graph, the conversion from Eager to graph is not so straightforward. There is no out-of-boxing of Python native types to TF tensor. So we have to use TF tensor everywhere and also we have to use the TensorFlow operator explicitly everywhere. So this is the end. Just I hope you enjoyed the talk. And I just want to share with you the fact that I'm writing a book about TensorFlow TF function and the neural networks. If you want to stay in touch and get informed when the book is out or when a new article about TensorFlow ecosystem is out, just leave your email in the subscribe page. Thank you. We have all our time this evening to ask you questions because it's the last talk in this room. If no one minds, I would start with one question. So first, please put your slides after the talk. I really want to reproduce the examples you gave us because this is what I like about TensorFlow. Like you sometimes get this crazy stuff and crazy errors and you have no idea what they mean. That's what's my first thought. And the second, what do you think? The developers of TensorFlow, did they do this on purpose? Like not how you said, like all this about less, greater and equal operators? Did they do this on purpose to not replace them with TF much greater and so on? So I'm 100% sure that TF equal and the underscore underscore equal Python operator has not been overloaded because internally in the TensorFlow code base, they use TF tensor as an index in the map. So they have to be hashable. And therefore, they can't use the TF dot equal because TF dot equal generates a new TF operation and therefore is not something hashable. And this is the reason for TF equal. The other replacement, so for the greater, lesser, and so on, they should be converted. And perhaps they will be converted because in the RFC, they said that, of course, in the future, we will handle this comparison. But since this is the problem of the equal operator, perhaps they can't do this. And they force us to use the TF Boolean operators for this reason. Thank you. If you have any questions, please come to the mic because they are not detachable. So we still have a bit of time for one or two questions. Okay, then, thank you very much for the talk and thanks everyone for being here.