 We're starting with an amazing talk from Divya Goswami. She'll be teaching us more about our beloved Python programming language and its interpreter. Now I won't take much of your time handing over the stage to Divya and happy learning. Divya, good luck for your talk. Thank you. Good afternoon everyone from India. This is Divya Goswami presenting on Who We Get Pythons. Today I'm going to talk about how Python was originally built, how it came into existence or exactly in my words as Who We Get Python. I will also share how Python interpreter acts or how it runs behind the scenes. You don't really see how it... You just know that how it runs your code, but we'll see something under the hood. Also steps of how Python takes your code from parsing to running. And for people who wants to follow along with me, there's the bit.ly link. And here's the QR code for you. If you want to follow along, you can scan the code or go to this link. You can get the access to the slide. So moving on. Before I come start the topic a little about me. I'm a senior computer student from Kolkata and I've worked with Open Mainframe Project and IBM Z. This time as a summer mentee at LFX. Also, I am an Z ambassador for this year at IBM Z. And I'm currently working as a DevSecOps intern at Trell. You can go through these links to see what I do on Twitter and GitHub. A little about me. My portfolio link is right there. And a few of my blogs are also posted on that link as well. So moving on. So before I start again, a little flashback on how I came to this topic. How I thought about this topic. So the story of C Python begins for me while I was working as an intern some months back in a company. And I had the job to build a crawler in C++. And crawler in C++ means a bit difficult for me. And back then I started learning a Python. And I knew that I am very much confident in Python than building a C++ crawler. So I discovered something called you can embed C++ inside Python. You can literally use some extensions and you can run the code on Python. So the pros will be that I will be learning a new language. But what happened is like I became confused with terms like I got Cython and then C Python and Jython and Pypy. And these were also misty back then that I actually took the challenge of learning any one of these and actually understanding that how are they different from each other. But of course about the story of the crawler I did go back to C++ and made it. But this is the first error I got while using Cython back then. And I thought this is something interesting I should explore it. So I saved the error inside a file and I kept it for later. And resultant is Pycon India 2021. And this is the presentation I came up with. So that's the world. That's how I started thinking about C Python. So a little before again going inside that the mist about I was talking about. So Cython or what I was actually trying to write the crawler in is a compiled programming language. It's totally different from what I am going to talk about today. That is C Python. Cython is actually it gives the power of C inside Python. So it's basically Python with C data types. Again, one more is PyPy is implementing the Python interpreter in Python. That is using the just in time compiler, which essentially means that compiles your code or Python code during execution rather than the normal traditional way of a compiler that compiles before execution. Another term that I was talking about is Jython. It also is another implementation of Python, but on the Java version machine that is totally different. Now we come to C Python, which has a totally different meaning from all this. C Python was how Python interpreter was first built. The creator himself and a few people back then they tried to create this language, this interpreter in C. That's how Python came into existence. So C Python is actually a binary that has compiled a version of many C files. I'll be showing you what C files exactly in our wild. That's the compiled version of so many C files into one single binary. That is actually what resides in your bin directory. I'm a bit not acquainted much with the Windows one, but in Linux you have the bin directory where Python exists. So that's where Python resides. That's actually what C Python is, and that's what we are going to talk about. So before starting, so why Python? This is from Guru Vandrossan, the creator of Python. Remember that name? It's from the original tutorial he wrote himself, Chapter 1, Witting Your Appetite. That's where this is written. Python's elegant syntax and dynamic typing. Many of us have already started using it together with this interpreted nature. From here, we can understand that Python has a nature of interpreter. What is it? We'll go after sometime. We'll jump into that. It also, it's an ideal language for scripting and rapid application development in many areas on most platforms. Even as an intern, I am extensively using Python for automating and writing small scripts that it's actually widely used in my daily workflows. So that's why most of us use Python. So as a starter for the curious ones, I have prepared some trivia. You can discuss it in the comments, maybe debate it and get some useful answers in the end. We can discuss the answers in the end of the talk. This is trivia one that says that my balance is 10,000. Guitar costs 10,000 and also food is 100. If guitar is balanced, obviously people will say you cannot afford it. You need the food, right? But what Python does is it says, let's rock it. Forget the food. We can get the guitar. What exactly is happening here? We'll be discussing it for the people who are curious. You can, you know, you can start it in the comments right now. Another trivia I'm giving you right now is a tuple which has nested lists inside and an element. I have tried modifying one of the elements inside the tuple, inserting it. You can see I am trying to add or append it. And it does show the error that tuple does not, you know, this is not fit for mutation. But when you print it, it actually does, it actually mutates the inside list. So what exactly is happening here? Maybe this is another trivia. You can repeat in the comments and we'll come to it at the end. So let's start from the beginning. Now, you know that Python that you run on your system comes from a source code. And this is the source code. You can get it from this GitHub link. It's an active development. Anyone can go and contribute to it while every directory here has its own purpose for my talk and for my, just for making this talk easier and to understand. I will stick to the ones that are really useful for today's presentation. So the grammar directory contains the computer readable language definition. The include directory contains all the necessary header C files. Object right here has definition about the various objects you use like lists, tuples, dictionaries and the Python finally is the main source code of C Python or the Python interpreter written in C. So everything here actually condenses and comes to this place. And then actually the binary gets formed right here in that directory. So all of these C files together with the headers getting combined into a binary. And that's how Python is began. So who begets Python C, but I'm not done yet. I have already questioned answered my question, but there is a lot to do. I have been given a time and I will explain how exactly. So the first bit hint to our previous question, previous trivia. This is an example I wrote. This is the definition of a function. Simply it just concatenates and brings C Python in Python India. But what exactly happens inside is in one of the steps what Python does is it parses the code and it produces some weird thing like this. That is you'll come to know it's a tree. What exactly the name is I'll be revealing it to you in after some slides. So this is what is one of the steps that happens inside Python before it gets executed. You can see that it's recognizing it as a module. The module has some body body as recognizes some expression. And each of it has the string with the another string and sort of it loads. But let's not really bust our brains right now. We'll go slowly through. So before this step comes out, so we'll be talking a bit about how exactly it reaches this step. So the roadmap. So Python takes our code through the following steps. Lexing the initial step parsing then generating an AST. That's exactly what the previous tree was the abstract syntax tree. And then compiling are making the creating the bytecode will be coming to it again. And then finally running the bytecode that is final step of execution. So we'll go one by one to each. First is Lexing. Now, again, do not bother reading this. This is one of the Python's frequent reads. It is what is called the grammar. It is again found in this grammar directory inside the C Python source tree. It has all the necessary recognition words for Python to identify the syntaxes and tokens keywords. What exactly what exactly is which word what do you mean. So this is the first step where Python is actually distinguishing and finds out the purpose of each word that you typed in your code. So it doesn't really understand what each means each keyword means that this is where it references. So this is the Lexing step. Now we come to the parsing step. So what is parsing? I have used module named tokenize where Python essentially gives you the gives you the information about how it referenced the grammar and took out what is exactly what here. As you can see, UTF the encoding of this Python code I've written is in UTF-8. Again, a small fun fact is once upon a time you could have written Python in lot 13 also right now it's not supported. It's removed from Python 3. So this is how Python writes or identifies which is what it as you can see the definition it knows that it's a name and this food is a name. The brackets are operations. This is a new line the indentation as the tab is recognized as an indentation. You can essentially see that it is recognizing the string or the name or the variable name. So this is the second step where Python is actually tokenizing or in our words mark marking every single words in our code what exactly is each of them. So now the second step the parsing is done now the code is fully readable and mark by Python to go forward with dealing with what is the purpose of the code. We haven't yet gone into what the code actually wants Python to do. So the question now is which object does what and it which order. So before that we come before we come to the EST or the third step. I'm going to talk about a bit on what a tree looks like and with nodes. This is an example expression I have written that if you feed inside Python what it exactly does is it makes a tree out of itself. So as you can see this gets recognized as an expression the expression breaks itself with the plus sign we all know that terms are divided with plus and minuses. So the term gets divided and then it recognizes the factors in between the multiplication and that's how this whole expression gets inside a tree. So here the expression is the root node and then it has a lot of child nodes or the children nodes. So yeah that's how exactly tree with nodes look like. So Python does exactly that. So what it does is it makes exactly follows the same order and it makes something called the abstract syntax tree. Now abstract syntax tree as the name says it does something with the syntax. Now it makes it itself into an object tree object tree as in it recognized all the objects in the tokenize step and it slowly makes an order about which object should be doing what and exactly which order. So I have used the AST module that is readily built in inside Python you can use this as well and it parses the whole full definition function definition. And right now it is not really in human readable format. As you can see that tree is actually giving you some weird addresses you can travel down the tree node by node as I have used is zero or the root node. You can go forth and you can do this but in for human readable format. You also have AST dump that actually dumps the whole AST and that again will not be that much human readable. So I have used pretty print module to actually show it and line by line what exactly it looks like. So this is third step where the AST is getting generated and this is exactly the tree that you saw some slides back. So that was the abstract syntax tree. It arranges an object tree on the basis of syntax on the basis of syntax analysis and after this is actually what Python goes and tries to run the code or actually starts generating the bytecode. So the tree is done and now we go to the bytecode. So making code objects or creating the bytecode. So there is a code done to code again already available attribute and what exactly it does is it prints the code object. So code object in general sense an object is a sequence of statements or instructions in a compiler language or usually a machine language. So what right now Python has done is it created the AST and it produced the compiler or the machine code or the low level code. Now which is ready to be run. Okay, so this is just the pre stage the just the stage before Python is about to run. So as you can see again I have used the code attribute to actually look at the raw form of foo as code object. You can also use several attributes inside it that is code as the name suggests it gives you the constant values of inside the it actually prints the literals inside the function definition you can see see Python in Python India. Also with that you have another there are several of them which I'll show you in some time there are again one more example I have tried showing you is code one names that actually prints all the variables that is used inside or locally inside the function definition. And finally we come to the code code which is actually printing the unreadable ASCII value of the code that so that's exactly how it gets broken down in from the AST to the also now each of this will be converted into a byte code or which will be interpreted as code in bytes. There will be a small chunks of instructions each core each byte code having one instructions and that will be loaded on something will be coming on afterwards and that will finally run the code. So now you can see that Python has reduced the whole big function that whole big def foo inside into this. You know this small byte codes now next what should be done it will be run so before running now as we have seen the code code here. We can now understand that what exactly is inside code code so code code. I have used another module here that gives you the off name attribute which essentially shows you the value of each code what exact operation is carried on in each byte code which instruction is happening. So for example in the root code in the root node the first one is the load const and that's the op name and op name is actually the operation that will be carried out but a byte code is actually containing two bytes in size the op code and the op argument. So and we'll be seeing what the op argument is in each line if you have remembered what exactly was the function definition. I mean the function I have written it's actually storing the it's actually getting the string see see Python inside variable name. So if you can guess it it will be something of see Python getting loaded inside the stack. Now think talking about the stack. This is the next step where Python is actually taking each of these instructions and loading it on something called the stack. What is the stack. Before that. So each of these if you want to go to understand which what what each of these of course actually does there is a file there is a file inside the Python source tree as called the see well see. That is actually having your a fixed switch statement where if you can go and search for load const it will give you some definition declaration and it will show you what exactly the op code does with the op argument. Suppose for example I have showed you load fast here load fast is another case inside the switch case and this is what it does. If you are interested you can check out this inside the Python source tree in the see well see. Now before so we were talking about a stack. What is stack before coming into what exactly how it runs the code stack is a data structures the for these CS ones we have been taught this in our classes stack is actually data structure where it's. It has two operations that is push and pop it has a pointer on top of it top which hints to or which points to the actual operation that is going on right now so what it does it's. If it wants to carry out some operation it pushes it inside the stack and once it's done it gets popped up something like this exactly is carried on inside Python. So Python here is here so this is again I have used this module inside that this attribute and that faded in the function definition and this is what how the each of the op codes are stored on the stack and it runs each of it. So for the. Once who have understood what the stock is you can understand that the first is it's loading the see Python string as I've said before it's stored inside the pie variable. And then the next step in the stack is it loaded the pie variable and it loaded the constant in Python India and if you remember the last step was to contact me both the string so it. Finally added since this these two are strings so it got concatenated and it finally returned the value so that's very simple what how the Python has. Run the code using the Python stack based VM this is the final step in the running or execution of Python. But again if it's very difficult to understand and there is a module that's called in service you can import it readily again I think you have to. You have to get it installed using pip I have I think used pip to install it but once you have installed it you can use it on the interpreter mode and then interactive mode and you can. Ready to show food and it opens a web server. And what exactly it shows is the definition and we come to the AST so for the people who are very confused with the addresses and that structure the small structure. This is how install is makes you see it so I have essentially showed you the function definition part where because it's it has most of the informations and this is exactly where I have shown you the tree getting broken broken up so. You can see the food the name and key keyword name where there was nothing of it and then it showed the Python variable what what it had the value inside it the C Python. But this this this is one of the examples of one root node and showing its AST or the abstract syntax tree. Again you can also see the code object which I've shown you sometime back so there are so many attributes as I've told you before I have essentially showed you the code here it's showing you the as the bytecode string the constants also it's visible here as I've done it in the interactive mode there is no file name the first line number is obviously one. If you follow this you can see there are again other another called the n locals that shows you the local variables inside the function definition the stack size essentially if you remember it was to and the one names that was used inside. So what instaWiz does is it actually shows you all the whole the breakdown of code inside and you really don't have to go through all the you know the modules and do it slowly but as you can see again there is a disassembled code where it shows you the stack how exactly is it getting carried. If even if it there is a jump target that's that's that's how it shows there and this is what instaWiz does so you can check this out also for if you want so in conclusion. Python takes our code through the following steps again revising lexing that was the first step of differencing from its dictionary parsing where it tokenized every single keywords what exactly is what. And then it generates the AST as in what how or which order will it follow the code or execute the code and then it compiled into chunks of code based on operations instructions and then finally it run the code on. Python stack VM so that's how you that's how Python does these steps. Now for the trivia we have finally come to the end almost the end so the trivia here so there is a way of identifying what exactly so what is is here is doing is is identifies if the same object is getting referenced. Not so this is a game of referencing in Python where if guitar and balance is not exactly. Referencing the same object it's not actually referencing to the value but the object and each object has a different ID as you can see guitar has a different ID balance also has a different ID so it rendered it false and that's why it printed let's rock it. Again for trivia two solution this is again a game of referencing in Python as you can understand tuple inside had a reference to a mutable object that was a list and when it tried to access it it had no problem accessing it because it was a list but after when it used the plus the sign that means it was appending. It did understand that it was trying to mutate the tuple but by then the list that was referenced inside was already mutated so that's why you got the mutated string or actually the tuple. So now we as we come to the end of the slides thank you for hearing this talk and for the people who are actually interested in going more deep inside this just if you don't remember these links just remember 10,000 meters there they he actually does a lot of these things. I also this is something you'll get ready if you just search for it inside the Python machine it actually shows you every single step even more deeper than I actually showed you here with writing small functions and actually visualizing each of the steps even the stack VM because it's a huge topic in each. Also, if anyone is wanting to contribute or go for more deeper understanding of the source tree Larry Hastings has made another video or talk stepping through see Python you can search this also. So thank you so much for coming to this talk if you have any questions this is the time you throw it to me. Thank you. I think Nikhil are you audible? Sorry. I'm sorry. So we have one question thank you for that very insightful talk in chats tuples are mutables mentioned as hashable said that immutable so what does hashables mean? Hashables in chats as in. Okay okay so okay I have to check the chats hashable exactly it's not really I haven't been acquainted to that topic so might be it might be a chance for me to learn as well. So I'll definitely go through this chat I don't think so I'll be able to find it at the moment. We have another question like if these 256 and is 256 is be returns true can you explain why. This was actually happened not exactly right now. It was true for some one of the versions right now I don't think so it really right now it proves it as true but it's again for the reference part. That it's actually referencing to the same object gets stored but in the latest steps it's actually getting replaced. So the object reference changes for the second time if you write it it doesn't actually.