 Alright, it's two o'clock. Let's get started. Any class questions? Questions on anything we do with class projects? Contents? Interns? Solutions for the interns? Have we ever been able to know what we want to impose it on? Go to recitation sections and talk to each other. There's seven recitation, six recitation sections. You signed up the one when you signed up for the class, right? DJ. I think Eric also recorded a video. I don't know if this week's is specifically him going over the recitation section, but there's also Eric recorded his recitation section. Any other questions? Questions or more questions? Yeah. So the video, is it going to be structured? Are they going to be the same as the one that you have just given up? They'll be similar. Yeah. They'll be very similar to the ragged intern. Maybe more questions, maybe less questions. I don't know. Yeah. So is it going to be the dissent of recursive parsers? Yes, it's covering semantic analysis, lexing and syntax analysis. That's it. No semantics. Although we're getting into a lot of stuff. It's going to be good testing type material, but I'll wait until the next lecture. Going twice, back to semantic experiment. Okay. Now let's go through and talk about exactly what's going on at this time. Just kidding. Okay. So we're talking about semantics. Why are we talking about semantics? With the parse tree? To figure out what the parse tree does. Yeah, to figure out what the parse tree does. To figure out what this program that we've just spent all this time lexing and parsing. To find out what it actually does, what does it mean? Right? So there's a couple of things we want to know. We want to know, does this parse tree make sense? Right? Does this do what we think it should do, or did the programmer do something nonsensical like try to add a string with an integer? For instance. Cool. So we talked about three different ways of actually defining semantics. We have English documentation, a reference implementation, and formal math like this. So you're learning, so you can think of these two ways. You want to learn a new programming language, or you want to describe your new cool programming language. What types of the languages features do you need to describe? So what do you need to describe the semantics of? So that somebody else can either write a primer for it or use it and write a program in that language. We learned about a new programming language. The data types? Yeah, so maybe something about the type system, like some of the basic data types. What else? Operator behavior. Operator behavior? What plus minus? Yeah, exactly. Plus minus? Some? If you think some programming languages, I'll say a camel off the top of my head, that you can't use the plus operator on floats. Plus is only for integers. You have to use a different operator to add floats together, which is incredibly annoying. Or in some programming languages, you can use the plus operator to concatenate strings. It kind of makes sense, but you're not really adding the strings together, right? It's actually a new operator with different data types. Whereas a different language like PHP can concatenate strings by using doc, which is very confusing. You can switch between them. So what else? You need no built-ins, you need no types. What was that? Functions. What about functions? The different functions, like the built-in functions? Yeah, so what's the standard library? What kind of functions are provided? What else? How to do input and output? How to do input and output? Yeah, so it could be baked into the language. A language like Python print is a language statement. Other languages, like C, you have to import from standard library from standard.io.h in order to get functions that will print for you. What else? Class or function. Yeah, how to construct a class? That would be one thing. How to construct new types? So you have the basic types. How to construct more? How do you say I want a point which is composed of two float values an x and a y? And I want to consider that as one type. Or how do I create classes? Or how do I write functions? We just talked about which functions do what, but how do you even write your own functions? What else? How to handle errors. So different languages do it differently. Java throws exceptions. C returns different values from a function depending on the error codes. What else? Imports, yeah, modules, right? How to use other people's code, whether it's a pound included in C or it's an import in Java. We want some way to be able to reuse other people's code. Awesome. So we got a lot of different things. Even variables, actually. We talked about this, right? How do we define variables? What does it mean to define a variable? When can I refer to that variable and kind of have variables with different names? What does that mean? Functions, creating functions. We just talked about. What about calling functions? How to pass parameters to functions? There's different ways maybe that we can do that. And what are the exact semantics of calling a function? As we'll see they vary between programming languages and even between the same programming languages you can have different semantics of passing arguments into a function. So function parameters, types, operators, exceptions, control structures. This is something that seems so basic we didn't even think about it. So you have ifs, whiles, do-whiles. Some languages like list you can actually create your own control structures. So you can create something called an unless, which is like if but the opposite. So unless the condition is true then do this, do the branch. So it's the opposite of an if. But it can actually make your code read a little bit easier. So like unless this variable is true do this code walk, right? Rather than if this variable is not true then do this code walk. All structures and everything are just for immutability's sake or is there some other things is the benefit of having control structures? I'm making your own. Oh, making your own. Yeah, it all depends. Like with most language features you can go crazy and shoot yourself in the foot and do something insane and people read your code and it's like what in the world does this do? But if you use it correctly you can do that any language, right? But if you use it correctly it can maybe help improve the readability and get across what you're trying to do. So it just depends. Yeah, like so some languages have for loops like a C style for loop for some initial condition initial condition and then a thing to do on every loop. Some languages have a for each loop so you want to loop over every element of the collection. In a language region to find your own control structures you can create a for each loop. You don't need the language to have to support that to change to enable that. Constants. Why do we want to use constants? Why don't we just use variables? This name is saying guarantee. Yeah, right. The compiler will guarantee that this variable does not change. When I was in undergrad I was working for a company who I will not name since we were recorded and I inherited some code that was written by I think a physics student who was like a self-taught programmer and that person had wrote code that was like variable one equals one. Like variable o and e equals the value one. And then was using that. It hurt so bad because what if somebody changed that? Later on it was one plus plus or something, right? If you're going to take the time to declare a variable called one, you probably never want that thing to ever change, right? So anyway, so yeah, constants are super important language features but even, so you can think about constant variables. What about const in like C plus plus? How does that help? You just ignore it. Unless the compiler tells you then you just put it in. Like the const keyword on C plus plus. Like why would you want to use that on an argument to your function when you're defining a function? Yeah, or it knows well the key thing just like in the constant variable you're guaranteeing to the person who calls your function I will not change this parameter, right? This value is going to remain constant. So you don't have to worry about passing in something and then having the function change it. That's one of the key benefits of doing that in C plus plus. Cool. Methods. What does it ever do to a method and a function? A class. Yeah, right? So a method, usually the way you think of it is a method is associated with a specific class and when it's invoked it operates on an instance of that class that that instance is accessible usually through something something like this keyword or even primer in the case of Python, right? Classes, all that kind of stuff, right? So we need definitions for every single one of these things and can we have arbitrary vague oh yeah just like add these things together and no we need incredibly precise and accurate descriptions of what things do and so those of you that are very detail oriented will have a very easy time of this section because we're going to be very precise in how we define these semantics of the language operators that we use and we'll look at different types of semantics for different types of things and how that changes our programs. So this is, you really have to focus on these things and be very precise and of course as I say this now I'm worried I'm going to mess up when I go to the examples so let's say I'm doing that on purpose and your job is as chatty as I say things so make sure I'm saying them very precisely. Cool. Well this thing is something super simple like declarations so what is a declaration? Yeah, so defining usually and now we'll kind of think about C for right now it applies to other languages so we want to define variables what other things do we want to declare and define? You actually explicitly declare that functions you may want to declare so in C you can declare a function without defining it and say here's my function signature I'm going to use this I'm going to define it later trust me so we'll see that scope and declarations are very intertwined so sometimes we need to in our language contracts just depending on the language itself we need to explicitly say I want a variable called I and I want a variable called Boo I'm writing my name so I'm not going to do anymore but oftentimes we want to give these definitions names we want to declare I have a variable I want to call it Boo so I can reference it later and I want to it has a specific type that'd be what you're declaring does every declaration have or need a type sorry your name type works too let's go with type first does every declaration need a type? Yes Yes? So you're not going to implicitly Yeah so once again it depends on the language and it depends on the semantics of that language C and and Java you pretty much need to declare types 100% of the time but in a language like Python you don't need to declare types at all and in some languages you can even types are optional so you can not declare them but if you do the compiler will use that information and generate optimized code for that specific type What about names? Are names mandatory? You may need to use it somewhere What if I assign it to something? Think about this we're going to come back to this I'm trying to think of if you've experienced this I would say you probably haven't seen it yet but you can create anonymous types types with no name you can create a struct that has no name you can define a variable and it's type is a structure with no name We'll get to the uses of that but names are even optional, right? So you want to think about the design space here we're talking about these kind of things because we're talking about the design space of the programming languages so I want you to think about crazy stuff like that What if I had a language with no names? It would be weird You could do it Actually, you could definitely do it Wow I don't think I'll remember but maybe bring this up when we talk about handicappals because we have no names there and we can do cool stuff What about languages with no names? What about languages with no types? What about languages with no names and types? Right? So think of that design space and then think how does that affect me, the programmer when I write programs in this language? So, super simple declaration, right? In I, I have an integer I but in some languages declarations are implicit So in Python you don't actually have to declare a variable You're actually going to be I don't even think you do I think I should just put it there So like we said with implicit variable declarations So here, I target equals the test value plus 10 So this is saying I want a new variable target if it doesn't already exist So why might that be good or bad? Types So A, there's a thing I'm not going to specify types which is kind of cool So I don't need to specify types that deep others things B, I don't have that line at the top of my function that's specifying exactly everything I'm going to do It would be kind of like reading a book at the beginning of the book or at the beginning of the chapter you have a list of all the characters who are going to appear in this book or in this chapter just to get you ready for reading You can skip over that because you'll see it you'll kind of figure it out What are some of the downsides? That's the big one, right? So if you mistype it let's say you argue the declared target above and then here or even worse let's say above I use targets and now here I'm using target without an S The compiler doesn't complain it sees a new variable it creates a new variable for you but the code doesn't work and the code doesn't crash that's the worst part it just doesn't work right So that's actually a huge problem when you're using this dynamic program where you don't have to explicitly define it Cool Let's talk about names I already said you can do stuff without names but let's talk about names So one of the key questions that we're going to be looking at now is once a variable once something is declared how long is that decoration valid? Only for that scope that's declared inside but what is the scope? What could it be? Function? So a variable is going to find that function or accessible in that function Class if you declare a member variable in a class it's accessible throughout any function in that class What was that? It said only in its primary so it would like say and then it won't work anymore outside just like for a counter-cable So how does that work? So let's see is it a function? If you define a variable inside of let's say an if block can you access that variable outside that if block? No So C actually has a block-level scope it's all about the primacy or not primacy what are they? Whereas other languages like JavaScript variables are declared at the function level So a variable declared anywhere in a function it's accessible anywhere in that function until the end of the function What are some other types? So we talked about member variables which are accessible all the way throughout the class or any method associated with that class What else? Globals Globals? So where are those accessible by? Everywhere Everywhere? Can I access your global variables? In whose program? Maybe there actually is a bioscope so if you look if you look at the lexer.c from project 2 a lot of the variables some of the variables are declared static and that means they're only accessible in that file So even if you include lexer.h you can't access any of those values even though they're global variables So that's what static keyword means on variable declarations in C code It's file Well let's say I don't have static and think back to project 2 How do you get the line numbers from project 2? Did it manually by hand? It's all in The running process Where did the line number live? lexer.c Yeah, in lexer.c but you're writing a code What is it? project 2.c How many? Something else? .c And it's including that but it's actually not it's not including the header file the header file is only declarations which says hey there will be an integer called line number don't worry about it and then it compiles you compile both and then the linking steps links that together and it says hey your program.c is actually relying on a line number variable and I'm great that's defined in this .o file that's what the .o files are and so then it links it together before it spits out the binary so that's how you access global variables so me, at my computer here I can't access your global variables right? but could I? could you write a language? do you think of a language? what would it mean to have a language where let's say function definitions are global like actually global would it be cool or terrible? yeah yeah so you need some way of distinguishing between different instances of the same name why would it be useful? wouldn't it be kind of cool if you just call a function that I wrote just by saying oh you can call this function and you know it's my function I wrote it it's out there let's say on the left you can just reuse my code that code reuse becomes incredibly easy it's actually even better if then you have a way to uniquely identify a specific instance of a function and every instance is unique so I could refer to that specific version of your function and I know you're not going to change it so yeah so we've seen some declarations are accessible the entire program some just the entire file some could be global and actually you think that's super crazy things like Android package names they're going to do Android app development you have to create you have to give every package a name and that name has to be unique across every single Android application in the Google Market place so actually get around that problem they have the same problem but they manage I think there's like 2 million Android apps now or something crazy but they're able to do it so yeah they do things like this like the Facebook app is usually just like Java package names is where they came from you reverse the domain name you're like com.facebook.katana some things are functional and so the related question here these are two sides of the same coin one question is how long is the name valid which is what we've been talking about if you declare something in C in a block it's accessible throughout that block the reverse question is when you see the use of a variable if you see foo how do you map foo to a declaration how do you know that foo is an integer or is it a string or what is it this is these are the same pretty much the same question so scope is what we're going to be setting here scope is the semantics behind this question behind how long is the declaration valid and how to actually resolve a name and so we'll see there's different styles of doing this and these mean different things for your program and be having thoughts on names seems like something super simple and trivial anyway what was it obvious to you when you first learned it that like did you so we actually so there's a resounding no when I asked you a variable to find if block in C is accessible outside is that because you've been burned by that or you've tried to do that and then the compiler complained he didn't really know why so he moved it somewhere else right these things aren't obvious necessarily but they need to be taught and they're completely up to the control that's the key piece you have a lot of variety and a lot of different ways you do this so cool let's think about C style scoping so C uses block level scoping like you said so the decorations are valid in the block that they declared and then we have this caveat about global right so decorations that are not in the block are global unless the static keyword is used in which case the decoration is only for that file boom that's how you do that one done should we go home? not yet okay JavaScript is different decorations are valid in the function that's why any of you on JavaScript programming yeah so you have the problem that when you have a script block and you want to declare a variable inside that script block but not have that variable leak in the global you have to surround that inside the and so that's why you're doing a variable declaration in JavaScript if you don't want that declaration to be global you have to create another function because that declaration needs to be only in that function cool okay let's look at some C code I love looking at code alright so we're including standard.io.h what's in there what is it? input output input output stuff printf is in there what was that? scanf scanf yeah it's enough yeah scanf yeah yeah that's a C function I was thinking about what's what's the C plus plus one oh that's the C yeah that's right cool alright so defining function main is this valid C should I add more arguments here? yeah so okay we'll think of that's true later we haven't revealed all the code yet alright but most main main can take in up to three parameters the first parameter is arm C the number of arguments that are passed the second one character pointer pointer are V a vector of character pointers of the string arguments that are passed in and the third one is an environment pointer usually called ENVP which is a pointer to a bunch of environment variables so if we can define all that we can also define nothing and it will still work okay cool let's say inside here we have a block I have an i I say i is equal to 10,000 and I print out the value of i valid C and just without anything else yep you do it as much as you want yep block blocks the only thing you carry about is blocks and then I make another block and I can say print out i is this valid C? yeah let's think about it is it syntactically valid C? do I have to get a semicolon anywhere? no look at it in your head is it going to compile it? yeah you're just saying that can sound strange I should do that I should put random odds into these things so you can practice right syntactically it should be a valid C file our include is correct the int main is correct we have matching blocks here is that invalid syntax? syntax I mean don't you have to type in one block? talking about syntax is the syntax valid so what would make it invalid syntax? yeah if I forget if I miss this quotation mark then it's going to be invalid syntax so the quotations do new lines? no I don't think so I don't think so I don't think you can multi-line and see that's strange so yeah that would definitely be an error if I forgot a semicolon anywhere well anywhere that there is a semicolon these three places that would be an error if I forgot any one of these braces that would be an error if I had any of the braces if I didn't have these parentheses would also be a syntax error so it's important to think about what is it valid syntactically does it look like it could be a valid C program and the semantics question is is this semantically a valid C program? no, why not? yes, why? same scope before it was declared so we have a declaration so how many decorations do we have here? two where's the first one? main main we actually technically probably have a lot more because include standard.io is going to set h is going to paste that header file of standard.io.mage there that's going to have a ton of decorations for all the printf functions that we want but we don't have to worry about that just in this code we're declaring here a function called main that is what? what's the return type of main? an int and then what's the parameter type? it doesn't take anything it takes nothing no arguments here so that's the first declaration and that means well, we didn't talk about how things would be declared here but then later on with another function I could call main there's nothing special about main main just says this is where to start but I could call this function again and I guess we should say main is valid inside its own body so I could have main called main recursively if I was a crazy person maybe I am so on my second definition I have an int i here so I'm declaring a variable i and so on this line what's going to happen? it's going to assign the value 10,000 where? which i this declaration of i so we want to map the usage to the declaration right here i's main use I'm going to map it here, good this is i map 2 yeah the same one right and so we think about the block level scope it means that i is declared here it's accessible anywhere in that block and so when we see this i here what do we map it to? can we map it to any declaration? is it main? no, that's the only other thing is it defined as standard io.h? probably not, that'd be weird let's compile it so we can try compiling it let's try and declare line 11 and so the reason is exactly what we talked about this i has this scope it is defined in this block and so all usages of i first look up in their local scope their local block hey, was there something declared that's called i if there's not then it looks in the scope above that in the global scope or if there's any other scope we keep looking until it gets to the global scope and finally it says no there's no variables to find here so this i has nowhere to go so it's an error now you know where your errors come from put an if around here to the exact same thing get rid of these braces and it's still the same thing because this i is only accessible in this scope and so if this is outside what if I got rid of these braces? would it work or would it not work? it would work what do you think? is it going to work? is it not going to work? I was going to do an example but it would work so this i without these without this block this i is declared for this whole scope of main and so this it would be a child scope of main scope and so still this i would look first locally in its local scope and say is there an idea? no let's check the outer one which is main it would say yes there is a declaration of i here and so we know that that's the one we can use I made one line change does it seem tactically valid? is it semantically valid? it's not initialized yet is that invalid semantics though? so let's think of it I guess we need to bring that question down a little bit does this usage of i here map to a specific declaration of i yes so in that case that's semantically fine some people also notice some other problems there's no return of main can you still compile a C program without a return value? it's actually just a warning so that's technically not against valid semantically valid C it's not good C code it can do it the same goes for here you're basically using a variable before you've assigned it a value so this is actually fine behavior I mean it's not fine you can do this but you get the compiler that makes no guarantee on what it's going to output so I can compile this I can run it it's going to put 10,000 and it's going to put something else and so this was on a CentOS 6.7 system that I compiled this and ran this this is on the MAC so it's actually using a different it's not using PCC it's inclined to compile everything when I run it I get a completely different answer why? it's not the memory address of i the memory address of i, i as we'll see is on the stack somewhere that can be one theory, that's a good theory so one could be maybe the compiler automatically initializes it to zero I would say that may be unlikely because compilers want to be very fast and so if they have to initialize every variable that you didn't initialize I don't know, they may be doing more work because you may initialize it later and now they've done this unnecessary work but they definitely need to do that really what it's doing is just outputting whatever is at this memory location whatever was there before in this case on Linux for whatever reason it's zero and here on MAC it's some other value so okay, so now we have to look at some examples of different types I mean different ways in which our intuition and our understanding of C it seems almost silly to be talking in this little detail about these very basic things but we really want to understand how this is done and so we're going to focus on how to resolve a name so when we want to resolve a name, as we saw we want to map that name back to the place that it was declared why do we want to do this? the declaration has the information about the type so you want to make sure that in that scope all instances of that variable are in the same data type you wouldn't want to in some cases and then later assume it's a string you're going to have problems why else? to clarify the value of it it's one thing to think about abstractly these are variables variables have types and variables have values as the program is executing what is a variable to the computer? to the CPU it's just a chunk of memory somewhere on the computer and it's going to be the size of the data type and that's going to be system dependent on a 32 bit system this integer will be 32 bits but it's just 32 bits somewhere in memory so the computer needs to know hey here when I set that memory to the equivalent of 10,000 for an integer when I print that out I'm going to print out that same memory address and I better not print out something else because all the CPU the hardware knows is registers and memory that's it but us we don't want to think about registers and memory we want to think about variables so the compiler's job is to do this mapping mapping variables and variable usage to memory addresses and we're going to get into that much later much later? so to do this we're going to use a data structure we're going to call a symbol table so the symbol table is going to store in every scope what? declarations are there in that scope and any metadata associated with that is it an integer, is it a function what's the data type all those specific things that we need and so the symbol is basically mapping declarations to attributes and so we're going to look at two different ways so remember I talked about design decisions we're going to study different ways of how to do this mapping it may seem like it should only ever be one way because that's the only way you've ever learned or used in your programming but the point is that those are just arbitrary decisions that we've made as a community and they have frozen cons as opposed to other ways of doing this scope and this resolution but fundamentally we still need to do the same thing we see a name, how do we map that to a decoration so a static programming is in the name it sounds statically, what does that mean in that instance should it ever change based on how the program executes it should change so on different runs of this program I would point to different values so that's what we mean by statically we mean that when we compile the program we do this mapping then we do it once and that's it, it's done it never changes no matter how that program is executed every time that eye is seen it's the same memory address as this other eye over here the flip side of that is you think, okay one is static the other one's going to be dynamic so we're actually doing this resolution of names to decorations dynamically as the program executes so we're looking up we're building the symbol table dynamically as the program executes and we're going to look up and find and map usages to decoration to values in the symbol table at runtime this means that different paths to the program, different executions can mean that different usages of variables are mapped to different decorations it's all seen, there's pros and cons to each approach so we really don't understand them to talk about them to really be able to talk about that so the key difference is alright, let's look at an example so this is going to be static scoping so this is the stuff that you're familiar with this is your stuff so we have an int x so what's going to be the scope of this x what was that global, yeah it's a bit of a lie though for that in a second what do I have here a function what's the name of the function bar, it returns nothing and it accepts no parameters now I have a function foo is this a decoration that's important, right it's a decoration, it's also a definition of the function so you can think about what's the big difference in information between bar and foo here foo has parameters, I mean foo tells you it takes in no parameters and the decoration of bar tells you it takes in no parameters what was that, yeah there's no body we don't know what's actually the code of bar but here we know the code we know the body of foo that's just the big difference between definitions and decorations and here I can do things like this I can say I have a character C and I'm setting it to the character C decoration is this going to be, is C going to be accessible from bar's definition no, block level, right it's only defined in this block now can I call bar yes, why yeah I can call bar because it's already been declared as we'll see, we're just stepping through this now we'll see, we'll go through and build a symbol table and see how you resolve these and I can print out X as an integer and C as a character can I do these things what does X refer to you have a global X what's the scope of Baz is it okay not what's inside, Baz itself it's a global but this is why it's a misnomer can I call can I call Baz from foo no, no no, it doesn't sound very global does it so it's really the strict, super strict definition is from the declaration to the end of the file and any files that include this can call it for global right, so from here to the end of the file anybody can call X this is why if you're coding like a single file C program it's good to have your definitions your decorations and your functions on the top so that any function can call those right, otherwise you have to properly structure your functions such that you never call a function you haven't defined yet and this is why header files are so important because it's the same idea you take those definitions with those decorations, you move them to a separate file and you include them at the top and they include just paste them right in there but now you're actually declaring all the functions you're going to use and they're going to be available to any function in that C code okay, we have Baz, Baz prints out X and then it sets X to 1,337 which X is this? Global X, perfect okay, now we have a function MAR which now is the new declaration creating a new variable X in which scope? in MAR's local block scope exactly, cool then it can call Baz so what if I access X here which X would get access would be this one so local one, yeah question is why we'll see why in a second the algorithm for how you look up and map the usage to a declaration now we finally get to our main function where we set X to 10 which X is this global then inside another scope we set declare a new X where X is now not a manager, it's a character pointer it's a pointer to a character and we print out the string X so this usage, which X is this going to refer to the global in X it's going to be a character pointer, we'll see and then it's going to call food so question is are there different declarations of X and the other way to think about this is there anywhere you can write an X in this program that maps to two of the same declarations let's think about that that's the only thing we're going to do is block that one, that's not true stay from the intern to be project three and also think about this