 I think oh don't know the backs probably find the front. Yeah, are they over there? Okay there. Yeah That's probably better you can see that. Okay good All right Unfortunately, this is too far over here. I'm gonna stick this closer to here where I can see wait is this one you up recording But line in blinking The line in is blinking, but so something is on here. Yeah, it's okay. I just don't know if you're recording or not Okay, you're good Yeah, okay, I'll put this down So good afternoon everybody um my name is castan some people call me rasta I'm here to talk about see it's not very trendy. It's not very popular. I'm gonna try and make it trendy and popular That's my job. How do I mean how to bring the fashion back? I have written a few things in C before I wrote a window manager the terminal emulator whole UI toolkit Video player I've actually written window manager several times over and so on so I have a little bit of experience with it here and there So I'm gonna try and share that some of that with you today if you have a laptop Please turn it on and make sure you install things because this isn't gonna be me just telling you to do X I want you guys to do something So I'm making the assumption here that this is an open-source conference that you have a laptop that runs Linux or BSD or something similar If you have in something that runs something else, you will have to deal with that. Don't ask me how Mac works I've never used the Mac. I don't know how they work. I don't know they're weird things Yeah, but you'll find I have some good commands that might not actually exist on a Mac I don't know I don't use it. So maybe the terminal will work on a Mac beats me You figure that out or ask me questions and I might be able to help you so first of all I Do not like actually just speaking to you. I want you to ask questions to be interactive You've got question ask it. You've got a query Don't sit there and stay confused and be quiet in the corner and feel embarrassed and asking a question. It's not bad It's a good thing Try things actually do stuff. You learn best by poking it and seeing how it wobbles so poke it and see how it wobbles and Give me comments and any feedback you have whether you think that's the right way or the wrong way and why you think that plus these discussions actually start a chain of Talking about a topic and then maybe learning about something. So don't be shy to do that So oh, I didn't actually do this yet because the network here didn't work I'll try do it now actually Do do do do Just that comes up. They've done it again. Unfortunately. I can't this slide isn't going to work Okay, um And for I would like I'll put this up later when I have a network that works. Unfortunately, they're blocking SSH full screen is it There we go. Um, they're blocking SSH. So I can't upload this This is all the examples and source code for this session. I tried to upload it this morning here. Sorry Wouldn't allow me to Something's wrong. I don't know what so See is actually probably the most popular language in the world When you speak about popularity in the amount of code that we'd be running on your average server your average PC Even the most phones and embedded devices the vast majority of that is probably C code You might not see it. You might not know it, but it's running in terms of the amount of code executed all day long it's done in this language it's quite old and Being old. It's actually a language a lot of other languages are inspired by they build on top of it a lot of syntax of language like Java JavaScript The even D or C++ as well, of course are based on C They basically use the same syntax C of course was based on B Before it there was actually a language called a I believe I don't know back then I wasn't actually doing things until C So but it is basically where everything comes from if you learn this and you understand this most other languages are fairly easy To deal with they're an extension of it. So that's a good thing another reason It's really important is major things like the Linux kernel on C while I was actually having this discussion with peer people here during the conference and The chances of you replacing something like Linux kernel another language are approximately zero Because you'll first have to rewrite what's there, which is tens of millions of lines of code and Then that will probably take 10 or 20 years to Rewrite that's up because the kernel won't stand still for 20 years So if the kernel is to be maintained in future you need people to understand C. So This bring brings me to another nice little line. It's something that the people in the Linux Foundation always quote There are no kernel programmers who have individually submitted more than three patches ever Because after the third patch you get hired by somebody Basically, if you know how to work on the kernel and you get patches in you get hired. Basically, it's a guaranteed job Asterix there is apparently one crazy guy who has submitted lots of patches and refuses to be hired by anybody But everyone who works on the kernel gets hired So it's actually a good job and the Linux kernel is everywhere It is basically today the kernel of the world it runs on most phones on mountains of embedded devices on Probably the vast majority of smart TVs smart watches Servers and even some PCs and laptops It's only in PC and laptop land that it isn't the majority everywhere else. It's basically one. So very important Most of the things built on top of the kernel and the next layers up are all done in C So for several layers up you will find more and more C code up there And that just multiplies the amount even more the embedded space is very heavily see Because you need something small lean and mean that can run on small microcontrollers, etc The other big benefit about C is it basically teaches you how the machine works if you get good at C You know how a machine works if you know how machines work You know how to make things actually very optimal because at the end of the day performance is related to how your machine works It's not related to like, you know, do you have the most beautiful algorithm in the world? Or is your code really pretty and nice to read and review? It's based on really how well it runs on that machine So understanding machines lets you write more optimal code in languages other than C But learning it through C is not a bad idea For me C was always a fun challenge Admittedly before I started C I did a lot of other low-level stuff But it's a challenge to live there to work at that level and actually get things right The good thing about it is it teaches you good habits So to avoid bugs you'd have a certain way of doing things so the bug doesn't happen So for example, I don't leak memory I don't allocate it and then forget to free it because I write the allocate then I write the free Then I go back up online and insert my code in the middle So I don't actually ever forget. It's just a habit that you create and after a while you don't think about it Learning good habits means you write better code with less bugs So But it doesn't mean you have to do everything in C Other languages are good for other things if you really care about program of productivity C is not the best language necessarily that number of hours you spend doing something But if you care really about the result of your product and how efficient lean mean etc. It is it probably is actually a very good language Unlike a lot of other languages and runtimes C doesn't limit your performance So if you're going to use Java you still have a whole VM in a jit to go through that may or may not produce optimal code Etc etc etc etc C will and its cousins like C++ etc They will allow you to produce perfectly optimal code, but it's only you that stands in the way not the runtime Whereas you use Python. It's always going to be slow and there's nothing you can do about that So you're the only one who's in charge of making things good And also C is great if you're very unhip stuff today So since it's not hip, it's good. I like it So why would you listen to me? Well, let's see I've done more than 20 years of C Before that I actually did about six years of 68 K assembly I've written most of my code living on Linux It actually is ported to Windows and actually runs on Mac OS to for the Mac users here I have no Apple products at all at home. So I don't actually see it. Just people tell me it does I have a Windows VM. I tested in I just booted up and run it and say all it works And then I swear at Windows for a bit and shut it down and I've actually done a lot of arm work as well in the more recent decade or so a lot of neon Assembly as well for optimizing and doing graphics and things like that So I've done a fair bit of code. I've probably written somewhere in the region of about a million lines of C So the libraries and window manager I work on total probably about 1.5 million lines of code these days All done in C. So I've done a bit. I've been through a few war wars doing this. Also, I'm old and grumpy. Oh, sorry. I'm old and wise Another reason to listen to me So what you need today, you'll probably need a terminal like something to run stuff in You'll need some sort of text editor You'll need Mx or Vim or Vi or nano-jed Oh, just use an ID and use the clips if you want to be lazy or whatever works for you Go and find something. You will need a compiler today. Basically your compiler is a GCC or clang Yes, Microsoft actually do a compiler. It only runs on Windows. So I was going to say it's a niche market so to speak And there are a few other really niche companies actually TCC tiny C compiler But the main ones GCC and clang pretty much own the compiler world. So have one of those And you need to tap-tanching shoes. We're gonna dance. No, we're not. I'm just kidding So this will be your basic first C program That's it. You compile it that way CC is the standard C compiler Command even if you install GCC CC still works. It's a sim link pointing to GCC it's kind of just what everyone agreed on from the early days, so you compile it and If the compiler succeeds amp-sand amp-sand run the code So that's kind of the lazy man shell scripting in C. So to speak you can literally just Compile and run on the same line and it'll compile so quickly it may as well be a shell script in these days So all that does is run an application and say hello world. It pretty much makes sense, right? Everyone gets that by the way question who here has never actually programmed before Okay, who here has actually done any C Okay, three, okay There's a people who've done C. Is it like beginner intermediate or expert? beginner okay, so beginner slash intermediate and Oh, okay, right All right, but so who here has done things like Java Yes Yes, Python Yep, ruby Yeah, so you've done a smattering of other languages, so okay, we're good So basically call function give the function a string to do something it with it and presto out it comes. That's really simple So what I missed before is I didn't handle the arguments So every program is past arguments. This is pretty much the standard between all operating systems arguments are basically a series of strings argument one argument two argument three argument four and so on etc etc The first argument is always yourself. I the executable and then it's always the next one along That's your first argument So if we go that wise we start our V1 arrays and see started zero in most languages They start at zero unless you're in lower land where they like to start at one for whatever strange reason. I don't know So this basically looks at your first argument and Just hands that as a string to the print output in C the standard print function print F Which uses a format that says take This string here and then modify it by replacing the following elements with the things. I'm about to give you So percent S would be a string there So it takes the first string and puts it in and so if you run source Bob, it'll say hello Bob pretty simple and If you printed our V zero and if you try that now you will get source Instead of Bob you'll get the actual executable and each foot our V2 You might get a crash or you might get an old pointer I don't know it depends on your system, but it's undefined because if you check the argc value which we don't do here for simplicity and Warning don't nitpick my code over these things I've specifically not checked lots of things to keep the code simple and short You should do more checking of these things always in your real code. I've only done it for illustration purposes So argc there would actually say two Which means zero and one I there's a count of two of your args are valid So this is pretty common arrays of strings. You get one the moment you start a program. It's useful to know So now let's go and improve this a little bit more Sorry Oh, sorry, okay, blah blah blah blah. I I go actually I was I'm going to get to that in a bit I had to do it because The hard bit is they have to start talking about pointers and pointers I want her to do a little bit further on then right at the moment you pass go Because it's basically a pointer to a pointer double pointer I'll get to that in a bit, but basically think of it. It's an array of strings. I Could do char RV open square back at close square back it and it would do the same thing In C. They are the same kind of thing if there's one important thing in C to master its pointers and but in my experience People get it or they don't get it and People who don't get it generally don't understand the concept of indirection So if you can't understand indirection, your life as a programmer is going to be horrible You're not going to like programming. It's going to suck especially doing anything complicated So understanding pointers is a key to understanding indirection. You understand indirection and guess what? Programming is much easier. It's less of a challenge. So very important to understand purchase But I'll get to them in a bit later. So Here instead this time What do I do here? That's right. Um, I will just first of all print the first arm So the first command as I was discussing before so, you know what it is. It's just as illustration I'll tell you how many arguments your past and then we'll run a for loop. So C has for loops while loops All sorts of other constructs. I'm only going to cover a few here. We only have two hours I could do this for like 20 could do it for you for several days So basically we do a loop we have an integer we start at one because that's where arguments actually really start And then we just start printing it So we replace a string we put a space next to it and then finally at the end the backslash end is a new line So to make sure we don't have a line between each of these things We don't print the new line and then we just throw it in at the end So you can print your arguments so that would be how you iterate over an array and you're going to pump it out and then So now I want to type C is a type language unlike some scripting languages Data types have a format and a size One of the more common ones is an int. It's an integer Generally, I'm going to now be generally speaking. It's 32 bits or four bytes So you'll get a male and value of either minus two billion to two billion or so or zero to four billion depending if it's unsigned or signed depending which way you look at it and The way it's done is actually two's complement I'm not going to go into detail of two's complement But just think that zero up to about two billion and then when it goes beyond that it wraps around to the largest negative value Sorry, it wraps around and then Starts going negative and then when you go to the largest value will actually be at minus one just before zero So you just kind of move the zero point somewhere else in your range two's complement effectively works that way And almost all the binding all binary types basically do this They used to be other ways we have a sign bit and so on but two's complement became how hard where does this? We got chars another really common type. It's basically a bite It's 8 bits often used to store characters like string characters, but it doesn't mean it's only characters It can be any 8-bit sign type Warning on arm char is actually unsigned by default and most other platforms. It's signed by default If you don't put a signed or unsigned in front of it So if you really care about it being signed you have to put a signed in front of it So the same thing zero to 235. That's what 8 bits can store or zero to have F in hex Unicode Of course Do you know the Unicode standards? Do you know so the Unicode standards before the study extending need about 21 bits to store everything in the Unicode range? But people have been extending this now So I think now you need about 22 or 23 bits. I've got to go back But either way 22 or 23 bits isn't 16 bits. It's not 8 bits so you go next one up to 32 and It's kind of really cool. I read a total emulator and I use the unused Unicode range for metadata So I use that to tag special cells with extra data So I can do things I can put videos in the terminal or images or little Icons from disk and I use that unused range because Unicode's not using it. So yeah, it's a 32 bit value at int per Per character But you're talking that's a Unicode the Unicode only defines the character set like Zero is well actually end of string It's defined that way or nothing null and then so on the first hundred twenty eight is actually ASCII So that has you zero to seven zero zero to nine eight is ed blah blah blah blah all the nano ask And then beyond that you start getting more ranges. It only defines your value It doesn't even find encoding now encoding you might make a mixed up with Unicode encoding is how do you represent these Unicode values in a series of data a series of bytes in memory or on disk There have been many ways at one point there was the 16-bit Unicode which ended up being not enough for Unicode because they went past 16 bits So that's why today Windows does UTF-16. So UTF-16 is a way of using unused bits before and Extending it so a character is multiple bytes in Unix. They never went to 16-bit by default but Windows did to try and let's solve Unicode once and for all which didn't happen and So Unix never went there. There were all these encodings like this probably like 200 or 300 different kinds of encodings for text that are commonly done if you've got a machine now type in icon I believe icon minus L will list the encodings that handles and just get a big blob of every possible encoding I was gonna get that but today Unix basically went we're doing UTF-8 because it's basically taking ASCII and extending it So it's either what it's totally ASCII compatible, but then you can extend it using multiple byte sequences to handle the extra ones So a lot of the time will be one byte one byte then maybe three bytes is quite common the next one's up and so on No what happens so for UTF-8 It's just a series of cars so cha cha cha and you have to look at the first try and based on its bits and values Is the next child the next character or is it extending the current value? Then you read them you munch them together you get a Unicode value at the end of the day and then you go and print that You can but you'll need UTF-8 Basically, and today actually UTF-8 has become almost a de facto stand-in on everything outside of Windows You have Windows as UTF-16 and the Internet most of web standards all the Unixes etc They pretty much have decided UTF-8 because it was ASCII compatible Also, it's actually very compact most of the time when you have to transfer non High-level high value Unicode values. It actually compacts better It is an absolute royal pain in the rear for programmers because now what's character number 16 in this string? I don't know you have to actually walk it from beginning and go as you go to find which one is and you've got a decode as You go to know it's painful or as before it was fixed you just go to byte number 16 or if in When you had the UCS 2 which was the 2 bytes per character a 16 bits you go to byte number 32 No No, no, wait, wait, wait, wait, I'll get to that to the very near the very end. I'll come to that In C you don't get the types. You don't get the language picking up all the pieces for you. It provides the real basics We have is libraries Libraries do this work for you. Libraries say give me a UTF-8 string I'll give you a Unicode string of ints as a result. You don't have to know or care I'll deal with that for you and I'll do the reverse. I'll pump out UTF-8 for you, etc I can walk UTF-8 characters for you like go to the next one go to the next one go the previous one and it'll decode as it goes So you can use libraries that do that. I know that I wrote them I've done that I've written these libraries and what I do is I write the library I make it work I debug it and then I use it and I stop writing that code I can remember what it did when I wrote it six years ago kind of vaguely so I can tell you about it But then I just start using it So the trick to doing really effective C once you've mastered the basics is using libraries Unless you really want to go write your own libraries But sometimes you do sometimes the library doesn't solve something the way you want them to want to solve it And that's the point about knowing basics if you know your basics you can build your own solution you can Take some fire to have the solution you want and not just modify it Maybe if it's a good change and it's clean and neat Send it back up stream to improve the library for everybody else So you don't have to maintain that code anymore It just works and everyone else can benefit to and then if everyone does that together you end up with quite a lot of good code So there's quite a lot of good libraries that do these things for you But I'll get to that kind towards the end. It's kind of getting towards the more advanced topic and so If everyone looks at the ASCII manual page man ASCII it'll give you a table of what the values are for all the ASCII characters This is probably something most programmers should actually know You don't have to know every value, but have a rough idea of what the ASCII table looks like So if I actually want to declare a string in C a char star Which is a pointer to something a pointer to Charles a pointer to characters or an array an Array is actually just if I refer to the array. I refer to the pointer to the first member So it effectively is the same thing So if I would it's find a string like that with actual Characters put in each element the first line would be explicitly six if I did the second one The C per the C compiler will automatically figure out how big it is because you've given it content It'll know how big to make it the third one I can just give it an actual string and it'll figure out how to do that too and The last one the pointer to it is exactly same as all of the others and in memory You just have a series of bytes with those at in those decimal values in it which represent stuff Strings are just numbers. They're not magic strings strings are not this magic new different thing. They're just numbers Everything is just numbers actually So, okay, here's pointers a little bit. Oh, yes How do I convert it why I looked it up in the ASCII table and wrote it down for you there No, I get no in If you print effort if you were to print effort so you can read the output print F percent C will print it as a check character Print F percent I and it'll print it as a number So you can get that out. It's very very easy, but they're actually in memory are just numbers It is just numbers a string or a character and a number. They're no different You're just arguing about how many bits in that how big is it? That's all it is Yeah, I think a lot of people may not appreciate just how simple C is and often they go Well, how do you do this? You don't it doesn't exist. You don't have to do it. It's all really basic So you master the basics and everything gets built on top of the basics so and if you master the basics you can do assembly and You can maintain kernels because you know how to do this stuff. It's good So pointers are really important pointers do what they say they point to something they say this thing is there It's like an it's an address. It's actually a cold addresses, but think of it like a street address It's saying this thing lives at 120, you know Bob Street and it's telling you that's where it is So pointers can say this is over here a pointer can say go to this address Find me the piece of paper over there and that points to the next address and that points to the next Arrest and that points of next address So you follow the chain of pointers till you find the thing you're after and pointers to pointers to pointers to Pointers are a real thing in code. You actually do this stuff. It's actually very very useful to do this and playing pointer games is really really fun You can do really interesting things with them. I've got if you want I can tell you beautiful stories about them later. Oh, yes No, um a pointer is a point wait wait wait so you can point it to point it to pointers Right, and you can never end but I'll get to that in a sec. Um and Wait, so remember I'll go to get to this in a bit, but pointer is just a number is Nothing but an integer. Well, okay asterisk Well actually it's a command it's a do this and then it's a kind of jump to a number an address a place in memory, but The actual pointer itself is just a number on a 32-bit system. It is 32 bits a 64-bit system. It is 64 bits It just is a number of that size Asterisk on today's modern systems. They might have the pointers be 64 bit But they only allow you to use about 48 bits of it as the actual pointer the other ones. They don't allow you to In modern Intel and modern x86 only gives you 48 bits for the pointer The other bits are actually unused at the moment So on a modern 64 bit machine if you look at pointers, you know It's a pointer, but you print it out and you can see a range of zeros in there That's a point or fffs and so on because you got I know what pointers look like when you print them out So you got I smell a pointer right there So it's a number like everything else. It's a number you get to know and recognize your numbers You can really debug things effectively. You look a memory dump and go to where that's a point Oh, that's Characters there blah blah blah back in my days. This is like when I was a kid This is how we hacked games. You took a floppy disk You dumped us and you hunted it for code you hunted it for the copy protection code You are looked for things that look like see an SM like machine code the actual op codes you knew and they go wait Oh, that's a jump instruction. That's an it. Oh, that's it. Oh, and then you can figure out the bit that does a copy protection Change the machine code to just skip it to where what's after presto copy our protection removed So but knowing your numbers allowed you to do this though These are the people who would write do you remember the days you have you ever heard of like 4k and 64k demos? Where people would write these programs the entire program was four kilobytes everything all data and they did amazing things on screen Of course, they knew their numbers. They knew how to deal with hardware registers They knew how to compress things how to compact things how to use less instructions blah blah blah blah Knowing your numbers allows you to do really incredible amazing things. So Anyway, um, but just keep in mind the more in directions you have the pointers the more memory you access Accessing memory is not free. It costs. So getting something from Maybe in the range of a hundred to two hundred clock cycles that means of those four gigahertz You lose two hundred of those gigas just sitting there and waiting for something to come from memory So it's expensive to go get something from memory That's why we have caches in between blah blah blah to try and speed that up But nevertheless the more memory you access the slower your world gets and it's the same with pointers So the more pointers you indirect via there isn't any magic that compresses it down to one single in direction It's your job to find a single in direction if you want that if you want to have that optimization But everyone is going to cost you because it has to go and do that in direction It's not free Just a little on an aside There was something we did in our project to try and save memory Because we found certain parts of our data structures with the same values again and again and again and again Because they're kind of default values and sometimes someone changes those fields to be something different It's kind of rare not common. So what we did is we'd actually created on copy on right We move that to be a pointer to another data structure somewhere else that everyone shares So it's one data structure memory for all those default values and only when someone Modifies it to they copy and make a custom copy themselves. It needs to be freed Interestingly enough by reducing our memory footprint our rendering speed went up by 5% Because we just accessed less memory during rendering So reducing how much memory you access will give you a speed up if you won't live in the world where performance really really matters You have to care about these things So going low level and knowing how your memory is laid out and how to compress things and get things to use less memory very use Next so before I go so covered printf before that you saw a few versions But whenever you see a percent in it unless it's percent It's Converting something it's saying please put an integer here a string here percent C is a kachar percent P as a pointer Percent D is also the same as an integer And then you can have percent li with a long integer Etc. Etc. There's this whole manual page on these I'm not going to cover it But printf is really useful you're going to use it all day long printf debugging is a thing and it actually is very useful at times So can I tell you what time it is? But that actually is it the I is an integer so 30 to be You have to know you have to know That's your job Yes It points to the first one Okay, okay, I skipped that I'm terribly sorry you're right good you bring it up So it points to the first one the first one's here The next one is the next bite after that next one's a bite after that next one's so when you print the string So bite bite bite bite and when you find a bite whose value is zero. That's the end of the string so common bug in Sea world that can lead to buffer overflows and things like this is you forget to put a zero at the end You fill a string you fill a buffer with a string you forget to add the zero So what happens is someone then uses that string and they start reading and they read off the end of the buffer and start reading All through memory and eventually crash or do something because you didn't put a zero there It might eventually encounter a zero in memory and stop by itself It might do it before any damage has happened, but maybe it doesn't maybe it walks off the end and When you put it as a string yes, you'll notice in one of the era in two of the arrays I actually added the zero by hand before so it was kind of hidden in there I didn't actually point it out, but yes, it's the zero that um that ends the string So always remember null terminate yours to your strings if you don't do that bad, man No, no that that that's a convention for strings a string is just an array of charts And the convention is we know the string has ended if there's a zero a zero by for other arrays You don't know the size of the array you get a pointer to the beginning You have to be told the size separately so often what that's why you will notice there's a function s n print f It's basically don't print f to output print to a memory buffer to generate a string in memory very very useful for many purposes and s s print f writes to a buffer and you just give it a point for a buffer and fill it What's the problem with just giving a pointer to a buffer? You don't know the size so if you use s print f and you're not absolutely 100% Ensure that the amount you're going to print in is less than the size of your buffer and you can guarantee that Then you should use s n print f s n says s and the n is for providing n as in size of array So you put in buffer then you put in size as the next parameter and then you do the rest of stuff after that So size is always passed along with the arm with the array Note that in real life. You don't actually need that size all the time You often have other ways of knowing the size like for example You got an image you don't give the size of the array You know the dimensions the image and pixels if I give you the base pointer I know where the whole thing blob of memory ends by multiplying my two sizes multiplied by my size of pixel thing And then I know how big that is in fact I will just never walk outside of those bounds because that'll lead to a crash So you decide how you pass that information. Maybe it's passed in via other data traction Maybe it's in the global variable some other way of knowing it which strings It's kind of their variable length bits of memory and you walk till it's a zero You figure it out, but you can come up with lots of imaginative ways of doing this And then it could be that you have a pointer and the first thing you point to is the size and then everything is trailing after that I mean you can invent also way But that's how you can compress things or make things more efficient by choosing a method of representing your data that works Best for your situation so Anyway, I covered four loops before Um Mosey that highlight should be there. Why isn't the highlight there? anyway, um So I covered four loops before I'm in C four loops are actually quite powerful They can do a lot So four loops basically have three conditions. They have a start condition. What's your initial state? They have a condition for should this loop continue and they have a condition for how to iterate the loop what to do So you're really bog standard out of the box first loop. Anyone ever tells you it's four equals zero X less than some number I plus plus So that's basically set the variable I to zero at the start then every time at the end of the loop check if I is actually less than 10 and if it is Then Do the iterate like the I plus plus I increase this by one and then go to the beginning of the loop again and continue The cool bit is you can make them do you do two things at the same time like go through two variables simultaneously start I at zero J at 10 if I is less than 10 and I is greater than five. Oh, sorry. Oh That should be out. Sorry. That should be J is greater than five. I'm terrible. Sorry type of If J is greater than five increment I plus one and decrement J by a button by one And then that'll only increment go over a certain sub part of loop you can make these as complicated as you like and I've done interesting ones that walk through it link lists that walk through complex string arrays that do lots of interesting stuff For the iterations if you know how to use for loops, they're just a nice handy construct To save you time, but the common ones are the first basic one So as long as you understand you can do more with them and then you explore that later. Good So back to machine now to machine concepts In reality your machine is pretty much this from your point of view. There's a CPU It does math and make decisions and it talks to RAM and there's a lot of storage. That's it That's what a machine is. We're done. You now know how a machine works. That's basically a machine Okay, there's a little bit more than that You've got a CPU and it talks to RAM Then you got a GPU and it may talk to RAM by DMA by just might use it They'd be living in memory like straight onto the memory bus It may be on a separate card over a PCI bus that can actually map memory and talk to it Your network cards, they will do the same thing They will probably have DMA engines and they'll write to memory and they'll read from memory The disk same thing if you have a system that doesn't Do very slow system. You do not want to have that. That's a horrible place to be So most hardware will actually have some kind of DMA and it's a CPU's job to tell every other unit What to do and when to do it. I Please discontrol a load block X from here on disk and write it to here in memory and the CPU said tells it to do that it basically Most of the hardware actually has another piece of memory you map and in that memory are registers for the hardware And the CPU just writes to that memory to give it information like base address block number Transaction type blah blah blah blah and then that gets written out either on a command queue or Directly it's just when you write the last register the operation begins so your system still Wait wait wait no I was talking hardware registers That's how it talks to the hardware like net and your network disk and so on I'll come to that in a bit Well, what diagrams for that But effectively think of it from a programmers point of view You just have a blob of memory to store stuff in and you have a CPU to basically fix from that Do some processing and write back out to it? To make decisions. That's what the world is instantly enough the people who make a GPU think the same way They think the world is like a CPU but inside the GPU and they work on the world in exactly the same way The disk controller guys kind of pretty much do the same way They all run firmware they run a kind of software of their own and they think that way so everyone's sharing this big place to put Stuff memory is where it goes Now this is where I say that I start coming of that Um Ram is generally really really slow you might not think that you might think no god. No hard drive slow Well, actually I find so floppy drives are slow and then tape drives. There isn't slower than that But no in the scheme of things RAM is actually very very slow. It's not very fast But it is big it's huge you can put a lot of stuff in it and then to make things faster your CPUs have more memory inside to today you actually will find in a lot of the bigger desktop CPUs in L3 cache On the embedded not so much L2 caches and L1 caches and each one gets smaller as you go. The diagram doesn't say that the L3 cache is Slow, but still faster than RAM and it's quite cheap to make and therefore it's quite big because they can afford to put a Lot of it in your L2 cache is more expensive Therefore they put less of it in but it's faster than L3 and L1 is yet faster more expensive and thus smaller L1 Normally these days you find 64k is what you have it may be split into 32k data 32k instruction But you'll have 64k sometimes shared sometimes not So 64 kilobytes is not very much and L1 cache is Reasonably fast, but then in what you really have is registers and registers are what a really fast Registers are zero access You can basically get to them immediately whatever's in a register. You can work with right away But Dart has to get to the register work with it and when you're done with it, it's got to go somewhere else I back out to memory So we're going up and down you know data goes flush through these caches all the time etc. Etc So that's the registers you are asking about before they sit there inside the CPU Two questions. What makes it expensive? Is it like different material or different clock cycle or what are the The way the way this they are the transistors are designed they consume more transistors They're just and they're Therefore it's more space on the die And they have to basically move it with shorter runs to the ALU's the other the other arithmetic logic units Rather than further away Do you actually use address these L3 L2 L1 no, no These these are invisible to you generally speaking asterisks But it is actually something you should know because basically you have to ask yourself I've got some data. Where's it going to live and you're going to think how big is the data if it's really big It's not going to live in L1 cache It might not live in L2 cache It might not even live in 3 L3 it might have to live out in memory And it means if you keep just rummaging through a huge amount of data You get very little caching speed-ups because you basically nullify the caches by having such a large data set But but if you only access small section of that data when you access it repeatedly your current data set then that works really really well So sometimes designing your data to fit inside of caches or really smart thing Now I'm going to jump over to GPU land so the people here know how GPUs arrange memory these days Oh, no, no, no, no, there's a reason for caching. I'm going to have caching so GPUs what they do is they don't lay out pixels with pixel pixel pixel pixel pixel in a nice big row like you might think like a linear frame buffer They arrange them in little tiles Those little tiles are actually designed so one tile fits in the GPUs cache The idea is it doesn't fetch a byte or even fetch a cache. Well, it fetches a cache line But when it fetches a cache line It fetches an entire tile of pixels and the reason it wants a 2d tile is GPUs tend to access pixels within a 2d area Around each other like this pixel the one next to it the one next to it often to do linear interpolation and smoothing and stuff like that Or when you rotate something you're not always walking horizontally you're walking diagonally, but you're still walking through a 2d region tile So they actually do things in tiles to optimize for caching To make that to make these tile caches work really efficiently And if you design your software and the CPU to work same way they can work really really efficiently too Because they just okay, you're that's what you're getting to now your average machine today might have four actual cores When they say they have eight they're actually hyper threading Which means they're creating a virtual core and they're switching that core when it stalls when it's Deciding to wait on RAM to another context and then start executing something else on the same core And so they basically just hardware switch it over, but you would generally have for maybe eight cores I don't know on some of the really high-end Machines, especially the now ARM servers The ARM servers are now switching and shipping with 42 or 44 something cores like real cores But that pales in comparison to GPUs GPUs often have a unit with Okay, wait wait wait hear me out. They will have a unit and often they might have four eight or sixteen cores instruction that can do 256 512 a thousand 24 operations in parallel So what you do is because you're hashing you're doing the same mathematical operation again and again and again You just say do the math to this one and this one and this one and this one and this one And you basically iterate and step over each step of the algorithm on a thousand twenty four things In parallel and they have a whole bunch of these units also working in parallel blah blah blah And that's why GPUs are awesome at parallel stuff Until you have to branch until you have to do if then else and they suck it if they're next You know why the course when you do when you do the if the whole 100,000 24 things has to switch and do the other case so they actually run both the if and the else and What happens is they just filter some of the units take the if result some take the else result But you actually have to iterate over all of them because they're so wide So that's why they stay are awesome at doing hashes because they really don't have to do any of that They just do math math math math pump it up. They're awesome at math and pumping it up in parallel So imagine they had like thousands upon thousands of cores compared to your arm So then compared to your arm When moving dates between the rounds you Mm-hmm. Yeah, you have to manage a queue right? No, that's done in hardware for you. Oh The hardware does the hardware actually knows what has to be flushed out and what's in caches and it will go do that like we need to put when you're like interfacing with a piece of hardware like a Serial board mm-hmm You would it depends how it's wired up the old-fashioned way is there's just a Register and you write to it and whatever bite you write to it gets pumped out by the hardware a more modern hardware You'll just write to a series of a bit of memory and you'll say this bit of memory now start and the hardware will read from that Bit of memory that buffer and just start writing out the bits for you from each bite And then you'll get an interrupt to say when it's done Yes, yes, so this from the CPU's job is you write the stuff somewhere out to memory That's why the diagram before you put it somewhere in memory Then you tell a piece of hardware here in memory. I put it here. Go make it work for me Go write it out for me and whether it's the serial port. It's a hard drive. It's some GPU it's pretty much anything else. They pretty much do the same thing GPUs are probably by far the most complicated piece of non CPU hardware out there by a huge mile So there's a lot you can do with those But serial port's actually a nice easy example So just write this out or reading in it'll write to a buffer that's been told before here's the input buffer Then it'll send you an interrupt to saying I have written in I've written bytes this buffer when it's full And you might actually have two buffers So you have the a buffer and the B buffer and once it's filled the a buffer it sends you an interrupt saying it's full and Then you go and take that buffer and read from it in the CPU and decide what to do with it Oh In In interrupts are a sideline which basically tells the CPU hey something happened and one of these end lines wakes up It just says something happened No, it doesn't call well, okay There is actually a jump table basically I am an interrupt table that says if this interrupt comes in jump to this address and run the code that's there and Generally speaking your kernel will have this it'll set up all the interrupt handlers and its interrupt handle go Oh, this thing happened and then based on that decide what to do with it They'll split it into like an upper half and a lower half where one half is actually really compact and does very very little So it allows the interrupts to be unmasked and then continue and then the second half actually does the load of work Which might be copy the data out or things so the first one might be oh do this flip buffers to the second buffer Go back and then tell some other part of the kernel that like the Buffer is now able to be read from and you can this is how many bytes are in it go deal with it And this other guy copies it out somewhere else does something with it Then he may actually send another interrupt to get well actually doesn't send any he calls user space and switches And then user space to switch back actually sends an interrupt to say and then that interrupt switches back in the kernel space And so on but it's pretty much just memory memory memory the interrupt line is a sideline of Signaling and it's actually quite limited you only have you know Maybe eight sixteen thirty two kind of sixty four one twenty eight kind of interrupt lines depending on your hardware and platform and so on Okay, um, what was it? Here we go Um, so you know it's covering that your data lives in memory It'll just move back and forth as needed and that's generally this the job of software to go move it back and forth and Also between hardware other disk systems. I other IO etc etc All the caches are just ways of making memory look faster than it really is by things you access often go live in cash Note that caches also a fairly course They only fetch memory in things called cache lines cache lines these days are approximately 64 bytes And I think some of the more modern arms they're 128 bytes And that means if you access just one byte in that memory It'll fetch the entire 64 bytes Together then put that into the cache and now you get to access it cool It is you access one byte in that cache line the rest of them are free Because it's already paid the price to move them in so knowing how your caching works is a fact useful You design your data structure to be cache friendly I if you're going to have to stop and wait on memory Make sure the other things you're going to access immediately after are together it in Together in memory in the same cache line, so then you access after that a quote-quote free There's other beautiful. There's actually CPU instructions to pre-fetch I go to the cash and say cash Please fetch this now and then you go do something else for a while and come back later and get it and you hope by the time You come back later It's already fetched and you don't stall if it hasn't been fetched yet You might have to wait, but maybe you wait less because it's already started the fetch No, no, no, no So keep in mind processors have very very few registers It varies from processor to processor, but normally in the ballpark of four eight, maybe sixteen registers is pretty common I guess you might see some But it's generally a fairly limited resource and so the CPU just reads and writes data and Does math I either actual math or logical math. I if this is this else that compare and branch and That's how it makes this logic decision life is all about comparing numbers Well go There we go So all your data in the world as I was covering before is just numbers strings and numbers your pixels and numbers Your audio is numbers You everything in memory is just numbers numbers numbers numbers Where is your data? It's at numbers. It's as addresses in memory. It just lives at this data lives at 123 over there this data lives at 876 over there this data lives at five over there well the numbers are actually much bigger than that But you get the idea everything is numbers and not about data is just numbers numbers numbers. It's all numbers Well, why did that go back and in the world of numbers? The only two numbers exist Zero and one there is nothing else. There is no spoon Um Then everything else is built on top of zero and one has everyone here at least done learning binary Okay, so okay, you've learned binary. We all know that blah blah. I can skip that good and Please get used to hex hex is good. You know why it's good. You'll find out very soon Hex is good because you can nicely represent memory in a way that you as a human could read it And it still is very machine-centric and if you know how to learn read your hex and kind of grasp it and understand it And know your magic values and what they are You will be able to debug and do things far more effectively so basically hex is just declaring 16 actual digits and They just represent binary blobs of four bits and You can figure out the decimal from that blah blah blah blah, but it's really base 16 instead of base 10. That's all it is So what is memory now as I was starting to say before memory is a series of letterboxes And a letterbox just has a number in it Your memory has a lot of little letterboxes like a piece of paper and on that piece of paper has written a number of ring The value of value of zero to 255 or zero to ff in hex and that's a small selection of that So often your registers these days might be 32 bits. They might be 64 bits on modern machines So I four or eight bytes There is a SIMD unit the SIMD unit on Intel's is often the MMX unit or MMX and SSC and That adds you a bunch of more registers there specifically for those instructions and They actually often can be a bit bigger Some of them actually be as large as 512 bits wide for one those registers. Oh Single instructional single instruction multiple data So that's a way of saying Take Yes MMX SSC is that MMX and SSC is that on arm. It's called neon That's why it's preferred to a SIMD because it's the generic version of it um and So that's why you're basically saying here. I've got eight Bite numbers take these eight bite numbers and add them to the eight bite numbers over here So add this one and this one this one and this one this one and this one this one this one this one this one This one and it puts a result down here with all of them added together and but it does it in one instruction So you can add eight numbers together eight pairs of numbers together in one instruction For example, the really really wide ones do even more than that And they might not be eight bite values might be 16 bite or 32 bite bite Eight bit or 16 bit or 32 bit values and it'll do that. It also the SIMD units can do extra math like saturation math So instead of overflowing and going back below zero back to zero. It'll stop at the largest value or Actually underflow as well So they do a bit of extra stuff. That's really handy when doing media things Then you don't actually have to handle dealing with that. So they can be handy, but um Generally if you're writing C you probably just are gonna write C in your compiler may or may not use these instructions for you For detect something you can actually explicitly use these there are things called the like intrinsics and The intrinsics are basically like function calls. They look like a function call But they actually are a single assembly command and that you can access These instructions and these units directly with that and the compiler will deal some of the footwork for you around that but note The moment you start doing intrinsics Your code is not portable anymore. It'll only run on that architecture and Even worse you put things like neon intrinsics there and you run on a CPU without neon You'll crash with a sigil an illegal instruction because that instruction is unknown So you can't actually even run that code until you detect what kind of CPU you're on if that CPU supports those instructions Eventually when you do advanced code, you know how to do this You know how to detect that and you'll switch code a which is just pure C code B Which is C plus intrinsics and you switch at runtime based on which one you can run on the CPU You can detect this stuff So it's an idea of just how much memory really is okay 1k is 1024 we all know that a megabyte is a million of that and a gigabyte is a thousand times that again It's a lot of bytes out there. So registers are generally very small compared to the amount of memory you have And your average servers in terms of your memory footprint are just huge these days They can have a lot of memory as well as even your average powerful PC or even your phone has mouth and so go Why do you do this? So the common data types you should know this is pretty much all of them You really know need to know you've got bytes. I charge you've got shorts, which are two bytes together Trust me They actually are useful if you want to reduce the size of things You've got a dark piece of data that needs more than eight bits But it doesn't need a full 32 bits and your 16-bit values are your shorts are very useful for that And if you use them carefully, then you can actually save quite a lot of memory Doing things sensibly your ints are four bytes generally speaking I know if you want to go and quote the actual KNRC in the thing that isn't a guarantee that you only know that a char is a machine at least the size of a bite on the machine And so on there were days when you had via Vax machines where I think char's were 36 bits or something like that Was it the ins with these six bits and char's were seven bits? I don't remember but either way stuff was weird today on any architecture You really are going to find that is going to be the case Now longs can longs may vary depending on your whether you're 30 to a 64 bit if you're in 64 bit They will be 64 bit Asterix not on windows on windows longs will stay 32 bits But you have to deal with the fact they may change in size Your long longs are always going to be eight bytes and your floats are four your doubles are eight Just small little asterix here Intel the Intel floating point unit actually has more than eight bytes of precision. It actually has 80 bit registers So you may actually get different results when you're doing things on Intel floating point Then you might on another architecture because it can do things to slightly more precision That's it might round up or down Differently compared to another thing even compile the same because they have different size register So if it's keeping the data in registers, it'll keep a bit more precision But when it goes back out to memory, it has to go back down to 64 bits or for eight bytes There are compiler flags to force it to keep it into 64 bits and Points is of course vary in size as well Go hey There we go, I've got space bar it'll do So pointers are just a number As if everything else they just tell you is you are asking before memory Starts this is the start of the thing. That's what they use for but they don't have to only put to the start You can put it somewhere to the middle of something if I I want you if you are a function and I want you to access some piece of data that's inside this big buffer I have I don't have to give you the base pointer of the buffer and then some other information like an offset and the size I say the data is here in memory already. It's over here. Go get it And so I can just pass you the points are straight to where it is It doesn't have to be the start of the thing can be anywhere But generally speaking it points to the start of something think of it that way, but you can move it around too How it's stored in memory better depends on endianness. So who here knows what endianness is? So you guys know you've heard about it? yep, so and for those who don't endianness is a Convention with how you lay out your bytes in memory do you lay them out the least significant to the most significant or most significant to least significant in order in memory? increasing address Intel is Intel and arm little endian mind you arm actually can do big endian It's a switch, but by convention they all boot in little endian mode All the OSes do that that I know of so Big endian the MIPS spark power PC 68k They'd like to write things out in big Indian you need to know your endianness and be aware of this because when you write out binary data You're gonna have to be able to load it on a machine that might be of different endianness So you either have to like TIF does I think they actually may write in either endianness You don't know but they have a way you can detect which endianness the file is and then you can interpret it the right way Or what often happens is you always write the file in one endianness and it's defined this file is written with this kind of endianness deal with it And then if you're on big end for example, you write in little endian on big endian You have to actually do the swapping when you load it into memory This is actually something interestingly that Microsoft is famous for because they were a traditionally a Intel only Operating system all their products like office and word and all these things they will just write what's in memory straight out the disc They wouldn't really care about bitterness alignment and Dada and Endianness But course on Intel you don't actually have to align things in memory the hardware actually does alignment fix ups for you This doesn't happen on most other architectures. You actually get it you get a an exception and Either it crashes your program or the colonel may trap it fix it up for you and it creates a huge slowdown So but they would just write what's in memory and read it But then when they actually have to support other platforms they have to move from 32 to 64 bit Etc. Etc. They actually do have to do a lot of work to make sure it keeps working Because memory layout changes and things like that if they had to serve that they did actually have power PC I think at one point and to support that is actually really hard Etc. Arm requires everything to be properly aligned in memory So they had a lot of work to actually make sure their stuff works correctly because they've been relying on Intel Architecture being really lax in this department So this is why it's important to know because if you want stuff that goes a run on Your machine here on some arm tablet over there or on some big fat server over there Imagine one day you have 128 bit servers and point has gone 128 bits. Okay, that's gonna be ridiculous But I'm imagine they do you want your software to just magically work if you do it right? so Yes, well, it's just like writing the files to disk if you're trying it's the same thing if you're then transferring across the network You're right The other end might be a machine with a different endianess you could you know most of your Wi-Fi routers you buy off the shelf today They're MIPS There's actually a CPU there. They run an operating system sometimes Linux sometimes some BSD sometimes something else. I don't know and You can talk to them and guess what they have a different endianess to your PC They're big your PC is little No, they they don't but If they were dealing with your data or you had some kind of RPC to them that was binary etc. Etc Then you'd actually have to make sure you do that Binary RPCs are in fact a very good thing. They're very very efficient for a machine to decode Text-based ones are far more expensive because you have to examine every single byte to find out if the next byte is going to be The right thing yes or no, so it's examined figure it out examine figure it out examine figure it out with binary You go, I know the next thing is going to be a 30 to be the integer. I know it. I don't have to make a decision I already know because the previous thing told me the next n bytes are always exactly this sequence I've got a header and the header tells me it's this therefore all of this is laid out like this main memory I can just mem copy it in and if my endianess is correct. I do nothing main is not correct I just blinded go swap swap swap swap swap swap swap swap swap and now I can use the memory as is I don't have to do any compare brunch compare brunch compare brunch compare brunch Etc. So that is one of the reason why binary IPC or any sort of binary protocol generally will beat text-based one Over the internet. So if you care about performance of performance of a performance I'm proto buff That's one. I I can't tell you much about proto buff. I wrote one. I wrote it for my desktop It wasn't actually a protocol It was actually a serialization Mechanism to take from bad structures and memory write them to disk and do the reverse take disk and write back to memory And I've actually benched my mind against Lib Jason and it runs rings around Lib Jason In terms of encoding and decoding data Everything's like much smaller. I've got numbers somewhere on a web page But like I think I came up with something between two and seven times faster depending on the on what you're doing writing reading Blah blah blah memory the footprint is much much smaller like it's a fraction of the size for the same thing And I did I didn't optimize it for speed I wrote it because I just wanted to ask for a serialized and deserialized I didn't want to have a competition But when a bunch of people saying yes, we're gonna do Lib Jason blah blah blah we're gonna do all config in it I went now you're asking a bunch of people to do config in Lib Jason who really care about performance and these people will probably veto it So before that happens, let's see if Lib Jason is actually fast enough or not So I did some comparisons or something else I had that we were already using and it was already fast, but although it could be faster and could be more compact and Lib Jason was already significantly slower and my suggestion was don't do this because all you're gonna do is take what we have and make it slower You're not making it better so But binary is not that easily editable by a human being So there's actually a whole library with the binary one I have that actually will decode it for you into a text file and re-encode it for you But the sort the actual file is in binary So if the machine is reading and writing it all the time you don't go through text every single time You only go through text when a human has to actually deal with it and there's a tool to do that So do do do So yeah, if you have something on memory you have an integer there Depending how it's laid out if the number is like one two three four five six. It might look like that in memory It might be because the zeros won't be at this end. There'll be the other end. That's actually a little end in there Binary, sorry is the binary representation of a string will be literally Character character character character and the null thing then you might have padding to the next data structures We like to align things and then like that might be like an integer of value one and that's what it looks like in memory Very handy to have so See provides you ways of laying out your memory. They're called structs or data structures Every language pretty much has this but in C It's actually kind of useful to think about it as a way of saying I've got a blob of memory I'm going to name my memory this bit is to this this bit is this this bit is this and this bit is this That's really what a structure is doing and they lay it out in certain ways They like to align things so you notice we have a gap here of grace because I have a character They have an int it's going to want to align the int to a 32-bit value So it leaves a gap there to keep that alignment And the same I have a short here There's a big gap over here until I have this pointer, which is eight bytes on the 32-bit system Oh this nine-byte vowel thing here, which goes down to here again It's going to line this to here and so on etc etc, but all this is doing is describing how memory lays out Undefined it's not a list lies. It's not not accessed I Kind of lie a bit on the second one you can technically access it as long as you don't rely on the content in it and So you who here knows? Debugging tools like you know, have you heard of GDP? Yep, do you know Valgrind? If there's one thing you come out of these two hours from Valgrind Valgrind Valgrind Valgrind Valgrind Valgrind Valgrind It is the best tool on the planet. I owe those guys many cases of beer Valgrind is a CPU interpreter. It takes binary machine code and interprets it and What it does is as it's interpreting it examines everything it does every memory access Every single thing and what it accesses and hasn't been initialized not and it can tell you Exactly what byte hasn't has not been initialized and if you access to it or not and it can tell you where it knows Where it was initialized it knows where you freed some memory And if you go access memory freed as a you're accessing that was freed over here It does the most awesome debug tool in the planet Valgrind Valgrind Valgrind send the guys lots of beer They are really awesome Hmm. I think I have it at the end of this but at the VAL G R. I and D Oh if it's actually at the very end I covered that if it says ancient tool as an ancient tools an interesting tool Valgrind is bite accurate It's accurate to the bite Electric fence all it does is every memory allocation you make it basically uses mem map to do it and adds an empty page on either side With nothing mapped in there So the idea is if you go beyond that page like you go before a buffer or after a buffer if you walk far enough You'll hit an actual operating system empty page and then segful But if you go just one bite over an allocation it won't catch it Unless that one bite is at the very end of a page with no pages around at the 4k One bite It'll catch one bite over It'll catch it because it looks it flags and tags every bite in memory It knows whether it's been initialized or not So if you read a bite that hasn't been written to yet by your code It'll complain saying accessing uninitialized stuff So it actually does sometimes complain with some libc functions Plus what they do is something sometimes like stir comp or stir copy. They're optimized So what they do is they don't read one bite at a time they read four bites or eight bites at a time and they read all of that into a register and Then they base their decisions on what's in that register. So Valgrind complains saying you accessed unilitialized memory and you go No, I didn't actually because it's not going to use that memory and due to the way memory is aligned all Allocations will always start at alignment and be rounded to an alignment. So they'll are they will never be Like they won't be one bite allocations even though theoretically. They don't really exist They'll always have this extra padding space behind just for Realistic use and the way memory works It knows that it's not going to go out of a real page of memory and crash when it does this access Because it's aligned in memory But it masks out that stuff and it doesn't access it, but the problem is Valgrind complaints It knows too much of what's going on and sometimes you need to know these little tricks that people do and it's a valid trick and Valgrind is Technically correct, but it's in practice wrong and so you often have to do things like tell vanglain to ignore that You have all these ignores listed like no, that's okay We just know you see it It's like it's complaining here and it never crashes really and you know It's gonna stir this or stir that and go, okay, it's Valgrind. I know what it's doing So but those those things there it's undefined content And I think this is the trap that almost everyone falls into in C and C++ It's the undefined land they start relying on things being defined that are not you might assume that that memory is zero Or you might assume whatever you can do certain things and that assumption may not be correct So don't make assumptions that you can't make be careful about that so It still doesn't like and Not apparently not today. All right So if we'd have some code everyone type this up now, okay, right, maybe not I Would have loved to have uploaded it if the network worked What was it this is the data structure we had before so at the top line we have hash includes standard IO I think I we covered this before by almost skipped it. I skipped it Hash includes in C. Literally our copy and paste it's take find the standard O file here and copy and paste it in this line and replace it That's exactly what they are. They're copy and pastes. They're compile time copy and pastes The compiler goes and does this for you So it's looking for that header file and the header file defines What functions exist in the standard IO set of things like printf for example? So they're now after I've included that the security compiler knows as all these printf function exists and it works like this It accepts these parameters and so on therefore now the compiler knows what to do with it So that's what hash includes really are they copy and paste you can do really evil things like hash include I hash include a C file in the middle of a block of code and I sometimes do this as code generation systems So you have a piece of code and you have a macro and you hash include different blobs of content And you basically generate different pieces of code as you go by using it as a trick But it's not a hat not a bad trick to do. It's perfectly valid to do that It's intended to be able to do that. So if we have this data structure, and that's how we've told the C compiler This memory is laid out that way We can fill the data structure in like this so we can set val 0 to 123 the next one to 99 I've used decimal here next one to 101. I've taken my Myself my own executable as a string and I've set that the pointer the char for battle 3 to that I've chosen really bad Variable names here. I've done it for brevity and just for illustration I've now put a full string in in stuff it Over there that will fill that and then another integer over there another one and a pointer to itself Just for illustration purposes And what we do here is we literally go find the memory address of where the data starts so We've taken the the data structure The pointer of the address here. We've converted to an unsigned character pointer, so we're going to look at a whole bunch of unsigned characters and walk through them and So we start with there and the end pointer is the base address plus the size of the data So how big it is in bytes however big the compiler decides that is And we continue going until we hit the end pointer and then we iterate and all we do is we just print out the hex value of it So we're just rummaging through memory to see what's there And we compile that like that. Look get out of the way and you'll end up with something like that Um, it'll just basically dump out memory. You notice before the gray areas seem to be filled with zeros Like there's nothing in it. That's actually by kind of luck that that memory happened to be zero But it might not necessarily be So if we go look at highlighted before those are the same colors as I had in the data structure before Those are where those data structures live, but interestingly enough This padding area here doesn't have zeros in it. It's got junk and that was your question before Um, it's unalienized. There's random junk in that memory. Sometimes it's zeros in fact quite often it's zeros But sometimes it's not because it's been recycled. It was used before thrown away Then now it's used again and it's retaining previous junk and memory something to keep in memory for in the security Oh, I'm actually uh, okay. I'll try yours if that works Let me get rid of that Oh, cool Thank you very much. I'll save me walking back and forth Um, so the same color had before the dark gray thing Has content has junk and as the longer your program runs The more of memory that you fill in and do this stuff with will have more and more junk in it So it's very important not to rely on memory containing something when you initialize it You have to make sure it has something there No The the red one has gone away Okay I think now it's on your red lights off. Oh green lights off Green lights off Um pretty sure it's it's on There's something wrong with your computer. I'd say try a different port. Do you have another one? No, I just I just what's worked Try now. Okay. Now it's working. All right. Maybe there's something else. Um Um, okay Um, and there I've just gone through it. So that's 123 there in hex the first one as we initialized before 99 there in integer 100 value 101. That's written in hex That's the memory address of my first argument in memory. Oh, look at that. There's a bunch of zeros here and it's 7f You'll get to know about these things later And there's a string stuff it over there in ascii Um the the 1313 over there. There's a null Byte terminator of stuff it and it's in junk after Um, etc. Etc. Etc. Etc. And so and that's the location in memory of where that data actually lives Look at that 7f 0 0 7f. Oh fd. Oh, look at that addresses near each other Yes, that's the null byte here, that's the null byte of stuff it the end of string It's just gone. It's just long enough to go over this column here and just do there and then this is the alignment Padding there That byte is actually part of the string. Um the null byte Um, so anyway our memory locations, um So when in c In fact, this isn't pretty much in every single language When a function is called or any local scope is entered It basically quickly pushes that scope on the stack. The stack is just something where you start building You add something on top. You add something on top. You add something on top The alignment like Alignment is based on the processor's Bus width or What the alignment is it's like 8 bit alignment. Um 8 bit alignments vital and you mean 8 bit you mean 8 byte Bite alignment or is it basically uh, it depends on the type bytes must be aligned to bytes Shorts must be aligned to a short and in must be aligned to an int A double must be aligned to a double a pointer must be aligned to a pointer So if a pointer is 8 bytes, it must be aligned to an 8 byte boundary If it's a char it can be aligned to any byte in in memory So the general rule is exactly that they must be aligned The asterisk is The type the type you're about it the type must be accessed and aligned to its natural alignment value, which is its size um Intel doesn't require alignment And if you do an attribute packed at the end of your struct you can tell it to pack it all down in memory There was a very very small performance hit of doing that But intel handles it other architectures don't even allow it They just have an exception and they might crash your program But um, they get the processor cannot access memory at that address Which is kind of interesting That's where you get a sig bus sometimes and some architectures and you won't see an intel sig bus is instead of Segfault the bus signal is the memory is valid memory and it's within your memory range, but I can't get to it and accessing an A unit or a thing on an unaligned boundary On a system that doesn't allow unaligned access where there is no kernel trap for it Which the kernel trap is optional because it's very expensive Um, we'll get you a sig bus. There's another reason you get a sig bus Um, or another main reason if you do memory mapped i o you have a file You memory map it into memory and then you access pieces of the file And you have an i o error on disk you'll get a sig bus You did a valid mapping you did everything correct You'll get sig bus because the processor cannot page that into the memory mapping costy as an i o error on disk There's a few other little cases as well, but they're the main ones um So So every time you put go on the stack you just to ac add a blob of memory you put more stuff into it Actually, I should speed up Um, so the more you put in the stack the more it needs So how many of you learned in the good old computer science days about recursion, right? And your computer science lecturers told you recursion is really awesome They wanted to use like functional languages like Haskell or lisp or these kind of things to demonstrate how great it is How you can solve all your find all of your solutions with recursion Problem is recursion is expensive Every single time the function calls itself calls itself calls itself it adds more to the stack and it keeps using more memory And what I said before the more memory you use the lower your Performance goes because you have to access more of it So it is not free and believe it or not stacks are not unlimited in size There is eventually you'll hit the end of a stack And your machine will crash you can't put more things on the stack So be careful to some extent how much you put in the stack and some systems have much smaller stacks than others 16 megabytes these days is not uncommon on most pcs But I've seen down in the rare area of 206k on the stack It's to the point where I've had my code crash because I put these like 16 and 32k buffers in functions And they go on the stack and then you just call a few of those and you suddenly blow the stack And that was actually qmu when it was doing its Misk binary emulation It did a pretty bad job of that and I was unhappy but I redesigned it and used malloc instead Move it into heat So but when you exit the function everything that was on the stack just goes away everything that was pushed there Just get popped you are no longer allowed to access this you can technically go and rummage to that memory But it's now undefined It's now going in undefined and that's where you can get garbage because it'll it won't clear it out It'll just leave it there. What's the point of clearing out? It doesn't matter because you say someone's going to overwrite it later again when they reuse it It's a stupid question here, but like functional programming is all the rage in javascript So just the same rule apply that people start using more functional styles Then their stack will get bigger and It does tend to happen actually the more function heavy you are the Deeper your stack is and javascript rule you're maintaining a stack Regardless of the language regardless of the language any language I know of maintains a stack because you you are calling to something You've got to sort somewhere Kind of yes, but even procedural languages can recurse they can do it so they can be functional low-level languages Can be just as functional almost as functional languages. It's just how you use it So it's not what you call it the counts. It's how you use it the counts. That's what makes things good So you can also dynamically allocate on the stack decide how much you want not just declare variables and that's alloc a alloc I don't know why they use they actually no idea But that allocates n bytes on the stack for you somewhere when you need it But be careful don't allocate too big things for the reasons that eventually you blow your stack Um, so heap is basically everything else that stack. Um, normally you get this by calling a function called malloc Um, my cat's name is actually also malloc. Everyone thinks that's really awesome. I named my cat after a um c function um And my next cat will be called free because you get rid of malloc by using free And you can also use caloc caloc is interesting in that it guarantees the memory is zero Like all the bytes are set to zero by the time it returns it Realloc changes the size of an allocation, but it may move it around in memory if needed Thus it returns a new pointer to you And mem map is an interesting one. I'm not going to cover here. It's probably too advanced And that's a way of going straight to the operating system and saying give me memory And it's expensive. Of course, it's a system call um But it kind of moves memory outside of the normal libc memory heap allocation Area and you kind of do it yourself. There are reasons you might want to do this Very special reasons and we can go into that later. Um, if you want to know Um, but remember if you're getting stuff from the heap you have to remember to release it It's your job to say I don't need this anymore. That's what free is for Um, so if you forget to do it, you're going to have leaks. It's going to be fun. Um, don't do that Uh, so now this is the same code as before um When I put this Valgrind will track all these functions and it'll it can tell you if there is a pointer you have in memory And then you free the data structure that contains the pointer and you haven't got that pointer stored anywhere else And it'll tell you you've leaked memory You've lost track of that pointer because you forgot to free this one too. It can figure these things out Valgrind is awesome best thing on the planet. Um Where is Oh, okay. No, this is right. This is the same code as before though with one change I've called a pause function down here pause does exactly what it says it pauses forever It never comes back But what is it waits for a system signal you can if you send a signal like kill Minus half kill minus user one kill minus one of these then that will Release the pause. It may also kill the process depending if they're catching or not Um, but I did this for a reason so the process would start and not exit So we compile it and I can go look at the memory mappings of a process But don't I know I'm going through Um, it's a useful thing. No, there's this wonderful tool called pmap print the mapping of a process So print the the give it a process id of anything and it'll print you this It's like chose a very very simple application. So it's mapping a simple um So if you can see here in red, that's where the stack is And the stack has 136k mapping. That's all that's been given in this mapping. So You can probably blow the stack up with 136k Um, this here is some magic anonymous memory 4k read write. We don't know what it is This thing read and execute from libc So when you run ps or top and you see some memory that uses 500 meg of memory, it probably doesn't it Might have mapped into memory things like these shared libraries and everyone maps the same shared library into memory Which means this only exists once in memory for every single application you run in your system They all share libc and that same bit of memory for libc is shared between every process you have Um, it's not duplicated. So it's important to know read and execute. That's a bit of code there This is actually to run the dynamic linker that figures out what to link in like libc. That's also code there Um, this here is probably some global variables Um, this might be some other just constant variables, which are not executable Yeah, different parts of that file get mapped into different things It's just this part of the file is now this part of memory This is now this this is now this and it's just accessing it file directly and modern Why because you need you need execute permission for running executable code You need write permission if it's something that actually can be written to like global variables If they're writable global variable what happens is unix does copy on write So it first of all maps it in in their current state on the file Then when the first guy writes to it, it makes a copy of that then releases it unless you write your one variable Now you have a private copy But that's why it's read write if it's read only you try and write to one of the you try and write to your Code from libc and it'll crash It won't let you write because it's not writable and that's actually the right thing to do So each different parts of these files gets mapped in differently and anonymous memory is a way of going to the kernel and saying Map nothing in I not an actual file. Just give me anonymous memory. Just give me pages You don't have to back it onto a file. Give me stuff and the kernel will do that. It'll give you fresh pages The libc malloc and someone will use that trick to get large bits of memory And otherwise they'll use the another heap where he was Burke and spur to to move that around This is obviously a shared binary if it was static then The binary would have its own copy correct, correct. They'd be a um, so they see this reading rx here That's the code from that binary, right? So if it was static, this would be rolled into that and that would be a lot bigger And the permissions would Permissions would be the same. Yeah, oh Is that all that's Everything here the s5 this binary that's the binary itself. So we've got some code here probably some constant values and some Read write values globals you can read and write to they would those would just become bigger If I'd statically compiled stuff in and then you have less of these extra files mapped The other mappings will probably still be there the anonymous ones and definitely the stack So the ld will go the ld.iso will go away as well ld.iso is the runtime linker That's the thing that everyone links to and that then goes and finds libc and finds something else and finds something else and links I mean before your main function is called So that's its job. So even though it says there it's using 4.2 mega ram. It really isn't it's actually probably using a tiny fraction of that Um, so it's lying. So remember this knowing how your memory looks is actually very useful to debugging and knowing if you're leaking and stuff like that Um, so now we're going to do some simple Oops Wait, wait, wait I'll go back now I was like, okay, and Why do I call that that? Why did I do that? Oh, no, that's the file. I was later. Never mind. I don't know why I have that there Anyway, um, so when the stack goes up and down it'll just increase and decrease the stack pointer size Um, we'll go back to stack In this case just plus or minus 16 stacks may grow up or may grow down Either they might start at a high address and then go down in value They might start at a low address and go up going at a high address and coming down is actually quite common But it's an abstraction. So I think of a stack Growing up. We just turned it upside down in real life Um, and the stack includes the parameter the parameters you pass to a function It also includes space for the return value and a return address I once you finish going through this bit of code here where to jump the cpu2 to continue where it started before it called the function So the stack includes this data um It's a and local variables in each scope as we're talking about before So here is a little program I wrote to just quickly dump through memory There's good reasons why I put a lot of hex in here like 7777 and 8889 and so on So at the top I just have a function to dump memory So basically it goes from some pointer to some pointer just printing out whatever's in memory Um rummaging around through it. You can you can do this if you want to it's kind of not valid But on a machine level it actually is valid not valid Conceptually but valid in practice um And I just have a function as we're doing before I talking about before we recurse actually Recurse in a very limited way. So we start if you Go down I pass in this hex values. I don't know one one one zero Three the counter and then just a pointer to the top data value and then in this function I have some local data data If count has gone to zero I dump memory and just return um otherwise I return the value for my function and I decrease Three two one and when it gets to zero it'll just then go head back Of course counter's now gone to zero and it'll increment increment the parameter count. So if you run this Compile it and run it you'll end up with memory like this. That's your stack. That's what's sitting there a lot of zeros Um, unfortunately, it wasn't really aligned right. There's a reason for this So I just shifted it over a little bit to make it easier to read Huh Why Where's my highlighting? Why is my highlighting gone? Libra office Why has it lost a whole bunch of my I don't know why it's lost a lot of highlighting. That's weird. I literally was downstairs. I had this. Um You'll not okay. I'll do this by hand here. You'll notice some of the hex values had before seven seven seven seven is lurking there um You'll notice the start address and end address seven fff or whatever. That's where one of the um Addresses are if you look carefully you should find a seven f an ff seven f somewhere here. Here we go. Um Ff seven a well ff seven f blah blah blah and actually continues in the previous line That's actually zero padding up there And if you look back here, it'll be then d3 d3 bb bb You'll see that there's a bunch of pointers. They're sitting in the stack So they're referring either to a return address They're referring to a variable on the stack like maybe the top data, etc, etc But most of the stuff is living there in stack Um, and okay, that was fixed for alignment there Um, so now it's easier to see that this here is actually a pointer But you notice it's just written in reverse. Um because of little indian Etc etc. So all the data is there if you go rummage through So whenever you call it'll grow In this case think of the stack growing down function zero calls function one calls function two calls function i think called function four Every time you'll just add something to stack you'll hit return address frame pointer parameters and local variables in there Um now important thing buffer overflows. Everyone here's heard of buffer overflows, right Buffer overflows are silly things like you have the variable stuff here And what happens is you go st u f f and you don't stop you continue walking through the buffer And if you're writing to it you walk through here Then you start writing through this memory and start writing through this memory and write and write and write and write and write And you can modify the return address For example use a return address and jump to a different function And that can actually have you then be able to pass parameters that function even and then execute something Like you run the system function and pass it b in s h You know r m minus r f slash And that is how buffer overflows in principle work They find someone who hasn't checked their buffers and just goes writing over it and then modifies that they basically Nuke the stack they do something bad and then they Get the machine to do something it wasn't intended to do you do need to know how memory is laid out to do this stuff So if you want to be a security researcher, you need to know this stuff um And if you want to protect against this stuff you also need to know it if you want to write good code So that's how buffer overflows in principle kind of work Um or one of them, but no I actually kind of lied to you not all parameters are always passed on the stack They're conceptually passed on the stack. There are optimizations where they'll use those cpu registers to pass them So you might find they're not visible or what you do is they find that they're the parameters of the previous function That get written out to the stack to save them plus the registers are reused for something else So if you only go one level deep it might not have to save anything up It might just be able to use the registers if it doesn't use enough local variables um So the return values also can be passed in registers, but it depends on your architecture and your api basically um So as we're talking about before cpu registers memory um This is where your variables sit in the cpu where you have direct access to them. So as I said 32 64 bits um You get instant access to this stuff You have to generally move data from ram to register and then back again um note Some cpu's have some instructions that can access memory directly. I add this value to this Pointer whatever's pointed to this thing So use the pointer here and the cpu will access it go and do that for you and hide it And it won't have to use the register to store it in first um But the best way to think of it is that the data does have to be loaded from memory Maybe to an invisible unknown register and then written back out Um variables get assigned to registers on the fly By without which variables get assigned there Um, it's kind of invisible to you. You can hint for a variable to get a register by putting register in front of it register int i um for example Uh, so the compiler takes care of it and the chances that you could do better the compiler are effectively zero Compilers will do a very very good job here. So don't worry about that But an example here is that if you have three variables here i j and k and you do these three loops The compiler is not going to go and need three registers to i j and k It'll probably recycle the same register for i j and k because it doesn't need them any more afterwards So And if they're not actually ever used after that the compiler knows that and it'll just use one register for all of it So don't get into the thought that the more local variables you have The more registers you need it doesn't work that way. It's it basically is the number of active things you have in registers at the same time like the content So heap now we're going to pass stack heap memory. That's what we store permanent things or permanent as you might say It doesn't really have a structure like a stack It's just random all over the place how it's structured is totally system dependent And it may actually depend on the size of your allocation of when you allocate it and how often you allocate it, etc, etc Um, you have to allocate track every allocation Um, it must be done. Libc provides functions for this. So malloc Calloc reallocan free They're basically your main set that you'll use you'll probably use mallocan free mostly at the beginning and then every now and again stray into Realloc and eventually discover calloc and go. Oh, I didn't know that my hint is don't wait to discover calloc It's awesome use it to use it day one Calloc guarantees your memory is zero. So, you know, I said before you can't rely on guarantees Don't unless someone's giving you a guarantee and that gives you one Calloc is good because it can also optimize I if it's already zeroed it won't rezero it for you if it knows that Um, so here's an example. Um, just like allocate some memory What did I do? I filmed that memory with uh like a Some character from that I keep looping around like every 10 characters pick one and then just print it out So mallocan and then free at the end. That's how it works um realloc You take some data And then you want a bigger one go and say I want a bigger one notice it may return null What happens if this return if I said data equals realloc data I'd lose my pointer. I'll leak memory. I no longer have a pointer to data. I've lost it I can't even free it. So that's why you'd always use another pointer for the return for realloc and if it returns null then um Handle the error blah blah by using abort there Calloc makes everything same thing, but its parameters are different. It has a count and then a size So how many of what how one lot of one megabyte there? Um, and please make give me all zeros. That's simple Um, so calloc is awesome. Remember guarantees of zeros. I use it a lot Was it basically saves you from bugs? You don't have to remember to initialize every single value Um in a data structure that you're allocating memory for it has guaranteed. It's all been zeroed out Yes If you absolutely know you are about to fill this memory yourself Then there's no point having someone else fill it for you. Then you refill it now For data structures notice I said data structures if I'm allocating a lot like 15 megabytes of memory for an image with pixels I don't care. It's first of all 15 megabytes. I don't want someone to go zero my 15 megabytes of data that cost And it's pixels. So if there's a bit of junk there, you'll see it But it won't crash the application you'll just see some junk But I'm going to fill those pixels with something anyway and render to it. So I don't want it to do it. So that's kind of my dividing line If it's a smallish data if it's a small data structure I'm going to use calloc if it's a data structure. I'm continually allocating all the time I might avoid calloc and specifically initialize it to specific values by hand Because I want to avoid the overhead um otherwise In larger things I'll use malloc, but I'll be careful about it. I'll pick and choose which one I'm trying to note Realloc if you realloc memory you calloc before It doesn't guarantee the extra memory. It extends a zero So realloc doesn't give you those. There's no magic career alloc or whatever um But the main main reason to use calloc instead of malloc plus memset to set the memory to zero Is because calloc can optimize if it knows it's a fresh memory from the system The kernel will guarantee all fresh memory to use zeroed for security reasons Any new memory given to your process is zero The kernel guarantees this for security reasons Otherwise you'd be able to see in other people's old memory what they had there Imagine some other process running as some other user ID and the password was in their memory Now you happen to get their old memory. It's like oh password. Thank you. No, of course not Yes, so yeah, yeah, um, yes, so it may reuse your own and that's where you get junk. Um, yeah Um, so also libc is the basic toolbox. It's really basic. It's not really great. Um, but Don't stop at libc But it will do your basic i o f open f close do manage memory management malloc xon string handling They're kind of really painful. I hate doing strings and see just avoid strings. Just don't do it Um and all the posic system calls like read write mem math blah blah blah blah blah f sync kill and so on Um, but beware an actual system call is expensive You go into the os and come back out. You may lose like thousand cycles two thousand cycles easily And often you might even get your tlb flushed Which means all your caches are now no longer hot with your own data and they've got to be refilled again So system calls hurt don't do system calls if you don't have to and there's tons and tons of other libraries out there Don't stop at libc C has so many libraries. It's not funny and you should use it because this saves you time So efl is one i work on it does data structures i o network i o and other loop handling GUI rendering and so on etc etc um glib Does data structures as well And i o like main loop handling gtk is often used together and just does the GUI side There's lib jpeg for jpeg doing lib png zlib compression etc etc etc etc whole bunch of them like open gl for graphics You want to do rendering it deals with accessing the hardware for you etc etc Free type font rendering and so on there's lots of libraries use them Don't rewrite it yourself if you rewrite it yourself your life will be painful Don't do unless you know what you're doing. Okay now file i o um These are the basic libc file ios this one is just going to open a file for read mode. It's a string Whatever f open. I don't know why they did that but they do and one i'm going to open for write It's going to use the first argument and the second argument So copy is file one to file two basically a really simple file copier And it's basically going to read into the buff i use the size of to know what the size of this buff is Read and write read and write and just print it out. It's not hard to write a copier It's very basic if you read someone's code that does this you read it and go okay I understand that And the nice thing about c is you can read people's code and generally understand what it's doing It's generally not that mysterious And if you run it it just does that it tells you what's right again at the end It didn't quite get a full 1224 bytes. It got 768 And it wrote it now this was a more interesting one This uses a library zlib zlib is a compression library compressors and decompressors So as you can see up here in red, I've included the zlib header So I've said, please get me functions that are from zlib and now i'm using these functions are from zlib How do you know they're from zlib? Well the way I know is I read this file zlib.h It actually has a documentation in the file in comments So just go there it'll show the functions and next to the function it has comments telling you what the function does Go read it. I don't understand why people don't read more header files The first thing I do and have a library is read its header files because I'm looking for the functions the definitions and Some documentation that probably should be there and all we do now is we find out what our compressed buffer size should be If we have this buff how big we should have much bigger it should be We open again. We open a file here open a file here. We read it Now if we actually manage to read we go and compress this into Sorry this into the compressed buffer. So it goes from here to here read the documentation We want to know what it does and then we write it We write out the size of the compressed data because it'll be different each and every time we write out a Size of desk size, which is in fact a long up here And then we write the actual compressed data So you're basically writing a small header which is how big it is then blob of data How big it is blob of data how big it is blob of data And so on so you end up something like that. So every time it writes out not 1,024 but 313 540 so you can see the compression of the file changes as you go Different parts compress better or worse Etc etc I was compressing as I was writing so you I read 1,024 bytes compressed it wrote out what it managed to compress You would probably do larger chunks, but you're normally doing it chunk by chunk by chunk Yeah, um, oh, it's not like a stream This is done in chunks. There's reasons to go later So as I said, so I've included zlib and the one thing you have to do is telecompile It's a link to that library minus lz Again, read the documentation for your libraries. That'll tell you what to do There are wonderful systems that kind of hide this from you in a standardized way Don't have time in this thing to cover that But basically generally minus lz links are libz, libm, libina, lib smelly skunk They work the same way It's very simple So I don't think we're gonna be getting doing the exercise. I was gonna say how to as an exercise try and Modify to decompress the file we just compressed So I get you to do that, but we don't have tables here. So I think Ellen's gonna do it anyway We don't have much time That is actually the reverse that will decompress what I just compressed Um, and that will do again. You'll open it will read it. Um, we'll look at panels. We'll use the um, where is it the Um, where's the uncompressed function there will uncompress that chunk and they'll write it back out And it will produce the same file from beginning again So it's not that hard to do and then close the stuff close our files at the end, etc Um, there are some things wrong with what I wrote I don't check the results of f open It'll return null if the file isn't there or it can't open a right file to write I use unsigned longs to write them which is system dependent. So if I'm 32 bit a 32 or 64 bit also endian is specific, etc, etc I could have made it shorter and written like 16 bit headers because they will be big enough for the chunks we have They'll never be more than value of 1024 Or not too much more I don't look for short reads on decompress I assume if I have a header and says this many bytes followed they will happen I don't handle that error case when there's actually like file.io error there I don't check my malloc results bad man. I'm a bad man. You should check Although sometimes you don't really need to check it returns null use a null pointer You're trying to access null pointer your machine will crash. That's okay. Well, your process will crash If you were going to abort or crash anyway, whatever It'll be the same thing and I probably could have used the binary write and binary read Yes, it's a stupid thing from the days where you could re-access files as ascii as opposed to binary. That doesn't happen anymore So the advanced topics You can do functions And address them by pointer functions are just addresses. They just live somewhere in memory. You can have the pointer pointing to a function Um, it just exists somewhere in memory. You say call whatever is here at this address Um, it's a good way to indirect things Um You can basically say call this something sometime later when the thing is done Uh, so you can store a function and that kind of makes see a functional language is kind of what you're getting at before Um, you can store functions and then indirect and call them later Um, so they call callbacks often because you call them back just like on a phone um Did it do why did it go back there? Um So that's an example I had I created a type diff for a function I just stored functions in an array and then I actually do ascii to integer Take your first argument. They're passed in zero. I would call func one. I passed one. I'd call func two They're passed in two. I'd call func three. They're just based on which one I call Um, even more advanced, right? Why did it go back anyway? Um, you can do o o and c um So you combine structs function pointers type def heap allocation, etc, etc Um, many people do this a linux kernel has o in it They use like structs with function pointers, etc, etc to basically define which function handles of reading and writing and this and that of something Um gtk does it. Um, they use glibs.g objecting. EFL does it. There's an eo thing for that Um, and this is just like a really simple example of one. I brewed up. This is the very last thing um and I've just I've actually created a object and a public type the public type is a set of functions I expose Um, the object the beginning of the struct has the public type the public functions and then some private data inside the function So Here's my implementation. I just define some functions that actually exist I have created a class here in the class. I just say this class. This is the create function This is the destroy function. This is the set text function Then I actually have the function implemented over here blah blah blah blah blah And over here in oo style. I would do class dot create create an object an object pointer set text Yes, I do have to pass an object again and text and I can do destroy Now if you do that and you have the same struct at the beginning with the same open close and everything You basically do inheritance because the pointers are in the same lake locations in memory So you can call destroy and etc. I said on any object that inherits from the same class And so that's kind of how you do oo and c. Yes, you can get more elaborate. So anyway Um, I didn't cover a lot of stuff gdb val grind. We actually did cover electric fence. We actually did cover you pointed it out Um, and most of it's just putting tools together. So making something better. Um Done just on time perfect Thank you Yes No, I did I just I perfect timing. Yes, I've done