 Hello, folks, welcome back to another Rust stream. This is going to be another one where we implement something from scratch. And in particular, it's gonna be one of those streams where we go through basically a sort of guided set of exercises or challenges. We've done a couple of these. So we did one following the fly.io distributed systems challenge. And we did one where we did the code crafters implement your own bit torrent one. Both of those were really popular and I've been talking to a couple of people and it seems like one of the reasons is because people can try the thing on their own at their own pace and then sort of switch over to the video and view me as a sort of reference as you go and compare their solution to mine at each step of the challenge. And so, you know, if that is a way that people enjoy going through the learning and I totally understand why, then I should do more of them. And so here I am, I'm gonna do another one. I also think they're really fun. So today we're going to build Git from scratch. We are gonna follow a code crafters thing here as well. Apologies for the saw outside my window. Hopefully it's not too bad. So we're gonna build a Git from scratch following the code crafters Git challenge. I've been told that the last step of this challenge, the cloner repository is the hardest single step in any of the code crafter challenges. So we'll see whether we get to it. Depends how long we actually get through. But we'll see how it all pans out. There is, so code crafters is not free. It is a paid site, but there is, so two ways that you can sort of get access to this. The first of them is, I have like a referral link you can use. I'll put that in the video description. I also put it in chat here. So if you go through that link, then you get like seven days free or something. They also have the entire challenge on GitHub. I'll put that in the video description as well and in chat that has all of the sort of steps of the challenge and the text of the challenges. It doesn't have all the nice infrastructure that is on the site for like running your test suite and stuff, but at least it's there. And I think all the tests are there too. So if you don't want to pay for it, then you can do it that way instead. And as a sort of, I guess a disclaimer at the beginning here, I'm not sponsored to do the stream. Like no one has paid me to make this video. I do have the referral link. So if people like it and pay for it, then I get money from it, but it's not a paid thing. Before we get started, there are just a very short amount of housekeeping. And I know people hate housekeeping because it delays the start of the stream, but we'll do it anyway. The first one is that there is now, if you haven't seen it already, there is a Discord for, I guess me, specifically for Jay here. Jay has a Discord server that I happen to be on. The Discord server mainly has sort of announcements of whenever I do new videos, when I'm planning to do new videos and other things that might be relevant and sort of live notifications and such. And you can find that at discord.johnhu.eu. That'll redirect you to the invite link. And I guess I can, I guess I can also put the invite link right here in chat so that you have a handy way to get to it quickly. There are a bunch of other channels too that are only available to people who sponsor me. So that brings me to the second thing or the third, I guess, which is I now have a GitHub sponsors. There is no requirement to sponsor me. You don't get anything particularly fancy except access to a couple of these Discord channels. The main thing is if you find that you've gotten something valuable out of the streams that I've done and the other content that I've produced over the years, then I would appreciate if you could sponsor me and sort of help me do more of this. But I'm also in a stable job, so this is not a thing that sort of sustains my living. It is more of a way to sort of go the extra mile. Okay, and then the last bit is I will be at Rust Nation at the end of March, and Helsing, who I now work for, is sponsoring a hackathon at Rust Nation. So I'm going to be there running a hackathon with remote control drones and cards. I think it's gonna be pretty cool if you happen to be there, stop by and say hi. And that, I think, is all the housekeeping I wanted to do. And so now we can get started. I have not looked at this challenge at all before starting, so this is sort of my normal style of how to do these videos, right? Is that I don't wanna go in already knowing what I'm gonna build, because if I already know what I'm gonna build, it's gonna be much less educational for you, and it's also fun to watch me get stuck. So we're gonna start this challenge sort of blind, and hopefully this will go well. We'll see how far we get through the challenges. My guess is, based on the difficulties here, my guess is we'll get to the last one. We might not get through the last one. Sort of depends on how long I'm willing to stream today as well. I'm guessing we'll stream for four to five hours, but we'll see. Oh, okay, are there any questions before we start, before I click the start building button here? This slide's very authentic, good, just like at home. The Discord is for anyone, like you don't need to be a sponsor to join the Discord. It's just that the only channels that are on the Discord, if you're not a sponsor, are sort of at the announcement channels, like there's no actual chat channel. And then, depending on the tier you sponsor at, you get access to sort of a community chat channel, one where I post interesting tidbits that I come across, and then this sort of goes a little bit up from there. So there's a more frequent Q and A tier, and there's also a tier where you can sort of suggest additional streams I should do and such that are further up. Will you implement rebase? Probably not, I think that rebase to me would come after cloner repository here. Like I think it's worth pointing out that I've worked a lot with Git, so it's not like I don't know what Git is. And in fact, I know a decent amount of the data model for Git too, because sometimes that's how you have to debug why Git doesn't do what you want. I just haven't actually looked at this challenge before. Pre-requisite for this. So this is going to assume that you know Rust. Like I'm not gonna teach you Rust in this stream. But if you don't know Rust, and like hopefully you should be able to follow along, it might just be some parts of it where you're like, oh, I don't know what that piece of code does, or the syntax does, but you should still generally be able to follow along. But the goal of this is someone who generally knows Rust and then wants to sort of see someone with experience building this. And hopefully over the course of that, you'll get exposed to some techniques, some libraries, some ways to use the standard library, and just techniques for programming in Rust more broadly, and potentially also debugging techniques. So I'm hoping sort of teach some intermediate Rust concepts as we go through this as well. If you want an intro to Rust, one thing that I can recommend is, so a couple of years ago I ran this class or like a mini class at MIT with two other lab mates of mine called the Missing Semester of Your CS Education. It's missing.csl.mit.edu. I'll put the link in chat. Nope, that's not the link. This is the link. And it has a lecture on Git. It's not given by me, it's given by Anish, who's also a great lecturer. And if you go in there, it has like both the, a bunch of written notes throughout this. And it also has the entire lecture video, at least in theory, if I can get that to load. And so this one is a, I think a really good walkthrough of sort of the mental model of Git, but also the data storage model of Git. We'll get into a lot of the details of this as we go through the stream. But if you're completely new to Git, I would recommend checking that out. All right. So let's then get started. I don't think there are any other burning questions. So let's go for it. Start building. I would like to do it in Rust, please. Language proficiency. I'm going to go with advanced. Next question. How often do you intend to practice? Once a month. Accountability, I'll pass. Accountability for this. All right. Step one, clone the repository. That's fine. We can do that. Git clone. Okay. And then they want an empty commit. I can do that. Right. So one of the things that sort of code crafters has set up. And I remember this from when we did the BitTorrent challenge is that there's sort of a push hook. So whenever you do a Git push it, it runs all of the sort of test suite over your things. You see here, it built the Rust app. It ran the thing. And it's telling me that like, basically test one failed. Which is totally fine. That's sort of expected. I guess we can go here. Woo. Okay. Great. They received the Git push. So that means we're now set up. Okay. That's fine. Your next stage. Implement the Git init command. Okay. Okay. So the idea here is that we will have, they'll run our binary basically as the Git binary. And the Git binary has an init command. It initializes by creating a Git directory with some files and directories inside of it. Okay. Woo. Yeah. What's in the Git directory? Bunch of text files. Yeah, that's fine. I'm guessing this is the same thing that's described below. All right. So we got to look at what we have in here. Let me get another couple of terminals in here. Let's see what we have. So where's main? All right. I like clap. So we're going to go ahead and add in clap here. And then I can never remember the setup for this. It is like this. Oops. Great. Like so. And the arguments we want here is actually we want sub commands. Right. So in clap, derive, reference. Sub command. There's the example for sub commands. Yeah, here. Okay. So what I actually want here is I want sub commands. Right. For things like init. And command here is one of these guys. And I don't actually have any global arguments as of yet. I do want to include here sub command, which I want to derive. And actually let's use the, the tutorial instead, which has sub command without all the bits. So I don't need this. Don't need this. Get rid of this in it. And initially in it is going to take no arguments. We do want to derive debug for it. And then down here, instead of doing the sort of our and Vargs that they're using here, what we will instead do is match on args.command. And assuming that is, you know, the only valid command initially is init. And then we're going to have, you know, this. And then we don't actually need an else branch here that they have for unknown command because clap will take care of just crashing for us and giving a useful error message if the appropriate, if a sub command was given, that's not the thing we support. And then, okay, what do we do here? We created directory.get. We created directory.getobjects.refs. And we write to head refs head main and we print initialize get directory. All right. Just to see what that does. That's fine. That's all fine. First exercise. Get push. Let's go look at what it says here. So we get an explanation as well. Okay. Yeah. Okay. So we have a structure. We have a head file. The object directory contains get objects. The references can add get references and head can use a reference to the currently checked out branch. So in this case, it says that the main branch is checked out. Okay. So let me give you a brief overview of get here rather than sort of diving into. So this is linking into the, the get book, which is a good read, but just in the interest of time here rather than in get everything is stored as a roughly, roughly speaking a content reduced blob. So that means if you have a file, for example, the file is really stored as all of its bites and then the keyed by the hash of those bites. And so this is an object. Similarly, a commit is also an object. So a commit is an object that has a, well actually there are a bunch of things in between here. So a file is sort of the blob is the lowest level thing that is an object whose key is the, the hash of the file of the blobs contents and whose contents are the contents of that file. A tree object is something that holds something like a directory and what it really holds is a list of name and object and object key for the contents of that file at that time. So you can imagine that the tree for, well, I don't really want to use the same thing, but let's say I do hear a make their foo and I touch foo or I echo hello into foo bar. Then the way this is, this will actually be stored is there'll be a, this sort of thing will be stored somewhere. And that is the tree entry that we would have for foo slash bar. So for the repository here that's rooted at foo, we would store the hash of the contents of foo bar and the name foo bar in a tree object. And we would also separately in the object store, store this hash and the contents of foo bar like this. And then where would this, you know, this actual string right here where that would be stored in a tree object and the tree object is keyed by the hash of this literal string. And where is that hash object? Where is that hash stored? That hash, that object key is stored in the commit object. So when you create a commit, the commit points to a tree and particularly points to the object key, the hash of the root tree object, as well as it includes, you know, information about the metadata like who authored it and such. And then the commit is stored as all of the bytes that make that up and keyed by the hash of that content. And that hash is what we know as a commit hash. So if I do RevParsHead, so this hash right here is the hash of the commit object that is, that head points to. And similarly, you know, if I do master here, for example, it's the same hash because master and head currently point to the same thing. If I did something like head hat, which means the previous commit, this is the hash of the commit object before head. And I can even do cat... No... What is the name of that command? Maybe it is just cat file. So cat file lets you print out what is contained within an object. So for example here, we can do cat file commit head. And this, sorry about the chainsaw, this is the entirety of the contents of the commit at head. And if I did a RevPars of head and then I did a cat file of this commit, it prints the same thing, right? Because ultimately the commit is keyed by its hash. Just like everything in Git is. So the core of Git in a sense is this object store. And the object store just stores maps from hashes to the body of the thing that contains that hash. So whether it's commit or tree or blob, whatever it is, it's all stored in the same object store that has the same rules, which is it's a content hashed storage. And so you see when we look at the cat file here of this commit, you see the contents of it is the tree for this commit. So basically the contents of all the files recursively all the way down is this hash. The parent commit has this hash and the information about the author, the committer and the commit message. And then we can keep walking down, right? So we can do up here, we can cat file tree this tree. And this might not immediately, you know, look very reasonable, but this is basically a serialized representation of the tree. And in particular, the things you'll see here is you'll notice that there's the mode of the file. So this like is readable, is it writable, the name of the file. And then these bits in between are basically a binary representation of, you guessed it, the object key for the contents of those files in this particular tree, in this particular commit. The reasons why you have this structure is because, for instance, if you create a new commit where you change nothing, right? So we did a commit earlier that was like allow empty, then it can use the same tree. So you don't need to store the entire tree twice once for each commit. You can have two commit objects that have the same tree reference embedded in them. And similarly, if you have two trees that differ only in like the contents of one file, then the tree objects are going to be different, but they're going to reuse the same object key and the blobs of most of the files except for the one file that's changed. Right? And so that's sort of the setup here. And you'll see here like, you see here is source and then source just points at its own tree, right? So if I, there's not really a nice way to turn this back into the hash that we need. But there is then a tree object for the sort of hash equivalent of this binary string. And if we catfiled that, we would see a similar tree object that would list all of the things that are under source, which is main. Okay. So that's what we're creating here. So we're creating objects, which is the sort of root store for this content hashable or content addressable store. And then refs is really just a a mapping between human readable names like the name of a branch, for example, and which commit hash that reference is to. So if we in fact look inside the git that we have for this and we look inside of refs, you can see here that the refs are in heads. You can see master. And if we cat master, you'll see that that is the same git rev parse, as we get back on head when we parse out the commit hash. So that's just where those are stored. If you had multiple branches, there would be different heads under here. At the moment, we only have the one. But under refs, we also have remotes. So remotes is where is the same kind of refs except for references that are stored remotely, such as branch pointers on the GitHub, for instance, from the last time you did a Git pull. So if I do a cat of origin head here, for example, you see that it doesn't have a commit hash, instead it says head is the same as that's what this ref here means, the same as refs, remotes, origin, master. So that means head is pointing to master. And if we cat master, you see that it's the same thing as here because we did a push. If I do a git commit allow empty like this, test two, then now if I cat, if I rev parse head, you see this commit hash is now different. Similarly, if I now cat master, that's the same because that's the local one. But if I cat the master in the remote origin, I still get the old commit because that's where that points. And so now you're sort of starting to see how this comes together. When you do a commit, you update your local refs. When you push, what you really do is you're telling the remote, hey, update your refs master to have this commit instead. This commit hash rather. The other things that are in refs is tags, which obviously every tag has some name and has to point to the hash of the commit object that that tag is pointing to. And if we look at objects, you see this is almost entirely just a list of files by hash. And in fact, we could even find here, so head was that, so git objects 6f. You'll see here that the first two characters are used as a directory name and then the rest are flat files. The reason for that is just because Linux doesn't like it if you have a bajillion files all in the same directory. So splitting by the first two characters of the hex just allows the file system representation of this to be a little bit more efficient. But you see here 6fc85, 6fc85. So this is the same object. And indeed, if we now cat this it's a binary file, but really the contents of that is is the same as this. It's just a compressed version of it. Okay. If you hex dump that, we see the same hash for tree. So you want to hex dump this file. I don't think that's going to help you too much. No, this is not just like the binary representation of that because the task key would be the same. It's like, I believe these are compressed. I forget exactly. I'm guessing we're going to get to that in one of the first exercises. Okay. So that hopefully now you have a sense of what are in each of these two directories. So there's objects, which is the object store, refs, which we just talked about, and head, which is just a special kind of ref. It is specifically which commit is currently checked out. That's what head means. And initially that's just going to be the main branch. That's fine. So we did this first thing where we're now, we just create the correct directories and then we also create the head file. So if you look at main here, we created get, we created get objects, get refs, and we created this get head file, the points domain. That's all we did. Get actually creates more files and directories, right? So if you go look in get, there's a bunch of other things here like logs and branches and stuff. We're just going to ignore those for now. These are all we need to get started. There's zlib compressed tail. That sounds about right. Get head as a new line at the end. That's fine. Okay. View next stage. Read a blob object. So this is the cat file command that we actually just used. And so I'm guessing here the goal is going to be that you can cat file some blob and it just prints what the contents of this is. So I'm guessing here we're going to get to the point of doing a decompress, right? Which is exactly what we expect. We'll deal with three gift objects, blobs, trees, and commits, right? So we've talked about all these so far. Blobs are sort of the lowest level thing, which is just the contents of a file. Yep, only sort of the contents of the file, not names or permissions. And this is so that for example, if you have two copies of the exact same file in your directory tree, it's only stored once because the tree just stores the same hash twice under different names. Yes, it's a SHA1 hash known as the object hash. This is an example. Gift object storage. They're storing gift objects. We looked at that. We looked at how the path would be structured. Each gift object has its own format for storage, right? So blobs, trees, and commits have a different on-disk binary representation from each other. And this is presumably because you want to optimize for both how they're used and how they compress the best. I'm guessing we'll see that in a second. So for blob storage, they are Zlib compressed. Yeah, so whoever was in chat was totally on board with this. A format of a blob object looks like this. Blob, space, size, a null byte, and then the contents. Okay. So we can sort of verify this if we want, right? So the... What is the easiest way for me to do this, actually? I just want to see if I can find, like, if we look under dot-git slash object, and we look at, I don't know, F9. What is in U? That doesn't look like a blob. 60. That doesn't look like a blob. But notice how it starts with the same kind of thing. So I'm guessing these are then probably either commits or trees. 9F. So that starts with XK, so that's a different type of file. 6F. That we know is a commit because we looked at that one earlier. 73. Okay, yeah, so these are all compressed. I don't think we're going to get something useful out of just looking at them. So let's instead try to just implement this first bit. So we're going to have to... I see because they're written like this and then Zlib compressed. So we're all we're seeing is the binary output of the Zlib compression. And so we need to Zlib decompress and then we'll see this bit. Okay, so let's then add here a cat file. And we should be able to here now do command cat file. And obviously cat file is actually take some arguments, right? In particular, it takes which thing that you actually want to print out. So it takes a... What's the dash P here for? Let's go look at cat file. So cat file takes a dash P and I'm guessing P being pretty print based on its type. Okay, so let's go up here and say that cat file is going to take a pretty print, pretty print I can't type, which is going to be a clap short like this. So it's going to take that, but it also is going to take a positional argument, right? Which is the object hash, which is going to be a... We can choose how we want to structure this, right? So we could say a string. Alternatively, we could say something like... I actually wonder whether... Yeah, I was afraid of that. I was hoping the clap might be smart enough to realize that this is something that can be passed in as a string to validate that it's exactly 40 characters. I think instead what we'll do is just take a string here and then if they give a hash that doesn't have the right length, then we'll just error out. There is actually the point here that I believe cat file is a little bit smarter in that it allows you to deduplicate. So you only need to give the shortest prefix that is unique. So here, because there are no other hashes, then the one for this one that starts with this, it actually lets you get away with specifying fewer. So that's even more of a reason here to specify string. So this is also going to have the object hash. And then in cat file, we go down here. Yeah, so we'll have to read the blob object, decompress, extract the content and print the contents to standard out. Okay, shouldn't be too bad. Cat file must not contain a new line. That's fine. You can use flay2. Okay, so here's what we're going to do. We have to open the file. In fact, we don't... So there are two ways we could do this. We could either read the entire file into memory and then decompress it in memory and then do what we want with the output. Or we can open an IO reader here, which is realistically the nicer thing to do. File, open, and we're going to try to open... I'm going to leave it to do here. Support, shortest, unique object hashes. Right, because what we're going to do here for now at least is to just open a file that is at exactly where the path should be, according to the object hash. But realistically, we basically want to use something like glob here to put a star at the end and find the number of unique files or the number of files that match that prefix. And if there are more than one, then we error saying you need to choose. If there's only one, then we give that file. But for now, let's do the whole thing. So what we want to open here is .git slash objects slash and then the first two characters and then the rest, where format. So what that is going to be is object hash to and object hash to, like so. So that is going to open the file. And then we're going to use the flay to create. The flay to create basically provides different kinds of compression and decompression things. In our case, what we want is a reader. And in particular, I want to decompress. Do they have a copy paste example I can just use? Because that would be nice. This is for compression. Don't need multi-member. I guess I'll do read. Zlib decoder. Aha. Amazing. So. Oh, and I don't really need end here. That one can go away. And so down here, I'm then going to create a new Zlib decoder. And the Zlib decoder new here takes anything that implements read and files implement read. So we can just pass that in right here. We get a Zlib decoder back and then it implements read for Zlib decoder, right? So the idea here is that it's sort of a streaming decoder. So it reads from the file, decompresses, and produces a thing that you can then read from. In our case, the question then becomes, okay, what do we want to read out here? Well, really what we're reading out is just a string, but it is a string with a little bit of structure to it, right? So it has this structure right here. So what we can do is we can either write a custom implementation over read here, or in fact, here's what I think I want to do. What is size here? Is the size of content in bytes? Yeah, but how many bits though? I see. It's just, okay, it's variable length decimal encoded. So this is really just a null terminated string. Okay. So in that case, what we can do is use Z. But in fact, I think what I want here is Z is a buff reader new of Z. So buff reader. Oh, right. I guess we'll return here a anyhow result. We'll do a cargo add anyhow. So that at the end here, we can do this and we can do context just to make it a little bit nicer than just unwrapping everywhere. Open, in, get objects. So this, the buff reader here. So buff reader is a type from the standard library that allows you to basically keeps both the read thing that you pass in and a buffer that it's reading into. And then it'll, because it keeps that buffer and doesn't just take whatever comes out of read each time, it can do a little bit smarter things like this growable buffer. You can do things like read until you hit this character. And what we'll do is keep filling into the buffer until it hits something that matches that, which is exactly what we want here, right? We can do Z dot read until we want to read until we get a zero byte. And the buffer we're going to give it is buff is back new. So we're going to read until that into this buffer like this. Read from get objects or read header from get objects is really what we're going to do. And these don't need to be muted. Okay, so if we look at the docs for read until, it says until the delimiter byte or end of file is reached, all bytes up to and including the delimiter will be appended to buff. Okay, so in theory at least this means that after doing this reading, the stuff that's in here is technically a valid C-ster. So in the standard library, no, C-ster. There's a type called C-ster in FFI that represents a borrowed C-string, which is really what we have here. C-string being null terminated array of bytes. That's exactly what this string is. It comes directly safely from a U8 slice, which is exactly what we want here. So we can do, I wish this was the default view. So what I want is let header is C-ster, what's the name of it? It's like from bytes with null. This one is safe, but it does scan the string ones to make sure there aren't any null bytes within the thing you gave and that there is one at the end. I think that's okay. Technically, we could have a smarter implementation here. We already know that there is no null inside, so we could use the unsafe constructor here. But in reality, I don't think it actually costs that much and it means we don't have to use unsafe. So let's just stick with it. So we can do this and say this one we can actually do expect because we can say no, there is exactly one null and it's at the end. So this isn't an error we expect. This should ever be possible at runtime. And so then we can use expect. And I'll grab in C-ster here. And at this point, back here, we were told the structure of this is blob space and then the size. So we should now be able to do header.toster. In fact, I'm going to do this context. Get objects. File header isn't valid. UTF-8. And remember here that really it's ASCII, right? But all ASCII is valid UTF-8. And so at this point, we should now be able to do let some size is header.strip prefix. We want to strip out the blob space that we know is at the beginning. And if there isn't a blob space at the beginning, then something is terribly wrong, right? And we can bail and say, get objects.file header did not start with blob. And then we can even give the string that was actually produced, right? Which we can do here. And then we want to say let size is size.parse. And we want to parse this as a U size, right? It should not be a signed number. The negative wouldn't make sense here. And here too, we can give the context of get objects.file header has invalid size. And we'll print that out as well. So now we have the size. Now we know how long the content should be. And so we should now be able to say buff.reserve exact. In fact, we can truncate it to zero, which is really just a clear, right? And then we can do reserve exact. And we want to reserve exact size, right? So this is saying clear all the stuff that was in the buffer that we read into previously, right? This header bit. And then we want to reserve space for exactly this much, which is how long get claims that the file should be. And at this point, we can now do z.read to end. In fact, is there a dot read? There's a read exact. That's what I want. See. OK, so there are two ways to go about this. The question really is, what do we want to happen if this is wrong, right? Like if the size here doesn't match how much is actually in the file. So we have two options here. We can either do read until end. And what read to end will do is if I pass in the buffer here, it will read until it hits end of file and it will grow buff if necessary. So if the real file is way larger than what the size header said, then this will just keep growing buff potentially until it produce a giant file. That's not, and then we could check afterwards that it matches the size, but still. The other alternative is to use read exact, right? Read exact takes a slice. So you have to take the back and produce a slice of that size. And then it will only read exactly that many bytes. If it hits end of file before that, it returns an error, which kind of is what we want here, right? Unexpected end of file. If you read something shorter and then we would, after doing the read exact, we would check that the next read returns end of file. That's sort of the nicer way to do this, right? So I think we're going to go that way for now. So in that case, what we actually want is we want to slice into buff of the right size. So if we go now to back, I forget exactly what this method is called, right? Because what we really want here is a method that returns, yeah. So, yeah, sure, let's do this properly. So here's what we're going to do next. I'm going to do it the slow way first and then tell you why it's slow and then how we could fix it. So where do I want to start with this? z.read exact atmute something. So this has to be a slice. If I just give buff here, right, like I can do this, get objects file had context did not match expectation. It's a little bit of a weird error. The problem here is that this slice will be of length zero because we cleared the buffer and then we reserved this many bytes. But the length of the vector is still zero, right? Like we made the capacity be large or whatever the size dictates. We did not actually set the length to anything. And the answer to that is, of course, or the reason we did that, of course, is like, if here, if the length was set to size, what would the value of buff zero be? Or worse yet, like size minus one, it doesn't have a defined value because we haven't set it to anything. And if there's one thing Russ dislikes, it is memory with undefined values, right? So there are a couple of ways we can go about this. One of them is resize. So with resize, I can set size and I can say that every value should be given the value of zero. So instead of reserve exact here, I do resize to size of zero. The zero here basically means every element should be given the value of zero. So allocate to size and then set everything to zero. This works. The problem is we're writing a bunch of zeros that we're then immediately going to overwrite. So that feels kind of stupid, right? Like it's an unnecessary performance penalty. What we really want to say is, I just want it to be the size and then they're all going to be overwritten anyway. There is a way to do this, which is to use the maybe uninit type. And what you would do is you would have a, instead of having a VEC of U8, you have a VEC of maybe uninit of U8. And then you can resize it. You can basically set its length to be, you can hear you would do maybe uninit, uninit, right? So you would set the length to just have uninitialized bytes the entire way. And then you would cast it into the U8 buff that's needed here through an unsafe call. And then afterwards you would assert that the vector now holds initialized bytes. We're not actually going to do that here because it's a little bit of a wonky code and I think it distracts too much from what we actually want to do here. But this is the worthwhile thing to point out. One of the reasons I don't want to do it is because one of the methods I know I'm going to want here is a way to turn a slice of maybe uninit U8s into a slice of U8s. There's a nightly, like an unstable function to do this but it's not stabilized yet. And so I don't want to also need to opt into nightly here. This will turn into an optimized memset, I believe. But even so, you're still telling the computer write all these zeros. So we're going to stick with this for now. And what read exact returns, you'll see it's an IO result with unit, right? Because it doesn't need to tell you how many bytes it read because you told it read exactly this many bytes. And so now, if we now do a read and I'm going to pass it a empty, I'm going to pass it a one byte long thing here. Yeah. N is this, I guess read, this is really read contents of, read true contents of GitObjects file. And this is a validate F end of file in GitObject file. And here we should assert that N is zero, right? So there should be no bytes after the size that the header required that we do. And when here we can use, I think it's ensure, right? N equals zero or GitObject file had N trailing bytes. Okay. So in theory now, we're reading out this whole thing. We write it into the buffer. And at the end of all of this, the question now becomes what do we actually print out? So the thing that comes after here, the content here is like, it's just a bunch of bytes. There's no guarantee that this is actually a string. It could be an image, right? Any file that you can put into Git will have this structure. And so when it says cat file, it's worth pointing out that this, that the stuff we end up with is just a binary blob. And so we don't actually want to print it using like printlin because print line will require the thing you gave it as a string. So instead, what we're going to do is take stood out, which is going to be IO standard out. So this gives us a handle to standard out. And then we can also even do if we want to sort of maximize performance here, we could lock standard out so that no one else writes to it and so that we don't get any garbled output. There is no concurrency here. So it doesn't really matter. But if you, so you can write directly into this, but what that means is every time you write a new chunk, it takes the lock again and again. Here we're just going to lock it and then do all our writes at once. And so now we can do write to stood out. And we want to write. In fact, I don't want to do that. I want to stood out dot write all of buff. And then this will be a write object contents to stood out. And then this needs to be moved. Okay, let's try to cargo run cat file. The P doesn't do anything at the moment. Let's see what that does. Oh, right. So it tells me, well, I couldn't do that because you didn't tell me about an object hash. So let's give it an object hash. Cat file this. Get object file header did not start with blob. It started with commit. Right. So we've only told it how to print blobs. We haven't put in handling for things like printing a commit yet. That makes sense. So instead, I guess up here where we're currently looking for the blob header, what we actually want to do is say kind and size is header dot split once on a space. And we do expect that that should always be the case. Otherwise did not start with a known type. And then we're going to match on the and then the size can keep going here. I'm guessing they all have the same structure for this header. But down here, what we want to do is match on the kind. If it's blob, then we do this. If it is anything else, then we do right to standard out. So now I know what the P flag is doing is almost certainly that if I'm going to guess that if the P flag is not given that it prints out just the raw decompress. If I did here, if I do get cat file this, I see you need to give the type. And if you give dash P, OK, never mind then. So if we get anything that's not blob, then we say we do not yet know how to print a kind. And this needs to be a comma. And I've done something else wrong. What else have I done wrong? There we go. Buff is borrowed as mutable. So this is because we're still borrowing kind here. So let's go up here and do enum kind is blob. And blob is the only one we have here. And so down here, we can say let kind is match kind. And we'll make this a little bit nicer. So blob is going to do kind blob. Anything else we do not know how to do. So we'll do in fact, we'll do anyhow bail like this. And then down here, this is now going to be kind blob. And we can get rid of this guy. So now if I do this, it'll say we do not yet know how to print a commit. Fantastic. Do we have something that is a blob is the question? Yes, we probably do. We can do we can print out this tree like so. Oops, I meant to use the git version here. Tree. Can I like pretty print it? I probably just does the same. Pretty print. Okay, great. That's what I wanted. So let's try to do. I don't know. Git ignore is probably a nice short file. And so now if I run our version and do cat file of this. And look, we indeed printed out what does indeed look like a git ignore file. Amazing. Okay. So that's really nice. Amazing. So now we have a thing that actually does print out what we wanted. We can get rid of this logs from your program will happen here. So make that go to E print line. And I think actually up here, we're going to do anyhow. Ensure pretty print. Because that's what the real git command does, right? When I tried git cat file of this, it just gives me the help thing and says you need to give both a type and an object, or you need to give dash P. And so this is a mode must be given without dash P. And we don't support mode. Great. Someone else in chat is it possible to redirect the reader the paler directly into standard out without the VEC allocation. It is. So the proposal here right is actually let me push this first and then show you that change. Great. Yep. Commit a second exercise. Get push. Let's see what it does. I don't know why we have Tokyo in here. Probably get rid of Tokyo here and then get significantly faster belt. A whole lot of steps. Let's see what they have in cargo tumble. Why do we need HTTP requests? Interesting. I don't none of. Oh, we're going to need it for cloning a repository at the end. Right. Then you need to actually read things from from the remote. I'm going to just comment those out right now because we don't need it for this. I'm also fine to use anyhow instead of this error. So we'll do this just to all test pass. Great. It's happy with us. Great. So we did the right thing. So now the question is, can we make this a little bit better? So instead of doing this with sort of reading into a buffer, the moment we've read the header, we don't really need to do anything else. We can just stream the rest of the decoder directly to standard out. And that is indeed doable. So here's what we'll do in order to accomplish that. We'll do a match on kind because that might only be possible for blobs, for example. Right. But if what we get is a blob, then we should be able to stream it directly out. And then what we'll actually do is IO copy, which takes a reader. In this case, the reader is going to be Z. And the writer, which is going to be standard out down here mute to standard out. And then N is going to be that. And we'll do here write file into to standard out. And then at the end here, after printing it all out, we'll do this assert. Now, the downside of doing it this way, I think I can now get rid of most of this. The downside of doing this way is we might actually end up copying a lot more. Like imagine that the file is actually way longer than what the size dictates. Then the copy here is still going to copy until the decompressor runs out of stuff. And this is how you get to things like zip bombs, right? Where you just keep reading as long as the decompressor is giving you stuff, even if there's a header that says how much there should be there. And so this is just going to write everything out. There is a way to get, I forgot the one byte read. You don't need the one byte read here, the thing to check that we hit end of file, because copy is going to run until it hits end of file anyway. And then it's going to tell you how many bytes it actually copied over. And so I guess this should be size has end trailing bytes, where that's really going to be file was not the expected size. Expected is going to be size, actual is going to be end like this. Yeah, so we don't really want to do a sort of unguarded read here. There are a couple of ways around this. So for example, it could be we'll see whether that's the case here. This one does not have it decompression settings. Let's see here. This one doesn't. Okay. So there are some of these libraries that actually provide you with a way to set a limiter. Given that we don't have that, what we can do is actually create a limit reader. And it is going to hold a, and there are crates that provide this as well, but we can write it pretty easily ourselves. So we might as well, the reader is going to be an R limit is going to be a U size. And then we're going to implement read for limit reader where R is read. And we'll grab in here, implement missing numbers. And in fact, we're also going to implement the default ones specifically because we want a forward as many of these as we can. Well, we'll not do that right now. We'll leave that to other crates to do. My thinking here was that we really want, like if the underlying reader here is an optimized like vector read, for example, we would really like to make use of that. But in the interest of time, I won't do that here. There are crates that provide exactly this type for this reason. But what we'll do is we'll have a, when you do a read, you'll do let N is self dot reader dot read into buff. And then if N is more than limit, in fact, we're going to shorten this even more. We're going to say if buff dot Len is greater than self dot limit, then buff is equal to, so this is going to be mutable, then buff is going to be equal to mute of buff to self limit. So that we never read more than what we're allowed to, we're then going to do the read into buff. And then we know now that this will never be more than N, but then we'll do self limit minus equal N, and then we'll do okay of N. So the idea here is there's one bit that's going to be missing here, which is we really want to error if there's more. I think what we'll actually do here is we'll do plus one. And if N is greater than self dot limit, then we will do a return IO error new IO error kind. I guess we'll do other too many bytes. And so now what we'll do here is we'll actually say that mute Z is, in fact, we can do that out here. Z is a limit reader over the original Z with the limit of size. And so now we can guarantee that we'll never read more than this size out of the reader ever, right? If we do, then we're going to get, if the reader produces more bytes, then we'll hit this error right here, right? And so now this copy is going to read through the limit reader, which means it'll be limited. It might still produce a value that's lower than size, in which case we want to error. But if it produces something that's more than it'll hit this too many bytes error. And so in theory now cat file should still work. And it's going to complain about this bit because the limit here needs to be a U64, almost certainly. This then should also be a U64. As U64 is fine here. I guess the limit here should be U size. And then this should be as U size. That's fine. So it still now works, but if you were to get an object file where the size header didn't match, you would get an error, regardless of whether it was too long or too short. Could you use reader take here? Ooh. You might be right. Take. And it afterwards will read at most limit bytes from it. Yes, indeed. Although, so the difference here is that this will return end of file. It won't error. I guess that's okay. It's kind of nice to get the error. But but in the interest of shorter code, which is easier to reason about, we can do Z is Z dot take size. That's the effect. Good call. Note this won't error if the file is if the decompressed file is too long, but will at least not spam stood out and be vulnerable to a zip bomb. And just see this still works. And then we can now get rid of the IO thing. Let's do add cargo lock and cargo Tommel first. Comment out big depths for now. And down here. Mitigate zip bomb. Get push. I've educated chat so they've surpassed me now. Chat now is telling me about all the things I didn't know about Rust. Yeah, so see how much faster the build was here because we got rid of all those extra dependencies. Nice. We might have to add them back, but at least we don't have to wait for them for each stage until that last one. Okay, let's go see what the next exercise is. I guess we're supposed to be doing exercises. Create a blob object. You'll implement support for creating a blob using the git hash object command. Use to compute the shaw hash of a git object. When used with the w dash w flag, it also writes the object to the git objects. Oh, I see. So this produces the hash that this file would have created if it were added to git. And then with dash w will also write it into git objects. Nice. Okay, that should be pretty easy. Yeah, that's fine. We just, it thinks we're at the next exercise. So it goes that failed, but you see stage two succeeded. Okay, so we have what's it called hash object, right? And then this is going to be a file, which is going to be a path buff. So down here now, command hash object, write and file. Let's see what we get into. So what are we going to do? Well, we need to, we need to compress it. And then we need to add the sort of size field to it. No, the other way around, we need to add the size header. And then we need to compress it. And then we need to hash it, right? So the inverse of what we did for print here. So the header is going to be, ah, see, here's the awkward part. We don't know the size until we've read the bytes, which means we can't start writing, we can't start creating the hash until we've read the bytes because the hash is going to include the size and it comes before the contents. That's why often these formats tend to have the size of the end rather than the beginning, because that way you can do it in the streaming way. But if you're at the end instead of the beginning, then when you read, you can't pre-allocate the storage. So like you have to choose either writing is annoying or reading is annoying. And in this case, writing is annoying, which is fine. Or rather, when I say annoying, what I mean is a little bit less efficient, right? Because you need to read all the contents of the memory in order to produce the hash you're eventually going to print. You can't just stream it through, but you can stream reads. And so given that you read more often than you write, that's probably a worthwhile trade-off. Yeah, we can start the file to get the size ahead of time. But that doesn't work, for example, if file here were standard in. So my guess, for example, is that hash object also allows you to pass things from standard in. Ooh, does it not? Oh, it doesn't. Well, OK, then ignore me. Then what we will do is start the file. So we'll do metadata of file, context, stat file. And we can actually here be a little bit more helpful, right? So we can do with context, format, and then actually print out the path to the file. And then we'll do a hasher. So I think we have the SHA-1 crate here already. I think that was already in the cargo-toml. Yep. So we're going to create one of these guys. And we'll add in this. And I think we already have the hex crate. Although I think actually looking at the cargo-toml, I think it's using hex. And I seem to remember something about, because there's hex and there's hex literal. I think is maintained by the Rust crypto community, which like crypto as in cryptography community, who are the same ones who maintain things like SHA-1. And so that's why they're using that here. I don't think it matters too much either way. We'll use hex that's fine, given that that's already in our cargo-toml. So we can just leave that. So we'll go down here. The hasher is going to be that. And then we're going to be dot, sorry, hasher dot update. We're going to write blob space. And then hasher dot update the size, which is going to be a format of stat dot. Now Rust analyzer doesn't like me now. Stat dot lin. And then we update it with the size. And then we update it with a literal zero. I wonder whether does update allow that as refu8? No, so I'll do this then. Like this. And then we want to stream in the actual contents of the file into, no, this is not even true because we need to compress it first. So ignore me for a second while I grab the Zlib encoder. Oh, there's a std in flag for reading from standard in. Okay. We want the Zlib encoder. We'll grab the same things here. And we want one of these guys. So, ah, so this is where it's going to be. Yeah. Okay. So we create a Zlib encoder. And we're going to write all blob space. Compress is going to come at the end. Hasher is also going to come at the end. We're going to write all that. We're going to E dot write. In fact, I think E, I think E here just implements right. So we can just do this instead. It's a little bit nicer. And that way we don't need the format. We can just write E stat dot Len. And that also means we can then write. And I think we can just do this like so. And this is going to be a question mark. That's fine. So the compressed thing here is a Vecuate. Right. So here you give it what you wanted to write the output into. And so here really what we would do is construct the file that we're going to output into. But we also want to sort of keep a running hash of the thing that we're writing so that we get the hash at the end. Right. So what do we actually want here? And here I am going to claim that we need our own implementation of write. So we'll do a hash writer. I mean, unless the Shaw one crate has one, but I'd be very surprised. Yeah, it does not. Okay. So we're going to hash writer, which takes a W. It has a writer. And it has a hasher, which is going to be a Shaw one. And then we're going to implement write for write for hash writer W where W implements write. Implementing members. And again, here, you know, there are a bunch of other methods you might want to pipe through. We're just going to skip them for now because the efficiency isn't that important to us. So here what we want to do is self dot hasher dot update buff. And then self dot writer dot write buff. And actually this is not quite right because we actually need to do this buff and because it could be that the writer does not write all of the bytes that it's given. And we would only want to update the hasher with the bytes that were written because they're going to be given in again the next time around. So this is actually a, I'm guessing someone has been tripped up by this in the past. And so then we're going to return. Okay. And flush is just going to be self dot writer dot flush. We don't need anything special for that. And so now our writer is going to be, and this is going to be annoying, isn't it? Yeah. The generics here are going to come bite us, which is a little frustrating. So we'll have a, we'll have a function for this. I'll show you what I mean. So what I really want up here, right? Is writer is if write else. If write, then we want, oh, actually, the thing we're going to have to do here is file create temporary. I'll talk about why we need temporary in a second. And otherwise is going to be a Vecnu. In fact, it's not even going to be that it's going to be unit because we don't actually care about the bytes that are written. We only care about the hash. And then what I want here is given writer. The problem as the compiler helpfully points out is that the if and the else have different types here. And so writer doesn't have a well-defined type. It has one of two, which means zealob encoder has one of two because it's generic over the writer. And, you know, that we could put all of this code inside each branch, but instead what we'll do is just have an FN. In fact, we can define that up in here. Write blob is going to take a W. And the bits that it is given is the size and the file size here being a U64 file here being a reference to a path. In fact, file size and writer. And it's going to return an anyhow result like so. And then up here, we're going to go up and grab this, put that in here. And so now if write, then we're going to do write blob of file and stat.len and this, write blob object to disk. And really, this is to temporary file. And again, I know I've promised to explain why we have that. So I'll do that in a second. And otherwise, we're going to do this like so. And then down here, this is going to return OK of nothing. So inside of here, we're going to take the writer where W is right. And we'll grab in path here. I was pretty sure unit implements right. Does it not? Why? It should. Right? Really? There's a sync type instead of it just being for unit. All right, fine. So this is a writer which will move data into the void. It does no actual system calls generally called by calling sync. Okay, fine, fine, fine, like this. So inside here, we're going to open a Zlib encoder over the writer that we're given. We're going to write out blob space and then the size. And in fact, that means this bit can also move in here because there's no real reason for it not to. At which point, we don't need to take the size anymore. And these don't need to pass it in. Oops, one too many. Stat.len. And at the end here, we're going to call e.finished. Compressed gives us back the writer after we finish the encoding. And the hash now is compressed. So it's the writer is actually going to be one of our hash writers here. So hash writer of writer and hasher where the hasher is a shall one new like this. And so now the hash at the end is compressed dot hasher dot. What is it? Finish for shall one that finalize. And then this, I guess we will do dot context here because these write blobs return an error, which is going to be write out write out blob object. And this is really construct temporary file for blob. Right. And at the end here, I guess we get the hash. And so we could really just return the hash here instead. What is the type of hash? It is something. Ooh, not what I meant to do. So in our, in our shaw one, shaw one here, shaw one core. Is there an easy way I can name the output here is what I call fixed output core. Output size user. I just want something that I can name as the output type here. That's fine. I can have it be a hex instead, I suppose. So this will output then hex and code of hash. That's ultimately what we get out here. And so this then is going to be let hash is this. And this actually, I'm going to have to explain this now, I think. So here in the, in the right case, the hash we get out, like we wrote to a temporary object because we don't know the hash until we've run through the entire input file. And so we don't know where we want to write the output to until we've done reading the input. And so we have two options. Either we do it all in memory and then right out to the final file or rewrite to a temporary file and then we move the temporary file to where it's supposed to be. And so in this case, I'm going to do the writer. The latter, I mean we're going to do creator all. Oops. Yep. .get objects. This where that is going to be hash. .to. All right. So the first two characters create subter of get objects. And then we're going to do stdfs rename of and this let's for now just say it's called temporary. Realistically here, you would use an actual thing that generates random temporary file names. For now, I'm just going to pretend that we have one of those. And then we're going to move it to the final location, which is going to be this and hash to onwards. Move blob file into get objects. And in fact, I think we want to get the hash out of here. So this is going to do this, this is going to do that. And then here, regardless of which path we took, we do want to print out and it looks like hash object prints with a new line. So we'll print out the hash. Regardless of which path we took. Okay. So now let's just see if this does the right thing. So if we run hash object cargo lock, what does it do? Well, it printed a hash, but it's not the same hash. So that seems problematic. As the list of implemented commands is growing, why not split the command processing of each into individual dedicated functions? We'll probably do that. Like there's no real reason to have them all be directly in line here, I agree. And it's going to get unwieldy as we grow. Same thing as there are probably some things we can reuse, like for example, this way to construct paths. There's no real reason you want to repeat it. Quite to the contrary, you probably want to share it. I think for now this is okay, but the moment we go to the next exercise, I'm going to split them. Okay. So we did something, but it's not quite right. Like we end up with a different thing than Git does, which I think suggests that what we actually want to do here is a, I want to do a dash W. And then I want to do a cargo run dash W. And then I want to diff.git objects 31e, which is what Git actually produced, and Git objects 2a70, which is what we produced. So it did actually write to the file. That's good. Okay. Binary files differ. Yeah, that's fine. So let's do hex dump of each of them. Hex dump of this guy. Oh, I want to decompress them, which is Zcat. I think Zcat can do this. Can't. Isn't there a... Oh, I don't remember. There was a command equivalent to Zcat, but for Zlib, and it is called... Anyone remember? I don't have xxt. Really Neo Vim doesn't ship with xxt? That's really annoying. Yeah, I could write it out myself, but I also want the ditto one. There is a command for this. I'm just blanking on the name. In fact, I actually wonder whether I can just have Vim open the file. No, I did not want to do that. No, hex dump. I don't think hex dump has... Uncle Brazilian data. Wow. That's awful. That's so awful. I mean, I guess that works, but that's just terrible. Wow. Wow. Okay. That's so stupid. Okay, so that's what it gives for our file, and then what was the other one? 2A... Wasn't it 2A? Too hard to scroll. 2A70 is ours. Interesting. Well, that's certainly different. So notice that the one from Git has the actual file contents, and ours just does not. Interesting. Why do we not... Oh, that's because we don't actually write the bytes at all. We write the header, and then we don't write anything else. We need to stdio copy of... That's so stupid. Okay, stdfs file open of file. We'll grab this same one here. So of course they're different, because we didn't do it. So we're going to read from file into e-context. I guess stream file into blob. This needs to be a mute. Okay, let's try that one more time. How about now? Okay, so it's different. It's still wrong, but it is different. Okay, let's now see what we get this time. So, no, 5.1A is now ours. Okay, that's better. And what is the one that Git produces? Git produces this one. Okay, so our sizes are the same, right? Blob 11.792, blob 11.792. This file is automatically generated by cargo. Interesting. So how are these different? Let's do hex dump of this, and then do... I think I only need the first three. What? I'm apparently bad at files. 30, and then I want the 5.51. So that's Git, and this is ours. That all looks the same to me. A, B, Diff, AB, they are the same. Ah, is the Shah one of the uncompressed? Because that would certainly make a difference. If it is the hash of the... Yeah, the notes at the bottom. Shah needs to be computed over the uncompressed contents of the file, not the compressed version. Okay. That actually makes this slightly more annoying. Just because it means that what we really want is a hash reader rather than a hash writer. It's not the end of the world, right? So we just switch this around. We implement reader of R. Let's keep this because we're going to need almost the same thing. Why not wrap the Zlib encoder with your hash writer? No, because the whole point is that we want a... We want the hash of the input, not the hash of the output. In fact, and it doesn't include the header, either, I don't think. In fact, that's going to be kind of weird. If the hash is... In fact, we can find this out, right? What is the Shah1 sum of cargo.lock? Okay, what is the Shah1 sum of this file? Like the decompressed. Okay, so it's with the header, but it is uncompressed, right? So you see this hash here in the file name matches this hash here. So it is a hash of everything, including the header. And so therefore, we could just have the hasher operate... We could have the hasher operate on the... Yeah, we can still do it on write. We just need to do it before the compress rather than after the compress. So we want here this. So this is going to be... Yeah, so you need to invert this in your head, right? Because when we do a copy here, the writer that's hit first, so the thing that gets to see the initial input, is the last writer we sort of add to the stack here. So this is the encoding writer... That the zealip encodes, and then we wrap that in our hash writer. And our hash writer is then going to see the uncompressed bytes and then forward to the zealip encoder, which is going to produce the encoded bytes. So here we do writer, writer, writer, writer. Hash is then writer.hasher.finalize, and this is writer.writer.finish. And we don't actually need the writer back. And this needs to be mutable, and this does not. Okay, let's try that again. Ah, that's the right hash. Amazing. All right. So now we have the same hash as what git does, and just to sort of... I mean, there's nothing to diff here, right? Because the hashes are the same, and we would overwrite the same file. Okay, great. So now we have hash object working. And at least in theory, if I do this, it prints it, but hopefully did not do anything to git objects 31e. So that was last modified. In fact, I can just stat that file, and then I'll run our thing without dash w and then stat it again. And the modification time has not changed. And if I do dash w and then stat it, then the modification file has changed. Okay, so now we have a hash object that seems to be doing the right thing. Amazing. And we can remove amb, and our diff here is implement hash object. Git push. See what it does. Ha-ha, all tests pass. This string of a dumpty-yike-dumpty-dunky-dooby-dunky. What a string. Okay, fireworks. Amazing. Read a tree object. It'll implement lstree command, which is used to inspect a tree object. Okay, so before we do that, let's do a little bit of splitting here. So let's in fact make modules. Create that. And then I want a cat file. And I want a hash object. And inside of here, I want to grab all of cat file. That's going to go in here. And that's going to be something like, I guess, invoke. And that needs to take preprint. And here we could introduce an object with options that we could then forward onto clap. But I don't think we're quite at the point where we need that yet. Object hash here is going to be, only really needs to be a string reference. And this returns an anyhow result of nothing. I typed too fast. And then this goes in here. This goes away, this goes away. These go away. These go away. These go away. And then kind also is in here. It's the only place it's used. So cat file here is now going to do command cat file invoke of preprint and object hash. Commands. Yeah, that can borrow. And then if we also look at command hash object, we'll do the same thing, which will take this bit. And we'll do fn invoke. Returns anyhow result of nothing. And it takes the arguments write and file. So write is a bool and file is a path. And this is now going to call commands hash object invoke of write and file. Right. And I guess we'll do here a pub create. I also can't type apparently today. Hash object is going to get these. Same thing with the writer that we have at the bottom. That's also going to go in here. And if we go into commands, these are both going to be pub create. And cat file is going to be pub create. Oops, pub create. And hash object is going to be pub create. And it's yelling at me for something. I've messed up my syntax here. This should be a colon. The path buff goes away. This goes away. This goes away. These go away. If we now go back to main file takes a reference. Amazing. And just check that that still hasn't broken anything it has because cat file needs an okay at the bottom. And main now doesn't need most of its imports, which is nice. Like so. Split commands into mods. We're not streaming for day to day coding to use copilot. No, I do not. I've just never really found a use for it. Maybe it's just because I haven't integrated it with my editor, but like my bottleneck is not usually actually writing the code. It is thinking about what I want to write and how to do it well. Okay. So now we have this split into mods. So now we want to add the next command, which is LSTree. LSTree. Is it taking any arguments? I wonder. Okay. We'll look at the tree object later. No arguments except maybe name only. Okay. So clap long. Name only is a bool. So we're going to have here LSTree. Name only. That's going to be LSTree invoke. Name only. We're going to go here. We'll do this. Create that module. We'll go grab the start of this from here. And this is where we're probably going to start to see some reuse between cat file and LSTree, right? Because LSTree also needs to read out something that is stored in the object tree or in the object store, rather. Right. So the output of LSTree would be tree shot. Yeah. So this is the thing we looked at earlier, right? So if you have a nested directory structure like this, the actual tree object only store the listings for one level deep. So that means if you do LSTree of the tree of your repo, you read out file one, dur one and dur two and nothing else. For the things that have sub trees, like anything that is a directory, the hash that's listed for that entry in the tree blob or the tree object is a tree hash that you can then recurse down into. Whereas anything that's a file is a blob hash. So, right. So the caller that wants to print out something like this would actually need to walk the tree objects going all the way down. The object is alphabetically sorted. The sub gets stores entries. That's fine. With the name only flag. Okay. So we are going to here name only. We're going to anyhow and sure name only, only name only is supported for now. And eventually we'll return one of these. Right. So we're going to get a tree. Sean, we're just going to print the names and nothing else so that we don't recurse down. We recommend implementing the full LSTree output too since that'll require your parse all data in a tree object, not file names. Okay. So this it's optional whether to also print the hashes, but we are allowed to great. Let's look at what a tree object looks like. Trees are used to store directory structures. Multiple entries. Yep. We know that the name the mode for directories of values this. Interesting. So there are they actually stored in ASCII. We have a we have a command to check this now. So if I now do a git rev parse head. Nope. Get a cat file. Nope. Commit of this. So this has this tree. So let's now print out 60 slash 46. Interesting. Oh, that's because that that commit didn't actually commit anything. So do a rev parse of head. Let's do this guy cat file this commit. Tree here. So if I now print out the tree at 6790. Oh, that's because I'm passing it to Shossom. Okay. Interesting. So there is something binary in here. Right. Like this is the contents they're talking about. But what I'm trying to figure out is whether that's actually what's stored in the file. But it sounds like what's actually stored in the blob is this thing. Right. So we see the the object prefix, the object header here. And then this is what's actually stored. But this is binary. This is not ASCII. So it's not actually ASCII that's stored in the file. Tree object storage. Here we go. Tree object stored in the git object directory. That's fine. Looks like this after ZLib decompression. Tree size zero. Okay. So node space name. Null byte. And then a 20 byte shot. I see. And the 20 byte shot is stored as an actual like is not stored hex encoded. Great. Yeah, exactly. Not hexadecimal. Okay. Amazing. So in that case, I think this shouldn't be too bad. I think what we actually want here now is if we go back to main here, we'll add a objects. And then we'll go to cat file and we'll pull out some of this. So trying to figure out what we actually want to pull out. So we want something like read object. It'll take a hash and it'll return an anyhow result of something. And in fact, it'll return a tuple of kind and kind and an input read, I think, or buff read, maybe, I think is actually what we want. So when we get down here, yeah, let's have it also return the size actually, which means we'll probably want to struct here. And we'll say this is an R. So it returns a kind. It returns expected size to use 64 and returns a reader, which has the remaining bytes of type R. And what this will actually return here is an object. And I'm using Imple here because I don't necessarily want to commit to exactly the order of operations in here and which wrapper types we use, for example. And so now I think we can then return here okay of object where the reader is Z, the kind we have, the size we have expected size is this, and this no longer needs to be mute. This is pubcrate. This is pubcrate. And this is pubcrate. And now in cat file, we can use create objects. We only really need read object here. And I guess actually let's do a, let's have that be an Imple on object actually. It's going to be a little bit weird as an Imple on object actually. I guess I can do this. Right, because otherwise the R on the outside doesn't actually matter here. So we'll do this. And that way this can be object. And we'll also take in kind. And then we can do here now that object is object read of the object hash.context, parse out object file. And we can go down here. And we can match on the object of kind. And if it's anything, I guess here we could now have cat file understand how to print trees. But for anything else we'll do an anyhow bail. Don't yet know how to print, to print trees, print these object are kind. And then we'll do, we'll add a couple of things to this. So this can derive debug. It can derive partial eek and eek. And we can also just do Imple display for kind because why not? We'll bring instead here. We'll implement this. And this is just a match on self. There are crates that give you this, but it's not. It's not too bad to just do yourself here. Like so. And so now you can print an object.kind here. And we're going to go ahead and let this be pubcrate all the fields. And so now this is going to be object dot reader. This is going to be object dot expected size. So and that means I need the object to be mutable because I need to consume the reader. Nice. That makes this a whole lot nicer. And this says unreachable pattern because we don't have a tree yet. We'll have a tree now. Trees never constructed. That's totally fine. This is tree. How to parse or do process. We might as well add commit here to actually while we're at it because we know it's going to come, right? So commit is also going to presumably have at least the same structure for the header and the compression. So this is more of a what even is a great. So now we have a sort of generic object reader. The pretty printer and cat file just knows about blobs. And the thing in LS tree is a to do for now. Just to see that this still works. Yep. So we'll do a commit pull out object store reading. Okay. Do we need this to be generic or would it be simpler to just box the reader? Yeah. So up here in objects, instead of the storing an R, we could just have a box din reader here. We didn't read. I don't really want to do that because well, it's fine for this to be generic and only ever be instantiated with one type. And that way you don't get the indirection via box. So it's more efficient, but you also get the code to be able to take, like you get monomorphization, right? And doesn't really feel like this is particularly painful, right? Sometimes generics get really painful and boxing. It just makes the pain go away. There's not really any pain here, at least the way I see it at the moment. The other nice thing here about using the monomorphization is we can, we can get the stuff from buff read, which is going to be pretty nice. And you also, well, I guess you could get the buff read stuff through dynamic dispatch as well. I think that trade is object safe. But I don't think there's really any wind here from switching this to dynamic dispatch, at least at the moment. We could always switch it later. Okay. So now let's go and see if we can do LS tree. So LS tree, huh? What is it? What do you give it? You give it a tree shot. Ah, so in Maine, this also needs to take a tree hash. And similarly, then, you know, this needs a tree hash. LS tree needs a tree hash. This needs a tree hash. This is now a tree. The only thing we're going to be willing to grab in here is a kind tree. How to LS a different kind. And then, you know, this bit in here, it is TBD. It also be tempted to call that type read object. So you can have a write object later. Yeah, it's not a bad idea. It is the object that we have here is very much a, it's more like an object ref really. Yeah. The cool thing actually here with this being generic in this way is you could also have the reader here. Be a vector. So you can have an object entirely in memory, which would actually also work for write. In fact, this doesn't even need to be a reader. This could be a file that you write into. It's a little weird, but it could be. I'm going to leave it as object for now. We'll see how we adjust it over time. Okay, so let's now go back to this bit here. So the representation is this bit, which is the header, which we already read out. The actual file doesn't contain new lines. Okay, good to know. So these are going to be. All right. All right. And how's the mode encoded is the mode encoded in decimal. Okay, great. So here's what we have to do. We have to, this sort of alternates right between mode name and 20 by Chas. And the, I think the easiest way to split this is probably to read until a null. Yeah, I think this is actually just a, this is a pretty simple loop. So we do a loop where the first thing we do is mode and name. And we can have a buffer here, which is the thing we're going to need to read into. So mode and name is going to be object dot reader. And we want to read until we want to read until we hit a zero. Really. And in fact, yeah, give me buff read. So this is going to read into the buff. And in fact, we're going to buff dot clear at the start of each of these loops. We're going to read until we get a zero. And this is going to be N. If N is equal to zero. Then we break because that means we've hit the end of file. Provide the argument context. Oh, what is, oh, here. Read next tree object entry. So here the, we know that what's now in buff is going to be the mode on the line. And so what we'll do is, in fact, we could have two different buffers here. We could have one, which is going to be mode and line. And we can have one, which is sha one hash. So this first thing is going to be dot read, read until and just read until allow me to write into a string. There is a version of this that does. But it's not going to be nice. Is it because it's a C string instead? All right, fine. We won't do that. Take that back. So we're going to read until this. Then we're going to say mode and line is Cster. New with, what was the name of it? From bytes with null. Of buff. And then we can do here invalid tree entry. And here's actually an interesting question, right? So one thing that's interesting about C strings is that they're not guaranteed to be valid UTF-8. So you can imagine running this command on Windows, for example, where file names are not encoded as UTF-8. Or in fact on the Linux system where it's configured in the same way. And so the byte string that we get out of here is not necessarily compatible with a string or a stir reference. It truly is a C string. Whereas there's a sequence of bytes that don't contain a null. And so the question here becomes, how do we want to print these? I think what we're going to do here is just print them out raw maybe. So we do know that mode and this should be mode and name. Mode and name is going to be mode and name dot. Oh, isn't there a, I thought, two bytes. And then I want to, I want to split ones. Why is there not a split ones? I think the mode lines are actually guaranteed to all be the same length. That's an interesting question, actually. Mode, mode, mode. Like I think they are always six characters. Realistically, they really should be just, we should just look by space, I suppose. So there is a split first. No. I'm fairly sure there's a rust slice. That's not what I wanted it to do at all. Give me the splits. Do I really need to do a find and then a split? Come on. Well, split does what I want. And I guess I could do a split end, but I want to split once. Yeah, split ones. Oh, it's nightly only. Fine, we'll do a split end for now. Two. And this is going to be B is equal to space. I think technically is like is ASCII white space, but I think it's specifically we wanted to be a space here. Let me go ahead and do this and do to do place with split ones. So bits is going to be this. Let mode is bits.next.expect. Split always yields once. Let name is bits.next.ok or else. Anyhow. Anyhow. This is like a tree entry has no file name. Great. This should be this one. This can be this one. Okay, so we have the mode in the name. And the next thing now is the hash. Now the awkward part is that if we read more into the buffer, it invalidates our borrow of the buffer, which is where mode and name here live. In theory, we could sort of read into the tail end of the buffer, but we're not guaranteed that that won't reallocate and doesn't validate the references. So I think what we'll do here, I guess we could just print out the name before we have the hash. I suppose that's okay. Well, what's the expected output format here? It's like we're supposed to print mode space. Oh, I guess for now it's name only. So let's do the name only. So that means this will do stdout.writeAllName. Right, tree entry name. And notice here, we don't actually require UTF-8, right? We just write the bytes directly out to standard out. And if they happen to not contain UTF-8, that's up to the terminal to deal with. And then we'll also do a writeLine to standard out of nothing, writeNewLine. Little sad we can't combine these. And then we just don't do anything about the mode because we would only print that if we didn't have name only. And then this can go away, right? We're not going to do that anymore. And now we're going to buff.clear. And then we're going to, actually, we don't need to do that. We can do the following. We can just read them all into the buffer first. That's what I want to do. So here we do a readUntil. If we hit end of file, then that's fine. We break. And then we do n is object.reader.read. Why isn't there a read exact? Because what I really want to do here, right, is I want to take the buffer and I want to take the object reader and I want to read another exact. And I want to read sort of from n to n plus 20, right? There is a way to do this. I'm just getting the compiler to help me here for a sec. I didn't know that's why. Okay, so read exact. And in fact, I guess I can actually just use a different buff here. So I'll do hash buff is a 020. And I want to read into hash buff. Read tree entry hash. Tree entry object hash. And that is all that's stored in there, right? Yeah, okay, great. And here too, if n is less than... It is 20 bytes, right? Or is it 40? 20 bytes. Is less than hash buff.len. Then break. Expected because this is unit. Ah, read exact will error anyway. If the number of bytes available is not enough to fill hash buff. So that is fine. And so now we actually do have access to this at all times. So we're good. Okay, so this writes the new line. And then we loop. Are we allowed to print a trailing new line here in the output of LS tree alphabetically sorted just because that's how it's already stored anyway? Are there any notes? Yeah, okay. I think that's all, right? We have the hash here. We just choose not to print it. In fact, we could say that the hash is hex encode of hash buff. So down here, we could say... No, I do need that to be here. And then we could say if... So now we don't need to enforce name only. So if name only, then this one's easy. We just do this. LS. And we can have the new line printing at the end anyway. LS, we decode the hash and we print out. So for the long output of this, it's supposed to be mode, space, the kind, which we don't know yet. And that's presumably why this one is semi optional, right? So we're going to write all of the mode. And then we're going to write all of the name. And then I guess actually in the middle here, we're going to write space. In fact, we could just write out buff here. Right, because that's already how buff is structured. So we can just do buff here instead. Tree and tree to stood out. But the interesting next bit is then we want... Oh, no, we can't do that because we actually want to write the mode and then the hash and then the file name. So we don't actually want to write buff. We want to keep what this was. So we're going to write the mode. Then we're going to write the hash. So we're going to write space, tree or blob. We don't know yet. So let kind is let's just say that that's tree for now. So the kind is going to go there. And then a space and then the hash. And then I guess this is probably some like spacing to make them all be aligned. But let's do a single space for now. And then it'll be the actual name. So this is right. Tree entry hash just stood out like so. All right, let's do a cargo run of let's do a get LS tree of... Do we have a tree here? This is a tree, right? Yeah, this is a tree. Okay. Moment of truth. I suppose cargo run. All this tree. That looks right to me, right? And if I do name only prints only the names. The hash isn't printed raw. It's stored raw in the file. We do need... Yeah, so there are two bits we need to fix here. One of them is aligning the mode. So that should be zero padded to six characters. Luckily, that's kind of easy. So we actually know that the mode is UTF-8. So here we can do let okay mode is mode.tooster. Mode. Stur from UTF-8. In fact, isn't there a mode.asky... That's fine. Stur from UTF-8 of mode. We can error here too actually. We don't need to do that. We can say here that the mode is always valid UTF-8. And now we can do this. Mode colon zero six. Column left. No, I think it's just colon is fine. Tree and tree meta. Am I just being silly here? There we go. I just needed to get the characters in the right order. So now you see this entry here is right aligned with zeroes padded on the left. Which is indeed what we wanted. And the second one is that they're all listed as trees. Which is obviously wrong. The way for us to fix that part is we will need to read out each object to figure out what type it is. And so the way to do that is going to be very slightly painful. But it's going to be object is object read of the hash. Which conveniently we have right here. And so this is read object for tree and tree. And here we can be a little helpful so we can give the object hash. And now this is the object kind. And notice that we don't do the rest of the reader. So we don't actually stream in the entire file contents. We just read the header. And now these are all blobs. And this one is a tree. And so in fact we should be able to now sort of recurse right by calling else tree on this. And now it works further down. Amazing. It does the right thing. Okay, it makes me happy. We now have else tree. Implement this tree. Good push. Let's see what it thinks. All tests passed. All back to the browser. Boom. View next stage. Write a tree object. Ah, so this is the first time we're going to... Well, I guess technically we created a blob object with our right object earlier. So write tree... We need to implement... Do we actually need to implement the staging area? Yeah, exactly. We won't implement the staging area. We'll just assume all the files in the working directory are staged. Create a file with some contents. Git add. Git write tree. The output of git write tree is the 40 character Shah hash of the tree object that was written to git objects. We'll have to walk the files in the working directory. Create blobs for every file. For directories, recurse into them. Create tree objects. Record their hash. And walk all the way up. Okay. So let's see, it's one. Okay, so we're at the two and a half hour mark. Wow. Time flies when you're having fun. That means I'm going to go make some tea. And then I'm going to be back in a second and then we'll take a tree object. Okay. I'll see you all shortly. I have returned. I have tea now. It's the volume really low. I hope not. My audio gear says that it's good. Does Rust not have an organized imports equivalent like go does? So Rust analyzer does. You see it sort of when I automatically, when I use the code action to add an import, it will sort of reorder them and stuff. It doesn't automatically remove unused imports. But if I have something like a hash here, then there is a code action to remove unused imports. There's one for merging imports as well. So it does have them. I just don't use them very well. Okay. So now we're writing tree objects, huh? So that's back to our hash object thing that we had earlier. Right. So we'll want something very similar except that we want to be able to write things that are not necessarily just a blob. The interesting thing here with trees, and this gets out the thing we originally had with files, right? With blobs is that you don't know how long the input is until you've assembled all the input. In the file case, we could sort of cheat because we could stat the file. But even this is kind of technically dodgy, right? Because there's technically a race condition here, which is imagine that the file changed between when you stat it and when you actually stream it through the encoder. If that happens, then what we'll end up doing is write the wrong size in there. So like, you know, we're sort of cheating here. Let's see here. My T alarm went off. So I'll write a technically there's a race here if the file changes between stat and write. We'd write in the contents of the file first and then go back to append ahead or work. It might, although prepending to files is kind of annoying, like a pending is pretty easy. Prepending is usually expensive. The way I would probably do this instead is you there are cheats you can do, right? So you can, because this is encoded as an integer, you can always prefix zeros to an integer. So you just write out a bunch of zeros and then you just replace those bytes after the fact. But that's really not very nice. Another alternative is you write the raw bytes to a file and then you read that file out, which you know is not going to be modified, but even then it's kind of dodgy or you write it all to memory. And that way you're guaranteed, but it is a little weird. So it's all like, it's a little painful regardless. I think what we'll actually do here is I think it just has to be in memory. And the way this is probably going to work in practice is I'm going to change our object thing here. And I'm going to give it a I'm going to give it a right. And I think I know some people are going to hate this. And right is going to take a it's going to take a kind is going to take a size. And it's going to take a writer. So object here is like kind of a kind of a lie, right? Because we we don't really have methods on object. We just have constructors to it that were essentially namespacing. And then I'm not sure if this is what I want. I don't think it's going to give you back an object actually. I think it's going to just do this, which is kind of stupid. But what's interesting is that this can actually be a method on object. Where we basically say that this R is any IO so it can be a reader or a writer. You get it back as a reader if you call read, but you can construct one with a writer and then you can call write. It's pretty gross though. But like here, let me show you what I mean. So it'll take a it'll take a self and it will return the object ID and the way it'll do that is here. And so they will will steal the hash writer from over here, right? Support this. And then this is going to write now self dot kind self dot expected size self dot. And I'm going to rename the reader field. I'm not going to let reader be a writer. We're going to import digest. And then this is going to be from self dot reader. The the thing inside object needs to be the basically trying to decide whether the thing inside of object is the object we want to write or the writer that we want to write to. Maybe it actually is a reader. Maybe write takes maybe maybe it actually is pretty reasonable here. Maybe this is a read. And then write takes a W that implements right. And it writes into that W. And so this does actually write into the writer. And then this is the self reader. And then we return okay of the hash. And this has to be dot into. And this has to be a mute self. And when I say write, I mean, right as an R I G H T not W R I T. Yeah, we could use a we could use simple right here. That's fine too. And so now this let's see if we can actually rewrite this one in terms of the other one now. We should now be able to do all object kind is kind blob expected size is stat dot Len and reader is file. And now I should be able to dot write that into writer like this. And then the hash should be the outcome of that right. This goes away. This goes away. That's not too bad. And the reason why this is kind of nice is because here the reader is a file, but because vectors implement read, we can also just write as we will for trees. For example, we could take a string and we can provide that as the reader, which will then have that be the thing that gets written in. So I think this will actually turn out pretty nicely. And then this remain it remains true that this is a reader. And so now what is the type it wants us to implement it wants to implement a right tree. Okay, so let's go back to main here. Right tree and right tree takes no arguments, or at least for now. Right tree. And that's going to look a lot like hash object. So let's start from there. But what it will do is. Oh, actually, here's another thing we could do. So let me let me make this comment this out for now. This thing that actually goes through a file might be a handy convenience thing on object, which is we could actually have. No, we could have from blob from file, which takes a as ref path. And returns this that actually does this bit for us, just because we might end up using the same thing in the tree writer. Let file is file dot as ref. This is groove box dark hard is the color theme. And so this is actually going to return a buffer is fine doesn't really matter. Now, the weird thing here is that. Oh, yeah, no. Okay, so the reader here is always one without the header. The reader here is always a raw reader. So that's fine. This only guarantees that it implements read. Which is kind of interesting actually because we may want to do a buffered reader here to boost the runtime performance, but I think we can ignore that from now. So that means this is always going to return a kind blob. So now this should be blob from file of file. And these now go away. This goes away. This now becomes open blob input file. That's pretty nice because now we can have that exact same code be in our right tree. How do you set up line reps in your new beam config? Is it on save or do you have a hot key for that? It's on save. I run my rust analyzer set to do rust format on save. Okay, so if we now go back to right tree. So our task here is going to be to walk the current directory. And I guess we actually want to walk it. Yeah, I think we need to do this recursively, right? Because anytime you hit a sub directory, you need to construct a tree object for the sub directory return the hash and that's the thing that goes into the parent. So here we're going to need to sort of write tree for which is going to take a path and return an anyhow result of a UA 20. And so this is where we're going to write our recursive thing. There are crates that help you write this like this one called walk. Walk. Walk is really nice for this. There's also one on top of this called ignore. This is what ripcrep uses, for example, and it knows about things like get ignore files, which might actually be really handy for us here. And I believe that you can tell the walk builder to have a max depth of one. So we could use this one and that way we get things like get ignores for free. For now, let's do it the straightforward way and we'll see if it comes back to bite us. So dir is going to be fs read dir of path context, read directory. And I guess we can do this so that it's we could even do open directory this and say path dot display. And then we'll do while let some entry is dir dot next. And I think actually this is going to be an option because it is technically possible that you have a directory, for example, with no files in them and get will just not represent empty trees. They will just be completely skipped from the thing that you create. So I think we want this to be an option. So while entry is this oops, then let entry is entry dot. And then when you walk over things in reader than every individual entry might also error. Bad directory entry in what am I doing with context that goes there. This goes here for path dot display, like so. And I got to write this correctly. So what we want to do when we create a tree is we want to create that same representation that we've been talking about, right? So we want to string. In fact, we want to back. So the tree object is going to be a back new. And for each entry, the thing we want to write to the tree object is the mode, the name, a literal Noel character. Right, mode, space, name, literal Noel character, and then the hash, right? That's ultimately what we want to produce. And because the hash is not a string and not hex encoded, we're actually going to have to do tree object dot push. Well, I guess dot extend hash. So we need to compute the mode and the name. The mode is going to be a match on entry dot metadata. Context meta data for directory entry meta is this. And I guess we'll also grab the path out because we're definitely going to need the path like so. Isn't there also entry dot. How does fire we only really need the file name. We don't actually need the path. File name is entry dot. And then I think we want to match on meta dot permissions. Well, I guess mute mode is if entry dot. Where's the file type? That's really what I want. Oh, type is equal to this file type for directory entry. I guess technically that might be in the meta. Yeah, it is. Okay, great. So the mode, the sort of initial state of the mode is if meta dot is dirt, then this is going to be zero, zero. No, just four. They write it without leading characters. Otherwise, it is going to match on meta dot permissions. Or what's the there was some set of rules here somewhere. Recap. Mode, mode, mode. That's not helpful. It was in the previous exercise, I think a description. Can I go back? Yeah, tree object tree objects. For files, the valid values are these, these and these. Okay, so this is going to be else if meta dot is sim link. Then it is this string. Else if meta dot permissions dot. Right. That is the thing that's only available on Unix. No, why does my, oh, have I somehow changed my search engine? That's not what I want. I want rust. That's why my things aren't working anymore. What is it called permission permissions. Yeah, permissions X, which is Unix only. So, because I think there isn't actually a way to check for executable on windows, is it? Like there might not be a flag like that. It would be under FS. File time metadata X file attributes. Yeah. Okay. So there might be a way to pull it out through this. I think what we'll do here is what we'll go the Unix only path for right now, which is to grab this guy, which then gives us meta dot permissions dot mode. And we want to see whether that ended with and what is the bit for how do you even determine whether it is executable? I suppose we could just take the mode right, but it seems like get is a little bit more restrictive about which modes that actually allows in there. I think what I'll actually do here is just say. So this is octal 111 not equal to zero has at least one executable bit set. I think that's right because the way you go for the may the way you construct executable bits is you go from I think one in any given octal position is the executable bit because read and write is 644 for example is read and write for the owner and read only for group and other. So that means the four is the read bit to is the right bit and one is the executable. So this is if we end it with all the one bits and it's not equal to zero. That means that at least one at least one of the executable bits are set. And so therefore I think this is what we roughly want you could imagine that we only do this if if like the owner bit is set or something. But I think this is fine. Great else if there is some link. We don't need the mute. The file name here is an OS string right so we can't necessarily write that out like this either. So we need to do this. And then we need to do tree object dot extend file name. And then we do tree object dot push of a null byte. And at this point like why even why even use write right. What no an OS string is definitely valid is just bytes. Right. I give me OS string. Where's OS string. An OS string better be valid as just a bunch of bytes. Really. Oh as encoded bytes. Okay fine. As encoded bytes. Right. And we don't have the hash yet. So the hash is going to be hash is if meta dot is dur. Then and this is where we get into the recursion. Then it is right tree for entry dot path. If it's not, then it is the thing we made for right object, which is this from file entry dot path. I guess that means I might as well just pull it out out here. And this needs to write to. So this is also the question of where that goes. And this is where this extra business is needed. Do we want to put that somewhere? No, I'll I'm going to replicate this for now. It's annoying, but it is what it is. So that's going to do this. And that means we actually also need to hex encode the hash in order to use here in the path that we generate. And then if and this is actually let some hash is equal to this. And this is where we end up skipping. So this is empty directory. So don't include in parent. And then that's going to be the hash. Amazing. I think that does it, right? So this then the main bit we have left is we need to decide here what to. So this this extends the tree object to include all the lines. And then I guess it's if. If tree object dot is empty. Then we return OK of none. So that means there were no entries. Otherwise we return and this is where the hash is going to come in. We do this. Bump bump bump bump bump object. We're going to have to do the same like temporary dance here, aren't we? Yeah, we are. That makes me think that we probably need to make that be more standardized. So this is going to be object to sort of fill in the bits here first. Kind is going to be kind of tree. Expected size is going to be tree object dot Len. And reader is just going to be the tree object because vector implements read. Unsatisfied trait bounds. What doesn't I? Oh, we might need a cursor wrapper. I guess it does not. Let's see. Look at read maybe. Maybe it says there. It does not. OK, so then I think it is cursor. Yep, cursor. So we'll want here a cursor new cursor now. I want a cursor now. This one. So this is then going to be OK of hash. OK of some of hash. File for tree stream file into tree stream tree object into tree object file creates sub dir of git objects move tree file into git objects. So clearly this code we now have repeated a bunch of times. So that's not super nice. And that means we'll go over here and we'll probably make our right thing have to be a little bit smarter. We probably still want right as a sort of convenience method for testing, for example. But let's do a right to objects. And that's going to have all of this logic here. Which is going to do this. And now, of course, again, we will fix this at some point. It just doesn't matter at the moment. So now this can all become a lot easier because we can say right to objects. Right to objects does not need to take a writer at all. It doesn't need to take me itself. It just takes the object. And then this is just right tree object. And then this becomes OK of some of that. And similarly, now we can also simplify hash object, which now just becomes. Now just becomes. This one's now a little bit more annoying, but I think actually this one's OK now because we don't even need this anymore. We can just do like this method is this helper is now so small that it can go away. So this goes away and becomes this. And this becomes right to objects into get objects into blob object. And this is, I guess, really object file. And then this all goes away. And in fact, we can still keep this one. Right, that remains the same. So this is now just where do we write the object? And the hexing code we have to do regardless. So we can say here that this is going to be hexing code of hash. So that makes this a little nicer by going here. And this makes those nicer. That's pretty short and sweet, wouldn't you say? And so now we have right to objects there. It still uses this single thing called temporary, but that I think is OK. This is now just right to objects, outputs the hash, recurses, does all of that stuff. And so in theory now, we should be able to do hash is write tree for path new of dot construct root tree object. And that's just going to make the hash. And then we print the hash. Right, right. I think I think that's. Oh, I guess let some hash. And so this would be like anyhow, anyhow. And no, anyhow, bail. Asked to ask to make tree object for empty tree. And then I suppose we may want to at least hard code that it shouldn't commit the get directory. And so what we'll do up here is if file name is equal to dot get, then continue. Otherwise this will potentially recurse forever, which is not what we want. I guess the question becomes. Right, I guess I need to get. Add dot. And then I want to see if I do a get. What's the comparison they give you here? They say. No, write a tree object. Right tree. I could run this in a different directory, but where's the fun in that? If I do right tree, what does it give me? Okay. And if I run cargo run, right tree. I was about to say, yeah, it gives the same hash, but it's just the first character. That's the same. So that doesn't really work. Oops. Well, it did something. Is there something else that we end up including that they don't? Oh, I wonder if it's the alphabetical ordering. Because we walk these in. We walk these in random order, whatever order reader here gives. I think that should be. Actually, we can find this out, right? So we can do L S tree of the object we just constructed. It's a valid tree. At least get thinks it is. In fact, all the entries. No, something's not the same. It's the ordering. See, in our tree, we print out cargo lock, then read me, then code crafters. They print out cargo lock cargo Tommel, read me, then code crafters. So I think it really truly is just the ordering. That's pretty promising. So that means. We're actually going to have to sort all of these first. That's not the end of the world, although sorting is interesting, because do they sort? I guess they have to sort by bite value, right? Because otherwise you get a different ordering on, for example, Windows first Linux. Or do they actually like understand which encoding you're using the file names? I don't think that's reasonable. I think they probably just do like ASCII sort of the file names. Like what does sorted mean, right? Do they tell us down here? That's fine. That's fine. Ignore the get directory. That's fine. But what does ordered mean? I don't think there's a sorted reader. So I think the way we're going to do this is entries is a, I guess we could do a B tree here, given we're sorting them anyway. It's a B tree map. And then we'll do entry. And then we'll do entries.insert. Entry.fileName. No, I think I want this to be a vector. Yeah, I'm just going to have this be a vector. And then I'm going to do entries.pushEntry. And then I'm going to do entries.sortUnstableBy key. Entry.fileName. And then this is just going to be for entry in entries. Still not the same. So that's ours and that's theirs. They are now in the same order. And the hashes are all, the hash for the nested directory is not the same. The hash for source for us is whatever this is. And the hash for them is whatever this is. They order, the ordering here is different. So they order shorter strings first. No, we order shorter strings first. They order shorter strings last. Why? Okay, interesting. So they consider commands.rs. Let me double check that I'm not lying here. GetWriteTree is the one that's 401. 401 is the one that has 446. 446 is the one that has commands.rs first. So they order commands.rs before commands. Whereas we order commands before commands.rs. No, they don't have directories first. This is the output from Git. And the output from Git does not have the tree first. It has a blob first. So they specifically order differently here. They order with, if two things share a, if one thing is strictly longer than the other, the longer thing gets ordered first, which is not how string comparison works in Rust. Well, that's interesting. So in that case, we're going to have to do... No, it's not files versus directories. This is entirely based on the name. Again, they order the whole line entry. No, they don't order the whole line entry. Then BD would have come before 99. Or rather 99 would have come before this. It is just the name. But that means that our sort by up here actually needs to be AMB. And I think... So there are two ways we could do this. Oh, it's probably because they sort including the terminating null byte. Yeah, it's the same thing someone's getting out in chat. So if you sort with the terminating null byte, then null comes before all other characters. And so the null would come first, which is what they do. No, that's the thing. They don't sort that way. We sort that way. This first one is ours. And this one has a sort of null at the end here. And that one comes before the one that doesn't have a null there, which to me seems reasonable. But in the Git one, it's the other way around. So I think it really is... We may actually need to... It may even need to be the stupid. So if... afn. Oh, really? This is an OS string. Right. As encoded bytes. As encoded bytes. In fact, that's an interesting question. So if I go to... OS string. What is their implementation of ordering? I would sort of assume that their implementation of ordering is just by the bytes. Yeah, it is. Okay, great. Yeah, you could think of null as having a really high weight for them as opposed to a low weight. And then I think what we want to do is afn.startsWithBfn then return the... If afn.startsWithBfn, that means afn is longer. So then we want to return afn. Right? If... If bfn.startsWithafn then we want to return bfn. Otherwise, we want to return a compare b. And this needs to be ordering. So this means that... This means a comes before. This means b comes before. I may have confused myself here. Oh, finally he returns an OS string. Can I have it just give me the thing without allocating? Because that's what I want. Really? Entry. Where's my dear entry? Wow, they all need to allocate. Alright, that's fine then, I guess. That means this has to be this. And then let afn.startsWithBfn as encoded bytes. bfn is this. I guess because they're owned, we can pull this trick instead. So because they're owned, we can do the following. Into encoded bytes. This is at least assuming that's the method. Okay, great. So then I can do afn.push and bfn.push and then I can return compare afn to bfn. Git considers shorter strings to come after longer strings. Which effectively means the terminating character is valued high and not low. Git add. Right tree. That's the same hash. Yeah. Nice. Push will probably need to reallocate. You think so here? You think this actually allocates like a lame OS string? It's really annoying that they don't expose this method. I guess since we're on Unix, there might actually be a Unix version of this. So if we go here, that gives me access to the raw bytes. Dur, entry, X. No, that's not helpful. Okay, that's fine. Yeah, you might be right because the file name here is going to get back the OS stir and then I'm going to call OS string. Which calls the two owned. Which calls the two back. So what does to back in do unclear into back? You're probably right. That it won't over allocate. The allocation is sad here. This probably allocates which is a dot. If your directory contains a dot gets sort normally short as before. What if your directory contains a dot? Okay, let's do a make their foo dot touch foo dot No, not foo dot make their source commands dot touch slash something get add dot get right tree get LS tree this thing what what the fuck is this ordering get what is this what this makes that makes no sense okay okay get go home you're drunk it's got to have to do with the I agree with you I think it has to do with extension somehow like it's everything maybe it's everything with extensions before anything that doesn't have an extension but like what about if there are multiple extensions like what does that get ordered as does dot come before dot does come before the letters so that that's not necessarily wrong RS dot only applies one level deep look what what this is just ridiculous what what is this ordering it's not plain files I guess we can test if it's plain files right like if I do uh make their source source foo foo dot bar and then I touch foo bar s and foo dot s and I guess I'll do a this as well and I do get add dot and then get right tree and then get LS tree this and foo dot bar dot and this now I'm just trying all the things get I just want to understand foo dot bar dot maybe you are right maybe it is files first but it's like in a weird way because like if you compare here these two so here the non the one without a dot at the end comes first but if you compare foo dot bar dot and foo dot bar that's the same pattern but they're the one with the dot at the beginning comes first so I think you're right I think that I think the type does affect the ordering here but like what's the rule because it's it's not as though if you have two if you have a file and a tree with the same name then the file comes first because they cannot have the same name so I think I think it's like ignoring what comes I think here's what I think it does I think it removes the extension and then it orders them by name then type I think that's what's going on I think it removes the extension orders by name with extension removed then orders by type and I guess then by the extension I think that's what's going on but how many extensions do they remove like into there's a path buff from of this I believe it compares conflicting directory file entries as equal but these aren't conflicting it compares conflicting directory file entries as equal ok but they're not conflicting here they're different names note that while a directory name compares as equal there's reading someone who's looked up the the get source code I guess let me let me pull this up so get source code get hub get sure go ahead and give me something like probably tree tree tree tree tree tree tree tree tree tree is identical to base name compare except it compares conflicting directory file entries as equal note that while a directory name compares as equal to a regular file they then individually compare differently to a file name that has a dot after the base name right but we don't have conflicting entries I think I'm going insane there's gotta be something where they do something about this trailing dot what do they do for isdre oops nope ok isdre but that's comparing the mode and in fact this first thing just straight up compares the names so this is only if they're yeah and the problem here is the names aren't the same right so we should be taking this first branch and all of this logic shouldn't matter ah ha ha ha ha ha they take the shorter of the lengths and they compare up to the shorter of the lengths ok this is very cursed no ah so here's what it does ah it doesn't look at the extension but it does look at the mode so it takes ah let common len is come in ah afn dot len and bfn dot len so that's step one ah in two encoded bytes ah and then we do then we do match ah afn to common len compare bfn to common len and if the ordering is equal then we continue otherwise we return o ah if afn dot len is equal to bfn dot len then we return ordering equal and then this bit so this is if len here is the shared length so this is just checking whether we've hit the end of ah it adds a slash at the end if the thing is a directory for the purposes of comparison base name compare as opposed to df name compare ok yeah that's fine we can use this one instead because we're not handling potential conflicts so that might make this a little easier but it still does the same thing then it looks at what the character should be and the way it does that is it says ah c1 I guess that is actually a u8 c1 is ah afn dot get at common len c2 is bfn dot get at common len that's the equivalent of what they're doing there um and then it is unwrap or else that's what this thing is here ah except it's not an unwrapper or else it is a we can do it as an unwrapper or else no we'll do this as a if let's some see else if this is like a is dur then that's gonna be a slash else it's gonna be none right and this is really then if a dot metadata which means we actually have to extract this in the comparison which is awful so I think we're actually gonna extract the file name and the metadata when we do this ah so we're gonna say ah file name is entry dot file name and we're also gonna pull out the metadata which we can grab down here and then we're gonna push entry and name and metadata in here and so this is then gonna be ah a dot 1 this is going to be b dot 1 as encoded bytes as encoded bytes and then this is going to be a dot 2 so that's the metadata dot is dur ok and then c2 it's gonna be the same but for b and then what do they do then they do this ok and so now they do c1 compare I think this is just a c1 compare c2 right if c1 is less than c2 return minus 1 if c2 is if c1 is greater than c2 to return 1 otherwise return 0 which is what c1.compare c2 does and then now this can be entry file name and metadata cause we already extracted them ok right tree cargo run right tree still get the same hash wait we get the same hash so that means we succeeded including all the weird test cases we just added ok so we did it right yay git has very specific rules for how to compare names that's fine as long as the thing we got was correct celebration I guess source foo and also source commands dot r s dot git add dot git source commands with the dot implement right tree push all tests pass congrats ok I guess we can get the fireworks weird ok so right of tree object we actually got there pretty quickly and then we spent like 2 or 3 times as long as doing the right of tree object on directory entry ordering but hey we still did it ok git commit tree command the commit alright let's go to main and add a commit tree takes a message and a hash ok so commit tree takes a tree hash which is a string and it also takes a message which is a string and here you know you can implement things like if dash m isn't passed then we spawn an editor and everything but we'll leave that for for never ok commands commit tree commit tree invoke message and tree hash commit tree alright so now if we go to this let's I suppose actually read the commit a commit object contains information like the committer author name and email time stamp the tree shaw and the parent commit shaw if any ok commit tree ah dash p for parent I see so the idea is that you can create a commit that doesn't have a parent which would just be the initial commit or you can commit one that does have a parent so that means in our main we actually need a tree hash and there's also a parent hash which is actually optional so that here would be parent hash and let me go ahead and generate this function and we're probably going to need most of this the interesting thing here is I think we can actually just reuse pretty much all of this right so we can just make this be pub crate and then what do we have to do in commit tree well oh no the tree is already given to us so we don't even need to reuse this the expectation is that we're just given a tree hash ok so in that case this is then pretty similar to just the the bit where we actually construct the the bit at the end so we should be able here to do project kind commit and we don't actually know what the commit is going to look like but it is probably going to be formatable actually and this will be then commit.len and commit right that hashes this and then just like down here we'll just print the hash and be done this is going to be an anyhow result of nothing and the question just becomes what do we put in the commit and I think the bit we put in the commit is pretty easy because it's just a um what's a head it's just this kind of thing right but is that actually encoding of it aha format so we have one of the people working on code crafters in the chat and he was like well someone in chat pointed out all the ordering cases here that we've talked about like the name reordering can't possibly be part of the test cases right and the code crafters person is like it wasn't so far but now it's going to be nice um ok aha the output is ok content I think it actually just is uh ASCII encoded alright well if it's just an ASCII string then this is trivial right we can just do tree is the tree hash actually I don't want to do this with format because the parent hash is optional but we're going to do string new um and then we're going to do write to commit because you can write into strings it's really nice um we're going to write tree uh write line actually uh tree tree hash so that's the tree for the commit uh and then if let some parent hash is parent hash then we will write parent and parent hash uh and then we'll write author and we'll just hard code this one um we'll set committer as well uh and then we have to write uh we have to write an empty line it seems to me at least right it's an empty line in between here and then we write line into the commit the message it's unclear whether there's a new line at the end zero a it does look like there's a new line at the end okay uh oh and then it wants me to uh use right ordering can go away fs can go away these two can go away uh this is just the message right isn't there a I thought there was a way to use right line without it returning result uh for strings and that is by using uh really I was so sure the macro I thought there I was so sure there was one where you didn't have to do the unwrap but okay um these I guess we can write here all the here uh we'll never trigger as we're writing into a string uh but we'll add them anyway so that the compiler is happy um not all did we really make it that easy for ourselves oh hang on 40 character shot yeah 40 characters so 20 bytes that's fine so it's the same same kind of right if I'm not mistaken like this is the same length as this is yes okay great um I think that's all right we just construct the string we make the object and we write that out into the object store get right commit uh let's do a rev parse head right commit um dash p this something uh what lies oh it's called commit tree is that what I named it I did I just am blind commit tree must give exactly one tree uh yes I do indeed have to do that which is fine because that's going to be get right tree so that's going to be at the end here okay and now moment of truth yet again okay we produce a different hash oh I mean we're hard-going a bunch of things in here like the time stamp of the commitment stuff but but that's actually to our benefit here because it means the hashes should be the same uh oh but uh right but get will introduce a time stamp here uh interesting well here's something fun we can try though let's do a let's do a catfile dash p of this first that seems right to me what if I do a ls tree of that commit okay um so here's one thing we can try remember what I just did is basically create a commit of the current directory so if I now do a commit um implement commit tree then the now um so there's now a commit at the head right imagine that I now for example try to uh what's the way I want to do this um I'm going to create a branch called foo I'm going to check out foo and then I'm going to log I'm going to reset foo to origin master so this now doesn't have commit tree and then I'm going to uh uh get right tree and then I'm going to cargo run commit tree uh yeah now get right tree um I'm now going to get uh no cargo run commit tree of this tree and no uh I need the current commit as well so cargo cargo run this uh this tree that I just wrote up here and then this commit that is the head of foo here oh I don't I don't have commit tree here fine fine fine fine I get reset hard master uh I just want to create like an empty I guess I can just do this on master there's no I just wanted it on foo so that if I happened to have done something completely wrong I don't mess up my my tree too much in such a way that it's annoying to get back to the place where we just fit commit tree um but we can do this just fine so right tree uh rev parse head um cargo run I'm going to commit this tree with this parent and now I'm going to try to reset the foo branch to that commit and then do a get log so master was there foo is the commit that we just created called something and it has the correct parent it has the correct master if I do a get diff master to where we are there's no difference if I do a showp of head then that is an empty commit ok so now let's try to touch or echo hello to world get add hello uh world uh right tree but notice that I haven't commit I haven't committed that tree this is going to be important so now uh the parent is going to be this the message is going to be commit uh world the tree is going to be this which is the one with uh the world file added so that gives me this and if I now reset hard uh in fact I could I could um rm world add dot so now there's there's no here and there's no world file and then I can reset hard the branch to the commit that I created with commit tree and if I now look at get log I have two commits on foo since master something and commit world if I do a diff from master onwards and I have a world filed with hello and a showp of head gives me that commit ok so we have this is definitely a working commit tree amazing that's really cool ok so now if I go back to master and I push this I'll test the best amazing look at us go ok oh it's kind of tempting actually to do a to just like implement get commit as well because we have like now we have all the bits right you just call right tree as I just showed maybe we do that just like kind of for fun so if we go to main and we do commit maybe we don't even do the parent hash ok ok commit just takes a message and and if I go to commit tree let's extract some of this out so that it's actually just right commit which returns you the hash of the commit and nothing else right and then this now becomes hash is right commit of message tree hash parent hash great commit and by having this one be pub crate and then we do the same thing in right tree we take this guy and do this and now go back to main and now we say ok tree hash is going to be commands right tree right tree for path new dot right tree so you can see now maybe why these commands like right tree for example called plumbing commands and get because they do one thing really well and then things like commit is just straining together the plumbing commands and so now we get the tree hash the parent hash is going to be get head so that here we're going to need a little bit of smarts so we're going to do fs read to string of dot get slash head and I guess technically here we kind of want this to work even if get head doesn't exist like if you haven't if you haven't created any commits yet for example but let's just ignore that for right now so assume that there's something in there now it is possible for get head to contain a hash so if parent hash is equal to 40 no let's not do that let's do if let some read get ref is parent hash strip prefix ref colon right that's what's in there then it's just in get refs and I think it's just that same path right heads master so we're just going to cat that then let then parent hash is anyhow ensure parent hash dot len is equal to 40 and then we can say unknown type of head ref so here resolved is going to be fs read to string of format dot get slash and then just what the ref is for right so if it is for example refs heads master dot get slash and then that path and cat that file which is what I did here to get the hash and so then this becomes here head reference target and I guess what we can do here with context because that seems like it might be relevant I guess actually that's going to be get ref and then we should be able to just return resolved here I believe right we can do that before we start to do the whole constructing of the tree so now we have the tree hash and now we should be able to do the commit hash should now be commands commit tree right commit of the message which we've given the tree hash and some of the parent hash that we just computed create commit and write tree for remember can return can return none so we're going to do this bail actually this isn't even a bail it's just a return right which is really not committing empty tree this is going to be a hex decode bad tree hash bad tree hash and I guess we'll give the context here as well of bad tree hash I'm silly it's a hex encode of the tree hash so that gives us the commit hash and then we want to be really bold here now as the next thing right which is which thing do we update and that's going to be the thing that head points to it's the thing we're going to update so we could print the commit hash here but then that wouldn't move us forward what we really want to do is make head point at the newest thing at the new commit that we made and you don't actually want to update head itself because head might be a ref and if head isn't a ref then it's a little unclear what to update because it's pointing at a specific commit so trying to commit over that can't change that commit so I think what we do here is actually I think we're going to require that it is this so we're actually going to do here bail refusing to commit onto detached head which is what git calls them so this is if you check out a particular commit not just a a branch or some named head because you can't update to commit head is a symbolic ref and the original git was a sim lake oh interesting so that means that here we're going to do head ref that's what we're going to call that one um and then we're actually going to do else this so that we actually get that out and then this now can be head ref this no longer needs to be if let this can be this this can now be a parent hash that because crucially now we can do write to string down here and this is where it's going to be really interesting to see whether we got this right or not because this might just completely corrupt my entire it's recoverable though so we want to write the commit hash into that file any reason why use with context instead of context format yeah so if you do this then that means that even if the even if there's no error you're still going to allocate a string and then immediately throw it away if you do with context then this string will only be allocated if there is actually an error that's why I do that differently um I forget whether git commit prints uh where's the last place where I ran git commit I ran it up here somewhere right before I did the last push aha uh oh it just prints this I'm not going to try to print that summary that seems complicated um so let's just print line the hash shall we head is now at maybe commit hash maybe uh good catch the commit hash needs to be hex encoded so in theory this will let me commit what's here I don't want to do this I'm going to commit this first implement commit and then I'm going to do something like uh cargo run commit dash empty commit no such file or directory error read head reference target refs head master uh because there is a new line uh head ref is head ref dot trim and I guess write commit here can just be a stir and this can be a stir and this can be a stir this can be a message this can be this alright let's try that again ok get show empty commit log empty commit show dash p empty commit we have a commit we made our commit works what if I like what if I echo hello to world and then I get add world because we don't have get add and then our cargo run added world get log that is 079 added world get show added world and there's now a world and if I reset hard to head then there's no more world get commit nice uh just missing author and timestamp sure why not uh commit tree so this one's annoying because you need to parse get config and such but I think uh email name and uh timestamp here's what we'll do if let some name some email is equal to um and we'll grab that from std and var uh what am I doing var os of email I guess I'll do it this way it's not perfect but at least it'll do something kind of interesting right uh then name dot into string context uh it doesn't technically require um in fact maybe the thing to do here is a map uh of uh we could map name through like uh into string if we really wanted to so we would map I don't really want to write that code um then uh name and email and then the right line here becomes name email and I think you know the real git allows you to do things like pass in what name and email to use for author but the committer will always be set to your um to your credentials oh it returns the os string that's fine anyhow anyhow whoops ugh email email email and uh we can just ignore this and ignore this and this needs a bracket I guess this is really map error um okay and then the time is this is just a unix time I think at least that's certainly what it looks like um system time so we're gonna do time is um time system time now uh minus time system time unix epic and this is probably the epic uh yeah so this is gonna be UTC right so we'll do here oh uh isn't there there is a trick to do in this uh oh I can just do dot duration since uh nope I need to do that here that's what I want so that's time uh but actually that has to be time.s sex I think and duration since returns a result because uh current system time is before unix epic I think that's gonna be okay I don't think we're gonna run into that problem um and then I guess here it's a good question what does this return is like will this actually give me UTC uh if I go here I'm gonna guess that there's this is basically just yeah it's duration since but it doesn't actually tell me what the current time zone is that's fine realistically we'd use something fancier here that actually handled time zones for now I'm gonna have I'm gonna pretend that this should be UTC um ooh is it encoded as UTC or plus zero zero zero or Z let's find out uh this might end up being just like uh this is gonna be interesting so if I now do cargo run added world show-p January 1st 1970 okay I'm gonna go if that didn't work or at least I didn't do the right thing um so let's do plus plus zero uh it just does plus zero get add cargo run get show 1401 plus zero that's correct because I'm a plus one okay great so now we have something that gives uh it uses my name and email so now if I tried to do uh reset head and then did um name is uh to tie this all together inspector gadget and email is inspector at gadget dot biz then it is committed by inspector gadget nice and then what I really want is like I see so this is now now ended up a little bit weird so we'll go back to uh so now we're at a good place and then we can make this be um get add dash uh get add uh and then we can run I still want with the inspector gadget uh added support for setting name actual time uh set name email actual time and it makes me happy that the commit that lets us set that is written by inspector gadget amazing get push I mean this will fail at the end because we haven't implemented to get clone okay so clone repository given that this is supposed to be the hardest exercise and way harder than the things we've done so far I'm going to not attempt it four and a half hours into the stream so this might be a thing that we do in a follow up or it's a good exercise for you to do as a sort of follow up to this what I will do actually is um let me see if I can't a new repository uh nope I'll push this out so that it's available public that's fine repository copy get remote add github get push uh github master now it'll be here amazing I'll put that in chat as well I'll put it in the video description as well so it's easy to find okay I think that's actually then where we're going to end it um I don't want to start get add because setting up the staging area is its own different thing because the the staging area is like an in memory representation of your file system that is different from what's on the actual file system that you then need to construct your your trees from and so doing that as its own whole or deal fun or deal just not one that I want to do right now uh look you got it on this little indicator that's funny um what do I want to leave this with so I think I think trying to implement get add would be super interesting I don't think it's something I'm going to do right now but it's a fun follow-up exercise same thing with like actually going through and implementing get clone we might do it in another video or if I don't know doing that then this is a a good chance for you to do it yourselves as I mentioned so there is here I'll send the link and again this is in the video description there's a referral link where if you sign up through the link you get access to all of the challenges for like seven days and so you could in theory like clone my repo and then start just clone a repository and see if you can do it in a week or otherwise you know you could also actually pay for this that would certainly make me happy through the referral but like whatever it floats your boat and then as I also mentioned there is a code crafters here and I'll put the link in chat again and it's also in the video description which has all of the exercises here in the in like the raw form so it doesn't have all the infrastructure for running the tests and getting the framework of the fireworks and stuff but it does have at least the raw bits of the exercise you can go through it yourself if you can't pay for it and it goes beyond the seven days there commits to it recently one hour ago I think someone is watching the stream and is making changes as we go I want to see what these are now this is the best kind of meta stream this is the note that we ran into we have the Shah having to be over the uncompressed version and the Shah is over what includes the header nice that's funny what else do we have ignore the git directory important when creating the entries nice it's the stream talking to itself from the past I know right amazing how can I clone the repo if I haven't implemented clone it's an infinitely deep problem I guess you might have to resort to that old git command and then one day you can do it yourself okay I think it's time for me to go eat some food thank you all for watching I hope that was interesting if you found it super cool then maybe we'll do more of these if you were like okay I'm done with challenges now then I'll find something else to do because as I mentioned you can join the discord to get announcements for new streams and stuff ahead of time if you're not on twitter or mastodon or linkedin which are the other places I post this so it's discord.johnhoo.eu and that gets you to the discord invite link and there's an announcement channel there where new videos are announced and if you sponsor me you can join or youtube or ideally github sponsors because they take the lowest fees you also get access to a couple of the channels there potentially being able to suggest ideas for new streams or even just have a general community chat okay thank you and I'll see you later