 Hello folks, welcome back. Or I should say welcome for the first time because this is a new kind of stream. It's one that people have asked me for for a while. And it's basically taking a look at core crates in the Rust ecosystem and looking at sort of how they work, like what's happening under the surface here, not necessarily like read through all the code. Maybe that could be interesting too and a sort of separate thing, but rather like I want to understand this crate better. So we're not gonna be like after watching this, you're not gonna immediately know how to like make changes to Surrey's like proc macro code. That's not the goal. The goal is for you to have a better understanding of what actually goes on when you're using Surrey, whether that is deriving serialize and deserialize or implementing your own, implementing your own data formats. Basically what's the mental model you should use? The basic constructs are in use by the library. Maybe some of the design decisions and subtleties that are useful to know. And Surrey seemed like a great place to start because so many people use Surrey every day and don't really know how it works, which is not to their detriment. Like there is a lot of, in some sense it's a compliment to Surrey that you don't have to know exactly how it works in order to be able to use it. But it makes it very worthwhile to dig into a little bit. We are also in a position where because of scheduling snafu, I actually have a hard stop for this in a little under two hours, which means that I'm gonna try to be efficient, which is not always something I'm good at, but we'll see how well that goes. Speaking of scheduling, I've also started now committing to a regular streaming schedule, which people have asked for a while. So every fourth Friday, so that is today and four Fridays from now and so on, I'm gonna do a stream at this time. I'm gonna stick to the UTC time. So if there are like daylight savings and stuff, then the UTC time is gonna be the correct one. The UTC time is gonna be 6 p.m. And my plan is to stick to that for the foreseeable future. Maybe it ends up changing if I move to Europe and such, but at least this is the starting plan. I put this on Mastodon and Co-host as well, and there's a link to a calendar that you can actually like subscribe to so that you can see when streams come up and add it to your calendar. So at least that's the start. And without further ado, I think we should just dive into SIRTI. First of all, how do you pronounce SIRTI? Honestly, I'm not entirely sure. I don't know whether it says anywhere. I feel like I've seen, and I know this is not the most important thing, I'm fully aware, but I feel like it says somewhere. Maybe I'll forget, but so SIRTI is a library that's short for serialization and deserialization, so SIRD, but I've also heard it described as SIRDA, or SIRD, which is also weird. I think it's just SIRD, but who knows at this point. And the goal of SIRD is not actually to provide any particular serialization or deserialization format. Instead, its goal is to provide the infrastructure for doing serialization and deserialization, specifically of rust data structures. And it tries to do so in a way where in the general case, it gives you fairly high efficiency and performance. It doesn't always give you like a zero cost abstraction between the data that you have and the data format that you're writing to, but the general goal is for it to be a relatively sort of thin glue layer that hopefully mostly gets optimized out. And SIRD has a bunch of concepts that are really useful to sort of get into your head when you start working with or try going beyond just the very basic like derive SIRD. And the start for that is the SIRD data model. And there's a page on this on the SIRD docs. It's pretty good. But basically, and in fact, maybe I should draw this. I haven't drawn for a while. That seems maybe appropriate. Let's go with a nice orange serialization color here. So SIRD has a data model that consists of sort of three parts. So we're just gonna make them arbitrary shapes. So one is the data format. One is the, let's call it data type. And one is the SIRD data model. And in terms of the actual type in SIRD, the data type is the rust data type that's in use. So in general, that's gonna be blue as types, right? We all agree that blue things are types. So this is where you have serialize and deserialize. And I apologize for my handwriting. Over here is where you have serializer, notice the R, and deserializer. And the goal of the SIRD data model is basically to provide the mapping between these. So in serializer and deserializer, the only thing that you know about, the only thing that those traits provide you information about is stuff from the data model. So essentially the data model provides a sort of layer of, not just a layer of abstraction, but encapsulation, so to speak, so that there's a separation of concerns between the data format and the data type. Each data format only needs to sort of know about the SIRD data model, and each data type also only needs to know about the SIRD data model. And then the sort of little bit in between here takes care of mapping one to the other. We'll talk more about the concrete specifics here. You might also have heard of the visitor type. We'll look at that a little bit as well. And the visitor type lives over here. And it's also a thing that's, visitor is a little weird because it's sort of a little bit apart of here in that the serializer and deserializers will make use of visitor, but visitor is implemented by the data type. It's sort of owned by the data type side of things. And it's essentially part of the interface, which serialize and deserialize is too. But that is sort of the abstraction boundary that you get here. And so you might wonder, well, what is the SIRD data model? And so that's what we're gonna look at next. So the SIRD data model is mostly a set of types. And this is a set of types that they're not entirely arbitrarily chosen, but they're chosen to represent the kinds of primitives that we usually have for data. So these are the numeric types, the string type, byte arrays, options, units, structs and various kinds of structs. So unit structs, new type structs and just regular data structs. And enums and their constituent components. So this is things like unit variant, new type variants and just straight up enum variants. Sequences, which can be, whether there's a vector or a DQ, it doesn't really matter to SIRD. It's just a sequence of elements that has some order and tuples and maps. So these are all the things that are in the SIRD data model. And the goal for any given data type is to, for serialize, is to take the data that's stored in the Rust data type and turn it into one of these. So if you internally have a field that is a, I don't know, non-zero U size, then you're gonna emit that as a U64 into the SIRD data model. So when the SIRD data model asks you, how do you get serialized? You answer with using U64 in the data model. And for D serialize is sort of the other way where you say my type can be constructed from the following types in the data model. So if your type contains, again, let's say a non-zero U64, you might say I can be created from the following unsigned integer types from the SIRD data model. And then there are other variants of this, right? So you could say, for example, structs you could possibly construct from anything that is a map as long as the keys of the map, map to the names of the fields in the struct. So you can start to get a sort of sense for how that mapping works on the data type to data model side. On the data format side, your goal with the data model is to figure out how the bytes of the serialized format map into the data model. So for serializer, the goal is to take types from the data model and turn them into the bytes of the serialized protocol, right? So you're gonna be told, you know, here is a sequence and then you're gonna be told about the elements of the sequence one after the other. And the goal of the serializer implementation is to turn those into whatever the bytes for a sequence with things of that particular type are or how those are represented in the data format. And for the serializer, so going the other way, your goal is to take the stuff that you're getting out of the underlighting data format and turn them into the CERTI data model. And when I say turn it into, that's not actually, it's not like you're turning it into like a value in memory, but instead the way that the mapping here works is that when the data format is, or the DC serializer in particular, encounters something of a given type, it calls a method through the CERTI data model such as let's say, DC serialize, technically it's visit, but just for clarity here, DC serialized string, like I found a string, I found a string as described in the CERTI data model and that ultimately ends up calling the DC serialized string on the DC serialize of the data type. But these two are disconnected. So you can take any data format and any data type and sort of say, connect these two, right? Like use this data format to turn something into this type. And that's one of the ways in which sort of becomes really, really versatile is that you can mix and match these. So this is why you can put derive DC serialize on one of your Rust types, and then you can DC serialize them from both Tommel and JSON, is because when you're using the JSON DC serializer, what you do is you pass in your type and when the DC serializer run, it's gonna call the DC serialized methods on your derived DC serialized type. Or you can do the Tommel DC serializer, again, pass in your type and when it calls the DC serializer or the visit methods, those will end up going to your data type again and because it all goes through the CERTI data model, if there's a string in the Tommel or a string in the JSON, they will both end up calling the same method through the data model, which ends up calling your method on the data type. Okay, does that make sense so far? I'm just in terms of the sort of mental model, we're gonna dig into what the code actually looks like as well. But just before we go forward, I think it's useful to ensure that this sort of relative separation makes sense. Does this kind of model have performance implications? Might have a straight JSON parser be far faster or fewer allocations in CERTI? It's possible, like CERTI doesn't claim to be like a zero cost abstraction. That said, the DC serializers, they don't actually generally do allocation unless it's specifically necessary. Like if you visit a, if you give the JSON DC serializer like a string, like a stir reference, then when the DC serializer walks at stir and gets and sort of encounters something that it discovers is a string in JSON, it can actually just give a slice of that into the data model to say, I encountered a string, here's a reference to it. So it doesn't actually need to do any allocation, it only does the mapping to the data model. That's not always possible though, right? Imagine the best example for this is probably when you have escape values in the string. So imagine someone using like backslash something in a JSON string, then now you can't actually just give a reference to that string in through the data model because you actually need to decode the escapes. And so that means you have to allocate a new string, you don't really have an option, and then you pass that own string in. But even so, in both of these cases, this is the same kind of thing that you would need to do in a manual implementation. And so it's really just generics going all the way through. Okay, so I think we have a rough sense of what's going on here. So what we're actually gonna do now is dive in the complete deep end on the other side and do, let's see, cargo new lib, now fine, let's do a bin. What are we gonna call this? We're gonna call it 30, what now? Okay, and we're gonna go in here and we're gonna say 30 equals one and we're actually gonna pull in the derive feature so that we can derive, deserialize. And then we're gonna go to source main and then we're gonna use this and then we're gonna derive, serialize, deserialize on what type? Struct foo, which has an A, which has a U64 and a B, which is a string. And then we're gonna go ahead and just do cargo expand, expand. Just gonna expand all the macros. Let's see what it generates. Okay, so obviously, you know, this is a proc macro so it generates a lot of stuff. Let's see if I can just put that in a expanded.rs. Okay, so there's a lot of stuff here. You see there's like a bunch of macro tricks that you have to play, like this anonymous const here which is used to basically generate a new scope where we know that we can only have impulse where we know that we can control the namespace a little bit better. So this is not generally stuff that you need to think about in terms of writing your own sturdy stuff. This is just like macro magic, these lines here. Same with the automatically derived attribute here has to do with whether it's considered like unused code for the purposes of warnings and stuff. So you can ignore that as well. Where it gets interesting is down here. Okay, so we have implement serialize for foo. And so let's go before we move on to sturdy serialize. So the serialize trait, again, this is on the data type side of things, not the data format. The serialize trait just says how do you serialize yourself? That's all it does. And it's given access to a serializer but it's generic over that serializer. And the job of serialize as we talked about before is to turn self into the data model of sturdy. And the way you do that is by calling the appropriate method on serializer. So serializer here has a bunch of methods that correspond to the different components of the data model. So the goal for your implementation of serialize is to call the appropriate serialize method on the serializer that you're past. So remember, serializer here could be, for example, a JSON serializer, it could be a Toml serializer. You don't know. All you know is that encapsulation boundary. All you get to know is the data model. And so you might, if you contain a bool, you would serialize a bool. If you're struct, realistically what you're gonna do is probably call the serialize struct method, which in order to call, you have to pass in the name of the struct for some protocols. You actually wanna encode the names of the types as well. You're told the length, so as in the number of fields in the struct. And interestingly, when you call serialize struct, you don't pass in anything else. You get back one of these serialize struct things. And this is a common pattern in 32 that for more complex structures, you have sort of sub-serializers for things like sequences and tuples and enums and maps and structs. So here you see there's an associated type on serializer. So we're still on serializer here. There are associated types for each of these. And in particular, there's a associated type for serialize struct, which implements the serialize struct trait, which has the method serialize field and end. It also a skip field. This is so that you can have a type where you explicitly communicate that you're skipping a field during serialization, which sometimes matters. So there's some protocols where you're expected to produce a complete description of the value. And so if you choose not to include a field, you have to say so in the serialized format. And so the idea here is that if you want to serialize a struct, if you're implementing serialize and you're serializing a struct, you're gonna call the serialize struct method on the serializer you're given in, giving the name of the struct and the number of fields. And in the thing you get back, you're gonna call serialize field repeatedly for each field you wanna serialize and then call end when you have finished serializing the struct. Okay, so if we go back to the code, the generated code, hopefully that's what we should see, right? So you see here, all of these like underscores and stuff is again, just macro bits that we don't need to worry too much about the complexities of it. You see that there's a return type, which as a result, they can either be okay or error. This is also dictated by the serializer. So for some serializers, they'll actually return the serialized data format in okay. Usually though, like if you create a JSON serializer or something, it actually just wraps an IO writer. So the okay value here is empty. It actually gets written to a buffer somewhere or just directly to disk. This again, to avoid overhead of first having to serialize like into a byte string or something and then putting it to disk. And the error is both errors in the serializer but can also be errors in the serialization into the data model. Well, we'll see that as we go a little bit further down. Okay, so you see here, we create, we call serialized struct. So that's what we expected. We pass in foo, which is the name of the struct. So the macro, the proc macro here, because we put derive serialize on foo, it knows that the name of the struct is foo, it passes that in here. This again, is some macro magic. And what this ultimately ends up giving us is the number two, which is indeed the number of fields in foo. Why it's written out this way, macros are complicated. I'm sure there's a good reason, mostly relevant to what we're talking about now. And you see that if serialized struct completes successfully, then we assign the value that we get back, which is this sort of constructor, if you will, that we're gonna call serialized field on into sturdy state. And if it errors, we return. So this is basically the question mark operator, is what's going on here. And I'm guessing it generates it this way so that it works with older versions of Rust as well. That would be my guess here. Okay, and then you see, then it calls serialized field. It serializes the field A by calling serialized field, giving A as the field name, and giving a reference to the value of A as the second argument. So if we go back here and look at serialized field, you see that the second argument to serialized field is a reference to the value of the field, which is also expected to implement serialize. So this is how it sort of nests. So when you're serializing a struct, what you're actually doing is you are serializing each field. You're first telling the serializer, here's comes a struct. It has this name and this number of fields. And then for each field, you tell the serializer, this field has this value. And when you give the value, you give a thing that itself implements serialize. So then we're gonna call the serialize methods on each of those values and give those into the data format as well. They also need to map into the data model. And so if that succeeds and we continue to return, again, this is just a question mark and we do the same thing for B and then we call end and that's the end of serialize. So if we were to write this ourselves, right? So instead of deriving serialize, we could impulse serialize for foo. And all we would do is, you know, struct is serializer.serialize struct foo and two. Question mark, s.serialize field a, self a, b, self b, and then s.end. And this is complaining because we need to import the serialize struct trait. And this needs to be mute, that's fine. So now we've replicated the same thing that serialize did, that this is all the derived serialize actually does. Now, that's not quite fair to say, right? So there are other things that serialize does, namely it's configurable. So if you do thing, if you do serialize, right? You can do things like 30 skip. And so if we now go ahead and comment this out, let's look at how that changes what gets generated. So our cargo expand, then expanded. If we go now back up to serialize. So you see that now in the generated code, serialize struct, serialize field a, but there's no code for the field that we ended up skipping. I'm curious why this doesn't call skip. So remember in the data model, there's a skip field? Not sure why it doesn't call that. That seems odd. It's not using question mark because question mark produces slower code and larger binary size due to the implicit into conversion being done. Oh, I see, that's fair. Yeah, so here, so remember the question mark operator in Rust actually is not quite equivalent to this code. It is slightly different. So if this were written in with question mark, what it would actually generate is this. So it has a call to into to allow conversion of error types. And this is one of the ways in which question mark makes it nicer to work with error types is that the error type doesn't need to exactly match. It just needs to be compatible with the thing you return. But this of course is extra machinery, extra stuff that the compiler has to run on. And in this case, it's entirely unnecessary because we know that the error type is directly translatable here. And so therefore we do it with this. Okay, so that's an example of the kind of attribute you can get and there are other ones, right? So if we did here, let's say rename equals X. So now this one, in fact, then of course we're not gonna skip it. And if we now look at what it generates, you see that instead of saying that this field was called B, it says that the field is called X. So in general, all of these additional attributes that you can put on serialization, most of them are just relatively simple modifications to the serialization code that gets generated. There are some exceptions, right? So you can do things like completely switch what serializer is being used for this type. But at the same time, even that's not that complicated. It just means that in the generated code, instead of calling the standard serialized implementation for Bs, instead of just using this, you're gonna construct, you're gonna call some other type here to do the serialization itself. We're not gonna get too much into that, but just so you're aware. And it's sometimes fun to, you can read the cert-and-derive code and it's sometimes a little hard to parse because it needs to do a lot of token manipulation. But with cargo expand, it's usually pretty easy to figure out what actually happened under the hood here. Okay, so that's serialized. We're gonna talk a little bit more about serialized later, but for now let's move over to deserialize and see what that looks like. So I've removed the rename, then we go back here, then we go back to deserialize. You see for deserialize too, it generates this sort of macro wrapping stuff that we can mostly ignore. And for the implementation of deserialize for foo, here you see the structure is similar, right? So there's a deserialize method, takes a deserializer that's generic over the deserializer trait. Now, there are a couple of things that are gonna be different here. And the first one is this DE generic lifetime parameter. This one is more or less a reference to the input that the deserializer is working over. So imagine something like CERDI JSON. If you call CERDI JSON from string or from bytes, so anything where you actually have a, it has a reference to its input, then in order to basically for efficiency, we want to be able to return references into that input. Right, so imagine there's some like huge string in there for example, that we really just want to return a reference to directly rather than have to like copy it out into an own string. Then what is the lifetime of that string reference? That's what DE is. It is a reference into the origin data for the deserializer if available. Some deserializers aren't gonna have this, right? So if you're reading from disk, for example, then this lifetime is static and you're never gonna produce any owned references, any borrowed references, you're only gonna return owned data. So if you read from disk and you're deserializing JSON and the thing that you get out is actually a string reference, what you're gonna do is construct a string because by the time you read more from the file, a reference into like the buffer that you read from disk is no longer gonna be valid. So you're only gonna generate owned values. Okay, so that's where the DE reference comes from. Well, we'll see this come up later as well. And then the implementation here, you see we start talking about this visitor thing. And visitor is, hmm, what's the best way to describe this? The visitor pattern is CERDI's way of making it easy to deserialize sort of nested structures. So in some sense, you can think of this as sort of the inverse of what we're doing with serialization where things like serialized struct actually return a constructor that you then calls serialized field on and whatever. With visitors, that's sort of the inverse pattern where it's not quite as easy to do that with associated types, with visitors, you can do it pretty easily. So let's see what that actually looks like. So in this case, CERDI generates this private type, in this case an enum, that is a field visitor. That's what it's called it here, which is gonna visit, it's gonna be the thing that visits each of the fields of the struct. And it implements visitor for us, let's go look at visitor. So if we go back to CERDI here, so we have deserialize. So it's given a deserializer, that's fine. And deserializer has all of these, no, that's not what I want, visitor here. Yeah, so the bits you want here is that when you call deserialize, ultimately what you're given is a deserializer. And what you have to tell the deserializer is what do you want it to deserialize? What do you expect it to deserialize? And you can say deserialize any, which is as the thing that's being deserialized into you can say, I don't know what type this is gonna be. You just, you need to know ahead of time. This only works for self-describing formats. So formats where the underlying data type actually encodes things like this is a map, this is the start of end of a field, that kind of stuff. So JSON, for example, has a deserialize any. There are some, especially compact binary formats where if you told it, just give me whatever comes next. If it's a string, give me a string. If it's a number, give me a number. It's just a sequence of bytes with no structured information. And so the deserializer would go, I don't know what comes next, you have to tell me. And so there's this sort of separation here between self-describing formats and non-self-describing formats. And in general, all of these deserialize bool, deserialize, I8, et cetera, the deserialize any is gonna sort of forward into these methods if it can detect the type. And for data structures where it cannot detect the type, it's gonna just refuse an error out, saying, you told me to guess the type and I cannot guess the type. So in general, when you implement deserialize, what you want to do is give the deserializer information about what you're expecting to come next. So if you know that the next field you're gonna deserialize is, let's say a U-size, then you should call deserialize U-size. U-size is a bad example here, U64. So if you know the next field is gonna be U64, you should tell the deserializer, the next thing I want from you is a U64 because that way, if the data format is not self-describing, it still can actually decode and give you what you needed. Now, what you'll see here is that when you call any of these deserialize methods, you give in a visitor. And this is where the mapping is a little odd, right? So you tell the deserializer, what I want now, or what is coming next in the data format is a U16 or to take a slightly more complex example, the next thing that's gonna come is a sequence. And then you give it a visitor and the visitor is what the deserializer is gonna call methods on when it does in fact find that sequence. So you tell it when you hit a, like the next thing you find is a sequence and when you find that sequence, use this visitor to explore that sequence. So you see all of these consume self. So that you end up basically passing control back and forth between the implementation of deserialize into the deserializer which consumes self. So this is now where execution is happening. It is gonna then find this as a sequence and then call a method on the visitor and the visitor is then gonna call deserializer again and then we go back and forth. Well, we'll see how that works when we start looking at the code here. And so visitor has visit bool, visit I8. So these are roughly equivalent to deserialize from the data model, right? So this is saying so in the deserializer, when it finds the thing you asked it for, then it will call the appropriate visit method on the visitor that it was passed in. So again, if you use, for example, tell the data model, let's say deserialize bytes, then the data format is then gonna presumably call visit bytes in response. Okay, so with that in mind, let's see what it actually generated for us here. So the field visitor here is gonna be a visitor for the fields of the struct, right? So when you encode a struct, usually you wanna emit both the fields and the values. Think of something like JSON, right? If you had this struct, this foo struct that we have, serializes JSON, the way that would look like in the JSON is like string A, colon, and some number, string B, colon, and some string. And so the field visitor is gonna be the visitor for the keys of the underlighting data format. And then presumably there's gonna be a value visitor further down. You see that the value that the field visitor produces is field, which in this case is an enum, saying field zero or field one. Which corresponds to A and B. Now ignore here is interesting, which is this comes from the fact that CERDI by default, let me dig up the docs here to make sure I'm not lying to you. CERDI by default will allow unknown fields. And there are a couple of reasons for this. The primary one is backwards compatibility, where if someone, imagine this is a CERDI, if someone, imagine that you're like deserializing JSON or something. And if the underlying JSON files that you're parsing actually changes, like someone adds a field to it, your code shouldn't necessarily break, right? If there's just more data in the JSON, then this is a great way to just be able to subset that information or just allow the underlying JSON to evolve over time while you are still able to run because all the stuff you cared about is still in there. And so you can set CERDI deny unknown fields to say, every field must be in my data type. If you ever encounter a field in the data format that I don't have in my data type, that should error rather than just be ignored. And so this is where that ignore comes from, which is if we encounter a field while deserializing that isn't one of the ones we know about, then it's gonna be mapped to this ignore field and then get ignored. If we pass deny unknown fields, this enum variant wouldn't be there and we would end up erroring instead if we were to encounter a field that we couldn't map. So on visitor, there's this expecting method which is primarily used for error message generation. So this is, if you've ever done CERDI deserialization and you see a message like, expected something did not get that or got something else. Usually you don't necessarily get the got something else, you just get unknown or unexpected input expected X. That expecting comes from here. So this would be something like, imagine that you are visiting a JSON structure, a dictionary or a map, if you will, in JSON. Every field, every key has to be a string. But if the data format is in the process of visiting a dictionary and it's encountering a key, so you're in sort of this field mode and then it finds something that isn't a string, then it's gonna call expecting. And the way this works in practice is I think there can be numbers too. I think there's only strings and numbers, but I could be lying. And the way this works is actually pretty simple in sort of Rust type terms, which is that there's a default implementation for all of these visit methods. So if we look at visit bool, for example, the default implementation just calls, well, error invalid type, let's see if we can find the, oh, these are not, these aren't great examples because these actually have specialized implementations. But I thought there was a expecting call to expecting in here. Right, so here for visit bool, for example, it says error invalid type and it passes in unexpected bool. And the unexpected type, when it gets debug printed, I guess I can go back to using this search. The expected type, when it's debug printed, will use the expecting method of the visitor to say what was expected instead of what you actually got. So if the default implementation for the various visit methods is to emit a message saying, I got this, I expected this. And for the I expected this part, it's gonna print what the expecting method of visitor is. So in this case, the expecting for field visitor is gonna be field identifier, right? So we're expecting a field or something that is a struct field. And then we're gonna override the implementation of visit u64 because u64 is how we're gonna, okay, so this here is, we're gonna allow two ways to encode the struct. We're gonna allow people to encode it as the zeroth field on the first field. So if you are visiting a dictionary, CERDA basically allows you to say, you can either produce the fields in order where the keys is the index of the field. So in this case, you can see field zero. If the value of the key is zero, then it gets mapped to field zero, which we know as A. If it's one, it gets mapped to field one. And it also has a visit stir, which is A or B, which gets mapped to zero and one. It also allows visiting bytes. So if it's not a string, but it happens to be bytes, this could be something like, the underlying data format doesn't support strings. It only knows about byte slices, but that's okay if there's a byte slice that represents the string A, that's just as good. And so in this case, numbers are okay as long as they're the field indices and strings and bytes are okay if they map to the names of the field. So that's the visitor implementation for this field visitor. We'll see how that gets used a little bit further down. And so now we can implement DC realized for field. Right, so again, this is just a field visitor. And so we wanna implement DC realized for field because ultimately what we're gonna do is say we wanna DC realized a map and the visitor is gonna be a visitor that alternates between getting fields and getting values and the fields therefore need to implement DC realized and the values need to implement DC realized. So now what we're doing is we're implementing DC realized for the field part of that DC realization process. So down here, we do implement DC realized for field and that's gonna DC realized who DC realized identifier. That's interesting. I didn't know it had a DC realized identifier. Huh, I'm surprised it's special case. It might just be to give better error messages. Although maybe there are some data formats where identifiers are kept as a, are encoded differently. That could totally be. Okay, so DC realized identifier, that's not super important which method it called here but it's basically telling the DC realized okay, the next thing you should expect now is something that's an identifier and you should visit it using the field visitor which is the thing that we just constructed further above. So that's all you need for the DC realized here is to use the visitor we just had and then call the appropriate DC realized method on the DC realized here. And here, you know, you could have called DC realized or any DC realized any, but if you did, the data format underneath has to be self-describing. But by calling DC realized identifier, even data formats that aren't self-describing will roughly know what thing they're expected to produce. And so they might actually now succeed where previously they would not. Okay, and then we're gonna have a visitor for all of foo. So ultimately what we're trying to produce with this DC realized implementation is we're trying to DC realized into a foo. So we need a visitor that produces a foo and that's what this thing is. So here we implement visitor for this anonymous visitor types and it produces here a value of type foo. So this is the visitor that's actually gonna produce what we want. The expecting message here is we expected a struct foo. Okay, great. And we're gonna implement here visit seek and visit map. And those are the only two things we're gonna implement. So again, this comes back to we want to be helpful in that the data format is allowed to serialize structs as the fields serialized in order. It doesn't need to capture the fields, right? There are some data formats where because the fields are static, if you actually have a highly compressed and optimized storage format that's standardized, it has a schema, then you don't need to put the fields, the string contents of the fields into the data format. The order of the values is all you need. So that's why we want to be able to deserialize from just a sequence that produces the fields in order. Yeah, so GRPC for example. So visit seek here is gonna be, so seek access here is access to something that is a sequence. So this is one of the associated types on deserializers that allow them to basically have a, you might have, if you have a JSON deserializer, it has a different impulse that just deals with stuff that has to do with arrays that implements the seek access trade. And that produces things like next element, right? So here for visit seek, we're first gonna try to access the next element of the sequence that we're currently in, that we're visiting, and we're gonna try to deserialize that as a U64, which is the type of the first field. If we get it, great, that's the value of field zero. If we don't, it's an error. Yeah, and we can produce error messages here saying, if what we got is actually, if what we got is the end of the iterator, right? So remember, our next element here is basically like an iterator. So if what we got back here is none saying the sequence has ended, then we say, well, we expected to get two elements and we didn't. But ultimately, field zero here, assuming all of this ended up in the okay branches, field zero here is gonna be the value of field zero. So which is the U64, which is the field A. And then field one, we do the same thing, call next element, tell it to deserialize as a string. And this could be any type here, right? It's any type that implements deserialize. And we do the same thing. If we stop, then we say, we expected two elements. Ultimately, field one is gonna contain the string value of whatever was deserialized. And now that we're done, right now that we have the values for both fields, all we have to do is say, great, we succeeded, this visit seek succeeded, and it produced a value foo where A is field zero and B is field one. And I think you can already see where this is going for visit map. It does the same thing, right? So it, ooh, this is a very long line. Okay, so this starts out, I guess it's structured a little differently. This starts out by saying field zero and field one are both options that are set to none. And the reason why it has to do it this way is because you don't know if the key, if the field name is encoded in the day structure, that means they could come out of order. So you don't know which you're gonna get first. So therefore, you can't just like read one and store it in a value and then read the other and store it in a variable because you wouldn't know the types of the two variables because they could be in either order. And so instead you create variables for both of them, say that their options store them as none. And then you keep asking for the next key in the, so see this uses map access instead of seek access, keep asking for the next key. And then you look at, okay, which key did I get out? Which field, right? So this is saying deserialize the next key as a field. This is why we have the deserialize implementation for field in the first place. And if it's field zero, then if we already have a value for field zero, then we say this is not okay, duplicate field. Otherwise, we try to decode the next value from the map as a U64, which is the type we know the field zero has. And now we have a value for field zero. If the key that we got is field one, we do the same thing. We check whether we already have a value for field one. And if assuming we don't, then we last for the next value, say deserialize it as a string. Again, this could be any type and store it in field one. And any other fields we just ignore. And so we say, you know, give us any value, we're just gonna ignore it. And then ultimately down here, we're gonna check that we indeed have a value for field zero and we have a value for field one. If we don't, we emit like a missing field error. And then we produce again foo with A set to field zero and B set to field one. And so now this, remember, this is still the visitor, right? This is the visitor implementation for the type that can visit a foo value. And so then we did to tie that into a call to the deserializer and that call is pretty straightforward. That call down here is just saying, deserializer, deserialize is struct. So telling the data format, the next thing you should try to grab from the data stream is a struct. This is just self, right? So this could be written as this, those are equivalent. The only reason it's written like this, I think, is because you don't need to use the trait or bring the trait into scope. You give the name of the struct, you give the list of fields. And again, this is because the data format might actually need to know the fields and the order that they're defined in. So we pass that in. This is just a requirement of deserialized struct. And then we pass in the visitor, which is ultimately the thing that we wrote up here. And so what that's gonna end up doing, right, is it's gonna call the deserializer, the deserializer's gonna continue walking its input stream, its data format, looking for whatever comes next. If it discovers that the next thing that came in was a map, then it's gonna call visit map on this visitor and then the visit map code runs, which is gonna call back into the deserializer's map access saying, give me the next key, give me the next value, give me the next key, give me the next value until it feels like it's gotten all the fields or until there are no more keys, actually only until there are no more keys. And then it will produce the value. If what the deserializer realizes that there's a seek and not a map, it'll call visit seek on this visitor. We'll do the same thing. It'll keep calling the seek access part of the deserializer to continually get the next value and ultimately produce a foo. Okay, so that's the deserialize. Does this make sense? We've now been through both serialize and deserialize. We're gonna talk a little bit more about the subtleties here, but are there questions about the basic structure here of the serializer call graph and the deserializer call graph? Is there an attribute to force the generate of only seek or name deserializer visitors? You know that's a good question. I mean, the way that you do it is just you write your own implementation. I don't know if there's an attribute you can set specifically. No, it doesn't look like it actually. I don't think so, except writing your own implementation. Okay, so that was a flurry through deserialize. So let's now look through, now that we have a rough idea of the structure here, let's look at the kind of attributes you can have for certid derived and sort of look at how they might change both the serialize and deserialize. We already looked to rename. Rename all is the same, right? It's saying before you put in the field name in deserialize and serialize code, turn things into lowercase, turn things into uppercase, that kind of stuff. Easy enough. Deny unknown fields we talked about only applies to deserialization. And in fact, I mean, we can add this if we wanna see what it looks like. So if I now look at this, most of this code is gonna look basically the same. One difference you'll see is that now up here, when we match on the key in a visit map, there's no longer a sort of field ignore that calls ignore any on next value. And further up here, the field visitor, if it now gets a name of a field that it doesn't recognize, it just errors rather than producing field ignore because there is no field ignore in the field enum. Tag type, okay, so this goes into how we do enums. So let's hold that for a second and then we'll look at enums. Tag, tag, untag, that's fine. Surty bound, this is a way to get the surty auto-generated code to include additional bounds that might be necessary. This is because by default, when you derive surty deserialize and serialize, the impulse of deserialize and the impulse of serialize just generate this impulse block. But imagine the foo was like generic over some X, right? Foo only is deserialized when X implements radical, right? Then surty has no way to know that this bound is required in order for deserialize on foo to work, right? So the reason this might be necessary is that imagine that our struct foo X has a field X of type X and the impulse of deserialize for X only works where X is radical. Or let's say there's a bar and the implementation for bar of X is only when X is radical, right? So this might be one reason why you need such a bound is because this impulse exists and is constrained. And so in your definition of foo, there's not even a mention of radical. And so the derived implementation wouldn't know that this bound is necessary. So that's where you can use surty bound. And this one, it's really just it copy paste the text you put in here into the basically here after the impulse deserialize and same for impulse serialize just copy paste whatever you put in bound there. Default, default is a fun one. Let's look at what default does. So let's do here, surty default. So what default is saying is if this field isn't in the input then actually here's what we're gonna do. We're gonna do even better expanded and then we're gonna do surty default and we're gonna expand that to expanded to and then we're gonna diff expanded to expanded to. I know this is very narrow but you'll see very much of the code that was produced is actually the same. You see it's 170 lines here that are the same. And the place where it differs is down here. Let me expand this a little bit. So we're here in, this is in the deserialize code. And you see that if after having extracted the fields we find that field one is none then in the old implementation we would error saying invalid length and in the new generation we set the value to be default. That's the only thing it changes. Oh, that gets hidden by my face here. So it changed from producing an error if field one was none to producing just the default value into that field instead. And default equals path is just call this function to get the default rather than using the default trait. Remote, I think we're not really gonna talk about. Transparent is similar to repertransparent. This is just saying don't try to generate any code for this type. Like don't call deserialize struct with a thing that has only one field and stuff. Just directly call the serialize and serialize on the single inner values. This only works for new type structs. Certify from and try from. So this is used if you have proxy types where you wanna say, hey, Surdi, in order to, let's see, let me give you here. So this is gonna say, Surdi, deserialize into a string and then use the from trait to turn that string into a foo. So that way foo doesn't need to implement deserialize, only this type needs to implement deserialize. And then Surdi's gonna take care of the conversion just by calling the from trait. This can be really useful if you have a type that's from some other crate, for example, here. And that crate doesn't implement deserialize for its types. It doesn't have a dependency on Surdi at all. So you can have your own type that can be deserialized and then all you need to do is implement a mapping from your type into that crate's type. I don't know if you, yeah, so default on the container is just the same as default on all fields. So that's from and try from and into, they're all sort of analogous here. Surdi crate is just used for, you know, in the generated code here. There are a bunch of places where it refers to, let me go up to the top of this, externCrateSurdi as underscore Surdi. Sometimes you want to use a different crate than Surdi or you have Surdi available but under a different name. And so this just lets you change what this value is. So if we said here, crate equals foobar. See if it even lets me do that. Am I not? Yeah, it's gonna refuse to do that. This doesn't work because I haven't actually set it up so that foobar works but maybe if I do, let's see how that works. So if I do foobar package equals Surdi, I don't know if it'll let me do this, bring in the same dependency twice. Yeah, maybe if I do this foobar, there we go. So now it uses foobar as Surdi instead of having its own like externCrateSurdi. That's all it changes. It just changes all of the imports like anywhere where this type is talking about Surdi, it uses the name you specified instead. Those are all the container attributes. So let's then talk about enums. So if this is an enum foo instead and it has bar and buzz. And let's say this has something like this and then now we expand. Let's see what we get here. So now this is an enum instead. And what gets generated is very similar actually. You see for serialize, now we call instead of serialized struct, we call serialized struct variant. And the main difference between serialized struct and serialized struct variant is that we give both the name of the enum itself and the name of the variant. And the index of the variant in the definition of the enum. Again, for formats that don't bother encoding the names of the types into the thing and just rely on the sequence. And what we get back from that is again a sort of constructor. This time that implements the serialized struct variant trait rather than the serialized struct trait. And it has a serialized field just like the other one. So very, very similar code. And calls end. So this should be entirely unsurprising. For deserialize, things are a little different. So for deserialize, we, where's the, so this is, when it says field visitor here, what is actually a visitor of is the variant name. It can visit a U64, which is the index of the variant. So again, variant index here. Field is a little misleading because it's not a struct field at the moment. It can visit a string in which case it expects the name of the variant and same with bytes. And then the deserialize is uninteresting. This is deserialized identifier like before. Visitor for foo now is where we're gonna see things be a little different. So we're gonna expect to be visiting an enum. And when we do, we want to know from the data format, okay, you're visiting an enum. What is the variant of this enum? So again, the data format gets to dictate how enums are encoded, right? Through basically how it implements the enum access trait. And deserialize enum, which we'll see later. And so we say, okay, tell me the variant. And if that is field zero, then, and then you see here, this code now sort of nests, right? So this is first deserializing which variant am I in? And then if it discovers that it's in field zero, which is bar, then it just generates out the code for deserializing that variant type, right? And it can do this because the deserialized code, if we go all the way up here, the deserialized code for this is exactly the same as the deserialized code for just struct bar with one field U64. So that's why it just, when it generates the deserialized for a foo, it really just generates the code for a deserialized for bar and a deserialized for baz, like for each variant separately. And then it generates a sort of macro, like an outer deserialized for it that determines which of those it should use. And so that's what we're seeing down here is it first asks for which variant. And depending on the variant, it calls either the derived deserialized for variant one, which has visit U64, visit string for the fields. So this is all basically the code we looked up from the old foo. And then live further down here. This is if the variant we got was field one, then again, this is just the derived deserialized code for the baz variant, the second variant. And then if we go to the bottom of that, we see this is the end of the visitor implementation for foo. Then we call deserialized enum. Again, instead of deserialized struct, we give foo, which is the name of the enum, instead of the list of fields, we give the list of the variants, and then we give the visitor. And that's all there is to it. Now, where the details are here is in when we say, sorry for all the scrolling, I know it's gonna be disorienting. When we tell the data format, which variant are we in? How it determines this is gonna matter a lot. And this is gonna vary by data format, right? And not just by data format, but even within a data format, you might have multiple ways of encoding an enum. In fact, if we go back to the certi bit here, you'll see enum representations. So this is a separate page on enum representations that gives, here's an example, an example type. So they can be externally tagged, in which case the variant is placed outside a representation of the contents of that variant, right? So this is just you separately serialize the contents of it, and then you'd make that the value of a thing that holds the variant. So that's what certi refers to as externally tagged. And that's what it uses by default here. Internally tagged is you have a field, you serialize the contents of the type, but you add an additional field inside that serialization that tells you which variant you're in. So that would be something here like type request. And you have a jacently tagged, which is that you encode a sort of tuple of the variant and the contents of that variant. Like here, T is the variant name and C is the encoding of that variant. And then untagged, which is you don't even encode which variant it is, you just serialize the contents. And so when you deserialize and untagged, you basically just look at what fields you're given and depending on which fields you're given, you make an assumption about which variant it originally was. We can look at untagged too, there's a lot of interesting complexity in how you deserialize and untagged. So let's look at what happens if we try to switch this to an internally tagged thing here. So if we go back here and we say, sort of tag, the value here is the name of the field inside the representation. Can use tab, that's fine. Expanded.rs and expanded intrs. So let's see here. Okay, so one of the things that changes and serialize here is that instead of serializing a particular variant, we're actually now just serializing a struct. So basically we're not telling the data format that what we're serializing is an enum. We're just telling it, we're gonna serialize a struct now and it just happens to have an extra field. So it's not like this is cheating, right? But it's basically saying that if you use internal tagging, then as far as the data format is concerned, this just isn't an enum anymore. This is just a struct that happens to have one field more. And so that's gonna be represented in the remainder of the code. So it's gonna be a pretty big diff here. So if I just open expanded int and we look at the implementation of serialize, it really just serializes a foo and then it serializes a field called type where the value is bar and then it serializes all the other fields. So this one is actually very straightforward. It's really just serialize it as it serialize and deserialize as if it were a struct with then one additional field. So for deserialize, if we go back here, you see we have, we still have a type that represents which variant. So field zero here is bar, field one is baz. We implement deserialize, that's fine. Deserialize any, I'm just gonna say here, there's some like private types involved. So the deserialize any here is telling the, I wonder why this is a deserialize any and not a deserialize struct. That's interesting. Yeah, so this is telling the deserializer, okay, deserialize whatever comes next and we're gonna visit it using this tagged content visitor, which apparently is the thing that's in CERTI itself. So let's go look at what that looks like. CERTI source, where did it come from? Private DE tagged content visitor. Pub use self content, self content. Oh, it's just a sub module in the same file. Private DE pub struct, here we go. Not public API. Yeah, so this is a public type that's basically hidden from the API that the generated code gets to use just so that it doesn't have to generate just tons and tons of code in every implementation. So what does this visitor do? Internally tagged enums are only supported in self describing formats. I wonder why, why this isn't just considered a struct serialization. I guess it's because you are changing the struct. So it's sort of, it will be weird to have this be part of your schema because you're already having your data type be a different definition from what's actually in the format. So maybe that's the rationale here. So visitor for tagged content. If it's a sequence that expects the first thing to be that the tag. If it's a map, then it, oh, interesting. If it's a map, it visits everything in the map and sees is the thing I just visited the same as the tag field. And if so, then we know that this is the value of the tag. Otherwise just store it. And the reason you have to store it here is you're basically buffering, right? If you know that this is an internally tagged enum, then you need to keep parsing out of the data format until you get the thing that tells you which variant it is. And that means that anything that you parse out of the data format before you get to the tag, you have to put somewhere because you're later gonna have to actually use those values in your deserialization. So that's what this vector is for. This is just keeping around those extra fields, those fields that you passed by while trying to get to the tag. And it looks like it actually walks all the way through. So it doesn't try to like stop early when it hits the tag. It actually just walks all of them so that the tagged content has the tag and a vector of all of the other fields. And so this is one of the reasons why this representation is actually gonna be slightly less performant than the externally tagged variant because it has to buffer. And this might be the reason why it needs deserialize any chew, well, unclear. Okay, so tagged content, do, do, do, do, do. So the visitor for the thing that tells you whether this is the tag or not the tag is just if it is the name of the thing that they said they wanted us to tag, then it's the tag, otherwise it's content. That's fine. And then, right, and then there's a deserializer. So basically, we're gonna, sort of is gonna treat the buffered list of fields we walked past as a data format and implement deserializer for it. And this makes a lot of sense, right? Because it is a thing that holds data that can be turned into other data types. And so, so you see down here, you know, the content that we stored up here, the vector, where did it go? This content vector is a map that holds the vector of all the key value pairs. And when we implement deserializer for that, we're just gonna walk the content. And, you know, for, depending on what type we ended up putting in there, we're actually gonna, you know, call the appropriate visitor method. And one thing that's interesting here is when we deserialize into content, so map next value here, we're not mapping this into the value that is, let's backtrack here a little bit. Imagine that the enum variant has, you know, one field that's a U64 and one field that's a string. This code doesn't yet know which variant you're in. And so therefore, it has to, but it has to capture all of the fields that it walked by until it gets to the tag. And in fact, it just collects all of the ones that are not the tag. But it doesn't know their types yet. So it has to deserialize them because it has to buffer them for later, but it doesn't know which concrete type to deserialize them into. And therefore, it has to deserialize them just directly into the data model. And that's what this content thing is. So let's see if I can easily find its definition. Yes, you see tag content here as a content. Let's see if we can find its definition here. Here we go. So content, the content enum here is the Serdi data model. Like it's a data type that can hold any value in the data model. And so this is why the format has to be self-describing. As we walk through the fields that we don't know their type yet, we can't give any hints because we don't know which enum variant we're in. And so we don't know what the type is expected to be. So we just need to deserialize any which means the format needs to be self-describing. And what we grab out of the data format, we store directly into this content thing which encodes all of the possible values in the data model. So you see it has all the integer types, it has the floating types, characters, strings, new type seek and map. And actually I happen to know, if you look a little bit over here, you see to buffer the contents of the deserializer when deserializing untied enums and internally tied enums. So this is the same trick that's used for untagged except there you don't even know what you're looking for, you're just gonna grab all of them. And there's gonna try deserialize based on this buffered content that is just a direct encoding of the data model. And there is a separate crate called SIRTIVALUE that has this publicly exposed, this enum. And I know that there's some work on trying to move that into SIRTIVALUE, I don't know there's been any progress on that work in part because it should be rare that you actually need this in your own code but it is possible that you do. And I think some people just use SIRTIVALUE and JSON value for this, even though SIRTIVALUE is the more appropriate mapping for the demo. Okay, so going all the way back to the code that we had here, that's the reason we called deserialize any is because we don't know all of the stuff that's gonna be in type. We don't know the full definition of this type yet. As we just need to say, visit this and one of the things that you will visit is gonna be the tag and anything else we'll tell you later. And then we're gonna match on the tag that the tag content visitor found. And if it's field zero, remember field zero here is really bar, then we're in the bar variant. And so then we have a visitor for bar. This again, this is the regular derive for the definition of the bar variant. So it visits the fields. And the interesting thing is gonna be where we, so this is just the same auto-generated deserialize for the variants that we looked at before. That's where I'm scrolling past it. But when we call deserialize, what we're gonna do is we're gonna call deserialize any. This one could probably be deserialized struct because at this point we do know the type but let's ignore that for now. So here we're creating one of these content deserializers that we looked at from SIRTIV itself, right? So this, we're gonna give the content of the tag. So the content here is that vector of the content enum which is the direct encoding of the data model. And this implements the deserializer just by calling the appropriate visit method for whatever that stored data model value is with the visitor that we just had. I know that was a lot to take in at once. So ask me questions about how this was confusing so that I can try to explain it again. This is certainly like a tricky deserializer but that's one of the reasons I wanted to talk about it is because if you're implementing deserialize yourself these are the kinds of subtleties that sometimes you have to think about if you do get into really like weird data types or in some cases data formats. Although all this is currently data type stuff. For field one for Baz it's the same thing, right? This is the derived deserialize for the definition of Baz. And then at the end we just call deserialize any with the content deserializer or with the content of the tagged with the visitor that we just generated. And so this is one of the reasons why visitor comes up here, right? Is the visitor is the real implementation of deserialize. And then we sort of, this is the implementation of deserialize as the connection between the visitor and the particular deserializer that we end up using. Okay, so for adjacently tagged this is gonna be similar to externally tagged. It's just that instead of using the value of a field you look for the value of an adjacent field by some name. So this one's not gonna be that interesting. Untagged is probably gonna be interesting but I'm aware of the time. So I'm gonna not talk about untagged here. It's basically the same as internally tagged except that you don't get to look for a tag. So you just end up with that vector of contents and then you're just gonna try deserialize for each data format. And this means you need to do even more buffering because you might start to deserialize into one of the variants and then like halfway through walking content realize actually this is not the right variant. And so now you're gonna have to try deserialize into another variant but that means you still need to deserialize the parts of content that you already walked to deserialize them in the previous variant attempt. And so there's a little bit more complexity around there but that gets very in the weeds. Okay, so that's enums. We've talked about their serialization and their deserialization. We can look at the variant attributes to see if there's anything interesting. Rename not really. Alias is just allow it to be deserialized with multiple different names. It's not too interesting. Skip is not that interesting here. Serialize with and deserialize with are also now that we know how all this code actually works isn't they're not that interesting because most of what they do is just change the generated code so that instead of call passing in a reference to the underlying type you have a little constructor around it that instead calls this method rather than calling the deserialized method directly on the type of the field. Same with with bound we talked about borrow and other. Okay, so other is just is for tagged enums where you can say if the tag didn't match if none of the tagged enums if none of the variants were matched by the data that came out of the data format use this variant instead. So it's sort of a fallback if you are a default variant. Borrow is, do I wanna talk about borrow? Borrow I think is mainly there to ensure that CERTI generates the appropriate bounds. So this happens for something like let's go back to our struct foo it had a AU64 and a B string right so if we instead wanted to generate this or perhaps more interestingly something like this. So if we do this then the serializes straightforward because this is just serializing a string that's totally fine we know how to do that but for deserialize where this gets tricky is if the underlying data format gives us a reference to a string then we want this to continue to use that reference we wanted to turn into cow borrowed but if it doesn't if it produces an own string then we wanna produce the cow owned version of this and in either case the deserialize will only work if the DE lifetime outlives the tick A lifetime and I'll let you mull over that separately but we need to tell CERTI that it should associate tick A with tick DE because otherwise if we don't it's gonna generate a deserialize implementation here where tick A and tick DE are unrelated and that won't work because if we are truly gonna deserialize from a reference of DE into this field which has lifetime tick A there has to be an association with it namely DE has to live for at least as long as A does and so that's where you stick in CERTI borrow like this and we can look at what happens if we generate this so CERILIZE is still the same it's a serialized field it's not terribly interesting deserialize you see here there's an impulse which has the lifetimes DE and A deserialize DE for foo A there's no association between the two of them and when it visits a stir uh... do do do do do do do do do do do uh... visit seek that's fine a deserialize is a cal tick A that's fine so all the code here looks basically exactly the same as it used to and that's fine you know but when we do cargo check here uh... I would need to write some code that actually uses a simple deserialize but if you ever try to use this you would get an error from the compiler saying um... that DE doesn't live long enough and in fact I can I mean I can do this I suppose uh... CERTI jason is version one and we're gonna do um... jason from string and it's fine that it doesn't actually work who oh interesting maybe it just that I missed a special casing here think so that's interesting I wonder why that works because it shouldn't let me get rid of the stuff in between oh it is static you're right uh... let's do no that is still work um... all by default cowl always become owned that's why it only becomes borrowed when used with CERTI... okay so cow is sort of special case here so I could do this instead by saying uh... cow2 borrowed it's not gonna like this is it uh... yeah there's more trickiness here uh... so okay so the the answer is there cow is kind of special case here where cow will always turn into a borrowed string not stir uh... and therefore it can assume any lifetime here it wouldn't work if I did this for example how does that work oh because here Ticke just gets set to D uh... how can I make this fail so that I can show what it does I wish I had an example of the top of my head here uh... yeah use any other type the cow and stir here are sort of special case if I just did that that's not gonna uh... it's not gonna like that but if I did bool maybe okay uh... great so here it's gonna complain at me uh... and that we don't need the drop anymore so here it complains saying uh... the DC or less not implemented for bool uh... fine fine fine fine will derive struct bar Ticke and we're gonna have a phantom data Ticke and then this is gonna be a bar why is it okay with me doing this uh... now I'm confused should not be okay with me doing this unless if I have to do borrow on this one uh... great okay so now we finally get the error I was after which is lifetime may not live along long enough the lifetime Ticke is defined here and the implementation of DC realized for bar requires that DE must outlive A and this is the error we would have gotten for cow as well uh... if it wasn't special case and the trick here of course is we want sturdy when it implements the DC realize thing here to to add to the bound d is going to outlive Ticke because that is the case in which this is okay because this will just be a in fact we could do uh... we can make this be a stir so the explanation is a little easier you know this string reference in here is going to be a reference into the string that was passed all the way down in the DC realizer into its input and in order for that to be the case it has to be the case in the DC realize implementation here there's a bound saying d e outlives Ticke that's the only way in which this will end up being valid and sturdy borrow makes it so that we have that bound and uh... and it's not going to do anything particularly special like if I expand this you'll notice all of this is going to be pretty much the same uh... the only difference is going to be here implement visitor d for visitor d a uh... and you see it has here a bound saying d outlives Ticke and the generative value is just Ticke the only thing that borrowed really changed was that it added at this this constraints between the d and uh... the the a lifetime parameter uh... and we'll see this on the input DC realize to which is near the top uh... here so that's about to and i mean i can show you without uh... although no it won't compile so i can't show you without and if we if we did cow here now so if we now go back to you what we originally had with sturdy borrow then if i now expand this what we would hope right is that if you get a string reference it turns into a cow borrowed if you don't it turns into a cow owned uh... and indeed if we scroll down here uh... for the visitor again this is the visitor for creating a few and you see there's now a visit seek that's fine so here the code for d serializing this field has changed a little bit it's no longer just uh... d serializing a string creates its own little d serialized with that was a value that's a cow uh... and it implemented generates an implementation of d serialized for it and it uses this barrel borrow cow stir implementation from uh... sort of private so let's go look at that one uh... sort of uh... over here uh... bro coaster so this is it has a cow string visitor and you see if it visits a string like a if it this is a string down here it creates a cow owned if it visits a stir then it creates a cow owned uh... but if it visits a borrowed string that is a string where you have a reference that it that is going to be long lived it's going to live as long as the input to the data format then you can produce cow borrowed uh... visit stir is if you have a string reference but it doesn't point into the data format it data format input so it's not a long-lived string so specifically this extra function is the thing we wanted if it turns out that this can be a reference all the way back to the input then generate a cow borrowed uh... and same thing for bites so this is bites on this visit borrowed bites uh... so that's what borrow does it it primarily at this d bound but what that enables is that your implementation can uh... can now have fields that ultimately borrow from the the data formats input who okay so now we've been through all the variant attributes the field attributes are mostly the same uh... flatten just for time i'm not going to talk about flatten super interesting uh... and borrow we've talked about uh... remote types are not going to talk about so the question becomes okay what's left well we haven't really talked about the data format but the data format actually isn't that interesting uh... let's look at here so there's a there's a page on writing a data format in the the thirty docs and talk to you about about conventions about like you should have a an error type that shared between serialization and deserialization you have a serializer should have a deserializer and roughly what the names of the methods should be what convenience methods you should implement uh... and what modules you should have this is just basically how should you structure your uh... your data format uh... error handling talks about a little bit you know the kinds of errors that might come up and in particular the fact that you have to support custom messages so these are errors from the data type uh... so if the data type says you know i was expecting this kind of input you want that to be possible to propagate up through the data format so through the deserializer and serializer implementations this is not super important it's just a little detail but the implement interesting parts are implementing a serializer in implementing a deserializer one of the things that is stressed a lot in this documentation because it's important is that CERTI is not a parsing library it does not give you any mechanisms for parsing or interpreting a data format or for producing that data format it just provides you with the connection to the serialize and deserialize trade implementations of data types basically it gives you a connection to the CERTI data model uh... how you parse JSON for example is not CERTI's business all you are expected to do let's talk about serializer first because it's sort of the easiest one uh... to implement serializer all you really have to do and it gives the example here of JSON is you have to implement the serializer trade which has an okay and an error associated type the okay type generally you're not going to return uh... a thing so if you serialize to JSON for example you're not going to return a string instead the serializer type itself is going to hold your output uh... because it might be like a file writer for example where you don't really want to buffer it into memory and then write it out you just want the serializer to directly call into the underlying uh... uh... output socket for example so the when the serializer is finished it has it hasn't produced a value hence type okay equals unit here uh... and the error type is the error type we talked about these uh... sort of sub-serializers uh... oftentimes especially for simple data formats you can just keep this all to one type uh... sometimes you want to separate them out but it's not terribly important for what we're talking about right now uh... and then you really just want to implement all the serialize methods right so serialize a bool well in jason at least is right true or false serializing numbers in jason they're all serialized the same way which is the string representation of the number but without quotes serializing characters, serializing strings you really just want to implement all of these for whatever makes sense for your data format and then for things like serializing structs or enums then usually at least for some data formats you're just going to turn those into the equivalents of dictionaries right so there's going to be maps and similarly sequences are going to turn into arrays uh... new types are mostly going to be the same as if they were struct but some for some data formats like protobufs for example there might be more specific encodings that you could use and so for sequence here for example you see what serialize seek does is it starts the sequence and then returns self as a serialize seek implementation remember this is like a separate trait uh... associated type that implements a trait and if we go down to the implementation of serialize seek you see that it has a serialize element you implement that and it just prints you know it adds a comma to the output and then serializes self same with maps you know it's going to the call to serialize map uh... is going to emit a curly bracket and then return self and the implementation of serialize uh... where do we have it serialize map is just going to add commas and then add colons between the key and the value uh... and ultimately the end is just going to be uh... a closing curly so serializing really just is you're told what to serialize now do the thing uh... so not that bad deserialization is a little trickier and in particular what you have to do is again cert is not a parsing library when you implement serializer uh... deserializer i mean you have the choice of whether to implement deserialize any or not uh... for self-describing formats you generally want to implement deserialize any uh... and for any that you can't you're not going to implement that method and it's just going to error if someone calls it by default uh... and usually your deserialize any is just going to forward to the appropriate other deserialize implementation and then you implement the appropriate deserialize the appropriate uh... well so there are two parts to this right so there is if you're told to deserialize a bool then you first want to extract a bool or try to extract a bool from your input right so in the case of jason the string or you know the byte buffer or whatever and then you want to pass that value to the appropriate visit method on the visitor that you're passed in so really all you're doing is you know parse the format however you choose to parse it and then give it back into the the deserialize the data type through the data model using the visitor so you see all these implementation are basically all the same right they are call visit of the appropriate type with the thing that you got out of your input by parsing right so parse assigned integer pass it to visit and then there are you know a couple of things that are a little more involved things like deserializing options for example you need to know how your data format represents values that may or may not be there and then if we look at deserialize sequence you look for an open bracket and then you create you call visit seek and you create a type that represents your parser for the contents of a sequence so in this case comma separated here is really just a type that the you own as in the data format owns that is a parser for the stuff that goes in arrays and it is going to implement the access seek trait so that you can so that the the caller can keep calling you know next next element or next value and then ultimately once there are no more comma separated values then you check that you in fact have a closing thing for the array and otherwise you admit an error because the data format was broken so same things for maps right so if we go back to a map looks for an open curly bracket has a comma separated parser calls visit map so there's an implementation of access map for comma separated and we can go down and look at that too so seek access for comma separated is just see whether we're at the end if we're not look for a comma uh... if you find a comma then you just deserialize the thing that is between you in the comma and then you admit that value up to next element and map is going to be very similar right we we look for uh... comma that separates elements and then uh... so that that's going to be the next key is the thing that follows the next comma and the next value is the thing that follows the next colon but again how you actually parse your data format is not sort of business all you are supposed to do is implement the serializer and deserializer traits uh... and these sort of access traits for whatever your your sub parsers are for uh... map sequences tuples structs that kind of stuff okay i think that means we're all the way through i don't think there are any other things i really wanted to talk about for surty there's obviously more bunch more stuff to it like you know how surty derived does its code generation not going to go into that that's a that's very low level here or you know looking at the actual implementation of serializer and deserializer for uh... surty jason or look at the implementation of deserialize and serialize for surty jason value uh... those are interesting things to look at there's a lot of like good juicy stuff there if you want to pick up more about how surty works uh... but it's not really something we need to spend time on here uh... and so i think with that let's see if there are any questions at the tail end here because i've i've talked a lot but hopefully some of this now makes sense and you feel like you can start you can rewatch some of it maybe uh... and then go and dig into this on your own and now hopefully you have more of the the mental model and the uh... terminology and basic on this understanding of the types involved in the traits involved uh... to figure the rest out yourself we did it on time too it's pretty good also yeah i mean surty is it's it's really fun to read through surty because there's so much smart engineering like it's a it's a cool architecture you know that there's been a decent amount of like let's come up with a a better surty or surty has these problems and we want to fix them and i think those efforts are really good there are certainly cases where you can improve upon surty but i still think surty in and of itself is an accomplishment and i know uh... david is in chat so clap clap for david or detolna for those who don't know that the d's dance for david and alright i don't see any more questions in chat i'll i'll give youtube a couple of you know minutes to catch up oh nice thank you it's on here apparently apparently we are now on here hey look at that i will add that so i'm gonna as usual i'll upload this to youtube afterwards and that will actually have like chapter marks that people can scan through and stuff so i'll i'll send you the link to that later on i'll be a better thing to embed than the link to the live stream nice alright thank you everyone uh... hopefully that was useful uh... i'm hoping to do more of these de-crossed things i think they're gonna be a useful thing for the ecosystem to go through uh... there are a couple of other things that i think are really good crates that are candidates for going through this kind of de-crossed thing like i think tokyo is a good candidate tower is probably a good candidate rayon maybe crossbeam uh... pin nom i think there are a bunch of good candidates here so i'm pretty excited about this uh... this new series and now that i'm actually gonna do streams every four weeks i think this is gonna be uh... we're gonna see some good streams clap maybe axum there's lots of good candidates alright thank you everyone for coming out and uh... i'll see you next time whatever the next stream is there might still be more streams than every four weeks if i have the time but at least there there will be a stream every four weeks now thank you all bye