 should be good to go. All right. Hey, guys. Sorry it took us a bit. We're trying to get this thing set up. And all right. So basically, what I'm going to be talking about today is something that we did recently at Zalora, and something that made our life slightly easier. And we basically want to share how we did that in our journey and reasoning with you guys. So basically, this is a title says Code Generation in Go, and how we sort of automated some of our things to make it really cool. So one of the nice things about Perl, I think, is these three, this quote by Larry Wall, the creator. He says, good programmers need to be lazy. So I think my colleagues and I are a bit lazy. We were tired of doing the same thing over and over again and decided we'd automate it and try and do a little bit of Code Generation. So this is one of my favorite sites. Yeah, we can do it, but we also need to tell you why we want to do it. So basically, the story starts with microservices. We all know it's awesome. This site is deliberately empty. It's because most of us know why we like microservices. They're very scalable. You can work on independent microservices, independent of each other. You can scale different services independently of each other as well. So they are awesome. And the way we structure our microservices is they are reusable by multiple services downstream, which led us to think this. So basically, the Plugable Code for the microservices is abstracted away into SDKs that downstream services can use. Why did we do this? Because, one, they're reusable. Downstream services can just plug, import the SDK as a dependency and just call these functions over and over again. It makes it homogeneous. All the SDKs live in one monorepo. And because they're homogeneous and live in a monorepo, it's very easy to make modifications and distribute these modifications. Another neat feature that abstracting away into SDKs is we use Protobufs a lot. And because we use SDKs, essentially, the Protobuf code is abstracted away from the client itself into the dependency. So it's awesome. So we've got microservices. We've got SDKs. And we got this. So this was the bit of a jumble that we entered into. So basically, we had a lot of microservices. And then now we had a lot of SDKs. Because obviously, these microservices were talking to each other. And we could put all our code in the SDK. So what happened was, you see, it's maintenance hell. So we've got about three to four developers. And we really have to keep updating our SDK all the time. Any time we added a new endpoint, we had to update our SDKs. Any time we added a new query parameter, we should ideally update our SDKs. And God forbid, imagine we change something. Obviously, we're trying not to be backwards and compatible. But if we change something, then we needed to update our SDKs. So this led to a lot of copy pasting. Every time we had to really write a new endpoint, we'd have to copy the old endpoint, change very few things, and then go for a release. So this is basically a lot of repetitive work and a lot of frustration. And also, when you copy pasting, you could essentially forget something that could end up being disastrous for maintenance as well. Before we go further on how we did it, let's talk about a few pros and cons of code generation. I think these are general industry observations on how code generation is. But this is a very opinionated list. So feel free to disagree or think differently. So the good parts of code generation is basically why we wanted it. We don't really have to worry about the boilerplate code. We don't have to constantly keep rewriting the same code that we already know it's boring and we're lazy. And therefore, we wanted to automate it. It also makes our turnaround time fast because we don't have to do this manual work. It's all about just adding the one particular point that triggers this generation. And then your build systems would take care of it. You could integrate Bazel to it. You could have a make file that runs this. You could put it on a CI CD pipeline and have it automatically generate based on certain changes. So it's very fast. Those are the nice parts of it. The bad parts, you could essentially have code duplication. If you've seen some of these old C++ generated files, you'll notice that there's a lot of repetition. There's a lot of mapping and remapping. Some of the thrift files come to mind. So if you've used thrift, then you'll notice that generated thrift code is sometimes really, really incredibly interestingly hard to read, right? So that's the second point. So generated code necessarily isn't always easy. There's a bunch of byte code baked in. It's hard to follow. And if you really want to use this generated code or if you expect someone is going to be reading this generated code, then code generation could essentially be a hassle, right? And we absolutely expected people to read our generated code because the generated code, I'll show you in a bit on what we intended to generate. So this was important to us. We couldn't have code that was difficult to read or change, right? And also, I think in generated code bases, one difficulty is really telling a part of what is generated and what is not generated. So if you want to make changes, you want to really be able to tell which part of code is generated and which part isn't. This is not too much of a problem for go. Again, I'll show you why, but it used to be a big problem for C++ and closure generated codes, right? So the ugly, the really, really bad parts about code generation is generation is like stamping a bandaid into a wound, a battle wound and then assuming you have a flesh, it's just a flesh wound. And basically what you're gonna do is you're gonna think that your code is okay and you're gonna be fine ignoring it forever because generation is taking care of it. So if you have a complex piece of code that generation automates, then it's basically automating the complexity away. So you're generating the complexity away from your application and you absolutely don't want to do that, right? So if you think about it, or if you look at the slide, you can see it's very obvious that the bad and ugly really outweigh the good. So why did we go ahead and do it anyway, right? So for that, I think it's important to show you how a project is structured, oops. Right, so this is like a basic skeletal outline of how we structured our SDK. So the client itself is a package. It's basically the go HTTP client abstracted out with wrappers on top of do and a new request. The reasoning here is that we wanted to bake in authorization and retrying logic into it. So the client itself is an abstracted mini client that is sufficiently self-existing. API.go is basically the specification. It's a set of interfaces where if you look at the AWS SDK, then they give you a bunch of functions that you can use if you use the constructor to really construct an object. And API.go is just a bunch of specifications to do that. Methods.go is basically where the definition of these specifications go. So in C++ peak, API.go is sort of a .h and methods.go is sort of a .c pp or .c. So, but regardless, so the API.go is a set of definitions and methods.go is implementations. And that's where we're gonna be doing our generation. The methods.go is gonna be the place where we generate code. And this structure is basically serves as a demonstration on why we think only the positives apply for us and the negatives don't. I'll get to the code duplication in a bit, but we're not essentially abstracting any complicated piece of logic out. What we're abstracting is this. So if you look at the comment, the comment is basically the endpoint and the code itself is how we abstract the endpoint away. So if I have to give you a brief run-through, you just create an operation struct and then you use that to call a new request and then there's a bit of a bug here. This should be nil comma error, but regardless. And then you just call the request sent. And if it passes, then you just return the details that your struct is, that was already defined, right? So if you think about it, the most important, and here is an example of a different endpoint. So if you look at these two, you can see that the only things that change are, I mean, the only things that stay the same are these two, right? And everything else is basically configuration. So you need to essentially write this or write what the inputs are and what the outputs are, right? And the outputs go here as a response. If there were any inputs that would be processed, most of my examples I get. So in this case, they're not present here. And that's basically why we wanted to automate it. If you notice, it's very easy to read. As a matter of fact, the code that you're looking at is was actually generated. And therefore, it doesn't look very hard to read at all because it's very simple. It does a specific thing. It basically is a go to an endpoint, do this and get this back and send it back, right? And that leads us to a simple observation. Code generation, the way we were intending to do that was simply like building a little compiler. And we were doing this on steroids simply because Go provided us all the tools that we needed, right? So maybe a quick primer is how compilers work is they're broken down into three steps, right? So you've got the parsing where you essentially read code that's already written, right? And then you have a transformation where you change the code into a format that essentially is, that's essentially easy to write from, right? And then you finally have the code generation part, which is mostly what compilers do is they optimize this part and then write it to binary so that your computer understands it. In our scenario, in our case, this would be to just write more go code, right? So parsing, you can split it as lexical and syntactic analysis. So lexical analysis is basically reading through the code and creating tokens, right? So it knows the keywords, it knows the primitives, it knows the values you assign to primitives and it creates a token, a list of tokens on all these values. Synctactic analysis, interestingly, is a sort of transformation already. So basically, syntactic analysis is the part where you convert all these tokens into an abstract syntax tree. So when Go is basically compiling your code, what it does is the first there's a lexar running, right? So the lexar basically tokenizes all of your code and then it builds an abstract syntax tree, right? And this abstract syntax tree essentially is a representation of your code, right? And then there's transformation. So transformation, the reason it exists is because compilers, okay, this is not necessarily relevant to us but we'll still be doing it. The recent transformation exists and intermediate representation exists beyond an EST is so that you can swap in front ends and back ends from your compilers. So modern compilers like LLVM could essentially, you know, swap out a C-lang backend for a GCC backend and intermediate representation allows you to do this and we took advantage of this style of how the compiler works as well. That's the reason why I'm running you through it. And finally, code generation is using this representation to arrive at your desired output, right? So there's an input, there's a process and there's an output phase, right? So that's essentially how we designed our generator to look like. So the generator was made up of a reader and a writer, right? So the reader would essentially do the parsing part of it and the writer would do the code generation part of it. But where is the intermediate representation you may ask? And that is basically this array. So the API struct is our version of an intermediate representation of what we're reading, right? So essentially, if you look at this definition, you'll notice that it has all the values we need to really, you know, generate these functions, right? We know what endpoint to call. We know what method it has to be. If it's, I mean, it's just the name. We know what path to call. The path and endpoint together would give you the final URL. We know what HTTP verb to use. So we need to know what if it's a get or a put. In case of a put or a post, you need a body. And then we need to know what input parameters and output parameters the method needs, right? So essentially, this is in our scenario an exhaustive list of what we needed to really generate the final output, right? And now that we know the working pieces, we wanted to see how we wanted to build each of these independent pieces. So we needed somewhere to read from. We know what to write already. We know that we're gonna write into a dot go. This is basically what we want to write to, right? So we need to know what to read from. And we already know the intermediate representation as well. So the steps are read from somewhere, put it into an intermediate representation and write to somewhere. Why we have an intermediate representation is because we want to offer the liberty of changing the reader and writer as and when, if someone else wants to modify it. What if someone wants to generate generate the output as a dot C to make an ABI? This would still allow it to do it. The way this thing is structured would allow you to do it, right? Okay, so let's talk about the reader. So where could we essentially define the way the application is structured? And you know, where can we, where could we essentially read from? And we have multiple options. One is Swagger. We could essentially read from a Swagger specification and then generate a code. The slide itself is very opinionated. We're not very, I'm not very fond of Swagger. I've found it very loopy and difficult to write. So we almost immediately dismissed it as an option. Could it be a JSON? This looks nice, right? So if you wanted to maybe create these two functions, then all you had to do was to define what the input parameters were, the structure of the output parameters, sorry. The structure of the output parameters and you could essentially tell what endpoint and provide all the details. This could work. This looks nice, right? But let's look at the third option as well. You could also just define Go interfaces to do this, right? So remember our API.go. So if you could define it by hand and if you notice these two have the same details, right? So it's just a different way of representing it. So essentially you could represent the same thing that's a JSON, but you can represent it in Go itself, right? So what we did, what we could do is define an interface and then represent the functions and the inputs and outputs in the interface, right? And then obviously the one downside of doing this is you have to annotate certain values. Like for example, there's no way of representing the endpoint in these functions, right? So that becomes a comment and then you annotate this function to do that, right? And yeah, so that's basically, these are the three options that we had, right? And we specifically decided to Go with option number three. I'll tell you why. So basically this is a Go code base, right? So we wanted to keep all the changes internally to Go. We didn't want to, so we knew that the developers maintaining this code base essentially would have to know Go already, right? Because they're going to be reading these methods. There's other parts of components of this code base that is Go, right? So why not have all of it in Go? We did not want to rely on a declarative syntax because that's how JSON seems to be. It essentially, you know, transcends into something like HCL for Terraform where, you know, you write a declarative syntax and then it gets transferred into something that's entirely different. And we have to write the interfaces anyway. So it made sense for us to rather write the interfaces by hand than generate them from JSONs, right? So that was the reasoning. And we decided, you know, we just take the interfaces simply because we wanted to keep everything Go, right? So I know everyone probably knows JSON, but essentially they don't necessarily have to know the structure of this JSON. They don't necessarily have to know what the schema is that they have to adhere to. And that is something that we didn't want to be on the cognitive load and therefore decided to go with most, what we think is a simpler approach, right? So that's why we decided we'd go with interfaces. So because we go with interfaces, we had an interesting problem to solve, right? So imagine we used JSON, right? So the thing is we could have solved it with a simple parse, right? So you could just, you know, unmask with the JSON and then arrive at the intermediate representation that we had. Unfortunately for us, we decided to go with parsing the Go code, right? And because of this, we needed to look, we really, we had two ways to do it. One was, you know, read it line by line. You sing a buffer scanner, right? And then interpret every line and, you know, use a preconceived opinion on how a function should be defined to essentially arrive at, at, you know, the values that our intermediate representation needed. But you see the problem with this, as I say it, right? It sounds so painful, it sounds so opinionated and it's so not robust at all. So a more robust way of doing it was to use the AST to parse. So Go has really, really cool tools that let you do this very, very easily. So it was, it was just a matter of us, matter for us to, you know, import Go AST, Go parser and Go token to do this. Go parser basically parses it already into an AST. And once you get that, you can just give it a reader or a file name and it would parse it into an AST. And once you have the AST, you could essentially, you know, loop through it. So Go gives you the tools to walk through, to run or walk through the AST. And then, you know, you can essentially pick out for interfaces or, you know, pick out methods and then pick out their parameters. And that's very interesting. And using that, you can build this intermediate representation. So now we, this was the way we, you know, used leveraged AST to build our reader and then arrived at the intermediate representation. And finally, what, the final parts of the code were basically the easier parts. You just had to convert this intermediate representation into a final .go, right? Into, to do this, right? And that's where we wanted to arrive at. And basically, initially what we did was it was a lot of format writes or, you know, string replacers on a file. And that seemed very ugly. It seemed very difficult to follow because then you had a lot of string replacers and you were like, where do I look for, you know, where do I look for where I changed a line, right? So we decided to use Go templates. And this seemed to be a really, really neat solution because if you look at the, if you just look at this template, then it's, in my opinion, not too difficult to arrive at what this template is trying to do, right? It's declaring a function right here. It's adding all the input parameters, adding all the output parameters. And then if you look at this, it's calling the operation and then it's basically the same. It's very readable. It's easy on the cognitive load. So we decided we'd use Go templates. And, you know, this turned out to be interesting and it's easy to change, easy to maintain and could work out really, really good. And yeah, so that's basically the whole end-to-end of how we did it. There's also a little bit of housekeeping in case someone wants to do this. These notes are basically how we decided to do it. We decided to put all the generator code in the internal package. This seemed to be one of the nicer use cases for internal. I'm not the biggest fan of it. But this seemed to be a perfect use case for it. The generator would live inside the internal. We'd also obviously expose a command for the generator to run. And the most important thing is to add this directive, this build directive to the root of your program. So our api.go, if you remember our project structure would have this right under the package declaration. And this is how go knows to run the generator, right? So if you run go generate, it would look at the root to see if it's got the directive somewhere and the directive would redirect it to your actual generator. And that's basically the final pluggable points for us. If you want, there's a blog as well. I have more code snippets and there's a bunch of logic as well on how I looped through the AST. I didn't want to include that today because it seemed a little dry and seemed reasonably easy to understand as well. So if you want more information, it's definitely there on the blog. And that's basically it. If you guys have any questions, I can take them. Virtual peps, guys. Anyone have questions for us? We can post on the Zoom chat. Two questions, two questions up. Oh, two questions, yes. Okay, I'm just going to see how I can open that. You can check your webcam as well, so you can see you. Oh, okay. Hey, guys. Sorry, I'm just trying to navigate through this thing too. Any, so there's two questions as well. Two questions for the Syrian. All right, so the first question that I'm seeing is how did you parse the comment that acts as annotation for the URL correct? All right, so let me answer the first question. So basically the AST parses it, so that's easy as well. If you look through my blog, then you'll see that it basically parses it into a struct called docs and then you can loop through this. So basically what essentially we did was we looped through the docs and added a regx to extract what looked like a HTTP verb and what looked like a URL from the comments. Yes, regx could be a performance penalty and I think my answer, yeah, good old regxp and my answer is going to be tied with the second question. What about the performance of template generation? So essentially performance is definitely while we wanted to write very clean code, performance was the last thing on our minds simply because we'd be generating code during merge time, right? So when the code is merged, that's when the generation happens. It's never going to happen in runtime or build time even. So anyone who's using this library is already going to see generated code. So the templates are not going to be generated on the fly. They're only going to be generated when we add a new change. So it's sort of like if you used Hugo or Jekyll, it's basically like Hugo build or Jekyll serve just before you deploy it for the first time and once you build it, you never have to worry about it again until the next build. So the performance of template generation at this point doesn't seem to be too much of a concern and I can tell you it's not slow in the sense that I've never had to wait. It's as instant as running go generate and then seeing your methods come out and that's also the reason why regx don't seem to be a problem for us here as well. One was it was a simple or condition and two is we're probably only going to generate it every time there's a change to the library itself. But you're right. I think it's something that we could benchmark because essentially what our plan is to make it a monorepo and essentially it could add up but that's definitely something we could take into consideration. How about authorization and authentication? Excellent question. So basically the reason why you don't see any authorization or authentication in this guide or in this presentation is because it's all abstracted away into the client. So the way we did it, if I may say, if I may is we made a separate interface for authorizer and made it fluggable to the client, right? So that happens when you do a send request and it's absolutely abstracted away from, it's a different concern and you don't necessarily want to be putting it into every method to authorize it. So that it ended up being a good decision as well and therefore it's basically abstracted away into the client. No problem. Does anyone else have any questions? Thank you. I'm just going to see if I missed any question. Authentication, authorization, do you answer that? Yeah, I answered that. Any more questions? Give it a minute. No, right. So thanks so much for listening, guys. And feel free to hit me up on my GitHub or, oops, I think it's in the first slide. Sorry, I'm just terrible with slides. Thanks. I'm just going to put it out here so you can get it. So if you've got any questions, any more questions or if you really want to know how to do it, feel free to shoot me something on these accounts. My Twitter's got about 20 followers, so don't worry about it. Just feel free to send me a ping and it should be good.