 Thank you. Thank you all for coming. All right, everybody hear me okay? Perfect. Today we're going to be talking about building services in Go. Definitely if you want to follow along, please have a copy of this, or a checkout of this repository. Alternately, I also have it on the USB stick if your internet's not working or whatever. We will be compiling stuff as we go. It's definitely designed to be a hands-on tutorial. So you ideally also have a working installation of Go. If you don't, it's also available on a USB stick. So, all right. Today, I'll start with a bit about who I am, what I do. I'm Mark. I work at Dropbox, do site reliability engineering, have for the past 10, 12 years at various places. By night, I contribute to Dream with Studios, an open-source, pearl-based fork of LiveJournal. Kind of a blogging social network sort of website. And here's my GitHub, Twitter, email, et cetera. It'll be repeated at the end, but feel free to contact me. So what are we going to do today for the next hour and a half or so? We're going to start with talking a bit about services and what I define as a service and what I mean when I say building services in Go. We'll do kind of a brief history or talking about the architecture of services over the past 10 years since I've been working in this field. We'll talk a bit about Go and why I think it's actually a good choice for building services. And then we'll spend the bulk of today's time on our tutorial program and then we'll do wrap-up. So I'll start with the caveat, Lector. This is not strictly an intro-to-go tutorial. This tutorial is designed for the level of people who have done some Go and sort of like probably have done the tutorials, have read through maybe a blog post or two, have written Hello World and have a basic understanding. But it's also not the advanced guru level. So if you're coming here expecting like the really deep into the garbage collector and how all these work, I don't know that depth, so we won't go too far into it. Ideally, this tutorial will be in the intermediate where you come in, you have a little knowledge, and you walk out with more knowledge. There's going to be a lot of code, but it's a tutorial, we'll be going through it. So please ask questions. I want you to come out of this knowing the stuff, knowing the material and knowing what we're talking about. So questions are very welcome. And I build time into the tutorial for questions. So please feel free. So we're going to start by talking about services. And I failed completely at animation, so I have three slides that say services. So service design over the past 10 years has actually changed quite a bit. If you look at any architecture or application from a decade ago, you probably find a one large monolithic application. And I'm going to talk specifically, so my background isn't sort of like the Googles and Mozilla's of the world, not so much like the enterprise companies. I don't know a lot about how they've been doing things. They've actually been doing sort of service-oriented architecture a lot earlier than a lot of the more internet companies. But older sites are often large monolithic applications. LiveJournal slash DreamWidth certainly is. One large mod-pearl app, that's hundreds of thousands of lines of code or even millions at this point. Dropbox was one large application. It's actually at this point we've been pulling things out of it to make them smaller. The problem with this design, the problem with building an application that is one huge mess of code, it's fairly obvious to most of us, but they're very complex. They become impossible to extend, impossible to work with. They're filled with dark little edge cases where the edge cases have edge cases, and when you hire someone new and they want to, oh, I'll just improve this little thing, you have a lot of, well, that's going to change this, that's going to break that. It becomes really hard to continue growing these applications. So a buzzword that you'll probably hear thrown around a lot is service-oriented architecture. And if you think about it, it's very close to the Unix philosophy. If you've done a lot with the Unix command lines, you've got your little commands that do one thing and do it fairly well, and you chain them together to get more complex behaviors. If you want to think about service-oriented architecture, or SOA, that's roughly the concept that we go with. Small, well-scoped services that are easy to think about. So it's the difference between I have my website and I run my website, and I have a search service, I have a payment service, I have, you know, whatever other services that you run. So why SOA? As mentioned, largely, for me, it's a way of managing complexity. It's a way of keeping the scope sane. If you have a narrow specification, it's much easier to build. So if you think about this sort of logically, if I say, okay, we're going to build a chair, you can probably envision how to build a chair. You probably can think what pieces it's going to need. Even if it's a fancy chair, like a folding chair, you can probably picture that and draw it out, and I think probably we could trust anybody in this room to build a chair. That's pretty straightforward. On the other hand, if we want to build another Auckland-Tarber bridge, I don't know if anyone in this room could actually do that, unless you're a civil engineer, in which case, fantastic. But I certainly couldn't. Bigger projects have much more complexity and scope built in. Smaller services, as we call them, or smaller projects, are also much more easy to monitor and reason about. It's easy to ask the question, is my search service down, is my search broken, than it is to say, you know, is my website working? I'm not even sure what that question really means. If you read the Wikipedia article on service-oriented architecture, they try to describe it as a highway system, which sort of works as a model, because in today's day and age, with sort of the Googles and Facebooks of the internet, we have commodity hardware. We scale out rather than scaling up, and sort of SOA fits very well with that, because we're breaking down services into smaller chunks that require less resources, less computing power, and fit much better on the hardware stacks that we have. Also, SOA is really well easy to reason about failure with, so when you lose 1% of your hardware or you lose a rack, it's much easier to understand what's going to happen, rather than if you have a mainframe, or something really large. The other thing that happens with SOA is we make heavy use of the network. Instead of having one application that's all internal, or one connection to a database and one to a caching, now you're possibly making many connections to different services to different things. There are some drawbacks, of course, to this. We've sort of hinted at them, but if you've ever had to do a lot of console jockeying and you have some data and you want to process it, it doesn't always work out the way that you want. Sometimes there's edges or little cases that just doesn't do exactly what you want, or something like that. And this can lead to hacks. If you have services and they have their boundaries, they have the things that they do and you want to add new functionality, it's sometimes hard to know where to put that. Similarly, and kind of obviously, network calls do incur overhead, so you can go hog-weld and have 50 different services and you're like, great, I have... My code is super easy to test. It's easy to monitor. It's easy to work with. Anybody can spin up and work on this. It's fantastic. But for a request, you have to do 50 network calls to answer one user request. That may not work out. Finally, we talked a bit about complexity and people will say that a service-oriented architecture just reduces complexity and in a way it does, but in a sense it externalizes complexity because now instead of having one application that you're deploying and maintaining, you have 20 applications. 20 things you have to monitor the versions of. You have to have a deployment cycle for. You have to have checks for, et cetera. So some examples from the industry. Presently I work at Dropbox. We have a file distributor. That's all it does. It just distributes files. We also have a metadata service. Dropbox is one large file system, logically, and we have a service that just manages what goes where. It doesn't have to think about what's in the files. It doesn't have to think about ownership or things like that. It just manages locations, which makes it really easy to test and to ensure that all the edge cases are covered. If we go to open source, there's a million examples. In fact, most of the packages you think about can be thought of as services. Your proxy kind of packages like Nginx, Varnish, HAProxy, your Caches, Memcache, Redis, et cetera. Your databases might equal Postgres, Cassandra, Ryok, et cetera, et cetera. All of these are effectively services that you deploy. Ideally you want them to do one thing and do it well. You want your database to store data and to store it very efficiently. You also double as some sort of offline processing framework or something like that. Probably. I mean, maybe. So for today, we're going to define service as software that makes or receives requests, typically via an RPC layer, and usually over the network. This is pretty much everything, so it's not a very narrow definition. So why go? Why would we consider using a new language from a couple of years ago? Go was built by Google four or five years ago by Rob Pike and a couple of folks over there, and it was built in an environment that is probably the largest purveyor of network services in the world. Google has a very large infrastructure. They have a very wide variety of things that they do, and they historically have used a lot of C, Java, Python, et cetera. So they have a pretty good idea of what it takes to build an efficient service. Go was designed not only in a modern era like that, but in an era where every computer has at least 16 cores, if you count hyperthreading. Every server in a data center probably has 64 plus. You talk about tens of gigabytes of RAM, loads of gigabytes of RAM. Computers are a lot bigger than they were even 10 years ago. Back when I was working on LiveJournal, and it was like we had four gigs of RAM on our web servers, and that was amazing, right? Now you have 10 times that, or 20 times that, and that's standard. It was also designed in a world that is fairly modern as far as protocols go, as far as the best practices of how to communicate over the internet. JSON, REST, all of those things are fairly new in the grand scheme of things. They're less than 10 years old by and large, especially for wide-scale adoption. Go was created after all of these things started to become true. So because of that, it supports them by default. It takes them into consideration. They're built into the language, into the standard libraries. They're not afterthoughts. Like, yes, anything that we talk about today you can do in C, you can do in, you know, Perl, Python, et cetera. Go just has them built in ahead of time. Also, slight bias. I like compiled languages. I like statically typed languages and static compilation. It's also really fast. It's easy to learn. It's unicode by default. So these days, most things we do have to support languages that are not just English, and Go has that built in, or at least very well designed for that. So it's just a pleasure to work with. So, awesome. We're going to dive into the actual meat of the tutorial here. And today we're going to be building a proxy. I happen to really love proxies. But in specific, we'll be doing an HTTP proxy, which is a pretty, I think, probably safe to say most... Let me ask. How many people are familiar with the HTTP protocol, at least a little bit? Great. Fantastic. That makes the rest of this easier. So, we're going to build something that is fairly simple, but actually relatively performance. The version of the code in there that we end up with can do thousands of queries per second on a single core. And there's another version of the code I had in there that I hacked with. That'll do almost 8,500 per second, which is not too bad for a couple hundred lines of code. So, why a proxy? And why is a proxy a good choice for a tutorial? I think everything can be solved with a proxy. And pretty much every company that I've worked at has deployed a proxy for one thing or another. Particularly if you don't control one end of a transaction, like if you're talking to a database and you want to do some analytics, but you don't want to write code in my SQL, you might deploy a proxy. Dropbox uses a MySQL proxy for statistics and other sorts of functionality. HTTP proxies are great for gathering statistics, changing behaviors, doing analysis, doing request redirection. You name it. Also, since we're talking about network services, a proxy is a good combination of client and server behavior, which is something we want to talk about and talk about both of. So, we'll be doing that. So, if we think about the basic design of a proxy and what's in it, there's... I don't even know, four basic phases. How many people have actually written network code to a network server before? Any other language? Okay. Actually, that's great. So, basically, the first phase is accepting connections, name requests, proxying them to a backend, and then writing the response to the client. It's pretty straightforward. So, now we're going to go into some actual code. So, how this tutorial is designed to work, you should have a checkout of the code. If not, I have a USB stick with it. The code has a bunch of directories, part one, part two, part three, et cetera. If you want to follow along and write code, go into the part one directory and you'll see a main.go. If you just want to look at code and follow along visually and listen, go into the part one final directory and you'll be able to look at the code that we will be writing. So, your choice, depending on what you want to do, if you want to actually type it and compile it, then go for that. If you just want to hang out and listen and follow along, that's fantastic, and you'll go in the other directory. Is anybody not finding that? Or lost a question? I want to make sure you're all with me. Fantastic. So, if you've ever done network listening for sockets in any other language, it looks exactly the same in go. Right? In fact, if you've ever written C or Java, there's a main function. In other languages, they just start executing at the top. But in go, there's a main function. So, if you have your part one, or your part one slash main.go, we're going to start with the implementation on the if statement here. If you haven't seen if statements in go, there's basically two halves to it. Or you can have parts to your if statement, an initialization, and then the actual conditional. You can think about it like a for statement in C, where you have the initialization, the incrementer, and then the exit condition or the stop condition. The if statements in go can have an initialization and then the conditional. So, in this case, we say, execute the first part, net.listen, we get back a listening socket and an error, and then we do the conditional check on error. We're throwing away errors at this point, or rather, we're checking to make sure we didn't get one, but we're just ignoring them. So, we're not going to do anything fancy with errors yet. Then we're going into an infinite for loop, because with most network programming, you set up your listening socket, and then you accept connections over and over. I mean, the idea of the proxy is that it's just going to accept a connection and it's going to do something with that. So, this looks the same as in pretty much any other language. All right. So, this is what we're familiar with, but now we're going to talk about sort of HTTP in go. And one of the reasons that it's really nice to write network support in go is effectively the standard library. We're going to implement a complete HTTP proxy that will do full end-to-end HTTP 1.1 in about 14 lines of code and never have to touch the protocol ourselves. But one thing I will say is if you've done network implementation before, you've probably run into the situation where you write 10 bytes out to a socket, you write 10 bytes out, you write a lot of small writes and you go, wow, this is really slow, it uses a lot of CPU usage. So what you end up doing is buffered I.O., typically. You'll wrap your socket in a buffer system or a buffered I.O. abstraction and then you'll write into that and your bytes will go into a buffer. And eventually, you will then tell the buffer to flush. At that point, it will then write it out to the socket. The go standard library for HTTP does not give you the option of doing things in the slow terrible way. They force you to use the buffered I.O. library. Thankfully, the buffered I.O. library is extremely easy to use because they know that this is a common pattern and we'll come back to this a couple of times with go. They know the common patterns because we've implemented them for a couple of decades in other languages, so they just build them into the standard library. And in this case, they don't give you a choice about what to use. They force you to use buffered I.O. At this point, if you've got your code up, you've probably got a couple of lines that listen on a socket or set up a listening socket and then actually accept a connection. Well, the next step, when we talked about those original four steps, is that we get a request. In the HTTP protocol, most of you put your hands up, but the first thing that happens is the browser connects to a server, the browser sends a request. So we're expecting now to come in on this connection we just accepted. Since we don't, I don't want to have people pulling up all the docs right now and digging through them. I've cut and pasted the function definition from the HTTP library for read request. And if you read the docs, this says, basically, given a reader or buffered I.O. reader, it will read a request, an HTTP request from that connection and return it. It's really, you know, in other languages you can find modules that do this, but in Go it's just built into the standard library. So this is the code you already have. We're back in our code now. You've already got this line. This is your accept line. And then the next thing that happens is we have to create our buffered I.O. reader. When I said that it's basically one line in Go, this is it. You're using the buff I.O. module and calling new reader and giving it the socket that you're working with. And it returns a reader. And from now on, you can just do read calls on this reader and it does behind the scenes buffered I.O. to make everything more efficient. So that's great. Now we have to actually read that request. If you see our function definition for read request, you just give it a reader and it returns a request and an error. Go has multiple return values, which you might be familiar with from a couple of other languages, but this is the common idiom that we'll see time and time again where a function returns the value and then possibly an error. And then you always check the errors. In our case, we're sort of not, but so the next part, as mentioned, was read request. We have our buffered I.O. reader, we have to actually pull something out of it or we want to pull a request out of it and it's the same structure as all the other lines we're doing. We're doing our initialization, request comma error, colon equals read request and then we're checking to see if we did not get an error. How's the pacing? Are people able to follow on the time? Yes? Okay, there's no head shakes, so we'll go. So we talked about proxies, we talked about accepting a connection, reading a request from it. The next step is we have to send the request somewhere, so we have to talk to a backend. So we're using the net package to do this and it's very simple. It's the same as we've been doing. You'll just see this is kind of over and over and over. We're doing net.dial, we're saying we want this to be a TCP connection, we want it to go to this port and definitely do use, if you want to be testing what you're writing, do use this port, this IPN port, because we'll be running a web server on this port here in a few minutes. And again, here we see our new buffered IO new reader. Well, the HTTP library requires you to use Buffio, so we can go ahead and use that. Pretty straightforward. The next step we have to do is now that we've connected to a backend, we're going to send the request to that backend. We're going to say, okay, here you go. And then we're going to read the response back. We're going to pull this back from... or we're going to pull the response from that, from the web server. So this again goes inside that if statement. If you notice a pattern here, we're just nesting 13 if statements, which is a terrible design and we'll fix it. But it'll work for now. It'll suffice. So one of the nice things here with the Go HTTP library as the request and response objects have a lot of methods to help you out here. And in this particular case the HTTP request object, REQ, has a write method where you can say, I just want you to write yourself out to this socket. You don't have to worry about the protocol. You don't have to think about how to write out headers and how to write out body and how to deal with chunked encoding versus normal content length and close. You can just say, you know what? Request. Here's a socket. Write yourself out. The various will probably notice that we're not using Buffered IO here. For the writer we're writing straight to the socket. This is pretty inefficient so we'll fix it later. But I'm just noting it in case you noticed. After we write the request to the back end, the next thing that happens is we get a response back. Ideally. In theory. So we read the response. But we have to use the Buffered IO reader here because the read-response method requires that. So fantastic. We've only got one step left to a fully functioning proxy. We've now gotten our, accepted the connection, gotten a request, connected to a back end, sent the request to the back end, read the response from the back end. Now we have to send the response to the client. And then close everything out. So this is a little bit longer. But you've already got the top and bottom line. The read response. And so all we're doing in the middle is another response.write like we just did for request.write. The response.close thing here. Since we're writing a serial proxy right now we don't want to let one user tie it up. So we're telling the response that we want connection close. We don't want the client to think that they're going to be allowed to keep alive. So we set response.close. Then we write out the response to the client. And then we print something to standard out. We want to know, we want to be able to see that our proxy is actually working. So in this case we're printing out the URL.path and the status code. But you can print content length or whatever the heck else you want to print here if you want. Finally since we did response.close true, we have told the browser connection close that it can expect the connection to be closed. But then we actually have to close it ourselves. So, con.close this is the same in pretty much most languages that have object oriented socket stuff. So great. If you follow it along with that, you should now have a functioning HDDB proxy that the code looks terrible. And it's very slow and it ignores errors, but it should work. So let's actually test this out. If you have okay, if you're writing code and you have a working installation of the go compiler, you should be able to go into that directory and type go build. And in theory if I have I have a terminal here. I have a terminal somewhere. We have a bunch of files here. Go build. Actually I need to go into part one final. Go build. Oh, let's say the code looks something like this now. That's horrible. And if you build, you should end up with an executable that looks like part one final or part one. And if you execute that, it'll just sit there. Question. No log, Bufio, or those are standard library dependencies. So that sounds like you don't have your go root set or your go path. So I have the package installed on my Mac. You can only run that if you've actually got the packages if you're using the actual source distribution that I have is what those scripts are for. So if you don't if you're not using my distribution, then you probably do not want to run the shell scripts, the bash or fish scripts. So yeah, if you did run those just open up a new terminal because those will set up the environment assuming that you're using the packages we have. So did that work when you opened up a new terminal? Has anyone had success compiling? Yes. Fantastic. Okay. So let's go back to this. So if you've had success compiling and running, yes, perfect. Then you now have a proxy. But a proxy with no back end is not going to do any good. So we're going to have to build the back end. If you go up a directory, you probably need a new terminal. There's a web server directory. Just go build that and run it. You're welcome to look at the code if you want to make sure it's not doing anything dicey. It's four lines of code and it's a fully functional HTTP web server because Go kind of has that built in. Once you run that, then you can test that your back end is working. Oops, I have to actually run mine. If you do something like with curl to port 8081, you'll see that your web server is working. And if that's working, then what you can do, nope, that doesn't work, is you can go to your browser, maybe, and if you go to port 8080, oops, remind me to actually run my code, you should actually see the go documentation. And then if you go back and look at your console, you should see that printf that you put into your code. Yes, success, perfect. So you now have a fully functional HTTP proxy in, I don't know, 14 lines of code. It's not very efficient, but it works. So let's make it better. Let's actually do some stuff. Alright, we already tested it. Does anyone have any problems before we proceed? Great. So now part two. That existing code was kind of lacking. If you've done network stuff before, you'll realize this was a serial one customer at a time, and really slow, although it's funny to call it slow. 500QPS, 10 years ago when I was working on proxies for Live Journal, which were written in Perl, because we wrote a load balancer in Perl, the day we got 400QPS was an amazing day. So, we, yeah. So let's make it faster. This is sort of where go starts to really shine. Thinking about going faster in go, thinking about going faster in sort of any sort of programming, you start to think about things like doing things asynchronously or non-blocking, doing things in parallel, like forking, running things in multiple threads. There's a lot of ways you can make your program faster, or you can do multiple things at the same time or at approximately the same time. In go, what we tend to, what we do is concurrency. And it's really, it's built into the language. What are we talking about when I say concurrency in go? In essence, all the code you write in go is blocking, by and large. All the code we just wrote is blocking. You say accept, and that blocks until there's a socket available for you to accept. You say write, and that blocks until the write completes. Same thing, read response, read request, etc. Those are blocking calls all the way down. So, logic in go is basically that it has this functionality called go routines, which allow you to effectively run blocking operations in multiple parallel go routines, or concurrent, excuse me, parallels and bad work here. And what happens is the runtime, whenever one of your go routines blocks, the runtime can pick up on that and schedule a new go routine. So, you can think of it like cooperative multitasking. If you go back in the day to back how multitasking used to work on computers, it's very similar to how it works in go. If you dig into the details, the runtime can schedule new go routines when syscalls happen and when you make function calls. So, if you cross function boundaries, there's a chance that you'll get preempted and another go routine gets scheduled. But these work together to effectively run go routines concurrently. So, how do we apply this to a proxy? And how do you apply this to services in general? Typically, you think about breaking your project down into the sort of the units of work that you're doing and what kinds of work can be completed concurrently. For a proxy, it's pretty straightforward. We're going to have clients more than likely, or I mean with HTTP, if you exclude pipelining, a client is doing one thing at a time. They send you a request, you process it, you give them a response. But another client might also have sent you a request, and you could have hundreds of clients doing requests at the same time. It seems like this is a good candidate for go routines for splitting up the work that our proxy is doing by the clients. Somebody might come back and say, well, and if you're used to doing it in a threaded model, the one thread per client connection works up to maybe a couple of hundred, depending on your environment, before you start running into contention issues or resource issues. But in Go, you can have hundreds of thousands to millions of go routines. The system is designed to run at that scale. It actually, I've never played with millions, but I've definitely played with hundreds of thousands, and it works pretty fine, pretty well. So we're going to rewrite our proxy function to do this. So you have a choice here. You can either continue on with the code you've been writing, if you're happy with it, or you can switch over to the Part 2 directory and start with a fresh copy where you can go to Part 2 final and follow along with this phase of the tutorial. Your choice, whatever you want to do. But we're going to start in our main function. And this is the same code that you already have just rearranged slightly. We're adding error handling because I wanted to demonstrate how you can do error handling here. The traditional way of error handling in Go, you get your error, you do your command, and then you check if error is not nil, and you do something. In this case, we just exit the program because if we can't listen on the port, there's nothing we can literally do. So we might as well exit. Also, in writing this talk, I don't know how many Windows I lost one of these running and had the port tied up and couldn't find it in pain in the ass. So, we're going to start here. But now, what was the next step after accepting a connection? Well, the next step was we have to do something with that connection. In our case, we were going to read a request because we know there's going to be a request on that. But when we just talked about concurrency, we said we were going to add our concurrency at the client level, at the connection level. So that's probably about here. After we accept it, we want to do something concurrently. How do we do that in Go? If you've played with Go, you've probably seen the Go keyword. So what we're going to do here is we're going to say Go, handle connection and pass it in the connection. The Go keyword is magical. What it tells the run time to do is it says take this function call, this bit of code and go run that somewhere else. Go invoke that just invoke it like you had just called handle connection, except invoke it in a separate Go routine. The main Go routine then just continues on. So from the perspective of main, it's like nothing happened here. From the perspective of handle connection, it got invoked from main and it starts running. Main immediately loops back to accept and accepts another connection. So when another connection comes in, well, it spawns out a new Go routine. So first connection comes in, we spawn a Go routine. Second connection comes in, we spawn another et cetera, et cetera, et cetera. Now what happens in handle connection? We've probably guessed we're going to take a lot of the code that we just had and move it into this new function. So in fact, you can start by just cutting and pasting all of the code you had inside the accept block into a new function. That's going to look like this. So again, this is mostly a lot of the same code you had. There's a new reader call, there's a read request call, but we're restructuring it for better behavior. So let's walk through this and see what's actually happening here. First, we have this defer. If you're fairly new to Go or haven't seen this, defer is really amazing. It's the idea like, so let's think about from our concurrent proxy perspective. We have a user. That user is going to be setting us requests. The part of our code that is talking to that user is this function. So logically speaking, if this function were to fail or die or crash or exit or whatever, if that Go routine goes away, we want to make the user go away because we don't want them to still have a connection open sending to the void. Go allows us to do this sort of thing by using defer. Defer says I want you to run this code, this command, this function call when my surrounding context, when my function exits. No matter how it exits. So in this case there's a return call down here in an error case and there's return calls later if any of those calls get hit or if our function panics or the equivalent of throwing an exception, somewhere downstream connection.close will get called and we never have to, if you've written this kind of code, you probably written this kind of code in other languages, you have to do, oh, if error case, conduct close, return and then later if error case, conduct close, return and you have to remember to close the connection, etc. Go allows you to do it once, make it very clear that this will always be true and then not worry about it in the future. The next part is we have our new reader, this is the same same thing, we're just creating a Buffered IO reader. This is new though, we've added an infinite loop here. If you remember on the last one, we did a wild true or a four, an empty four around the accept because we were just going to accept a connection over and over but we were only processing one request per connection because we weren't concurrent. Now we have one handler, one go routine per user so we might as well leave the connection open and just read requests over and over and over. So we add this loop around read request this next bit is a little funky we have to know when the user gets rid of the connection when the browser closes the connection on us and goes away, we have to know that somehow so that we can exit otherwise we leak resources, we leak memory and everything in the go routine. So how we do that is now we're actually going to look at the error code that read request gives us because it can give us an EOF. It doesn't have a way of saying or the way that it says that the connection has closed is read request returns nil, a nil request and an EOF which is not strictly an error case but it's how you can know that something that it's done. All right, anybody have to type in? Good. So now we're just going to cut and paste this is the same exact code, no modifications from part one from the net dial to the writing the request out to the read response to the response not right. We're not going to touch that yet. You can just put this in the middle of your block. This goes down into where the more code goes so and then we can just do this. All right, give it a second make sure people have got that because we're about to compile again. So great, building part two same as before your directory might change depending on like I don't know where you're at in your file structure but if you run go build if you have continued on from the first part one you might get a build error saying you've imported a module or something that in which case you have to update your import line I think that's actually going to fire yet but part three will definitely have that so let's see once you get this built you find my terminal again the part two final part two I'm running my web server somewhere I think great we have a response here but now the fun part is going to be verifying that we're actually getting concurrent behavior and if you know um htdp protocol you can do something like get slash htdp 1.1 enter enter and now htdp 1.1 defaults to keep alive so in this case this connection is still open we could send it another request and we'll get another one so let's do that concurrently get slash htdp 1.1 get slash 2.0 where's my other get slash bar so now we have two clients actually talking to our proxy um and you can do the same if you wish but if you think about it we went from building a single threaded single surveying proxy in like 14 lines of code to building a completely concurrent proxy in 20ish right and this is some of the power of go when you going to do the same in uh... something like pearl you absolutely could you would have to use a module you'd have to pull in you know power dango socket or something else you'd have to know how to use these modules the specific way of implementing async.io or concurrency to do it you know in python you could use a threading module or tornado or twisted or whatever else so i'm not arguing that go can do things that other languages can't just that it does them very easily and that they're built in and you know if you're going to do a one-off easy network service it's so easy did everybody get it compiled did it work yes couple thumbs up people following fantastic any questions before we move on awesome um let's go back to our slideshow here so great this new version is faster just by doing this this little bit of concurrency it goes up to about 2000 queries per second turns out reestablishing all those connections to clients and everything is pretty slow so this benchmarking to give some background i'm just using a b a patchy bench on the local machine so it's not really a great benchmark but it works and obviously we're still not actually doing anything useful with our proxy we're just proxying connections to a backend which is great but not not really what we're trying to do so let's actually gather statistics about requests this is going to be kind of a simple iteration of the idea and you can extend it or you can you'll see how you can extend it but to give you an idea so let's think what kind of statistics could we gather about a request well there's like response codes response sizes there's timing information like time to first byte time to total response you know there's information on clients there's a whole host of things you can get some of these you can get out of access logs obviously request sizes are in access logs total time can be in access logs there's other things that don't show up in access logs that are much easier to get from a proxy but for today we're just going to do request size or response size more accurately so you can either load up part 3 slash main.go and get a pristine starting point or again you can continue with where you're at if you're happy with your code or you can load part 3 final slash main.go if you want to follow along with the code visually I'm going to repeat that a couple of times hopefully that works for somebody so this we're back at the top of the program so this is below the import section and above your main function we're going to define some global variables well you know it kind of makes sense that our proxy all of the users are going to be doing things concurrently but if we want to gather statistics we probably want to have global statistics I don't really care if user X had usage pattern Y I want to see what the entirety of my websites doing what all my traffic looks like and the easiest way to kind of do that is to have global statistics so in our case we're creating a map from with the keys or strings and the values or integers this is another language called dicts or hash or associated arrays in go it's called a map and then we're going to create a mutex we'll talk a bit more about mutexes because if you've read like effective go or other things about go you might be wondering why we're using a mutex the next thing we have to do is we actually have to initialize our map everything in go is initialized to the zero value for a map for channels for slices the zero value is nil which means you can't use them so we actually have to create it and we have to use make because it's an internal data type the initialization function I'll mention this if you haven't seen it before init is like init is something that runs when the program starts so you can think of it like main except any module can have an init function which is executed at the beginning of execution there can only be one main function because there's only one entry point to the program but there can be a lot of init functions and they all get executed I think it's a non-deterministic order but don't quote me on that so great we have a global map we have a mutex we initialize our map so that we can use it the next thing we have to think about is okay we have to put some data in this map we have to actually gather statistics about what's happening in our proxy so we're going to have a function that does that so you can add a new function above main or below main doesn't really matter and I called it update stats you can call it whatever you want it takes a pointer to an HTTP request object and a pointer to a response object because from our perspective this happens when the proxy is finished or when it has gotten the response back from the back end we want to collect some data about that transaction and save it to our map and then move on with sending the request so you've seen this pattern before with the defer we have a mutex so we're going to lock it and then we're going to defer and unlock and this is a really useful pattern if you're dealing with locks because it's really annoying to diagnose locks that are held and never get unlocked you lead to deadlocks and problems but if you defer it here it doesn't matter how this function exits it could crash and panic that mutex will still get unlocked so from a best practice perspective if you're ever dealing with mutex or locks the deferred unlock is really kind of the way to go you'll save a lot of trouble the rest of this is really straightforward we're getting the current value of request bytes for a path adding in the current content length putting it back in the map and returning it unlike in Python if you reference into a map on a key that does not exist you get back the zero value so in this case we have integers as our values we get back zero if you have strings you get back an empty string if you have channels or something else as your value in your map you get back nil you don't have to check and see if a key exists before you update the map I may be getting a hit here or overthinking it but would this process of updating stats be better as a separate go routine with a channel? great question so let's talk about mutexes and go so only slightly ahead I was hoping somebody would ask that because it's really a great thing to talk about this is one of the like if you go read a lot of the blog posts people are like go-his-channels channels are thread safe you don't have to deal with locking you don't have to deal with all this stuff channels that's great I really want to I want to write go in a very go way and channels are very go and in fact if you read a lot of the blog posts by Rob Pike etc they say share memory by communicating don't communicate by sharing memory and clearly in this case we're communicating by sharing memory channels have things that they are really good at channels are really good for cross-thread coordination if you have a worker queue and you have a bunch of workers using channels to distribute work to them is a fantastic application of channels you don't have to deal with locking you don't have to deal with sort of like contention on that channel go does all of that for you in the runtime it's also great for like as mentioned passing ownership if you have something that you want to give to somebody else passing through a channel works really well for that um mutexes are well if you've done other engineering you've dealt with mutexes they're good for locking things when it comes down to it use whatever is simplest we're all trying to write software here and especially if you're trying to write production quality software where other people are going to have to work on it and use it the goal in my mind is not to use whatever is cool in the language that would totally get the job done you're right if you want to go routine listens on a channel we could send stats requests down that channel and have that own the data structure and that would probably work fine um but in this particular case I think oh we have an update function it updates a data structure there's only one person touching that a mutex is for me felt a little simpler for this implementation like a lot of the things I mentioned but it's kind of six and one half a dozen of another if you find yourself using locks and using locks in a dozen places and then you have locks to protect your locks then you have a read write lock here in a mutex here and then you're debugging race conditions and things like that you're probably doing go wrong right um on the other hand if you're like well I could totally use a channel here and you know but it's not necessary you don't have to so in what simple find what is expressive for your problem um does that answer? cool you know I'm not a always the word zealot when it comes to languages like I use whatever gets the job done so I can move on to the next thing um so I happen to let go but I'll do whatever I need to do to get it done so right let's go back to our code this is the response in the right line you already have so we've already got these two bits of the code we're just adding a call to our update stats here so this is I mean we have to get the response back from the back end before we can actually update our statistics so because we have to know how big the response is before we can put that into our structure so we call update stats we get the current bytes and in this case one of the things that you do in proxies is you modify the data you modify your request or your response um and don't be evil in our case we're going to modify the response and add a header here that says how many bytes have been requested on this path or how many bytes have been returned on this path so pretty straightforward um alright give a second for anybody who's got typing that they're about to test it so now connection closed because I've been sitting here idle there's the error handling we added a while ago for failed to read request malformed request so now we build part three and that's running and now let's go back to our curl and if we run curl against it you'll see here's our header that we added showing that our stats is working if we do it again and again um we can do something like this and do a whole bunch of requests cancel that run it again and now it's done 77 megabytes of slash so if we give it a different URI so looks like our data structure is working we're collecting statistics around our global around paths and how big the responses are I mean obviously you could do anything here um and probably not publish the stats back to your users but you could uh we actually did this in our load balancer for live journal we would put a header in there that was like basically how long you had to wait for a back end and how many other people were waiting in line as well back when people used to complain about how slow the site was so people wrote plugins for their browser that would pop it up and be like you waited four seconds for that page to load not really a great great thing but it was fun um alright so we have our testing here alright let's get a little more wild with our proxy obviously at this point for every request we're still establishing a connection to a back end that's inefficient so let's look at connection pooling which is a pretty commonly used thing in services because what we what we don't want to do is create a connection for every user because we don't want to have like if we have 100,000 users connecting to our website and they have long lived connections we don't necessarily want to have 100,000 connections to our back end right that are just sitting there idle most of the time it's not an efficient use of resources we probably want to have a pool of back ends have our connections to clients and then whenever they send us a request get a connection from our pool use it and then put the connection back in the pool sort of the most efficient way of doing this kind of thing I mentioned the ephemeral port issue um has anybody actually run into problems with ephemeral port exhaustion in the past great if you're ever benchmarking something and it goes really fast for about 10,000 requests or 20,000 requests and then it stops and 60 seconds later it goes really fast for another 20,000 and then it stops you're probably running into ephemeral port exhaustion um if you want more info you can google it it's fun and interesting when it happens in development less so when it happens in prod um so how are we going to build a connection pool well if you think about the the algorithm I just described it's basically a queue you have back ends you put them in a queue when a client needs a back end it pulls it from the queue uses it and it puts it back in the queue so we're going to look at doing queues and go which means we're going to get into channels so we're also going to take time here to we've been building these bufferdio readers and writers and using those which means we're dealing with a connection a socket object a bufferdio reader object and a bufferdio writer object I don't really want to be passing all three of these around all over the place so we're going to create a structure to put them in and that's our top part here um oh and since I didn't read it out loud if you want to follow along we're in part four you can open part four slash main dot go if you want to type code or part four final if you want to read code so we start with our structure here and we're using an embedded type has anybody played with embedded types and go one hand two hands great so um I'm going to assume you've done object oriented programming in other languages yes some heads perfect if you subclass if you had a foo object and you subclass it with a bar object and foo has a method on it a method close and then you call bar.close it ends up calling foo.close if bar doesn't have one it's this idea of methods basically get inherited by the subclasses go does not have subclassing or object oriented programming in that sense um go has types and interfaces and methods and things um but one thing that go has is what's called embedding or type embedding and what you can do is when you define a structure or when you define an interface you can put an anonymous type at the beginning so in this case you notice in our structure the bottom two the reader and the writer there's a field a member name reader and then a member type and a member name writer and a member type but the top one net.con does not have a member name it just has a type what we're basically telling go is hey we're going to create this back end structure basically it's a wrapper around this other type called net.con um or it's an embedded type and it effectively lets you when we create a back end and a little bit and use it um we can call methods on that back end which will then get called on the net.con so it's a it's a way of simplifying your your methods and it's a way of allowing us to pass back ends two things that expect net.cons so it's sort of a subclass um so we're going to embed a net.con and we're going to add a reader and a writer and then we're going to add a back end Q channel channels are pretty awesome and go we're going to make use of it for our Q here we're going to say that this is a channel of pointer to back end structs we have to update our initialization function because as with maps you have to initialize channels or you have to make them so we make this here because it's a global channel so we can use it and we're going to set it to a size of 10 um in essence what we're doing here when you create a channel you have two options you can create an unbuffered channel or a buffered channel and to sort of the way to think about this or the way that I think about this is if I'm going to give you something if I wanted to hand you this remote you have to take it from me like an unbuffered channel in the sense that if I want to hand this to you until you are ready to take it I have to wait I'm going to be blocked if I say I'm giving this or I'm putting this into a channel I'm blocked until the other person is ready to take from the channel there's no buffer a buffered channel is more like a mailbox if he has a mailbox I can go put this in the mailbox and I know he'll be able to get it later and then I can go about my business but as with any mailbox there's a size limit you can define how big of a buffer you want for your channel once that buffer is full once your mailbox is full you can no longer put things in it and then you block so in this case we're creating a backend queue of size 10 which means it can hold 10 backends before it starts to block now we're going to get a little hairy with the code because if you think about how this queue is going to work we basically have two operations we want to do manufacture backends or put backends in the queue and we want to take backends or rather we want to get a backend from a queue and we want to put a backend into the queue when we're done with it so how do we let's think about getting a backend from the queue logically there's two things that we are going to do two paths either there's a backend in the queue already and we'll just use that or there's not and we'll make one right now we're just making backends all the time so you're going to see some of that code so I like small functions I like moving behavior into small little units that are easier to think about so in this case we're going to have a get backend function and it returns like any good go function a value and an error this is a kind of a mess of code or kind of an interesting more complicated bit of code so let's break it down into our two cases here in go you've probably seen select statements before in other languages where you do select case of this, case of that, case of that and it goes top to bottom until one of the cases is true and then it runs that case same idea in go when you say select cases and it blocks until one of those cases is true and that's the part to remember in this case we have no default so go will block here until one of these cases is possible since the top, let's look at the top one this is the syntax for reading from a channel we have our backend queue channel we want to possibly pull data out of it so we say case end reads from backend queue if you read from a channel that has nothing in it you block so if there's an empty channel and you say read you block so in the ideal case, when we call get backend this fires there's a backend in the queue we return it, we move on, we're done and it exits, it returns backend common ill because there was no error now let's in the other case let's say there's no backend in the queue well, we could just immediately create a backend which is what we have been doing and that would be fine, but I wanted to demonstrate a common paradigm in this case it costs us resources to create a backend it's a new connection, it takes time to set it up you have to do the tsp handshake and you're not going to be sure if there's a web server on the other side yet or if you're just talking to linux which has optimistically accepted you and put you in the listen queue there's a slight penalty to creating a new backend we know that and so we're going to say we're willing to wait a little bit to see if a backend becomes available to see if somebody else finishes with one and puts it back in the queue so what we do here is the time module has an after function which basically creates a channel and in that channel it sleeps for however long you give it and then it writes out a value so what we do here is by having these two cases if there is a backend in the queue the first case fires immediately we return it if there's no backend in the queue the timer starts for 100 milliseconds in that 100 milliseconds if a backend becomes available in the queue the first case still fires because Go will pick the first case that becomes possible but if 100 milliseconds goes by the timer fires and the second case becomes possible and at that point we create a new backend this is the net dial code you've already seen we connect to the backend and if it was successful then we return the backend structure and this is how we create our backend structure we create the new bufferdio reader and the new bufferdio writer and we return that we don't put it in the queue or anything because we know somebody wants to use it so that's great any questions about this construction back sorry I was just asking if you could go through that one more time yeah totally so you come into the select and the backend is not available you kick off the timer and you're waiting and at some point before the timer expires you're saying Go comes back into the select so select is not in other languages select runs top to bottom case, case, case, case Go is concurrent or parallel if you want to think about it it does evaluate them from well I think it actually evaluates them in random order technically but it evaluates all of them repeatedly until one of them becomes true if you have a default you can have a default section and if you have that then if all of them are false then it fires default immediately so another way you could have the original implementation here had the first case and then had a default and the default was to create a new backend but in this particular case that's less efficient than waiting a little bit of time for a backend so yeah the other edge case is if it's not true like if you have if you're reading from three different channels all three of them are viable Go will pick one randomly so it's defined random does that make sense perfect ok so this is getting a backend from a queue the other part that I mentioned was putting a backend into the queue and you might think oh we just put it back into the channel and we're good but there's actually two cases there as well they're much simpler so queue backend takes a backend and it doesn't return anything because it's never going to fail ideally and option one put the backend in the queue and in that case well there's nothing else to do so we're done but option two is let's imagine we're in a situation where 100 people hit the website at the same time well we don't have 100 backends available we create 100 backends now those users go away they're done well we have 100 backends but our queue is only 10 long so we can't put 100 backends in the queue and we don't want to actually we want to allow them to die off so we use the same sort of timer trick here where we say I'm willing to hold on to this backend for a period of time in this case one second and if in that one second a queue empties out or becomes available we'll put our backend in the queue and we're done if a second goes by and that hasn't happened then we close the backend and move on so how do we use this code the top bit is your existing code your net.dial new reader writes read response and this is the new code with better error handling we call get backend it returns something to us and then we say we use that backend be.writer and remember how earlier I said oh we were writing directly to the connection and not to the buffered IO because we hadn't created one we've now created one for writing so we're going to use it the caveat of the trick to using buffered IO writers is that you always have to flush if you don't remember to call flush there's going to be some trailer of bytes sitting in the buffer forever and you're going to wonder why your proxy doesn't work anymore because you're only getting half a page that probably means you forgot to flush so the final bit is we're going to re-queue the backend and put that back in the queue now note here we spin this off in the go routine remember we're in the middle of our handle connection here we are handling requests from the user we don't really want to block that thread holding on to a backend thread is the wrong term we don't want to block that go routine holding on to a backend that we may or may not be able to put into the queue because if the queue is full then we're going to block for a second so spin it off in the go routine there's no return value you care about you just want this thing to happen asynchronously at some point in the future let that go deal with itself and this is a pretty common paradigm or pattern you'll see in go code when you have something you want to get done but you don't need to care about when it happens you can just spin it off in a go routine if you do care and you don't want to block you can still spin it off in a go routine but you can give it a channel and then you can go look at that channel later for results so great now we can play with part four here there's nothing really to demonstrate with this one there's no visual output ugh can't type if I'm not looking at my keyboard whoops there I was about to go run it in the wrong window again I mean it's running still has our bytes here so it's functioning if you want to play with it you can add some print debugging and then it will tell you what it's doing so great now we have a proxy that has concurrent incoming connections it can handle tens of thousands of parallel connections at the same time it has a connection pool on the back end so it makes efficient use of resources to talk to servers it collects some global statistics on requests what else can we do there's still one place that needs a buffio the writing the response back to the user needs that and you can play with that if you want I will say in benchmarking it's roughly a 2x improvement by fixing that because the HTTP library writes out in a very bad it makes many calls to write like 20 calls for a header it's pretty bad so you definitely want to use buffered IO there's also a problem like if you think about our connection pool we put connections into it and we never get rid of them unless we have traffic so if you have a low traffic website you're going to end up with old connections that just sit there and get they expire and then you're going to have some bad responses to users because they get a bad back end there's also an optimization you can do by keeping the queue warm like by saying oh my back end queue is empty I'm going to put in some back ends that's a pretty easy thing to do in a go routine you just have a go routine that just checks to see if the queue is empty and spins up some back ends you could add a lot more statistics and things and the current structure is 150 lines of just go in a file which is fine but not great when I started writing this talk I wrote an initial version of this that I had to pare down quite a bit to present if you look in the final directory you'll see an implementation that may or may not be any good but it's fun and that makes a lot more use of that does back end pre-warming that does back end expiration that does a lot of the things that I just talked about that I don't go over in the talk so we're going to slightly switch gears a little bit before we do that any questions so far? great we're going to talk about RPCs I was planning this for an hour 30 and then I found out I had an hour 40 so we're not exactly low on time yet but we'll look at how you do how you can do RPCs and go it actually has a whole built in functionality for adding RPC clients and servers to your application really easily so for fun we can add that to our proxy this is in the part 5 directory there's a main.go and a part 5 final as per usual so when you think of RPC there's obviously two sides there's a server and a client there's the side that is going to be receiving requests and answering them and there's going to be a side that makes requests we've done a lot of that with our proxy but we've been managing it manually ourselves we listened on a port we accepted a connection we read a request we dealt with it ourselves now we're going to leverage some of go's functionality for doing RPCs so we don't have to really deal with it so there's two parts to that this is a mess of code here but I'll walk through it in essence the RPC system and go works by defining functions that have explicit specific method interface and by that I mean if we look at our get stats here if you want to use RPCs and go the built-in RPC library you have to define a function on a pointer to something in this case RPC server and it has to take two arguments the first one is the input arguments the second one is a pointer which will be the output arguments and it has to return an error so that's the signature and for this case we're going to implement a method that returns our request bytes global structure because we want somebody to be able to connect to us and say hey give me your stats like give me all of the stats because I don't want to just look at the stats on this particular page I want to see everything we don't take any arguments so we define an empty struct because you have to have something and our return value is going to be the stats struct which has request bytes which is that is the same map signature of our global map we then define our RPC server type Go does a lot of things based on type names so in this case it's an empty struct but it's just a way of collecting behavior on a name if you if you're familiar with Python this is just doing class RPC server object pass it's an empty thing with like no behavior really empty structs are useful in that way we then have our getStats function which is on RPC server takes no arguments and it replies with a stats structure we have to deal with a lock this is our mutex like we don't want to fight with updating of the request object so again we take the mutex and then we copy it this is a very simple we make the map in our reply and we copy all the data in um and then we return now because there's no error so this is how you define like if we could then create new functions of funk RPC server do something arg reply error and it will become exposed in the RPC system the second half that you have to do is you have to actually initialize the server so if we went back to our main function you would say like RPC register RPC server because that's the the name of the thing that has the behavior on it that's the empty struct that we defined and then we say RPC handle HTTP which means there's a couple different protocols that the RPC server can use in this case we're telling it to use HTTP and then this is the same listen as anything else you establish your listening socket on whatever port and then you run a go routine that says HTTP serve on this listening socket in our case we're not running this on the same port as our proxy you could you'd have to do some path separation to make sure that like the proxy doesn't try to serve the RPC paths but in this case I'm just doing it separately that's it that's all you have to do on the server side if you implement this in your proxy you have a fully functional RPC layer that will return the statistics object on the client side it's this that's it right you well you have to have the same types so the empty struct and the stat struct that we defined in the main class we also have here but then you dial HTTP to this to the end point and you call and go handles all of the marshalling it handles all of the sort of discovery of figuring out oh this is a a map that we're passing back and forth how do I put that over the wire efficiently how do I discover what methods are available on the remote end Go just does that for you you don't have to have really long definitions of functionality that you might have to have in other languages it's pretty straightforward and if we were to show this this one's actually demoable and if you have it you can play with it part 5 I think it's just called part 5 so now we have a server part 5 client now we have a client I think if I run this nothing happens because there's no data yet index out of range that's great it's not like we'll run some data against it oh it's another way of testing this by running W yet mirror against it there we go stuff's happening now if we run our client here it's making an RPC call to the proxy which is actively doing stuff although I think it finished so you can see stuff's happening and it's returning the data not a lot of code easy to do we covered a lot of ground today we have built a fully functional HTTP proxy with an RPC layer that you could take this and you could actually right unless you're a super large website that gets tens of thousands of queries per second you could actually deploy this code and start gathering statistics on your requests if you wanted to I wouldn't necessarily recommend that without some more battle testing but it's fun so great I'm a proponent of Go I think it's fantastic but there are some gotchas if you're going to use this in heavy production Go is still new as a language it's only a couple years old and if you read the mailing lists you'll probably see the flame wars where people are like Go needs generics Go needs exceptions Go needs whatever else that it doesn't have right now they're being slow and methodical about the language design they don't want to just add a lot of functionality until it's really clear what's needed and why it's needed if you're someone who really needs those things it doesn't have you might have to wait a while also best practices our industry as a whole is very new this whole internet thing is newfangled we're still figuring out which end is up and how to deal with it every week there's a new fascinating if you read all the openness to sell vulnerabilities and things like we're still learning how to write code how to test it and when it comes to language like Go we're still learning the best practices so adoption of Go at this point I recommend it Dropbox is at this point every request you make to Dropbox goes through at least one or two layers of Go our primary data storage engine is MySQL but we have a data storage layer on top of MySQL to handle things like multi-region and failover and etc that is written in Go and it's called Edge Store, there's talks available and we trust it we run our high production services through it it works pretty well has some gotchas one of them is Garbage Collection if you've never really had like I don't know GC has never really bit me much in and Perler Python but in Go I've run into it a couple of times when you're starting to do thousands of thousands of QPS those ten allocations you do per query and then you start to add up and when you start to get 20 millisecond, 50 millisecond GC pauses those start to become noticeable in your graphs there's a lot of tips you can use for dealing with that you can keep your allocations in check you can use free pools etc if you've done any Java in the past you probably have a whole world of tricks that you can use libraries the standard libraries are pretty good we just implemented a full proxy and RPC server 150 lines of code the HTTP proxy and 15 lines of code you can do some really powerful things with the libraries on the other hand they are just a couple of years old so you will find some interesting edge cases if you dig around in the proxy you'll find out that it doesn't really speak HTTP 1.0 very well which caused some problems for Apache Bench because it only speaks 1.0 so arguably you don't really need 1.0 too much these days it's something to think about third part of your libraries well, you win some, you lose some a lot of people got really excited about go, wrote some libraries and then they went off to other things and their libraries are un-maintained so you have to be a little bit picky about the libraries you pick if you look at Dropbox has released a lot of our internal libraries and we have like a net2 and a HTTP2 and some other things because we found the core libraries were missing thread pooling or connection pooling all the stuff that we wanted so we implemented that on top but definitely look for things that are updated that people are using in production there's a couple of companies out there releasing good stuff also this tends to bite everybody at one point or another by default Go only runs effectively in one thread that's a simplification so if you think, oh this language I could spin up my Go routines and it's going to saturate all 16 cores that's fantastic you won't see that by default you have to tell Go you're allowed to use up to 16 processors so you can do that but of course there are some, you should benchmark if you're doing something that actually is going to try to push 16 cores and you're making heavy use of channels you're probably going to see some inflection points and performance where it works really well with 4 cores, it works okay with 8 it falls off a cliff at 12 or something like that the overhead of communication as you increase the number of things gets to be pretty heavy so in short, best practices from C and Java, etc are mostly the same actually how do you design your memory usage how do you think about allocations and how do you think about sort of mutexes and things like that just because we have a new fancy language doesn't mean the best practices for writing code have changed so people sometimes come into Go saying it's going to solve all these problems and I don't have to think about that anymore and in some cases that's true but in other cases it doesn't work that way Go doesn't mean you can just forget about everything forever if you come from Perl or Python or Ruby you will instantly find that the code that you write takes maybe 1.2 times as long to write and it runs 10 times faster like it just out of the box a compiled Go program on average is 10 times faster than an equivalent interpreted language program so maybe you never have to worry about how fast your stuff is but you might so thank you I love Go I hope you will actually go out and write some and we have a few minutes for questions but otherwise you can find me online here there's also a lot of good things to go find other talks documentation slides are on, GitHub and everything else so questions can I just a general question about Go and my particular interest is in multi-processing distributed processing it would be lovely to sort of treat distributed processing as a kind of just another Go routine that you could just say by the way Go compiler use these other 16 machines and run stuff on them as well but I'm not expecting that is there any sort of reasonable facility to make that easy in Go at the moment sort of so I'm trying to remember the name of the library I think everybody could hear the question channels can run over networks yes well there is a fantastic library that actually basically allows you to extend the channel functionality across network LidChan? maybe that might be it but yes some people have done that and implemented that and then at that point you can build your one little program and still use the same functionality and channels and everything but it runs across a bunch of others so I haven't had any need to play with it any start on the sort of remote sites I don't know too much about it but definitely look up LidChan some of the stuff Docker's been doing and definitely people have been going in that direction so yep I had a question you collect your own statistics is that a comment on xbar is it a comment on xbar the built-in package for exposing variables you've done your own strut and mutex locking oh is that because you don't like xbar you know I don't think I've played with it too much I think for the demo case you know I will have to look into it so xbar does then basically exactly what you've done and adds a web handler ah okay nice so there are a lot of things in go like do you know when that was actually created that might be in the past year or two oh really ah okay I'll look into it more questions thank you all for waiting for the microphone I've heard that the garbage collection stuff is going to be fixed in 1.5 do you know anything about that yeah so people I mean it's been a common complaint about the performance of the GC and whether or not it's whether it's perfect right whether it could detect cycles and break them and everything um they've been making a lot of progress with it and yes they're going to improve it a lot in 1.5 I don't know if it's going to be I mean I don't know if it's going to solve all the problems right it's a really hard problem you know even Java which has been around for a while hasn't solved it right so I think we have a ways to go but um I don't know a lot of the details about what they're improving I think like cycle counting and um that sort of thing are better question in your running at docker or in your experience at docker are you finding any problems with the static compilateness yeah so the only problem we've had with the static compilaton is that one we were upgrading from 10.04 to 12.04 machines we ran into some issues where if we compiled it on 12.04 it would use like libc versions that didn't run on 10.04 and then when we try to distribute it to 10.04 it wouldn't work anymore because the libc is actually still dynamically linked it's not statically compiled in so that's one gotcha like everything else is statically compiled in but not the libc so you have to think about that and that does mean like if you do run on machines with like vulnerable versions of libc like you should upgrade them um other than that though we have had pretty good luck with having like a build machine and just tightly controlling what is on that machine and then when we push it out we can make sure that then we know what is actually running what version of everything is running so like which has been pretty good for us so so within the go community there's obviously this explosion of people writing their own versions of different things um and one of those that the new people find out very quickly is there's a million and one different frameworks for writing REST APIs so given that I'm assuming Dropbox uses REST APIs what would you recommend to people that's something that is reasonably good and worth trusting we don't actually use any well all of our APIs are pretty much protobuf right so internally we built an RPC layer on protobufs and so that it's compatible with our Python, our C our everything else um and then it's efficient over the wire so like at our scale we don't really deal with REST or JSON based APIs like we have um it's kind of we don't use the internal RPC library because we actually tied it in with sort of our metric system so that we can tag RPCs with sort of like where they came from and where they're going and how long they're taking and everything and put that data into an elastic search database so we can do analysis on it so in that sense like we're not using the go RPC libraries or any of the JSON or REST APIs um the one place we are using a REST API we're just serviting using the HTTP and the net slash I think it's net slash JSON RPC module so any other questions in the back I see in your go drop box GitHub repo there's lots of two versions in there how do you see the adoption of that stuff going in to say the mainline is there a path there yeah a lot of that stuff like we're not really pushing the standard library to adopt it because some of it is kind of special case to our needs um but some of them is like our sort2 module it has like a un6 slice a un16 sort slice and all those ones like the standard sort package only has like string slice and int slice and like one other but if you want to sort int32s you have to define it so like we just define a bunch of those that's one of the things where like I think they don't want to take it upstream because like there's the argument about generics and maybe they're going to make generics and then have sort just work on that instead of uploading like 45 different type slices for sorting um some of the other ones like we're not really fixing bugs in the standard libraries those get fixed pretty quickly it's more like we're adding some functionality that integrates better with sort of our internal tooling or stuff that like we have a memcache client in there and they're not going to upstream a memcache client so but anything else? and I'll have any questions alright then thank you so much Mark Smith and thank you all for coming thank you so much