 Welcome to contain your enthusiasm for Go. Fast and simple Go plus Docker development. As a matter of housekeeping, you'll find that this QR code will take you to these slides. So if you'd like to actually go find the original version of these slides, please follow this QR code. So Go is an extremely pleasant language. It spills their fast, it makes testing very easy and also fast, and it makes debugging very easy and there is a whole ecosystem of tools to plug in to debugging of Go. Then there's Docker and Docker also makes development easy. It makes cross-OS work easy. I can literally start on my Mac laptop and use Docker to gain access to an Ubuntu 2004 development environment, trivially in a single command line. Docker layer caching makes builds fast. So without Docker layer caching, I have a project I work on from time to time, that takes 54 seconds to build and with Docker layer caching that comes down to 0.5 seconds, so very nearly 100x speed up. So when we go to look at bringing Go and Docker together, there's some goals we want to achieve. So first, we want to take Go code and target into a Go executable in a Docker container. This is actually not that hard. But we'd like to get fast rebuilds and for reasons that we'll go over here shortly, that is hard when you bring Go and Docker together. We'd like to be able to Docker run tests. Say for example, and this is an example for my own development life, you're working on a Mac on an executable that is really intended to run in a Linux environment. You obviously need to be able to test that executable, but you're not going to test that executable for a Linux OS on your Mac directly. You need to be able to test it in a Linux-based Docker container. You want to be able to debug all the things as they are running in your Docker container. You'd like to be able to hack on the dependencies for the particular executable you're working on. So one of the things about Go is it's a beautifully modular language, and as you use it more and more, you discover that certain things are true. So one of the things you discover is it really, really wants you to have one module per repo. Then the next thing you discover is that it really wants you to have one executable per repo, and obviously one executable per Docker container. Now you can do things that differ from this, but that's very much the natural way to work with Go. As you dig into that, you quickly discover that you find occasions where there is some dependency you're working on. Say you have an SDK and you're working on an executable, where you'd like to hack on the SDK and the executable at the same time. So doing that in a world where everything is running in your Docker container to get you the cross-OS support that you're looking for, that's an interesting trick. Then the last thing is we'd like to keep these things idiomatic. Both Docker and Go have very powerful idioms for how you work. They end up being a big part of what makes them very natural and pleasant environments to work in. So whatever it is that we do, we'd like to try and keep everything as simple as possible, and try and see as close as possible with the standard idioms, so we don't get out of control messes. But as you bring Go and Docker together, there are some rough spots. So the first one we'll talk about is what I refer to as cache collision. So in Go, you have a couple of things that Go does to make your builds fast that have to do with caching. So the first one is if I have some Go mod file where I've listed out the modules that I depend on, as in this example. When I go to do a build, Go will go ahead and download the source code for all of those modules, and it will put them in its module source cache, which is in wherever your GoPath is slash package slash mod. And so it will populate this source cache so it doesn't have to re-download those modules every time you do a rebuild. It only has to download them on an occasion in which you change versions. This not having to download every time radically speeds things up. And then you've got the binary cache. So this is an exercise I highly recommend. It's hugely instructive. We're only going to show you a little bit of it here, but if you type go build dash X, that will actually show you all the things that are going on behind the scenes with Go build. It's fascinating. And we're going to focus on one particular thing that goes on behind the scenes, which is you end up seeing commands that look like this. They are compile commands. You'll see this compile command is for Linux AMD64. And then there's a whole bunch of other flags that go with it. And it's compiling client set.go. Now, one of the things that Go does with this is it builds a binary artifact cache. So it takes a hash of the command line and the file contents in client set.go. And those together produce a hash file name. And that hash file name is the resulting binary artifact. And what this means is the next time we go to a rebuild, if this source file has been built before with the same arguments, meaning we're still building it for Linux and we haven't moved anything around terribly much, then it doesn't have to rebuild the binary file. It just pulls it from the cache. And so this can take something where you might have thousands of different things and dependencies to build to a situation where you just have a linking problem. Now, then you go look at Docker. And Docker also does caching, caching is awesome. So if you take sort of just a very standard Docker file here, from Golang, 1.15.3, alpine3.12, so you've got a container, you set a work directory, you copy your source code in dot to dot and you do a go build, right? Super straightforward. So Docker will go through this in steps building layers as it goes. So the first step, it just goes and gets you the original container you're based on. Second step, it will change the work directory to build. Third step, it copies your source code in as an additional layer. And then the fourth step, it actually runs the build. And the point of this is at each layer, it's updating the contents and overlaying the contents of what came before. So if I do a Docker rebuild, some magical things happen. So the first thing is obviously it's the same container as before. So no real work to do there for step one. For step two, it can actually use the cache. So resetting the build directory, not a big deal. For step three, if the source code that I'm copying in is the same source code that I was copying in before, then it can literally cache this layer. So it doesn't have to reproduce a new layer, it can use the one it's got. And because it is caching that layer, when it goes to run go build, it can use the cached results of that last go build because the source code hasn't changed. And the result of this is that you get a very, very fast rebuild, that sort of 0.5 versus 54 second case that I showed you. So pure rebuilds are very fast. But what happens if you do a Docker rebuild with a source file change? So again, we start with exactly the same Docker file, but we changed one source file from last time. So we still use exactly the same step one image. Again, we can still pull from the cache, the work directory being set. But when we go to copy in the source code, what we discover is what we had last time, the six 3DD layer, that's no longer valid because the source code is different. So we've had a cache invalidation on this layer. And because of that cache invalidation, when we go to do step four, which is running go build, we can't reuse the cached entry from the layer last time. That's also invalidated. Once a layer has been invalidated, everything that follows it is invalidated in the cache. And so we have to go and redo this work. Now let's sort of look at what consequences this has for a go build, right? So I'm gonna skip over step one and two because they're frankly extremely boring and go straight to the copying the source, right? So you copy the source in and of course that will update the go source in that layer. Now, but when you go to the go build, a couple of things happen here. So first is it goes and does the download to construct your source cache. And so it'll download all the modules at the step every time that you have a source code change. And then it also ends up as part of building, building the binary cache. So again, for a one line source change, what does this really mean for us? Well, what it means is that your go source comes in, it invalidates the cache because it's different source code. And that means that it invalidates the Docker cache for the build, go build line, which means you can't use the module source cache. You've got to redownload everything even though that may not have changed at all. And your binary cache also has to be completely rebuilt even though that hasn't changed at all. And so we have this mismatch between how go cache is to go fast and how Docker cache is to go fast. And that mismatch causes all kinds of headaches. In particular, and this is sort of an example from a medium large command project that I'm working on where my one line change at a time to Docker build was 27 seconds. If I used a file, a go file, I'm sorry, a Docker file that was like this, sad panda, 27 seconds. It's amazing how quickly that starts to wear on your soul every time you have to do a rebuild because you need to build little Linux and run a Linux but you're running on a Mac. It's just awful. So questions, what can we do to make this better? Now there's partial solution that's extremely popular and I'm certain that you've seen it before because it's used everywhere. And if you haven't been using it, you really should be using it. And that's to say, look, the download of the dependency sources is a long time. So what if between the build step and the copy of the source code step in, we just copy the go mod and the go sum files into our Docker container build, right? Just those two, just the things that tell us what are our dependencies. And then we run this handy command go mod download which will go and download those dependencies. So what does this look like in terms of our caching behavior, right? So it means that again, we still have the copy of the go mod and the go sum happening just like we did before. Now these haven't changed though because we haven't changed our dependencies and we still have the run go mod download step but again, this also hasn't changed because our dependencies haven't changed. So the source cache gets reused from the step four layer and then we copy our source code in and through our binary build. Now our source code getting copied in, it's different. So that invalidates our previous step layer for the source code for step five and that means we have to redo step six. And redoing step six means we have to go rebuild entirely the go binary cache even though we may have only changed a single file, right? We may have just changed main.go but we've got to rebuild all of our dependencies and everything in our project. This is kind of a drag. So in the medium large command project I've been playing with, this one line change now has a real time of 11.85 seconds. So way better than the 27 seconds we were seeing before but this is sort of an emotionless panda because it still starts wearing on your soul. So here's a more complete solution. So there's a little handy tool that I've actually put together called import gen, import stash gen and it's a go generator. It's very straightforward. So the first line here is just a very handy bash line that goes and goes to a temporary directory and does the go get install for imports gen. This is a super handy way to install things and make sure that you always have the go imports gen that you want here and then it just runs go imports gen. And you may wonder why this is a bash line and the reason is because of the goose setting goose equals Linux because as it turns out for my particular case I'm building Linux containers. And so I want import gen running for Linux even though I'm running it on my Mac at this point. Now when I go type go generate it goes through and it generates a source code file and it sticks it in internal imports imports underscore Linux.go because that's where my gen file was. And the imports file literally just takes every package every single package in my dependency tree. And for my module, my entire module this includes everything I depend on and includes everything that anybody that I depend on depends on and it assembles an imports file. And with this little underscore what the underscore basically says is Hey, I'd like you to import this which from the point of view of the binary cache means you need to build it but I'm not really going to use it in this file. There are lots of handy reasons you do this often involving manipulating the init function but for us what we're really doing here is ensuring that stuff is getting built even though we're not actually using any of it in this particular file. So what does this end up doing? So you're going to want to make a very small change to your Docker file for the full solution which is you still want to copy in the go mod and the go subtle in step three but then for step four rather than doing go mod download you want to copy in your internal imports directory which will copy in that Linux underscore imports.go file and then you just run go build internal imports and that will go ahead and build just that one directory. So what does this actually look like in terms of what happens to our layer caching, right? So we've copied in the go mod still we got down to that step everything is still in the same place. Now when we copy the internal imports in it just copies in the internal imports directory. Now this only changes when you rerun your go gen generate and again even if you were in go generate every time it only actually changes when the packages you're using change and you can choose when to run that yourself. So that can be used from coming brought in from cash. Now then the next thing you do is you run go build internal imports. Now the interesting thing about this is number one go build internal imports because every package you use has its module downloaded your module source cache is going to get updated. So it'll go through and update your module source cache. Now I have the sneaking suspicion based upon the behavior that I've seen that it's actually incredibly smart about this meaning that it can tell a little bit not of a level of just module dependency but a level of package dependency because the number of files the number of modules that I get downloaded through this pattern is substantially less and substantially faster than what I get with go mod download. So not only do you get your module source cache but it's actually more efficiently constructed than with the partial solution. And then because we're running go build internal imports it will go through and construct the binary cache for every package that is in your dependency tree. So every package that is not immediately part of your own source code already has a binary artifact in the cache in your container. And what that means is when you get down to your one line change in your copy file your one line change to your source code that one line change to your source code is the only thing that invalidates the cache. So even if you make a one line change you don't have to download your dependencies anymore you don't have to go and rebuild all your dependencies again you just have to rebuild your immediate artifacts and then do your linking step. And so your go build is a much, much faster step as a result. To give you some idea of how much faster and again this is the same medium sized project one line change basically with the source and bin cache being preserved by the Docker layer caching this way is down to a real time of 3.22 seconds which is our happy panda over here. So if you look at this we go from a sad panda at 27 seconds then we get about a three fold speed up going down to a medium scale panda at 11.85 seconds. And finally you get to a happy panda with another three fold speed up to 3.22 seconds. So there's a real difference here and actually fully solving the impedance mismatch between Docker caching and go caching. So just to review, it's very simple. You just add an internal imports gen.go file looking roughly like this. You do go generate and then you add these two lines to your Docker file after the point where you would add your copy go mon to go some and you don't have to do your go mon download anymore because the go build internal imports takes care of that for you. So this is how we achieve cache harmonization. So next up you recall we were talking about being able to test. So I don't know how everybody else works. I've never been a huge proponent of sort of the classic test driven development model. But what I can tell you happens continuously in my life is I write code, I test code. I write code, I test code. I write code, I test code. It's a constant back and forth at an infinite loop for me. And so you want to be able to do this and do this quickly. And so again, a very small change to your Docker file. So first we switched to using multi-stage builds in Docker. And to do that, we name our first stage as build. And then below we say from build as test then we add a command to run our tests. So whatever it is that we're building, whatever command that we're building, we build it here. And then we go through and we run our tests here. And our tests here will run not as part of the build, but literally when we go and run this target in our particular Docker container. And so you can literally just say Docker run. I highly recommend the dash-rm because it means that you don't accumulate lots and lots of containers that you don't really care about. And then this is a fun little line. It basically says, look, the idea of the container to run is whatever the output it is of Docker build-q, which again, the dash-q option, just instead of showing you all the layers with Docker, it just outputs the final container image ID. And then the dash-target test here, it says rather than going and getting to the end, now this happens to be in right now, we'll get back to that in a second. Only build is your final piece, the test target, which is this target right here. Now the reason the target is gonna be important is obviously you don't want the final output of your container build to be a thing that runs your tests. And so you go in and then I'll construct a final stage where I just copy whatever my resulting binary is from my build stage to a resulting runtime stage and give my entry point. So then with this Docker file, you can say Docker run, rm, you know, Docker build-q target test and that'll run your testing or you can just admit the target test and it will go ahead and build and run your actual application, this example, a Hello World application. So that limits testing super, super easy. Now the next thing that usually I run into is debugging. I will confess I am a debugging addict. I use the debugger probably a bit too much, but it just makes it super easy to tell what's going on in your code and sort out and figure out a large class of problems. And it eliminates the general tendency to over log your brains out, which is going to screw your life up in production. So bringing go in a Docker container together with debugging, you know, this is something lots of people have written about, but nobody's actually talked about how to make it super easy. It always ends up being this bizarrely contrived thing that typically involves radically violating Go idioms and it just feels uncomfortable. And so this is a really simple way to do it, right? So the first is you add to your build stage in your Docker file, just a real simple installation for Delve. Now we add this layer early because this way you will always win at the game of layer caching. So once you've built this once, you never have to go and get and build Delve again. Now for those who don't know, Delve is the go debugger. So very simple thing to add there. And then the second little trick is in whatever your executables main.go is, you can use this little package debug. And this is the world simplest tool. It literally when you call debug self, it looks for an environment variable that is very, very bespoke to this particular executable, right? So if you have many executables running, you can use different environment variables. And that environment variable, if it's found, then debug self will simply exact Delve with the appropriate arguments to rerun exactly this binary again. And so you fall into a debugging mode listening on whatever port you specified. And if it doesn't, then you hit something that outputs the error. Now the error, generally speaking, you don't want to stop over because the error among other things will, when you run your Docker run test, you will go through and say, setting in V variable Delve listen hello world to a valid Delve listen value will cause the Delve debugger to execute this binary and listen as directed. So it tells you exactly which environment variable because this is presuming a command named hello world. If our command was named forder, it would be Delve listen forder. There are some things you can tweak to have even more control over this if you'd like, but the out of the box is pretty simple. And so if you come back and you run Docker run with the environment variable Delve listen hello world 50,000, make sure you pass the ports through, then the result is that you will get running in your container, your command executable listening at port 50,000, which will be accessible from your host, my Mac laptop in my case, and you can just connect your debugger and go. So it makes it super, super easy to go through and debug these things without having to do a very complicated setup because everything you need is already there and debug self is utterly harmless if you don't actually want debugging to be taking place. So the last thing is adding test debugging. And for test debugging, this is like you've got your go test files. Sometimes you want to attach your debugger to test files. The thing we just showed you is how to test it, attach your debugger to your running executable. So in this case, you can just say, look, we'll add one more stage called debug. We'll derive it from test. So it's still got all the things the test has in case we add more things. And we'll just have it run the command del test instead of running the command go test and it will file up listening on port 40,000 in this example. And so this way, if we run to debug the test, we can run Docker run pass through port 40,000 using a targeted debug. My actual test code will listen on port 40,000. And if we really want to go crazy and debug both the tests and the executable, we can pass the environment variable for the executable and forward its ports and as well as the forwarding the ports for the test code and use target debug. And so we sort of slowly build up to the point where we can debug absolutely everything in a very natural way. And if you go back and look at the Docker file, at the end of it all, we still come back to building a really lean runtime stage which just has our executable. So none of this additional machinery screws up in any way the actual executable you end up building at the end of time, which brings us to the last set of things that you're going to end up wanting to do at some point when you try and bring Go and Docker together, which is Go is really good with dependencies. Go modules are super nice. You can specify what modules you depend on. And inevitably when you do this, you find yourself in some sort of a weird situation where you want to go hack on one of them. So in my example, I've got a bunch of commands that use a common SDK. So I often want to go and hack on the SDK. In this example, we don't have many dependencies because I didn't want to be too confusing, but you can easily see where this would be a thing you'd want to do. Now let's talk first about what doesn't work. Obviously the thing that often I end up doing when I'm not working with Docker, which is saying, okay, keep all my Git repos in a directory. So I'll use the replace directive to point one of my dependencies at dot dot, a directory that contains a checked out version of the repo for that particular dependency. Now this obviously doesn't work and you'll see why when you get to Docker because saying copy dot dot log risk is not gonna work because Docker literally will not let you break out of the confines of the directory in which it's run. Now you can go play all kinds of crazy games with moving your Docker files to different places and so forth, but this is really not gonna be a solution for your problem. But what does work, it actually works really nicely because it doesn't ever involve having to edit your Docker files after you start adopting it is, let's say that we keep a directory called local and literally just drop a Git ignore in your local directory that says, look, nothing except a readme should ever be checked into my dot local directory here through the wonders of being able to get ignores in some directories that works nicely. And then anytime I have a dependency I want to hack on, I just check its repo out into the dot local directory. Now I can keep in my Docker file a directive to copy dot local to dot local inside my working directory. And this is something you're gonna wanna do after your go mod and go sum, but before anything that actually builds anything because if my go mod and go sum have a replaced directory, when I get down to my go build internal imports, if the local directory is not present, then my go build internal imports is going to fail horrifically. And so this one line change is completely harmless in the event that I don't actually have any dependencies in dot local, because again, who cares? But if I do work on dependencies, then it means without having to change the Docker file and OG that I remember to change it back and all the other things in no way, now do I have to build my CI to protect me against somebody who changed the Docker file so they can do this? Now you just have one standard place, you put it one standard where you deal with it and you're done, right? Anything that you're doing locally is purely local. So we've actually gone through now and met our goals, right? So we've got the go code to a go executable at a Docker container, now we've got fast rebuilds. Docker run tests is very easy and straightforward. Debugging both the tests themselves and the executable is very easy and straightforward. We've got a clear way to hack dependencies that sort of is vaguely go idiomatic, Docker idiomatic and again, the idioms are pretty much preserved across the two different environments. We aren't doing anything super bizarre, we don't have to go and edit things unnaturally at weird times. So thank you very much for attending the talk. Again, you will find that this is the QR code for these slides, please feel free to go and look at them, they're in Google Slides, feel free to borrow from them and thank you so much for attending.