 Hey, everyone. Can you guys hear me all right? Yeah. Oh my god. Um, so this is gopher's writing elephants writing postgres tools and go So who am I? I'm a senior engineer at pro core I work on availability and performance, but now I've been shifting mostly to security at pro core and I've been writing Go code actively for about two years What oh Yeah, so who is this talk for? Basically, I just wanted to take a step back and say that I'm not gonna be talking too much about like I'm writing Applications in go interface with postgres. Um, this is mostly focused on like operations DBAs database engineers and that in the tooling for around that some of the stuff I'll be talking about is very Relatable to that, but just not gonna be focusing on that too much So first off, let's do a little overview of go So go is an open source programming language built the idea of making it easy to build simple reliable and efficient software Go was created by Google and It was built for Google as well And that's important when looking at the language of the community that there's a a big focus on things like Reliability and being good for teams which are kind of distilled down to simplicity and Being good for building the kinds of things they need to build which you'll see pretty heavily in the standard library with things like awesome awesome networking libraries and awesome cryptography libraries in the standard library and Another really important piece is this article written by Rob Pike who's one of the three Co-creators of go titled less is exceptionally more where he just makes this point that go is lacking is purposely lacking certain features And that they're you know that they are it's not that they don't understand that these features exist but they they've made this explicit decision to not include them to keep the language small and simplistic and For there to be like a single way to do thing and what a see a single way to do a thing And so for example Like there's a for loop and go and that is the only way to loop and there it is structured in a way where you can do Different types of for loops or you know infinite loops with that But that is you know, there's no map. There's no while. There's none of that other stuff and Of course, it's statically typed compiled and garbage collected And it looks a little something like this, which I hope you guys can see all right With this being a hello world with the world being in Mandarin characters with Unicode support by default So why go with Postgres? I Think there's a Three main reasons I want to go over and not in any particular order of importance. Just order. So the first one is performance Go is pretty fast. So this is kind of a contrite example, but The best visual I could find really Where the the y-axis is? Jays requests per second to a JSON API over HTTP And go is that 21,000 requests per second with the next one down being node at 14,000 requests per second and the third one being vapor, which is swift on the back end So it's pretty fast The garbage collector is pretty solid and pretty fast. There's been extensive amount of work over the last Let's say like three or four releases It goes on a six-month release cycle So over the last two years on making the garbage collector really really fast Concurrency is a breeze. So in this example, we've got this function say that takes a string loops five times You know sleeps for a hundred milliseconds and imprints the string passed in To launch something in the background and what's called a go routine You just put the keyword go in front of your function invocation and that launches it with what's called a go routine, which is you can think of Similarly to like a thread even though it's very much not a thread And it's you know concurrent always concurrent and potentially parallel And what that means is that the go language will try to take advantage of all the system resources available to it on the computer you're running So if you've got you know like 16 go routines and four cores and they're all purely CPU bound Then you can run for you know, it'll run for a go routine simultaneously And so you get that staggered output typical of concurrent programming So second reason after rely after performance is reliability Statically typed Simple so this is a The example I think makes Shows a point where within go you can do a a lot with few lines of code while retaining Readability and so in this example, you've got this struct event processor Which is very similar to a C struct and it has an event queue Which is a channel of events you can think of channels as first in first out cues And there's this add method on the event processor Which takes an event and adds it to the queue and then a start method Which is an infinite for loop that will pop an event off the queue and then run its process method in the background in the Go routine and so this is lacking, you know the Declaration for event in its process method, but other than that like this is all you really need to have this kind of Functionality within your tool right like of course, this is not a full-fledged background worker system or anything like that But if you just want these kind of small pieces in your tools like this is all you this is it or almost it Testing is simple and built-in to the standard library and within the tools at ship with to go So you have this suffix, you know underscore test dot go and when you run go test, it'll run those files and It has some really cool features like a built-in race detector So if you add double-dash race at the end, you don't have to make any changes to your tests It will run the race sector. It's significantly slower as a result, but that's kind of what you get with that Where it's looking for things like concurrent rights or a concurrent write and read to the same place in memory and it will let you know like Here's the specific line of code where this is occurring It's very awesome when doing concurrent programming error handling instead of exceptions so Go does not have the concept of exceptions in its language instead everything that can Error or you know have an error will return an error as its last argument So a common thing you'll see something like this where my fun this takes no arguments and returns a string and an error and the first thing it does is calls this function run Which returns a string and an error and so this kind of thing you'll see a Lot and go where you're saying if error does not equal nil then handle that fact that I got an error Which in this case is to return you know empty string and the error Else you know if it nil was returned then there was no error So continue on in which case we're turning the string and no and this allows for very explicit handling of errors Where they can occur within you know the go programming language and on the kind of this most simplistic end of the spectrum You'll have something like this where if a function a common thing in the go community is if a function Shouldn't return an error button said should panic if an error case occurs You'll you'll see it usually prefix with the word must and so if error does not equal nil panic So the third one after performance and reliability is ease of use so first off the tooling Go FMT or go-fumped is a code style formatter that ships the go programming language and So usually people will like bind it within their ID or editor. So for me, it's like on save It'll just Run go-fumped on the language and so you to use it from the command line you run go space FMT space your File and just rewrites and formats it And for me, it's like even if you don't agree with the code styles Formatting that they let the go team went with Consistency throughout the community is awesome. So for I don't for example They chose tabs over spaces. I prefer spaces and everything else But one of the awesome things with this tool is like I don't even have tabs in any way configured in my editor I just run go-fumped and it will just replace it for me and that consistency is awesome because you know if you're at all Before doing a lot of good stuff I was doing a lot of Python stuff and the Python community is extremely fragmented and like what the code styles are you know, there's this pep 8 document that most people don't follow all the way and also Anything before that like doesn't really follow it. So you've got tons of different styling throughout the community and instead You know 99% of the go code you'll see is formatted exactly the same Testing and talked about that Go docs it's just an You know code comment to HTML documentation creator very easy and awesome Go build to build your binaries and go run to run them And when you run them it treats it it seems very much like a interpreted language And then I put in them go some of them user, but you know tons of editor and ID support Familiarity so This depends on your you know depending on your opinion of these languages this might not be a good thing for me it was Go has a lot of stylistic you know similarities to see and Python and then taking a little bit from JavaScript The library support is awesome There's tons and tons of you know big small medium-sized organizations creating Go tooling and go libraries and open sourcing them as well as awesome individuals And it's extremely easy to install them so you set up this go path environment variable as part of your installation and Then you're able to just run go get you know the github URL and that will install it Distribute a single binary anywhere so With if I run in this project if I call DB Duke if I run go build I will get a DB Duke binary and since I'm running it on a you know MacBook Pro It's going to be a binary built for 64 bit You know OS X right if I change this environment variable to go OS to Linux Then I get an elf 64 bit executable and if I change architecture as well Then I get an elf 32 bit executable as we will notice here is that these are also statically linked So cross compilation as well as just is a breeze so I run this you know at pro core We run 64 bit Linux everywhere, so I run this command on any of my go projects And I've got a binary that I can just scp up to the machine and run it or package up and ship out so performance reliability and ease of use and To take a step back here at pro core. We're mainly a Ruby on Rails shop within our web application and You know and so within that our operations team and security team You know use Ruby a lot and we have tons of Ruby scripts and bash scripts for interacting with Postgres And a lot of those will stay around and scripting languages are awesome I don't want this to sound like I'm dissing those languages, but in a lot of the contexts To which I'll be talking here where we needed to something a little bit more than that Go is a great option where we were able to retain a lot of ease of you the ease of use of using scripting languages While gaining a lot in performance and reliability where if you're coming from you know C C++ Java You know the classically compiled languages There's a lot you know there's a lot of reasons to be using those languages and I don't want to make the argument to stop using that but The argument here is that there's a lot to gain in ease of use and reliability from those languages while retaining performance so let's go into Interacting with Postgres in go. So just take a little sidetrack So in the standard library There's this library database slash sequel go uses slashes for sub-packaging and It provides this kind of this core API for interacting with sequel databases You it basically you implement the driver that actually interacts with the database And then it provides API for doing like open close connections begin rollback commit for transactions exact for executing a statement query for May the query query row for query That's only we're going to turn a single row and things like pinging the database and built-in connection pooling, etc So my favorite driver for a postgres is lib slash bq. You just run that to install it and You use it a little something like this. So at the top you just import it the exact same way you installed it and you Get your connection parameters. It can take a string or this built-in object And you call sequel open with the name of the driver and your connection parameters And this will return a DB object a DB object and with that you can call you know query row or any of those other methods and In this example, I've got this variable result that I want to put the result into and so this query row method will return a row object, which I can call scan on and pass in the variable I want to put the result into and So then I'll put the you know, select one will put one into this result variable and return an error if error is equal to NIL then I continue on and I've got my result So that's pretty much as much I'm going to talk directly about You know that direct API integration, but if you're interested, there's a great website that goes really in detail about Interacting with all the different sequel databases the common sequel databases in go Recommend checking that out but to go into some real-world examples PG net detective This is a tool I wrote at Procore for analyzing Postgres network captures specific like that you get from like TCP dump and Let me so in its open source for you to check it out. So let me let me tell a little story About a little over a year ago now my boss dropped this Graph into our Slack channel What this is is this is network traffic for our production database With the y-axis being megabytes per second and and the blue line being the bytes out of the database So you see this spike up to about 105 megabytes a second going out of the database like what is going on? And so we start looking around and we find this crazy Rambo graph that is the aggregate bytes in To our background worker servers So this is seems to be where the traffic is going From the production database into these background worker servers seen these massive spikes The y-axis here is megabytes for a second again And you know spikes up to about 70 megabytes a second going into these background worker servers like what is going on? you know, so we Look at you know slow query log and not really seeing anything So we turn on full query logging run that through PG Badger to start looking around within our database and just not really seeing Anything and so if they okay, let's take a network capture on one of these background worker servers, you know We start we open that up in wire shark Let's start looking around and not really seeing anything that's indicative of these huge mass amounts of data Like okay Let's take a step back. What do we you know, what do we actually really want to accomplish here? We need to figure out what queries are coming into the Database that are resulting in mass amounts of data going out and sort of think okay. Let's take a capture On the production database and try to correlate, you know queries coming into the data going out And so I built this little prototype in escapee, which is a Python library for doing packet manipulation and It was super easy to do and this works great, but it was really really slow I was testing with a capture of about like 20 megabytes and it was taking quite quite a while just because of the nature of You know what I was writing there's a lot of looping And so it just taking like 20 30 minutes for a small capture and on that production database, you know doing a capture It was resulting about a gigabyte every 10 seconds So trying to get like a good sample size of like two minutes a 12 gigabyte file and just needed, you know needed something faster and so decided we decided to rewrite it and go and in You know the first thing I needed to find was something resembling escapee But for the go programming language and there was this great library created by Google called go packet which Provides packet processing capabilities for go and what it allowed me to do is you know on that capture file Loop through it Open it up, you know lazy load and loop through it and say, you know if this if the destination port is 5432 meaning it's coming into the production, you know the database server and the payload The TCP payloads starts with a P or a Q That means it's either a parse packet if it's a P Which is a prepared statement or a query packet Then okay, that's a that's a query coming in You know normalize the query increase this counter grab some metadata about it Else if it's the source port is 5 4 3 2 and it's an act. It's a response back So save that with the destination IP in the act and the size of the payload and so what we can start to do here is correlate the queries coming in to the size of the packets going out and ran this and got some output that looked like this and What this is is a catalog query or a screen a schema query We're trying to get the scheme of this table drawing log imports and the total was about you know 170 megabytes With a 1 to 1,000 query packet to response packet ratio like this was very confusing for us It was funny when I first actually pop this into our chat like this was the results. We got my bosses like now There's a bug No way that's the actual result and there and that's why I went back and like Looked through and there were some small bugs, but re ran it and fix those and re ran it It was the same result It's like very confusing as to why that would be the result of you know, what was causing all this data going out and so to to remind you guys if So Procore is a Ruby on Rails shop and if you're not familiar with Ruby on Rails the ORM that Ruben rails defaults to is something called active record and Active record is interesting in that it doesn't store the schema in code But instead queries the database to get the schema of the table every time in Iraq so a process starts up and It you and it interacts with the models the model will create database get the schema cash the schema and use it so on Rails applications you will commonly see these catalog queries But they're super super fast and they happen infrequently because they're cash like so they only needs it the first time it interacts with the Model, but there's something called the rescue which is a background worker processing system for rails written by github that Every time it runs a job it forks the process and one of the problems with that is it loses that schema cash So every time a job runs every model it interacts with it has to query the database so when we start getting you know 10 thousands jobs Plus per hour running, you know around maybe one day tables being touched for job average of 20 calls per table Like that's a lot of data that it has to get and so it spikes We were seeing like a you know it hitting what at the time was a one gig of a one gigabit limit from the database or Tons and tons of data and so for that and many other reasons we're moving to sidekick which as Threadbase, so it's able to actually persist that schema cash So with PG net detective like where did go win? Performance absolutely do you mind if holding it to that? Thank you you Know out of the box like the default before I started trying to make it faster It was significantly faster than what I was doing in Python And then the other piece is community or ease, you know, which is a big piece of ease of use Like I needed to find this library for doing pack of manipulation and I you know before we Before I worked with scapey like I tried to find one in Ruby and after that when we need something faster We do a bunch of rust at pro course It was like, okay Let me see if I can find one at rough and Russ and like couldn't find a library other than straight up like lib p-cap bindings Which I really didn't want to work with So that was just a huge win for go So again, this is a open source Check it out. Let me know if it doesn't work for you So the second example is DB Duke so DB Duke is Not open source and won't be it's just it's way too specific to Procore But as I talk about this stuff like it's not that it's like secret super, you know Super secret proprietary stuff. It's just very very specific to pro course I don't think there'd be any value in us open sourcing it But if you're at all interested in seeing pieces of code from this tool or anything, um, let me know So to set some context We restore, you know staging QA testing databases frequently frequently as in you know Like often and a lot of them It's important that they and it's important that they restore successfully You know because winter restore fails productivity does for QA's or and or for engineers In the process, you know And then when they do fail the process of kicking off these restores by hand is faulty and what I mean by that is What it looked like was Hey, you know someone on the ops team SSH into the machine, you know creating a Tmux like listing out the cron jobs and the postgres user because they're Slightly different depending on the databases like copying and pasting that and running it And then just detaching from the Tmux session It's gonna take a long time and there's a lot of things that can go wrong with that with that process so And and when these restores would fail they would sometimes fail from for very easily recoverable errors So what we decided is we needed a tool for making restores of postgres databases manageable and fault tolerant And so this calls out to PG restore so it utilizes that for doing the actual restore process But kind of all the other stuff surrounding it So to talk quickly about the manageability piece The idea here was run dbduk as a daemon with jobs rather than this, you know specific script that just runs Something that we can you know easily use Something else to kind of kick off the job because it'll run for a long time So the content, you know, this is not anything complicated or you know Specifically like I don't know cutting edge by any means It's that doing this was extremely easy with go, you know You have that the core daemon process running and when it needs to run a job It just starts to go routine it runs the background and you can use channels to easily communicate back and forth between the go routine and the main process and For looking at fault tolerance The idea here was to treat restores as a state machine and recover from failure states So you can get you know for us, this was the simple state diagram that we use where These databases that are constantly being restored are you know in use then they're ready to restore and they're restoring Then they're ready for use and from there they go back to being in use And the failure state is going from restoring to fail to restore and we want to bring it back to restoring and You know again like So what this this is not anything cutting edge by any means but rather that Writing go code the way go wants you to write it allowed this to be very this to happen very easily by just Handling the errors as they come so that's the thing right? It's like as we go to these through these different states you can easily see like within go What are the different errors that can occur as it goes as the program goes through its flow? And just handling those properly And so I wanted to quickly talk about what does that actually look like in practice? So the on them again on the most simplistic side is airing out So indeed we do you know if it can't connect to the Postgres database This log fatal will basically just print this log message and then just panic and can alert the ops team and yeah, and log log a warning so Drop schema if exists underscore migration That underscore migration schema is something for Used by Sloanie when we're doing major version upgrades so if we're in the middle of that process while that the dump was created that this is being restored on and It's trying to drop it like if it can't that's okay Like don't blow up and like fail the restore process just because it can drop that schema like that's totally fine Just log a warning so we know about it and continue on and And then retrying on a little bit more complex set of retrying with a timeout without a backoff so I put with the timeout explicitly because So one of my lessons learned with writing DB Duke and working with restoring databases is like you always want to have a timeout Trying to I've had things where it's like it's try a bugs where it tried to restore a database really really fast for like Seven hours straight and just totally pose the file system as well as the database So that part's pretty necessary, but the width or without a backoff is Up to you of course So in this context, it's a method this is a method to block till not in use so blocking until not no longer in that state of in use and The code is very simple right like checking the state And then if it is if the state is in use go into this for looper We're sleeping for 15 seconds quitting if the timeout succeeded getting the current state And then if we're no longer in in use break out of it and so for both these pieces manageability and fault tolerance like It's none of what I'm doing here is specifically that harder or By any means even close to being cutting edge, but rather that there wasn't anything like Crazy about doing it. It was just writing go code the way go want to do right and like that's the thing is is After understanding like how go once you run it like you just kind of start getting these pieces for free From how the language works fundamentally And so where did go win with TV Duke other than that stuff is You know error handling being a huge piece for reliability like going through these state transfers You can directly see it every line of code like if there is a potential error that can occur here Capture and handle it explicitly And then concurrency which was a big piece for ease of use wanting to do these background jobs I just you know, you just put the keyword go And it just runs in the background So in conclusion Going back to that comparison of scripting languages to go like There is a lot. I feel it's just that I feel like there's a lot to be gained in their Reliability and performance while retaining that ease of use in scripting languages and starting to get these things for free when it comes to reliability and From the compiled languages of you know, C C plus plus Java You get to retain a lot of that performance while gaining hugely in ease of use and reliability And go is by no means, you know, this end all be all language I think all these things that we talked about, you know being Adamant about them being the best is silly. These are just all you know, these are tools in our toolbox and For me, it's like in this context of building tools for interacting with Postgres go has just been a true delight And I I hope you give it a try and like it out and I hope this helps so one last thing which is If you like what I talking about You want to build cool stuff with cool people and you like these pictures Procore is hiring Thanks guys. Um, did you have a question? Yeah, so totally so Yeah, and maybe I should have showed more of them Yeah, that's a good. That's thank you. Yeah, cuz so what it is with that table specifically is that it's a very that is a very wide table So doing that catalog query a lot can have some impact But it wasn't not nearly enough to bring that marker up. It was that the next proceeding, you know Once we're all catalog queries as well. So yeah, that was more of a bad design on the side of the slide Does that make sense? Yeah Sorry, is the what? No, there's no implementation for that at the moment. Any other questions? Gotcha. Okay, so for specifically unit testing the unit tests for DVD car Are you just using the standard library and then there's some I didn't talk to it, but there's a library called Testify which does some asserting. It's basically just allows it's more of Like just makes it a little bit nicer to do assertion testing within your unit tests But DB Duke is mostly integration tests that actually because of the process of like up, you know There's it's like it's there's a bunch of go code and then it calls out to PG restore and then there's a bunch of go code so Without what's, you know, really testing that whole flow requires integration tests of actually doing that. No it call Yeah, I we just used we actually did use Postgres for doing that part the mocking that is in places just for It calls out to something called of console if you're familiar That's a distributed key value store and it uses console lot for managing State and so that you can actually so that throughout the system it can know like what the state is of currently restoring databases Or if they've finished or whatever it's that part is mocked within the unit tests, but not the actual calling out to Postgres Any other questions? Awesome. Thanks guys