 My name is Eric Radman. I do a mix of system administration and application development. I've been using Postgres since, I don't know, 2002, 2003. Some were in there. And I have a specific interest in the processes we use to develop software, which is what we're going to talk about today, test driven development. In Richard Hipp's 2014 keynote in Ottawa, he contrasted SQL Lite with PostgreSQL. PG is client server. SQL Lite is serverless, and so on. In the conclusion of his talk, he made a compact recommendation that describes the perspective you need in order to use Postgres effectively. He said, think of PostgreSQL as a programming language. This is an astute observation. And when used as a programming environment, Postgres becomes part of your application runtime. And as such, applications written to take advantage of this runtime require an approach to unit testing that verifies this functionality at the right level. Unit testing is a software engineering discipline put forth by Kent Beck, which states that difficulty in testing is a symptom of poor design. Kent's articulation of this principle can be summed up like this. Design that is conducive to testing also depends on structural components that are at once easy to comprehend, easy to use in different contexts, and easy to modify. Now for most programmers that I know, the constraints of unit testing are perplexing. But I've found that non-programmers sometimes get this if I explain it in this way. The best way to write software that can be adapted and maintained is to write software that is testable. One way to make testable software is to write the test first. Now this sounds like a formula, but you, the programmer, are still making decisions. One of the significant challenges in this process is to find what Robert Martin identifies as architecturally significant boundaries. Architecturally significant boundaries to a large degree. That's what this talk is about. So with that in mind, let's do the patterns for the design of a system that leverages a database. This is a typical set of rules that you might find proposed as a guideline for unit testing. Unit tests cover unit of functionality. To use the metaphor of human language, the goal is not to identify sentence fragments, but whole paragraphs that represent a coherent operation. And finding the right level to zoom into is about finding a point of reference that takes into account the logic of the problem you're trying to solve. Now in order to communicate the unique role that unit tests play versus system tests or regression tests, people have devised rules for describing how this should work, what you can and cannot do. And this is a typical set of rules. So let's try to state these axioms positively. I'm gonna go in reverse order. Number five, it's a self-hosting test harness. You run get clone, make test. That's all you have to do. Maybe you have to install necessary prerequisites in terms of packages. Four, implies dynamic allocation of port numbers or other global shared resources. Number three, the tests should run independently of the contents of the file system. And you do this by mocking calls such as open, read and seek. Two, the tests run locally. Hence they are indifferent to the state of the network. They work just as well if you run them on your laptop on the airplane as they do when it's docked at your office. So what is the motivation behind number one? Why shouldn't a unit test talk to the database? Well, a database is typically or at least historically meant a number of things that are pretty hostile unit testing. It's a global shared resource. It's persistent storage. It's high latency. It's inaccessible without special licensing and the one person who is authorized to install it or knows how. And it's only available on the particular platforms that the vendor supports. And these characteristics are all generally true of a commercial database. And parenthetically, I would add that as such commercial databases are hostile to agile practices because they do not facilitate fast, repeatable tests. None of these problems is very difficult to overcome with open source software. So what do we do? High latency. Well, use a startup procedure that's optimized for latency. Dynamically allocate global resources and generate configuration at runtime. And because Postgres is designed to scale up and not down, it does put significant demands on the test harness. We need to initialize state in a file system. We have to establish client server access. You don't typically find that in a unit test. And the multiprocess architecture of Postgres holds shared memory. So web developers in particular have tried to solve this problem. With a different approach. And that is treating SQLite as a drop in replacement of Postgres. Now this is tempting. SQLite is fast. It doesn't require anything on disk. And it might be a correct representation of what's going on in production. So the ability to use SQLite as a fake instance of Postgres is useful. What is the trade off? You're giving up on the functionality that Postgres gives you. So will SQLite do? Well, yes. But the temptation to use SQLite, well let's call it good because it means that you're gonna write tests that use the database. By the way, these benchmarks that you see, those alarming benchmarks, those are just running the Django test harness using SQLite and then Postgres. I have no idea why SQLite is so much faster in that. I didn't bother to investigate it. But let's just give, let's take it for granted that SQLite performs pretty well. And depending on the stack you have, what drivers, libraries, other databases may vary. Okay. So how many of you know what the PL and PG SQL stands for? Probably some number of you here. What do you think? Procedural language, right? No, it's not programming language. It's procedural language. And that is in contrast with standard SQL, which is what? It's a declarative language. It's a very unusual thing. There are a few things in our world that are declarative languages. Not that many. I mean, regular expressions, you know. The SQL is unusual in how expansive it is as a declarative language. So if SQL is a language, how do you write tests for an application that interacts with the database? I'll give you the opinion that's kind of common from James Sadler in 2005. Here's what you do. You encapsulate all your database through an implementation of an interface called my database server or whatever. The mock will have no logic. It will simply return the data that's required. Now my interpretation of this advice is this. Reduce SQL to the smallest set of operators that you can emulate. The smallest set of logical constructs that you can mock and then test each code fragment using those. So could you test an application that used embedded JavaScript this way? Maybe. But perhaps, aside from the very painful process of emulating bits of the JavaScript VM, you'd be creating poor tests for one very important reason. You are limiting your use of JavaScript in your application to the means of combination you are able or willing to mimic in your tests. Now remember this thesis. Difficulty in testing is a symptom of poor design. If you are having some trouble testing, then you're probably doing damage to your production code as well. So what's the alternative? Mock high is a term I learned from Ryan Davis and it means you only use mocks to detach from unreliable resources. Now if you do this, you have adopted what I would call co-extensive unit testing. A unit test is commensurate with its individual functional units, hence the five rules that I stated before could be redrawn like this. It covers all data manipulation logic. It runs on a host with or without networking. It is agnostic to file system layout. It can be run without requiring locks on specific resources and it is portable and deterministic. The subtext of what I am calling co-extensive unit testing is that you do not normally mock for speed. In other words, if an operation seems to be slow, you don't mock that just because it seemed to be slow. What do you do? You figure out why that operation is slow. So what does this look like? I'll read to you a little bit of testimony from Kurt Sampson in 2005 as well. He said, sure, one approach is not to do these tests. Another is to figure out how you can ensure the availability of the resource. A few years ago, it even required moving an entire project from Oracle to PostgreSQL so that everybody could have their own database server with as many schemas as they wanted. Not only did the database stop being a critical resource that was blocking testing, but in fact, this brought us into the agile world, making database changes almost as easily as other code changes. The only tweak I would like to make on that is almost as easily, why not as easily? So if you use your own local copy of Postgres, you can write tests that are more comprehensive and composable. Use the full power of the database and shift logic back and forth between the application and the database as it makes sense. To this end, I started a project called ephemeralpg.org and the notion here is that writing unit tests that leverage a Postgres database is not hard and that the barrier to writing these tests is that it's not immediately apparent how to construct the test harness. You want to have a new database and you want it right now, how do you get it? So I'll switch to the terminal and show you the basic invocation of this. It's just PG-TEMP and I'll add dash T to say, give me back a TCP port number. There I have a new database. You just learned everything you need to know about that interface. PG-TEMP has a limited shared memory footprint and it automatically garbage collects your database after 60 seconds unless you specify the dash W option where you can specify the number of minutes and it waits until the last client is disconnected before it tears it down. So it spins up a background process that we'll periodically check to see if the connections have been disconnected. Now anyone could work out a pattern for using a DB and PG control for starting a database but a naive approach does not result in very responsive startup times. PG-TEMP is an office shelf script that tries to get this as right as possible. Errors are plumbed through so that hopefully if it doesn't start you'll see the reason why but hopefully it does start all the time and it can do it in usually at about a 10th of a second because the first time you fired off it will then initialize another database as a background task which will then recycle and use on your next indication. It starts on a socket by default. You can allocate a TCP port as we saw using the dash T option. So for this next part I'm actually gonna switch over to back to my terminal and we're gonna look at what it looks like to do some development using this tool. And we're gonna just sort of buzz through a little application that I put together. We're gonna call it TagWatch and the purpose of this little application is to be a little HTTP server that you can register as a web hook on GitHub or Bitbucket. The idea being that when someone pushes a new tag to the repository this web hook will post to this little server and we'll be able to collect information in the database and do things with it. So let me open up the essential parts. All right, we have the five files there and I'm gonna start up a little process to monitor the files as we go. So let me give you just an overview of what we're looking at here. The make file is our, really this is the extent of our test harness, isn't it? Right, we're saying the URL equals pg-temp and then dash T. What do we do next? We have our new database so we load up a schema and then we set an environment variable and run our tests. What does the application look like? This is a little Sinatra application and it reads the environment database URL and if you didn't supply that it blows up. It matches only one pattern, right? Which is slash, so it expects something's gonna post this and it just jams the payload of that post straight into a field, request up body, right? It just gets jammed into this field called payload. That's it, that's the whole application. Let me skip over to the last file. So you can see what the schema looks like. We have a commit table, it has several fields, it has a created timestamp, it has the payload which is just going to be the raw post that was given to us. Provider, right, which is usually bit bucket or GitHub in this case. The repository, usually formatted with repository or username slash repository and then the tag. Usually it'll be something like project name hyphen some version number. So that's the schema. All right, so what does the test runner look like? Well, this is a fairly typical mini test example in Ruby. What does it do first? Well, it truncates the commit table in between each test so that we have a clean response and we can see here that it's, we have a little bit of data. repo is null. This is a bit of false data that bit bucket and GitHub would never send and we're going to post that with the content type application JSON and then we're going to make a few assertions. We're going to assert that in fact the payload JSON field does in fact have something with a key called repo and the provider repository and tag will all be none because of course those are bogus or the payload is bogus. And then lastly we have the created time. We'll come back around to that because if we run that now, it's going to fail because it's going to use the current time. So if we want to write a test that is going to accept an actual payload from BitBucket let's say, what is that going to look like? Well it's largely going to look like copy and paste of the bogus payload except for one important detail, right? Which is that we're going to read this JSON file in that is an actual post from BitBucket. So this is a real example. And you know what, before I move on let's go look at what that looks like. This is a fairly typical message from a web service these days. Deeply nested no particular order to it. You can see lots of things. Now I was interested in three things, right? I want the name of the tag, I want the repository and I want what the service is. We're going to have to kind of dig through here to find that. If we look at GitHub, it's a little better. We don't have the insane nesting but it's just this huge pile of links with an order that's not particularly perceivable to me. So to make sense of this, I assembled a little utility that mirrors the JSON path selectors that Postgres has. So if you were to try to pick off some key path in Postgres, this utility takes the whole JSON file and flattens it out for you. So to give you, in a line oriented way so that you can grab for things. So let's look at what the Bitbucket example looks like. All right, do you see what it's doing? If you see something like repository SCM, right? And then in bright white, HG. That's the key path I can copy and paste these key paths into Postgres to select things out of this JSON path. And because I'm flattening this dictionary and creating these key index paths, we can grab for things, right? I can grab for a tag. And I can see what the paths are gonna be. What path would I have to use if I were to select this value in Postgres? Okay, so here's our test for Bitbucket. It's gonna read the Bitbucket JSON type or JSON file. Let me go ahead and save that. And what we should see down here is a failure. I don't know. So we expected the value of Bitbucket. We didn't get a value, all right? So there are two places we can solve this problem. One is we could solve this problem in the application. I can put kind of logic in the main file to kind of parse this out. Another place we can put this is straight in the database. And there are a couple of reasons you might wanna do that. Probably more reasons than I'm even thinking of. The most obvious, I guess, is if you have multiple applications stuffing data into this database, it's still consistent. So this is a Postgres conference, so we're gonna do it in the database. And what are we gonna do? We're gonna create a trigger, an insert hook, right? That's going to go ahead and try to process this string. What was the test that failed? It was, oh, expected Bitbucket, okay. So I'll give you a shortcut. There's nothing actually in the JSON files that Bitbucketer or GitHub send that tells you who they are. It just happens to be that there's links scattered all throughout there that contain the string, HTTPS colon slash slash Bitbucketer GitHub. Sorry, this is real world data. I'm just gonna use a regular expression and pop that off. And then we'll go back to our test and see what it's complaining about. All right, so we had another failure, which means what? Which means the previous one worked. So the next one is the repository. And I expected to see, I expected to see my username followed by the name of my project. So let me actually go grep for repository, okay? So that's what? That's Bitbucket. Let me see what GitHub did. A lot of text on the screen there. All right, so we actually do have something that looks nice, which is repository full name. So this is gonna be one of those rare cases where the data was kind of exactly in the form that I wanted it. So what are we gonna do in this commit trigger? We're simply going to pop off that key repository full name and another test passed. All right, so what is it asking for now? Tag, all right. So we're gonna do the same process over again and I'm going to look for, let's see, tag. And what do we have? We have three different paths. None of these is very good, right? It's like, wow, key key, some index in a list, some other random key. I'm just gonna sort of pick off one of those and also, again, I'm gonna have to use a regular expression to get at the actual piece of data I want. But that's no problem, Postgres has, it has very good, very good substring functions. And that's what we'll do here. So we'll go ahead and pop off that crazy path that I literally copied and pasted from that little tool and then find the pattern for matching the actual tag. And hooray, I have a service that works at Bitbucket now. Let's move on to GitHub. This shouldn't look very much different. And again, ignoring the time. The GitHub test is really just a almost exact copy of the Bitbucket one, except, of course, that we are reading in the GitHub file instead of Bitbucket. So what am I, okay, I just enabled this. I'm gonna go ahead and run this, see what failures. And by the way, in case you haven't noticed the URLs, it's spinning up a new database every time, right? Fresh state every time. So if part of what my application needed to do was to compile a plugin, a .so, and say load that along with the Postgres startup, that's no problem. It's a fresh state on everything. You can not only test triggers and functions, but any shared library that you could create with Postgres. All right, so it missed the tag name. Oh, that's great, isn't it? Because that means that the provider matched. Somehow it knew that it was GitHub, our previous regular expression matched somewhere else. It got the repository name right. It just doesn't have the tag name right. All right, so let's go grep for that. Let's see, github tag. Okay, another regular expression somewhere. I guess I'm gonna pop it off of ref because it kinda looks like the easiest place to get it from. Now, this is a little bit of a conflict, isn't it? Because Bitbucket had a different rule. So I'm just gonna take whichever one happens to match. All right, so for here, we're just gonna use coalesce and merge the rule for Bitbucket and GitHub together. All right, so this one should match GitHub. This one should match Bitbucket. I guess I don't want that semicolon, do I? All right, and I'm gonna go ahead and save that and see if that passes. Coalesce type, boolean match. Oh, yes, okay. All right, very nice. Okay, so there's one more piece of this, isn't there? Which is, how do you pull out the date time function? Because I wanna assert that everything in my database works. I'm asking the database to do work here and I wanna make sure that it works. One of the things we're doing is we said that the timestamp defaults to now. Well, Postgres has a feature that is available in very few programming environments that I'm aware of, at least not this way, and that is you can set a search path to define a new schema, a new namespace for functions. So if you need to mock out now, you simply create a new namespace, I'm just calling it test, and then go ahead and define a function that will shadow the other one. And so here now is a stub, right? It returns a fixed value. That's all we have to do. There's nothing else. And now, whenever we call now in SQL, we're gonna get our fake function back. Okay, well, it's one of the thing I have to do, which is in the test harness, my little make file, we need to shove in that fake. Okay, so did that work? Let's see, let's go to, let's just go find all these assert equals. It happened to be that January 1st was the date I put in. And let's just go uncomment all these assertions for these three tests and see if they pass. Indeed they do. Now is now shadowed in the database because we can use it. And by the way, you can change that anytime you want, right? It was just set search path. Yeah. How are the tests communicated? Just because I loaded it first. So, well, there was two things. I mean, I loaded it first, but I also altered my username, right? So the test, so by default, PGTEM creates a database owned by your current user. And so it's kind of implicit logging in as you when it sets the search path to that when you log in. Yeah. And because it's my user who loaded the schema, the schema also follows that. And that is, that is how that works. So let's go back and just kind of talk about some of what we just saw there. If the unit tests use a real database and not some fake imaginary construction that kind of looks like a database if you squint, then you probably almost nearly have a complete set of initialization fixtures that will work for a production database, right? That schema that we're using for our unit tests is probably almost exactly what I would use in production. The only thing you're gonna need, if you want to then extend that schema or that test harness to run system tests is some function that grabs the current state on a system and augments that schema. But it really, this is one of these rare cases where unit testing actually results in something that looks almost exactly what a system test would be. So I used Truncape. There is another method. I mean, you could do begin and roll back, but that doesn't quite work for anything that's very complex in between each test. And that is because begin transaction may be executed multiple times, but it doesn't nest, right? If you type commit or abort, it's gonna roll back or commit everything. Kind of open to ideas here. Also save points are kind of the opposite of what you want. You want kind of an outer envelope that says roll back all of the things, the transactions that were running before this. Maybe a save point is almost the right thing, but not quite because I don't wanna restart a server, right? And as we saw, use the search path to mask functions in the system catalog or anywhere else that you want to mock in your tests. So maybe we should do just one more thing, oops, which is to go back and just kind of see how close this particular set of tests, how far this particular set of tests got us if we wanted to do manual testing. One of the things I think that's surprising to people about unit testing is it actually gives you 90% of what you need to do manual testing. It gives you a great vantage point. So let's just kind of see if this little web thing that we just made even really works. So I'm gonna go spin up a new database. I'm going to load the schema off of disk, comes back and I'm going to start up the application server. I'm going to remember the PID so I can shut it down. So I have a new database here and I'm gonna just go in ahead and look at what's, look at the connections and I can see tagwatch.rb is connected. So the database, the little application is connected. And I'm gonna actually run curl and see if I can like literally just post this JSON file the same way that GitHub would and see if that works. And it does, it says okay, sweet. So let me go ahead and query the database. It worked, it sure did. That database has the trigger hit for the insert and it is kind of normalized some of our values. All right, let's go ahead and shut that application server down. Right, so we're writing unit tests but they actually do a great job at giving us a head start on a really convenient way to run other sorts of tests. Okay, so what about SQLite? Do we need SQLite to build responsive tests? We do not. Now it is difficult to draw a fair performance comparison between different applications that do different things but I collected some benchmarks from an in-house application that I, two of them that I maintain that I think will provide some useful insight and I have to apologize. I really wanted like two open source projects that were almost the same but like written using different methodologies that I wasn't able to find that. I hope these are useful though. The numbers are from two applications I maintain. Both of them are written in Python. Both use Psycho-PG2 as the database adapter and the numbers include Postgres startup time. And what I have found is that if you drop your object relational mapper and just use SQL you easily, easily pick up a 5x speedup even if you count the startup time of Postgres over using SQLite. That's how fast SQL, well how fast is SQL? It's fast enough, it's very good. It's possible to become kind of overly obsessed with the speed of unit tests but there is a steep price to pay for tests that run longer than 30 seconds. After two or three minutes these are only useful for regressions and regressions are longer, are valuable but if they take minutes to run they are not useful for the kind of test driven design that we're interested in here. In my experience and I've heard this from others as well that there is a direct correlation between the time it takes to run tests and the time it takes to deliver a new feature. And the only reason, I guess the reason I think this is true is that if your tests are slow then you tune out, you wander off, you check your email and you start multitasking. So here's the good news, you don't have to choose between rapid feedback and testing against a real database. Load scheme is off of disk, build your application using this language, it's a language called SQL. It's going to perform very well. So there are two ways of describing the role of a database administrator or a relational data architect. The first is by the things we do. The tasks or roles listed here are not comprehensive, they are not in the description of anyone in particular. But I hope they are representative of everyone in one way. This is a supportive role. Now the headquarters for the company I work for is located on 45th Street in Manhattan. And if I arrive there by way of Port Authority, I usually walk right through Times Square where marketing consultants have continuously launched a barrage of marketing capsules intended to hit every person on the street. Now one might think that nothing Times Square could be offensive, since it wasn't designed for anyone in particular, that I cannot help but feel misunderstood in that space. Why? Well you see all the demographic data in the world cannot see beyond the tasks of each day to the things we are about. Running automated tests on an ephemeral database reduces impediments to testing and gives developers the ability to explore solutions enabled by the database. So we are giving other developers access to things that will help them. We're facilitating their work. Now Rich Hickey as aptly described a database is something that provides leverage over data. Key value stores, file systems, and other facilities with an API are very useful, but as the programmatic capabilities of a database that make it pivotal to an application ecosystem. By making it easy for your team to employ the full set of features available in Postgres, you are enabling your team to simplify an application stack. In this session we started with the proposition that difficulty in testing is a symptom of poor design. This axiom is a statement about how software engineering works. A meaningful test strategy provides feedback on what? On architecturally significant boundaries. I will close with an illustration from another kind of creative endeavor. In 1998 I picked up my first point and shoot camera. A very clumsy, utterly forgettable device by Kodak. In 1999 I bought the newly released Nikon Coolpix 950 and I figured that photography requires at least some mastery of lighting. So I practiced with a TTL flash that's through the lens metering, studio umbrellas and reflectors. I carried this little camera everywhere over the next six years, but made very little progress in capturing memorable images. Very little progress, that is, until I picked up a Canon Elan 7 with their cheap 50 millimeter lens. Within weeks I was taking photos that I am pleased to present to this day. Does this short story of mine make sense? Can changing gear compensate for mediocre skill? No, of course not, but in large part the intelligence we apply to an endeavor is in the selection of the processes that what in turn form us. We create tools and these tools in turn provide the vantage point for the systems we design and ultimately they shape our own imagination. Thanks for listening. I think we have five minutes for questions. You can run it as many times in a row as you want. So it's creating a temporary folder in slash temp and what happens is the first process that gets to a pre-initialized database folder removes a file so that one of the others sees that that file was there first. So yeah, you can run them in a tight loop and in parallel. Yes. Restore point, yeah. Yeah, I said save point won't work because that's the opposite oil. That save point works within a transaction. All right, restore point is, yeah. The point there was that after each test is run I want to reset the state on my tables again. Okay, I actually hope I'm wrong because that sounds wonderful. What you just said, the problem has to do with what if you're testing code that is also starting transactions and aborting them, then you've just destroyed your entire save point. So yeah, actually, and I've done this kind of testing without truncate, but that assumes my application I'm testing isn't doing fancy transaction handling. Yes. Yeah, you're right, yeah. I've not, that thought never crossed my mind. I, could you do that with, I don't know how you do that, yeah. Oh, it does. Yeah, I use Django and I didn't know how that actually happened and certainly, yeah. Go ahead. Yeah, that's a good point. So I mean, this is a 110 line shell script. So anything that you need to do, you can modify in there and yeah, and it's going to write out a config file that has basically the minimal footprint that I could reasonably give it and also turning off some features you really don't care about like F sync and fold page writes and things like, I'm going to throw the thing away anyway. Mm-hmm, right. Yes, yeah. I mean, another thing that this utility does is it doesn't actually wait for the, it doesn't wait for the database to shut down, for example, you know, various things like that, yeah. Yeah, so this is, I mean, I think this is a fairly common task, which is why I thought it'd be worthwhile. Let me see if we can just script this once, you know, and, because this is actually kind of nothing like setting up a regular database. It's, you know, the parameters are all different, yeah. Yes? Yes, go to eradmin.com and I have an article on, I have an article that links to that little script. I didn't, I don't remember what the article is called. Yeah. The temp file, so let me see if I can rephrase your question. You're saying that I have this temporary database? Yes. Oh, those are just in my repository, right? Yeah, yeah. Yes, yes. Yep, yep. Mm-hmm, mm-hmm. I don't know if that's really, I mean, the concern of PG-Temp is to just spin up the database. So like the test runner that I create around that in, well, in this case, let's see, where is it? Right, so the test runner that is here, it can kind of do what it wants in order to sort of populate that database or choose where it fetches data. If I'm kind of missing your question, come back after and I'll see if I can kind of take a better swipe at that. Are we doing it on time? Are we? One minute? Anything else? All right, thank you very much. Mm-hmm.