 Okay so this talk is going to be held in English, right? Yes. And that means you probably, nobody probably has a problem with that. Unless you do then please ask the German English-speaking person next to you to translate. They are now your translator. Yeah, welcome to the Rube Goldberg database, Postgres and all the nice features and things you can do with it. If like me you used Beamer again, okay? I touched nothing. Well it looks like the Beamer has died on us again because the light is out. You have the picture back there. Last time we inserted some Q&A but at the beginning of the talk this is going to be tough. Okay, so the technical is informed. So you have the feed back there? You have it on your screen? Okay. So you can hold the talk for the stream but not for the people actually here. But everybody has phones, you can just see the stream. Yeah, let's see whether this works. Yeah, let's give it five more minutes. Yeah, let's resume the stream in like five minutes or so and hopefully the Beamer will be back up then. And in the meantime we're gonna have like room party here and the stream is gonna have some party music. We're gonna miss out. Yeah. Yeah, Leute, die fragmentiert bitte ein bisschen. Da sind noch freie Sessel und da kommen noch Leute. Und einer ist da vorne noch frei. Danke, danke, ihr seid großartig. Ich hab sie auch verschreien müssen. Ich hab vorhin erzählt von Puten Beamer und bam, kurze Update, es ist noch nichts passiert. Jemand ist in dem Raum da oben. Update, we're good to go. Okay. So, stream wieder da. Yes, also welcome zum Vortrag postgressen, was man damit cooles machen kann, wenn man ein bisschen tiefer eintrifft. Und damit meine ich, this talk is gonna be held in English. So I should also talk English. Yeah, Postgres and the nice things you can do with it. If you started like most people with MariaDB or MySQL, and then you'd start procedures and function with it and then got to know Postgres and switch to Postgres, but didn't have time anymore to go in depth there, then this talk is for you. So, yeah, you have a lot of things to do with Postgres. So you have some experience. Thank you. Okay, so welcome. So I realize, if you have to explain your title, it's probably not a particularly good title, but I've noticed a lot of people don't know what a Rup-Golberg machine is. So a Rup-Golberg machine is some kind of contraption to perform a rather simple task in an indirect and overly complicated way, usually involving some chain reactions. This is one of the original sketches by Rup-Golberg himself. And by the end of this talk, you'll hopefully have some idea on how you can build some pretty Rup-Golberg-esque things using Postgres. Okay. To start, let's just look at some built-in functions of Postgres, so we kind of get a feel of what we're dealing with here. So you've probably seen them before. They have a name and you supply arguments to them in parentheses, and one of them, like, let's pretend we have some table called users, and you choose it as an ID, a username, and we can then, for example, use the function lower to convert a string to lower case, which is kind of obvious. Yeah. And there's one thing I'll be using in the next few slides, which might not be obvious. If you're not actually getting any data from any table or something, you can just leave out the from clause. So if you just have some expression you want to evaluate something like 2 plus 2, but also if you just call a function, you can do it like this. You can just select and write the expression. So just so the next example to make sense. So we've already seen lower. Lower takes an argument of the type text in a lot of other languages that would be called a string, and it also returns a text. And yeah, that's what you expect it to do. There's, for example, also abs for absolute. You give it some numeric value, and it gives you the absolute of that value. So if you give it a negative number, you get the corresponding positive number. And the way I've written these down on the slides kind of suggests something. It suggests these functions have static types, and they do. So the types of the arguments and of the return like you have to define them when you define a function. That kind of raises one question. What happens if we use the wrong type, right? So what happens, for example, if we try to get the lowercase version of the number five? So okay, we get an error. That makes sense, I guess. And there's a lot of text here that's kind of saying, okay, the function lower for integer does not exist. And there's a hint where it says like, oh, you might do typecast or something. There's one thing this isn't saying. It isn't saying like that it's inherently illegal or something to try to lowercase a number. It's just saying it couldn't really know what to do with this. And this suggests one thing which is functions can be overloaded. So you can have multiple different definitions of a function for different types of arguments or also for a different number of arguments. And depending on what you then provide, Postgres will call the appropriate implementation. So let's look at a built-in function that has some overloads. There's the function round. You give it some numeric value and it returns another numeric value. So for example, if you run this number, point five will be rounded up and you get 1312. And there's a different overload of this function which actually takes two arguments. It takes one first argument value and s, I don't actually know why s, but that's kind of the precision to which you want to round. So if we take this already kind of crude approximation of pi and round it to two digits of precision, this is what we get. And depending on whether we call it with one argument or two arguments, it will call the correct implementation or we'll complain that it doesn't have an implementation that matches the types. Okay, let's look at one third last function which is coalesce. It's kind of weird one. You can use it to kind of provide default values. So let's say we have some table called product. Each product has a name. It might have a description. It might also have a long description. It might have neither. And you can use coalesce kind of to default back to other values. So what coalesce will do, it will return the first argument that is not null. So if description isn't null, it will return description. If description is null, long description isn't null, it will return long description. If both of those are null, it will return the string, no description. And yeah, one thing we can see with this function is, and I've kind of written it down that way, functions can be variadic. So you can have functions which take a variable number of arguments. One last thing to note, there's something that kind of looks similar but works in a different way. That's aggregate functions like sum or average. They don't operate on a single value, but kind of on a set of values. So for example, if you use group by, then calculate the sum of some value for each group. They are implemented in a slightly different way. You can also implement your own aggregate functions, but we won't be talking about that. Okay, building your own functions, how are we going to do this? There's a bunch of different languages available for this. There's for example, PL, PGSQL, that's the Postgres procedural language. You can also use Python or Perl for example, and there's a bunch more languages you could use. Some of them built-in, some of them available as like third-party packages. But yeah, for this talk, we'll stick to PL, PGSQL, because I think it's easy enough to understand, and it is available by default, so you don't have to enable anything, install anything. It's just right there. PL, PGSQL is a superset of SQL, so you can just have SQL queries in there in your program. You can just have a select or an insert in your program. You don't need to have any special handling of that. But in addition to the standard Postgres SQL, it also has things like statements or loops, stuff like that. Okay, so let's actually look at an example of why you might want to have a function, your own functions. So the example we'll be looking at is, let's say we have some tables which contain recipes or something like that. They don't really have recipes, but it's just an example. So let's say we have the table ice cream recipes. Each ice cream recipe has a name of the type text and a column ingredients, which has the type text array. So it has an array of text values. So for the first line, the name is the string chocolate, and the ingredients are, like, don't get confused by the curly braces. That's kind of the Postgres way to write down arrays. It's the array of the string milk, and then the string sugar, and then the string chocolate. Okay, let's say we don't particularly like Hazelnut. So if we want to get all of the recipes that don't have Hazelnut in them, we can do something like this. So kind of the interesting thing here is the third line. And what we're doing here, it's kind of a weird syntax, but that's basically the Postgres way to check whether an array contains a certain value. You do, like, you do the value equals any and then give it the array. So basically, another way to say it is, does the array ingredients contain the string or, like, the element Hazelnuts or any element equal to Hazelnuts. Yeah, and this probably does what we wanted to. We get the one recipe back that doesn't have Hazelnut in it. Okay, let's say we have a second table. And let's pretend like there's a slight difference here. Let's pretend this is for legacy reasons. We can't change it. It doesn't really matter. Which is in this case, ingredients isn't actually an array of text, but it's just one long text. So for the first recipe here, it has the name cheesecake, and then it has ingredients, which is just one long text with the, like, flour comma space, sugar comma space, and so on. Which, yeah, probably not a particularly good way to store ingredients, but we stuck with this. So if we want to filter out anything with Hazelnut here, we need a different query. So one way to do it, and again, not a great way, but certainly one way is to use, like, a fuzzy string comparison by using the like operator. And these percent signs are wildcards. What would be, like, an asterisk in regular expressions, for example. So what this will do is it will match any row where the string ingredients does not contain the substring Hazelnut. You kind of think about where that might go wrong. It doesn't go wrong in this example, so that's fine. Okay, but, okay, what could we do with this? Instead of doing these two different queries, we could actually write a function. The idea is we can create a function. We'll call it contains. And as its first argument, we'll give it some container, and as the second argument, some element. And we wanted to tell us whether this container contains this element and return a boolean value. And we want to support different kinds of containers, different types of containers with function overloads. Okay, let's look at how we would do that for the text case for the, where we're doing this with string comparison earlier. There's a lot going on here, so let's go through it part by part. The first two lines are the signature of the function, where we give the name of the function, and the arguments, and the types of the argument. So we're saying one day function called contains with first argument called container of the type text, and the second argument called element also of the type text. And this function will return a boolean. Then there's these weird dollars. These dollars seem to throw people off, but they don't actually do anything interesting. They just delimit the part that is actually written in the language we're using to define the function. That's just basically string delimiter here. And this is the actual code of our function. We don't need to worry about the begin and end things. That's kind of the structure of Postgres, PL, PDSQL functions. And the last bit is just we need to tell which language we've written this in, because there's a bit of different languages available. And the code bit here, these vertical lines are string concatenation. So what we're doing here is we will do a comparison of container, like, and if we give it the argument, hazelnut again, for example, it will just create a string concatenate a string percent hazelnut percent again, as we saw earlier. Okay. So this works, trust me on that. Let's now do the same for an array of texts. Now there's not that many things that changed, actually. The types of the arguments are a bit different. Now we have a container that is not just a single text, but an array of text. And we're doing a different kind of comparison. But it's basically the same comparison as we saw earlier for the query. Yeah. And the rest is the same. Okay. And with this, we can write a different query where we use this function. So we're doing, again, kind of doing the same thing as earlier, we're getting all the recipes, which do not contain hazelnut. And this is the query for cake recipes. But the nice thing is this same kind of query also works for ice cream recipes because we just need to swap out the name of the table and contains the right implementation of the contains function will be used, whether the container is a text or an array of text. And of course, this can be also reused in other queries or extended for more container types, stuff like that. Okay. So what are our takeaways from this? These can help make our queries more generic, which can actually, for example, be helpful if we dealing with a query builder or something that needs to generate queries that work for specific tables. In the generic case, we need to actually know which type the ingredients have something like this. If we do it like this, it just needs to know that the contains function is going to work for this table. But it doesn't actually need to know the specifics, which can make this a bit easy in some cases. And of course, with functions in general, they let you write complex logic once and use it in a bunch of different places. As I mentioned earlier, you can also have SQL in there. So there's some cases where this can help you reduce round trips. So for example, if you have some select query and then some later query where you're using the result of the select, if you do that in your client code, you have to wait for the select to go over the network to get the result back, then do some processing in your application, then emit the new query. Whereas if you do that in a Postgres function, it can all just happen on the database and you don't have that network latency. And yeah, as I mentioned, they're much more powerful than we saw earlier. You can do inserts, updates, leads. You can also do schema definition language stuff, like creating tables, deleting tables, stuff like that, which kind of leads us to the downsides. They may do kind of unexpected stuff. So usually, you might have functions which do data modifications when you don't really expect it. And of course, to see whether the function do something like that, some knowledge of PLPDSQL is needed. Yeah. And I guess kind of the main points I think that kind of helped combat this is if you are using custom functions in your database, you should write some pretty good documentations for it to say what your function actually does and how it's intended to be used. And if your function does some data modification, delete something from a table, it should probably be obvious from the name. Okay. Let's move on to triggers. Now triggers are a way to automatically call a function upon some operation in the database, something like an insert or an update or delete. And with these trigger functions, you can, for example, intercept queries and reject them if some you decide they shouldn't go through there invalid. Or you can also do stuff like modify the values that an update will actually need to. And yeah, one interesting thing for triggers for updates, you'll have access to the values before and after the update. So what would have been in the table before you run the update query and what would be in it after if it goes through. And let's again look at an example. Here we'll look at some like doing a change lock. So the idea is to write all the updates we do to some table to a second change lock table. So our main table will call it article. Each article has an ID and a body of the type text. And there's a second table article change lock where we will be using the same IDs and at a timestamp and an old body and the new body. So for each update we do to article we want a record in here with the body before the update and after the update. So this is the function we're going to use. There's again quite a lot of similarities to what we saw earlier in the basic structure of it. We have a function with no arguments in this case and the kind of weird special magic return type of trigger. So Postgres knows what to do with it. And then again he's kind of the main code. And kind of the only thing we're doing here is do an insert into the article change lock table, into the columns ID, time, old body and new body. And as I mentioned you have access to the values before and after the update. And that happens through the old and the new namespaces, I guess. And what we'll insert into the articles change lock is we'll just use the new ID, we'll just say we aren't going to change any IDs. So that would be bad. We use the current timestamp, that's a variable that will always contain the current time of the transaction. And the old dot body, so the value of body before the update and new dot body, the value of the body after the update. And then we'll say return new. It doesn't really matter in this case, but yeah. So this is our function. Now we want it to be automatically called for updates. So how do we do that? We say create trigger. We give the trigger a name, we'll call it log update. We specify when it or which conditions it's supposed to be run. So we say after update. And this could also be something like before update to intercept update queries or something like after delete might also be a thing you can do. Or actually one kind of weird option is to do instead of. So that will lead to the usual update logic not actually applying at all. And you're kind of left to do whatever you want with it in your trigger function. We specify the table on which we want this trigger to live. And in this case, we want this trigger to run for each row. So if we do an update that affects a lot of rows, our trigger function will be called once for each of the rows. Yeah. And we say it should execute the function log article update. Okay. So let's say this is the article table before our update contains just one post with the text hello dpn. Then we do an update and we change the body to goodbye. And this is what the two tables will look like this after we did this. So as we expect, the articles table now contains like the article 42 now contains the body goodbye. But we also have a record in the article change lock table, which contains the timestamp, the before and after values of the body. Okay. So what are our takeaways from this? We can move some pretty complex functionality into the database. Of course you could also do this kind of behavior in your client application by like checking the values before and then like writing both to the actual table and then to the change lock table. But of course, like this, we do all of that in the database and you can use this to ensure some pretty complex consistency criteria as well. You're not limited to what you can do with constraints, but you can basically do arbitrary checks on the queries and decide whether this is like something you want to consider valid or not. And what I mean by ensure a consistency across clients is you can't forget to call it trigger, right? If we have the change lock example and you have different applications writing to the same database, you might end up in a case where there's some edge case where something doesn't actually get written to the change lock. With triggers, you can't really run into that case. And the last bit is a bit of lie. You don't get anything for free. But what I mean by this is you get the same guarantees as you would in a transaction. So in another word, the trigger functions run in the context of the statement that caused them to be run and these are atomic. So if an error occurs in your trigger function, the whole statement will be rolled back. So you can't end up in a case where you've changed the article to tables, but it hasn't ended up in the update because something went wrong. Okay, but again, there's some downsides. It's kind of spooky. Like you usually don't expect updates in one table to affect other tables. And that might make things a bit harder to debug and test. But to be fair, in my experience, having code that heavily interacts with databases, testing that is already kind of tricky. So this doesn't make it much worse, I think. But of course, this can also negatively impact your performance. So because these are run in the context of the statements, if you do something that just takes a long time, some calculation that is complex or something that just does a lot of changes, that adds latency to your initial query. So especially compute expensive things probably shouldn't be happening in the database trigger. So yeah, let's finish with some closing thoughts. With great power comes great Rube Goldberg potential. So you can build some pretty weird and wonderful things with this. But you can also build some really confusing things with this, I think. But yeah, I'm excited to see what people come up with because I feel like this is a bit of an underused, underappreciated feature. But kind of what I think you should do before considering this is look at some other prosperous features that exist. There's a few cases where you can use views or materialized views to actually get kind of similar results, not the example we looked at, but other things where you might consider using triggers. And before you define your own function for something, you should probably check the built-in functions that exist and whether you can make it happen with those. There's a few really good ones in there. But yeah, I think this, using triggers and functions can be super powerful and I think it's a really elegant solution for some kinds of problems where you'd have pretty complicated code in the client that does some pretty weird transactional stuff where you really have to watch out that you don't have any weird things at cases where you miss something, which are made a lot easier with prosperous triggers and functions, I think. Okay, that's it. Have time for one, maybe even two questions. If someone has a short question, raise your hand and yeah, I will run and get you the hopefully now charged microphone. Nope, you'll get mine. Do these trigger support recursion or is it possible to run into issues with that? I think the question was, do the trigger support recursion and do you run into issues with that? Well, in general, functions in Postgres do support recursions and of course, yeah, you can run into problems with that. I don't think there's any tail recursion optimization or something like that in there. Although you usually wouldn't call a trigger function directly, but you'd usually have it called through a trigger. But yeah, you can run into problems with that. Yeah, so is there one more question, maybe? Yeah, okay. Is it Turing complete? Yeah, yeah, yeah, yeah, Postgres functions totally Turing complete. All right, nice. One more. I was expecting one more thing. In my world, the other recipes but the ice cream, I would have expected something like string split by commas and then something contains functionality available. That would be the much cleaner way to do it. Yeah, you're right. So there is actually built in function to split a string with a sort of like a longer certain character. And that would be the better way to do it. Yeah, this was just to have the two different queries and just kind of directly translate them into easy to understand functions. But yeah, this isn't how you should build your recipe database. All right, that's it. Thank you very, very much. Yeah, thanks. Thank you all for being here.