 Welcome to today's PostgreSQL World Webinar, Aggregates in PGX and Adventure. We're joined by James Blackwood Sewell, Senior Developer Advocate at Timescale TV, who will discuss why PostgreSQL's functionality to create custom aggregates, why creating and maintaining a performant aggregate function in C can be a high barrier for many would-be contributors, and how PGX can help with creating, testing, and deploying aggregates using Rust. My name's Lindsay. I'm one of the PostgreSQL Conference organizers, and I'll be a moderator for your webinar, a little bit about your speaker. So as I said, James is currently a Senior Developer Advocate at Timescale DB based in Sydney. Before that, he meandered through the PostgreSQL, Kubernetes, and Prometheus landscapes as the chief architect at GiroTech. Welcome, James. And with that, I'm going to hand it off. Take it away, James. Thank you. Yeah, so as Lindsay said, this is called Aggregates in PGX, and it's loosely structured as a bit of an adventure, which we'll talk about a bit later. My name's James Blackwood Sewell. I'm the Principal Developer Advocate for Timescale. I live in Sydney in Australia, although at the moment, you might notice there's quite a bit of light in the background. I'm a Frenchy in France and on T. I love to talk about Postgres and Rust and Kubernetes and lots of other things, like observability and open telemetry. I work at Timescale. I've actually been there for maybe three and a half months, so it's not long at all. And being a developer advocate is quite a new thing for me. As Lindsay said, I used to be the chief architect of a company. So I'm kind of learning as I go over the past few months. And you can find me on Twitter as James Sewell. You can find me on the Timescale DB Slack, the Community Slack energy in the Timescale DB Forum. And you can find me on LinkedIn as James Blackwood Sewell. So jumping straight into it, what is PGX? So PGX is a relatively new framework that lets developers extend Postgres. So you would use PGX when you would normally be extending Postgres by using C. PGX would let you do that using the Rust language instead of the C language. So if you want to write a Postgres extension, we want to write a new Postgres aggregate. We want to write a Postgres background worker. You could do that in PGX using Rust. I suppose the million dollar question is, why would you do that? C already exists. And that's because PGX makes things safe. It makes things fast and it makes things easy. So what PGX actually does is exposes the C API via a safe Rust code. So it removes many of the opportunities you'd have for crashing or breaking your database. And that's quite an important thing when we're talking about interfacing with Postgres via C because there are many chances to crash or break your database. You could do that via calling functions in the wrong way if you were doing things correctly. If you're doing things incorrectly, you could do that via managing your memory incorrectly. It could cause a seg file. That kind of stuff can be destructive for your database. You can bring it down. PGX is also fast. So because it's compiled Rust code, it runs really close to the speed of C and many, many times faster than code in like a PL language, like PL or PG SQL or PL Python or PL Java or anything like that. I say close to the speed of C code, which we'll look at later. Obviously because Rust is providing some extra safety guarantees, there's a little bit of over here, but it's a very small amount. And I think a pretty good trade-off in this case. And PGX is also easy. So it helps out with the development process. So it can auto create SQL objects. It can help you run your code, to test your code, to package your code. It can actually even help you get started with some template kind of options. And that's not something, as far as I'm aware, that really exists in the Postgres C where different developers who code for Postgres and C have their own ways of doing things, which is great once you know what you're doing. But for people that don't know what they're doing, PGX is a great way to get into being able to write low-level code for Postgres in a safe, fast and easy manner. So why Rust or I suppose, what is Rust without knowing about it? It's a modern general purpose compiled programming language. People used to say it was a systems programming language, but in reality, it's just a general purpose language that can be used effectively for systems because it's compiled. It's blazingly fast. So it's memory efficient. It doesn't have any runtime or garbage collector. You can run critical services in it and you can write it on embedded devices and you can easily integrate with other languages, one of which is C through FFI. It's got a rich type system. So it's strongly typed. It's got an ownership model guarantee. Guarantees memory safety and thread safety are probably a bit more interesting. The memory safety for Postgres and it basically enables you to eliminate many classes of bugs at compile time. So you compile your code. The Rust compiler comes along. It looks at your code. It works out if your code is safe. So it works out if you're writing code which is unlikely to cause any problems or isn't gonna cause any problems and then it will compile it. So if you were to do something like try and use an object after you've dropped it, then Rust won't be able to compile that code. It's got really great documentation, a really great compiler with really easy, friendly user error messages. So when you compile and it comes back, we'll actually tell you what to do, direct you some documentation and it's got really great tooling. The number one thing that it's got is a package manager called Cargo, which I mean, I'm not a professional developer but Cargo is the best package manager for any language that I've encountered. And it's got all the usual things like multi-edit to support, auto completion, type inspection, auto format, all that kind of stuff you'd expect from a modern language. A lot of which C has of course as well. Why do I personally use Rust? So I use Rust to program because I think it makes me a better programmer. So, well, first of all, I come, I suppose, from an architecture background and I suppose before that I came from a DBA background. So I wouldn't at least in a room full of developers or programmers call myself a programmer. I definitely write lots of code. I've got a lot of background in writing shell scripts. I then moved on to Python and wrote a lot of Python code, some of which runs in production today still. But I probably wouldn't describe myself as a professional developer. The reason that I moved to Rust and kept using Rust is because it made me look back at my previous code and think, oh my gosh, what was I doing there? It's made me think about the way I write my programs in a much more critical manner. As an example, perhaps in Python, I might not have been catching all of the errors which could have happened in Rust, Rust will come up and say, you're not catching this error, you need to do something with it. And I find it fits my mind and see it really well. I suppose as someone who hasn't had years and years of working with C, it makes it possible for me to easily switch into a domain where the rigorous memory management is needed. So I know lots of people who work with Postgres are completely comfortable managing their own memory with C. I am not one of those people, so that's why I use Rust and VGX. So basically it allows me to be confident that I'm doing the right thing while removing as much of the mental load of having to do it as possible. So what do we mean we say safe? We mean a bunch of things. The main ones are we're translating Rust panics into Postgres errors. So if something goes wrong in Rust, something panics, we're going to abort the Postgres transaction, but we're not going to terminate the process. We're going to maintain Rust's dropped semantics maybe we can even in the face of panic and error. So if we've got an object or a variable in Rust, when it gets dropped, Rust is going to do the things it's meant to do, even if there's been a panic in Rust or there's been an error in Postgres. And when code is called from Rust, which isn't safe, so you can actually call all of this the API from Rust if you want to directly. But you'd need to do it with an unsafe block. So you need to specifically put a block around it that said, hey, this code's unsafe, I know what I'm doing, and then you can run it. So I suppose the inverse of that is there's a lot of code that has been wrapped where the API has been wrapped. You can run it without that and we can guarantee that code's going to be safe as you run it. So basically, if you do it wrong, it won't compile. So where did PGX come from? There's a person and a product called ZomboDB. So ZomboDB is an extension for Postgres, which is very good, which enables you to use Elasticsearch as an index from within Postgres. It was written by a guy called Eric, who goes by the name of ZomboDB online. He wrote that in C, and it was great. It was really performant as far as I understand, but he decided that he basically didn't want to, didn't want to maintain C anymore, didn't really want to write C anymore, wanted to try going in a new direction. So he started using Rust. It was actually a sort of project that came before this one called PGX 10.js. So here now we're in that community at the same time. And then he decided he wanted to rewrite ZomboDB using Rust code. And he created it because he didn't, actually he decided that he was doing that so quickly that he wanted to create his own new framework to do it. So he created this thing called PGX in 2019. And things took off from there. He's now got ZomboDB completely implemented in Rust. PGX is really stable at this stage. It wraps a large amount of the Postgres API. I mean, obviously not everything, but it's getting there. And in fact, since then, PGX has now become owned by TCDI, which is a company that's a new custodian. So Eric now works for TCDI as part of that move. He's working on some stuff for them, but he's also bought PGX under their ownership. The product's staying open source. So from my perspective as someone who doesn't work at TCDI, but is a contributor to PGX, the main difference is now an All-Star team who have been paid to work on PGX, whereas before it was just Eric and one of the other people. So I think the move's been a really positive thing. And at timescale and obligatory link back to my employer, we use Rust and PGX in a couple of places. So we have a thing called hyperfunctions, which are functions that we make that wrap up I suppose time series functionality that you might want to use like statistical analysis or something like that. They're written in Rust with PGX and also the prom scale database extension, prom scale is one of our observability products and the prom scale database extension is written using PGX as well. Personally, I contributed the background worker code to PGX and the shared memory code to PGX. So what can PGX do? What can it help you with? I suppose it's apart from being safe, fast and easy, what are you gonna get from using it? So you can develop functions in Rust, including stuff like aggregates and triggers. You can map between Postgres and Rust types, including storing Rust types in the database. You can access the database through SPI, which is a server programming interface, which is I suppose the main way that C functions access the database. You can create new Postgres config options, which are called gux, you work with Postgres. So you can create new ones for extension. You can work with Postgres memory context or actually a lot of the time, PGX will work with memory context for you, so you don't have to worry about Postgres concept of memory context. So Postgres uses something called a memory context to store variables in different amounts of memory that are cleared at different times. So one particular variable might be free to transaction end. You don't really have to deal with that stuff when you're using PGX. You can create background workers, which run and then run code. You can create executor, planar hooks and transactions of transaction callbacks. So you can implement something like PGStat statements if you wanted to, or re-implement PGStat statements. You can read and write Postgres shared memory. You can read and write some types of objects into Postgres shared memory. And most of all, you can remove some of the mental load which comes for writing C code, which is going to be run in Postgres. And I asked Eric for a quote when I first did this presentation and he said, our goal is basically to make Postgres and Rust easy and fun. So on top of that, PGX can help you develop too. So you can quickly create new extension templates with cargo PGX new. You can create the SQL files needed to install your extension. So the days of manually coding up SQL files to install an extension are gone. It does that for you. So you run cargo PGX schema. You can install, configure, compile and initialize all the versions of Postgres that you need for your particular project. So I think it's like the last five versions, five major versions with cargo PGX in it. It'll install those into your home directory, compile them, configure them, get them ready for you to inject your extension code in so you can do testing. You can run your extension inside one of those Postgres versions and interactively test it with cargo PGX run, which will drop you into a PSQL prompt. And you can unit and test your extension across multiple Postgres versions with cargo PGX test. Then right at the end, if you want to deploy it, you can create installation packages for your extension with cargo PGX package. So it's really helping you from creating an extension right through to pushing it out to production. So if we just look at a really basic example, we're actually gonna, the second half of this talk is the adventure part. So we're gonna be doing some live coding. But if we just look at a really basic example, so we're implementing a max, I know this is just implementing a sort of hello world. So basically from line five to line eight is the payload. So that's creating a new function, which is gonna return a hello tester. And then when I create this extension in Postgres, I'm gonna be up to run select hello tester, open bracket, close bracket and that functional run. And then that's some test framework stuff down the bottom, which we'll look at a bit later. We wanted to implement the max aggregate. It's a bit more complicated, but still pretty easy and definitely a lot easier than it would be if we're writing in C. We'll definitely look at that a bit more later because that's what we're gonna be talking about, aggregates and PGX. And if you want to implement a background worker, which is a Postgres process that runs in the background and is able to do things like connect to the database by SPI and when you're using Rust, you can actually do anything that you want to do. We've gotten this much code. So it's still sort of like 30 lines of code, not much and it's fairly easy to see what it's doing. So I talked about an adventure before. I suppose let's start the adventure part. So as I said, I've been at timescale for, I keep on wanting to say three months, but maybe it's been a bit longer between three and four somewhere. So first thing, the first few things I did in my days at timescale was one, meet the team, virtually of course, and marvel at how everyone's really nice and actually we're hiring at the moment if you're looking for employment. Number two was get up to speed on the products and tools that we use and start thinking about some content ideas. And number three was browse the timescale, get repo to have a closer look at the code bases and find issues people have been reporting, which is always a pretty solid idea when you join a company. And one of the things that I saw sort of first off the bat was this person who I have no idea who this person is, Max her tramp. I'm actually not sure if they're a he or a she, but it's a person that created an issue in the timescale get hub repo. And if we actually open it up. So they're saying they want to add min time and max time aggregate functions. They say, dear, dear, first of all, thank you for your work on timescale for VB. In our application, we can calculate a minimum maximum value. We also want to know what, which time it happened is the first occurrence or the last occurrence of the extreme value. So basically they're saying that they want to have an aggregate where I can run the aggregate and it'll do, let's say a max, but it won't hand me back just the maximum temperature. It'll hand me back at the time at which that first, that maximum temperature first occurred. So they've actually got some SQL code to create a test for this. Oh, we've got some SQL code here that's implemented. What they've actually done is they've implemented a custom aggregate function in PLPGSQL here. And if we go down the bottom, it says, but it turns out the custom aggregate functions way slower than the solution above. I read that writing it in C would greatly improve performance. So I thought that maybe timescale DB users need this kind of calculation and maybe the timescale devs could help. Personally, they're not sure if they want to take the effort and risk of writing it in C, but you guys are surely more proficient. So I suppose the point of this talk is to show that with PGX and Rust, you don't actually need to be super proficient to write this kind of aggregate and it's actually fairly straightforward. So hang on a second, there's someone asking for help in our public GitHub. The problem that they're asking about is not being able to be solved with PGX and Rust. It's got the potential to be the topic for a talk and also the potential to allow me to make my first talk or one of my first talks a live coding disaster. So I thought, let's go for it. So now we're just gonna go through the steps of what this person would need to do or what we would need to do to implement the aggregate. So we're gonna call it max timed and just to make sure on the same page, if I run max timed on some data, it's gonna return the maximum value from the data but also the time stamp of that maximum value. If we actually have a look at the code that they've provided, they're creating a type here. So it's gonna be returning a custom type, instant value, which is an instant and a value. So the time and the maximum value. PGX can't actually use custom types at the moment. So for our test case, we're gonna be returning JSON B with those two values in the JSON B. So first of all, we're gonna set up PGX. So we're gonna have a terminal and we're gonna run through this but we're gonna install cargo PGX. If you wanna install Rust first, which you'd obviously need to do to get PGX working, I've got command down there but I'm not gonna run that as it would take quite a while to compile and install Rust. Once you've got Rust installed, this will give you the cargo package manager and you can run cargo install PGX. Then you can initialize all your Postgrease instances with cargo PGX, isn't it? And then we can go on to creating a new project. So actually, before we dive into a console, I might just ask if there's any questions here. Yes, there are, is. We have one question so far and it is, are PGX and PGXS, which has been part of PostgreaseQL for a long time related. No, they're not. And in fact, one thing I actually got to say at the start is there's actually another framework for PGX as well to further confuse matters, which is the Go one, which is like a connection kind of allowed you to connect to Postgrease, but no, there's no relationship between the Rust PGX, the Go PGX or PGXS. Okay, so then I have a question. How did they all get named? Like how did that happen given that it is so confusing? That's a good point. I can't answer that question because I'm not the one that named them, but I believe Eric just really liked the name PGX and went for it. It is a little bit confusing at times, but they are quite different use cases being that one is written in Go and is about a client side connection and one's on the server side and in Rust. But yes, it's always a bit upsetting when things have multiple names for different things. Sure is. That's all the questions we have for now. This is a great reminder for attendees to get your questions in as they come to you and James back to you. Cool. So if we open up the terminal. So this is just a terminal. I've installed Rust already. Rust gives me the cargo package manager and then I'm just going to run cargo install cargo PGX. So cargo PGX is actually a plug-in for the cargo package manager. So I'm going to run that. It's going to have a think about it. It's going to see it's already installed in this case and say you don't need to install it. Then I'm going to run cargo PGX in it. So doing that previous command gave me this PGX sub-command of cargo. I'm going to run that and you can see what it's going to do is it's going to download Postgres from five different versions. So 10, 11, 12, 13 and 14. The newest dot revision of each of those. It's going to download that code. It's going to unpack that code. It's going to configure that code and compile that code, then install it as a vision that I can run into my dot PGX home directory. While that's happening, because obviously that's not going to happen in the blink of an eye, if we switch over to our other tab, what we can do here is explore some of the other functionality we've got. So we can do cargo PGX new and then give it a name so we can say hello world. It will create a hello world directory for us. Inside here in the source directory, we will have a hello world function. So this is exactly the same as we saw before, but it's just showing that you can create a template if you want to. Unfortunately, what you can't do is create a template of an aggregate. I've actually got a PR open to get that going at the moment, but it needs a bit more work before we can put it into production. But in the future, I'm hoping if the PR gets accepted that you'll be able to do something like cargo PGX new dash dash template aggregate, hello world aggregate and it'll create an aggregate example and you can just jump in there and make minimal changes rather than having to write the whole thing from scratch. At the moment, you can do cargo PGX new dash B for background worker and it'll make a background worker, but at the moment it's background worker or bare function. So if we flick across into an editor, we can have a look at now is this weather.sql file. So this is just showing some of the sample code that we've got to do some testing once we get our aggregate up and going. So basically we're just making a weather data table. It has an instant a temperature in a city. We're creating an instant value, which we're going to be using for the demo type that the GitHub user created. And then we've got the implementation or the aggregate implication that the GitHub user created which we're going to benchmark against. And then we're copying a whole bunch of data into the weather data table. So this is a whole bunch of cities. I want to say in America, but I'm not a hundred percent sure about that. It's a whole bunch of locations with the maximum temperature on a particular day. Oh, for a particular hour, it turns out. And then if we go into this max time directory, which is what I created earlier, we have our aggregate code which we'll go through. So at the top, we're always going to have to use PGX. So that's Rust's way of just saying we want to use the PGX library. We've got online three PG module magic, which is just a macro that sets up the Postgres environment and enables it to talk to Postgres. Then we're defining our state for aggregate. So we've got a struct and we've got a max and a timestamp inside there. In Rust, this option then just means that it can either be an i32, so an integer 32 value or it could be nothing. So value none, and we're doing the same for timestamp. So those values could either be none or they could be an i32 with timestamp. And then we're going to define our actual aggregate. So we've got this PG aggregate attached to this implementation. So we're basically saying this is an aggregate, PG aggregate, we're implementing the aggregate code and we're going to be saying we want our aggregate to be called max time to PGX. The state of the aggregate, so that's the state that the aggregate passing round is using the Postgres internal model. And the arguments for the aggregate are a timestamp named timestamp and an i32 or integer 32 value called max. And then finally, we've got a finalized value which is saying when we finish trying the aggregate, we're going to pass out JSON B. And if you remember the reason for that is because we can't make it composite type with PGX at the moment. There's actually a bunch of work to make that possible at the moment. So I think if you check back later on this year, maybe even in a month or two, then that should be implemented. And then we've got our state function. So this is the function that gets called every time our aggregate runs. So for people who aren't super familiar with aggregates, you create an aggregate and it's got a state function. You give it a bunch of rows and then it runs the state function for each row to do some sort of comparison to work out what's going to happen. So in our case, this is our payload for the aggregate here. If the maximum value that's coming in from our argument is greater than the one which is in our internal state, then set the maximum value to max and then set the timestamp to timestamp. Obviously the second time it finds that value is not going to be greater, it's going to be equal. So it's not going to set the timestamp again. And then we've got a finalized function which is just taking our internal state and pushing it out to JSONV. Once we get composite types going, you won't actually need that function at all. But then what we can do is jump back into our terminal. Actually, are there any questions before I move on to actually running the code? Yeah, so two questions, two and a half questions, shall we say, have come in. The first is, can this patch into existing Postgres installs? So I think that question is saying, can I install PGX managed extensions into existing Postgres installs? So we're not really hatching, I suppose. It's not like we're changing the Postgres code. We're using the Postgres extension mechanism. We're extending Postgres by putting .iso files into it. And the answer is yes. So we've actually got, I'm not talking about it in this talk, so I don't use it very often, but we have install. So you can actually install it into a Postgres that is defined by the PG config in your path. And that will install it into that version of Postgres rather than installing it into the .pgx Postgreses. Whether or not you should be compiling code and injecting it into a running Postgres process probably depends on this as a development machine or a production machine, but definitely the ability's there. Great. And then someone else followed up that question where, if so, what is the earliest Postgres QL version? All right, okay. So if we jump back over to the other terminal, actually the earliest Postgres version at the moment is 10.21. So we don't go early over that. So we're just supporting the current major version and fire four below that for a total of five versions. So yes, you can't get PGX to push into 9.6 or something like that at the moment. To do that, we would actually be non-trivial because you'd have to go and pull in all the 9.6 header files, it's definitely possible, but it'll be a bunch of work in the PGX side. Awesome. And then a little bit of a broader question came in. Every experience with a different programming language has a cost. How long do it take you or others to learn Rust at various levels of proficiency? It's a really good question. And I suppose that is kind of the hidden iceberg of my talk, which I normally mentioned at the end. So obviously what we're doing here is we're trading, well, so I would say the target audience for PGX is people who are familiar with Postgres have at a time in their career, I wanted to extend Postgres, but haven't been able to do it because they can't code and see. And obviously you have two options, used to have one, but now you have two. So you can either learn how to code and see and then implement the code yourself. If you do that, then that's great. That's really noble of you. It's probably pushing the Postgres project forward more than otherwise might be because you can then go on commit to Postgres and you can contribute code to Postgres or you can use PGX to do it. Personally, because of the way that I know I write code, I find it very comforting knowing that the PGX is removing many classes of areas which can cause critical problems at compile time. So I like PGX for that reason, but obviously I had to learn Rust to get that. So that's a very good question. Rust isn't the easiest language to learn, although I would say probably Rust isn't the easiest language to learn if you're coming from something else similar to Rust. So I've seen people pick it up very quickly. I've seen some people who are in a season program and struggle with it for a long time. My hope would be that to write a Postgres extension with PGX, you need to know Rust, but you don't have to be an expert in Rust. Obviously, that becomes less true, the more complex your extension is though. So yes, that's actually a really good question and you do obviously have to learn this new language if you don't know Rust to use PGX. Awesome, thank you. That's all the questions we have for the moment. Cool, okay, let me jump back into this other tip. So we can now do Cargo PGX Run. So what that's going to do is that it's going to say to PGX, oops, right, you're in the wrong directory. So we're going to go back. So we're going to go into the MaxTime directory, which is the one that includes the source code that we looked at before. So if we look at this as the code we were looking at, just before, Cargo PGX Run, and it's going to compile that code. Rust can be a little bit slow to compile, so don't be fooled by the fact that compiled in like two seconds, I had actually pre-compiled that code, but it will compile the code, it will probe the code to detect what the entities are. So you can see there it's discovered three SQL entities. So two functions and one aggregate. It will make the SQL code that we need to create that extension. It'll inject all of that into a PostgreSQL instance. So in this case, it's using PostgreSQL 13 because that's the default. And it'll create a database. In this case, it's using an existing one. And it will dump us into the PSQL. And then we can do, I'm going to show that it doesn't exist. So we don't have any extensions. I've got a max one, which is installed just some of the benchmarking later on. And if we do a D, we've got nothing here. We look at the functions. We've got the ones that came from max. We create extension, max timed. And then we've got this max timed underscore PGX aggregate. Then what I'm actually going to do is I'm going to load the weather data that we talked about earlier by loading that weather.sql file. Now we've got this public weather data table. Select. We'll get back the max value and the timestamp, which happened in this case as J-Sombie. Any questions there? Nope. Just one observation. That 144 pages of C code extending PostgreSQL compiles in less than three seconds. Yeah. So that's that. I think that's comparing the compilation time of C to rust. Is that correct? If so, then yeah. I mean, C does compile a lot faster than rust. If I was to compile all of this stuff from scratch, it would have been a lot slower. But when I say a lot slower, I'm talking like 30 seconds, which isn't that slow on a new Mac, but it's enough time that I'm going to be sitting here twiddling my thumbs looking at you while it's happening, which is why I didn't show it. Okay. So now we've got this max timed PGX aggregate installed. What else can we do? So we can actually run cargo PGX schema. So that's actually going to show us the schema that it's going to create. So this is what's getting injected into our Postgres instance. So it's showing the functions that it's creating. It's showing the aggregate that it's creating. And it's got a bunch of comment code that's showing sort of how it's mapping internally. So this is actually a really killer feature because historically when you create functions or when you create aggregates or you create triggers or whatever, you had to go and create the file that creates those as part of your extension. But you don't have to do that with PGX. It just does it for you, which is really great. And another thing that we can do, which is pretty neat is we can do cargo PGX test. This is taking a bit longer because it's having to recompile something for some reason, maybe invalidate something. So probably get a more accurate representation of how long it's going to take here. So as I said, it's enough time. They have to sit here watching it, but it's not really that much time almost there. So what this is actually going to do is we'll jump in and have a look at the code in a second when it's run. That's actually going to fail this first time, which we're going to have a talk about. So it's actually running some tests against our code and then telling us what the outcome was. We're going to test our tests have failed. If we jump back into Visual Studio and we go right down the bottom, we'll see we've got these tests down here. So inside our tests, we're creating a table called test table, which is similar to the test data we've got. We're studying some values and then we're running our function and saying what we think we should get back. So we're testing our aggregate from within our code and then we get the ability to run that. So this fits in really well into deployment pipelines, that kind of stuff. I suppose the first question that anyone's going to ask is, why on earth is that failing? And the reason it's failing is because it's actually in the Australian time zone, so it's not comparing the times again correctly because all JSON time zones, time stamps are in UTC. And the point of that is just the show that you can do neat things like have configuration options that will be used for your test harness. Then we can jump back here. We can run KajaPGX test again, take a bit longer and then it will come back and say hopefully that it's been successful. Cool. And then other things which we can do. I'm just telling you how much time I've got left, about 20 minutes. So one of the things you might be wondering is how fast does this stuff run over here? Okay, this is something I didn't show before. So this is just the finishing of compiling everything up. We will run PG bench, which I really thought I had a command for it somewhere. It seems I don't. I wouldn't actually run PG bench because it's going to take me a while to work out the command again. But I've got some stuff here. So this is just what we've done. We've run the project. We've tested the project. We're going to talk about packaging it in a second. But we've got some benchmarking. So if I run the PostgreSQL C function over a thousand times with PG bench, that table 21k records, then it runs in about 10 milliseconds. PGX, if I just implement the max function that we saw before, so not max time, just the max, that runs in about 11 milliseconds. If I do the SQL version, so max time average, but implement an SQL, that runs in about 19 milliseconds. If I use max time average with the PGX, that runs in about 13 milliseconds. If I do grouping with max time in SQL, it's about 650 milliseconds. So it goes right out there. And if I do grouping with the PGX version, it groups down to 35 milliseconds. And if I use the PL-PGSQL version that was submitted by the GitHub user, that runs in about 115 milliseconds. So we're running fairly close to what the C code would be. I've no doubt that if I implement this in C, then the C version of max time average would be slightly faster than the PGX one, maybe like 12 milliseconds versus 13 milliseconds. But the point isn't that we're better, is that we're very close to the C code. And obviously this aggregate is not really doing anything. So the more stuff you have inside the aggregate, probably the less overhead is going to be noticeable. Any questions there on the benchmarking stuff? Nothing has come in. Okay. So one final thing which we can now do is we can run cago-pgs. So if I do, I've actually got Postgres installed out of place on the system. It's there. cago-pgs package. That's up here. Minus C. And what that's going to do, is it's still going to compile the code, but it's going to use that PG config that's located from a different instance. So not a PG managed Postgres instance. It's going to compile the code. It's not going to drop me into a PSQL. What it's going to do is inside the target directory, inside the release directory. It's going to create this max time time average. And then it's going to run this max time to PG-13 directory. I'll do a tree on there. So it's going to create an installation tree that I can then use to copy that code onto that server. That PG config could have come from a different server. It could have come from this server. Obviously they have the same architecture. So I'm running a Mac M1 here. I'm compiling for Mac M1. If I want to push it up to an x86 server, I need to have a Linux x86 server. But once I had that, then I could then tar up that directory, copy it to the root of that other server where the PG config came from and then untar it and it would install my max time SO object and my max time SQL and control file which make up the extension into that machine and have it available for use. So just to recap that, we've created a PGX environment from scratch. We've created a new PGX project. We've edited some code. We ran the project manually. We packaged the project for deployment. We packaged the project for deployment. I think that it's safer, faster and easier. At least from my historical DBA point of view where I'm not a person who's been working with C in Post Hoops for a long time, then it would be using C. Will we put this aggregate into time scale? Probably not. Because if you add the correct index to work with, then we see something pretty strange happening. We're not strange. We're pretty expected happening. So the Postgres max time average goes all the way down to 0.3 milliseconds. And the post PGX one stays at 13 and the Postgres max time average group goes down to 17 milliseconds. This is 35 milliseconds. So obviously creating the right indexes is probably better than writing it. You can see. But having said that, when I gave a similar talk to this in Prague quite recently, there were I think 150 people there and afterwards, four secret people came up and said, that's the exact aggregate that we want because we're trying to do this with a lot of grouping. And we just need an aggregate because of the way we're writing queries. So it's definitely something that's useful. Probably won't be going to time scale, but the demo code is actually in a GitHub repo if anyone wants to download it and play with it, which I will pass on afterwards. So what are some real world examples of stuff we've made with PGX? In my previous job, before timescale at Jiratec in Sydney, we were a Postgres consulting and support company and we actually had a bunch of times we use PGX in production, but things that we would never have been able to do. We never have wanted to do. We would have gone anywhere near doing it with C. We had an extension which helps implement oracle synchronous replication semantics. I would suggest that people don't need those, but this particular customer is adamant that they wanted them. So it's things like maximum availability, maximum performance and maximum protection, which are really marketing constructs, but there are ways of sort of backing up on synchronous replication when machines go down. So rather than saying, if my synchronous replica goes down and I'm just not going to do any writes, it allows you in some circumstances to back off and keep allowing rights to the master. An extension that spawns a background worker which wakes up on load and hubs. So the next slide looks at the postgres configuration settings and sends any chain settings to an external postgres interface, which can then be logged in Grafana as an annotation. So you can be looking at your database throughput in Grafana, and then you can see a vertical line when a config setting has changed. You can see what the config settings change from and what you change to. So that was implemented for some of us doing a lot of testing. We had another one that uses a custom guck setting to specify rules for reducing the OS priority of some database sessions. So it essentially re-nices database sessions which was a hard requirement for this customer. I mean, it's probably not the absolute best way of doing that, but it worked out for them. And another one which tracks sub-transaction counts to work out when you're headed for the 64 limit per transaction, which isn't really any way to track explicitly in postgres. So if you've got a postgres connection and it starts up 74 sub-transactions, which is almost universally happening because using an ORM that's not working very well, that's really bad and those we've written to disk and it will slow down your whole system. There's a bunch of articles on the internet about that. So we implemented a way of turning on this extension which would then tell you which commands or which escale statements or which sessions were causing that to happen. Cool. That's about all I've got, I think. Any more questions? Awesome. Thank you so much. No other questions have come in. I think we give folks one moment if anything pops up, you're getting clap emojis which I can only assume means great job. While we're waiting I'll just flick this up. So timescale's actually got a state of postgres survey open at the moment, it's open till June 30th. So if you're interested in helping us understand the postgres landscape, you know, really appreciated if people could fill out the survey. All the data is going to be anonymized and available to download if I have a wantson also prevented in a whole bunch of, I think at this stage, Grafana panels. One last one and then we'll go. What was the nastiest surprise you ran into? The nastiest surprise I ran into when using PGX was when I was implementing the shared memory support and I sort of understood how the postgres model for shared memory worked in principle but I found it was really hard to map that back to working in a safe way because you're fundamentally talking about something that's not really done and a lot of products these days, it was quite hard to talk to Rust experts about how it worked and be just quite a hard thing to do here around trying to safely encapsulate Rust objects and put them into shared memory in such a way they can be read back by Rust in another process. Awesome, thank you so much. So with that thank you so much. Thank you to all of our attendees for spending some time with us. Thank you James for taking us on this little adventure and I hope to see everyone at future postgres conference webinars. So with that, have a great morning afternoon evening and we will see you soon. Thank you.