 Hello everyone and thank you for tuning in for this new age engineering episode about PostgreSQL extensions. I'm Adam Furmane, I'll be your host for today and today with me we have two fantastic guests, Itaï Brown, CTO of Metis and Álvaro Hernandez, founder and CEO of Ongress. Thanks folks for joining. Before we actually move on to the topic would you like to introduce yourself a little bit? Itaï, how about you? Sure, so my name is Itaï, I'm co-founder and CTO of Metis. I work with databases more than 20 years, large banks, worked at Microsoft in the UK as an advisor of some of the top customers, moved back to Israel, opened a professional services company all around databases and then a startup that extract data from database and put it in mobile. It was later acquired by ServiceNow and now Metis, which we'll talk later, also helped developers to tackle database related problems. Cool, sounds good. Álvaro, how about you? Well, thank you. You already introduced the headline of what I am. Actually a lot of people called me Álvaro in Spanish is Álvaro, both worked for me. I've been basically working with Postgres for more than 20 years already. Postgres has been my database, go-to database for almost everything. Now I founded a company called Ongress. Ongress means on Postgres. So it's pretty obvious what we do, providing both professional services and products and services for Postgres. Obviously I love Postgres. I'm also a well-known member of the Postgres community. I founded a nonprofit organization in Spain to help Postgres develop in the world called Fundación Postgres. I've been doing databases that Postgres has been the database that I started with and the database that I've always worked with because Postgres always fulfilled my needs. So the time I work with other databases is either because I'm doing research which I love doing and I try to know all the spectrum of all potential possibilities around the world but also or because I'm helping some customer migrate to Postgres. Sounds good and as you can see folks pronouncing names is the hardest part in computer science. Go let us move on to the topic of our meeting today. So we gather just to discuss like extensions to Postgres, why they are successful, useful, helpful and other stuff. So let's start with that. Why even talking about extensions, what's so good about them folks? May I just start with a short story when I don't do Postgres for 20 years unfortunately more like four and when I moved from MSQL to Postgres I asked the front of mine who did the process before I said what do I need to know, what are the big differences and he said extensions that's the power of Postgres and ability to extend functionality but you have to understand that unlike other databases when you just have the standard edition enterprise edition and this is it, you have hundreds actually more than a thousand extensions and as a developer or a DBA part of your role is to know what to install, when to install, how to install. So it brings lots of power but with great power comes lots of responsibility and you need to know what to do about it. They said this is a mean, a new change. I said okay good now I'm aware about it and let's see what it means. Absolutely, it's been one of the most loved features of Postgres considered across all the history of Postgres and this is not just factual, there's data supporting actually there's a great yearly report that timescale runs and they ask many questions by the way 2023 edition is running right away so you can go and fill it in and extensions always come as one of the top three most loved features of Postgres. So what are essentially extensions and where so loved and so important that can even change the decision to go with another database or not like you say detail right. So extensions the way you define them is kind of like plugins for your browser you have a browser Chrome or Firefox whatever you prefer and you know that you can install extensions sorry plugins on it and they enhance the functionality in many potential different ways right from adding something to the context button on the mouse to performing something on the background to take a screenshot whatever it is right. So Postgres extensions are pretty much like this it's a it's a general framework very loosely established so you can do this basic things as adding one function programming PLPG SQL the stored procedure language for Postgres to just the new function a new data type to all the way down transforming Postgres into a distributed database right and everything in between. So extensions are what makes Postgres database that is yes not Postgres and that's it end of the story but rather Postgres with many other features many other capabilities that you can pick and select and install into your database and use them at will. So it changes it's a game changer and it makes Postgres behave in so many different ways with many of the different capabilities. Obviously you can develop your own extensions and it greatly enhances the flexibility and the in general the extensibility of Postgres which is also what defines Postgres database mainly. And this is something unique to Postgres what about other open source databases do they have something similar? There's nothing really like extensions there's there's other databases for example MySQL has pluggable storages right with Postgres you can create a pluggable storage as an extension also for the main part but the way of distributing code as an extension essentially what we need to think about an extension is that it prevents you as long as you're called your variety that you're creating out of Postgres can't fit within the extension framework and by the way framework is a loose term because it's not like a super well-defined framework but if things can fit as an extension you don't need to fork Postgres. So many other databases to add some kind of functionality they need to fork the database to create a you know separate branch whereas with extensions in you can fit within the extension idea and most use cases fit within the extension idea then you're done you don't need to fork anything you need to create different variety. So give me some examples our listeners probably are interested what can I do with those extensions what are your best picks for the extending Postgres SQL? So as you said like all over the place I mean if you think of I don't know monitoring you can get deep insights to how the execution plan work and there's some statistics information and how the buffer cache works on the other hand if you are a developer and you need geographic information to create maps and routes then there is an extension called PostGIS by the way some of them are commercial and managed by companies some by the market they're for time series and I think when we finish the session and I want to just share some a list of more than a thousand extensions in the most popular ones and I just named a few in Alvaro I'm sure you can just add another 50. Definitely well there's a number one used extension which is a Pitchestad statement which is a simple one but quite important that helps you monitor query performance right almost everybody uses Pitchestad statement but it's far not perfect at all but it's a good extension to get started. PostGIS you mentioned it already is critical for any any GIS turns Postgres into probably the most advanced GIS relational database right so it's it's pretty it's critical for that as you mentioned there's also extensions that are are the goal of some some companies right like timescale for example which turns Postgres into a time series database there's there's two versions of this extension one Apache 2 and 1 timescale license which is a different license commercial license but it's transforming Postgres into an optimized time series database with a lot of really really great features I mentioned before that an extension can turn Postgres into a distributed database and one such example is Cytus an extension also which actually has a very interesting story because it started as a fork of Postgres so Cytus long time ago it was a fork of Postgres it was complicated to bundle everything into an extension but then they figure out the way and they refactored Cytus into an extension and today it's an extension that Cytus got acquired by Microsoft and even improved further this extension and make everything that open source by them not everything was by before that and and basically can charge your database so you can have a coordinator and multiple workers and your data will be split across all the workers and you can basically scale out Postgres but this is another one that I love and I use extensively there's other extensions that are quite important there is extensions for crypto for you know doing any cryptographic function like pgcrypto a lot of people use it for encryption on the database a little bit manager yourself but you know and then there's kind of a small tiny extensions that are convenient all over the place for example there's extensions for UIDs right UID OSP and and there is extensions for for many use cases like I don't know you mentioned it there's approximately a thousand extensions there's no official repository or anything like that but it's something we can talk along the way through this conversation but so we cannot give an exact number of how many extensions are but I guess 1k is a good this is a good estimate I think maybe peachy vector is a good example of the extension okay just a reminder maybe six months ago we didn't even know what embedding in chat was only few know about it suddenly everybody wants to use embedding and vector databases and yes our database is dedicated for that or you can use Postgres with the peachy vector and the fact that you can not only install it easily but literally every month I read about oh we improved that we have new enhancement we have now better performances we support this we support that and you can the fast the sorry the rate of new features and new versions of the extensions that's a very very powerful and I think a great example of how Postgres can win using those extensions that yeah in my sequel there are some geographic data types but you can't compare them to the breadth and depth of and and the rate of like new versions of Postgres being released with new features so probably two good examples yeah I think this is following the model of out of tree development right we can trace here a parallel with for example in Kubernetes the CSI drivers the container storage interface at the beginning most of them were in tree meaning that they need to get updated with the cadence of Kubernetes development and they kind of need maintained or the expectations of those drivers were kind of leaning on the main Kubernetes development group instead of maybe the authors of those drivers which sometimes are very cloud you know cloud vendors specific now they are most of them are off or all of them out of tree and they are iterated differently they're iterated at the vendor space at this provides much more flexibility and a different set of expectations right and actually it's also important to consider to understand what are the expectations the extensions may be developed under the the same criteria in terms of security performance than Postgres codebase less or more right so it's it's also important that there's there's also a great variety there but you mentioned a great extension out super popular pg vector because of this reason and it's evolving very quickly I don't know we can bring many old examples there is great extensions for stabilizing query plans for example one one one issue that sometimes occurs in in heavily OLTP databases is that you may experience a plan fleet plan fleet is a query that is running usually according to a given plan which is good for you always using some indexes doing some basic joins whatever and it's responding within a few milliseconds and then suddenly this query starts taking three seconds to respond and and obviously your database goes down is essentially I mean if this database is executed 50,000 50,000 times per second right and then what happened is that the statistics were a little bit off and Postgres flipped the plan to another one but so there's an extension for this kind of use cases and others which is called pg hint plan and it's an extension that essentially allows you to set up some comments on the query and says you are going to use this plan for this query which might lead to suboptimal plans sometimes but at least stable ones so you avoid plan flips there is a hyper pg for estimating indexes right so if you use this extension you can see the impact of potentially creating an index without creating an index because index creation is not cheap you can involve a lot of IO right so you can do it concurrently with operations it's not a blocking operation but it may involve a lot of IO you don't want to and of course also hinders a right performance so you cannot just create indexes because you want so you can test them with hyper pg there's extensions for security things like pg audit for example I also mentioned pg crypto that also works in space pg audit basically traces your logs and and give you hints of events that have happened on the database like there's almost in any space extensions for postgres and I don't need to remember that you can write your own custom extensions if you want to deploy business logic again that on top of what the market release is kind of like if you want to do some operation close to the database you can run select statement and I don't know you have for example a very complicated JSON with some proprietary schema and you want to extend it and and run queries on top of that you can wrap all of this logic as an extension and deploy it on your postgres so that's another very powerful tool as in system architect you need to consider that as well actually let me make a comment here there was extensions again they're very generic thing they can mean many things there's ways of adding things to postgres but you can add these things in many different ways two main categories of extensions are those that are compiled that essentially are written typically in ccode and they they hook into postgres extension points called hooks and they replace some internal code of postgres with the extensions code these extensions need to be compiled be deployed in a specific way and they're actually loading loading code into your database those are harder let's say to deploy more careful to manage can affect the stability of the database potentially you need to think more carefully about them but there's another category of extensions which are your sql anything that you can write in sql or plpd sql or any procedural language you're using in postgres can be also packaged an extension so these extensions or some like their extensions can be thought of as a convenient way to distribute code to distribute your ddl and there was a few years ago some some code that we wrote to essentially help package your ddl as extensions and deploy exclusively as extensions because then it's exactly what you said in tie everything is packaged and then under a single unit which is actually version and then you can say oh you know i split my ddl into these three modules there's a common set of functions that i use for all the company's projects you can have a base layer we call this common and then there is this for this functionality this for this functionality and all of them are deployed as three extensions on my postgres database and i create extension create extension a create extension b create extension c and i'm done i i've got all my ddl on my database and this is also a very convenient way just to pack your applications that's simple this is also an extension and speaking from the user perspective how do i use the extensions what do i need to do to install them which platforms do i need to use what providers maybe um okay it really depends on the provider kind of like say if you want to start from scratch you open your you go to docker hub you download the an empty pg 15 then there is some manual process of the create extension and sometimes you need to bring the code because it doesn't even have the code on that docker image um so that's kind of like the naive simple answer if you want to use docker most of the audience i think would use platforms such as such as aws rds superbase where part of the power they give the users is the ability to more easily create those extensions so this is kind of like one question how easily can you do that when it come to uh stacker just like one tick of a box and and i'm sure varo can explain more something very very important to understand that if you work with a pg as a service one of these managed service you have to be aware of what extensions they offer and what extensions they don't we in medis encounter a problem and aws do not support the extension we needed this is it end of so you can't argue you can't apply you can't do anything about it when it come to the guy the great support that stuck us to open the ticket we spoke with them a week later we had this another extension i think i value it there's like 200 extensions already or something like that and and if needed yes add more so with superbase there are a short list of extensions but with a single click you can add them so it really really varies to one platform to another yeah you put it very well let me just formalize from a more technical perspective there is essentially three potential steps you need to execute to create a run an extension drive an extension to postgres the first two bring a little bit optional the first one is bringing the extension go to the file system uh so the extension is some some files right depending on its nature is going to be different but it's going to be some files that needs to be present in the in the file system which database can access has access to right and how you bring the these files to that file system what you said it depends on the provider if you're running postgres yourself it's on you for example if you're using a docker container you're probably going to need to build a custom docker image based maybe on the official postgres image then you need to compile the extension and add your own code it's not super complicated but obviously this is a little bit of friction if you're installing postgres like we've had to get you know with dbm or rpm packages some extensions are packaged some are not there is a mechanism for called pgxn where you can where it pulls the source code of the extension and compiles it on your system as long as you have all the dependencies that the extension needs which is also on you so basically you can download an extension compile it it will break and you'll figure out how to fix it right it's a little bit also involved if you're running on a managed services provider then the extensions are already there for you you don't need to do anything but some extensions as you said it's a many extensions are not available we can discuss also about the numbers i can give you some numbers um so this is the first part bringing the extension to the file system the second part is that some extensions need to be added to some configuration parameters some extensions need to be preloaded by postgres into shared memory area and you need to tune a parameter called shared preload libraries so many extensions that do especially the big let's call it the big extensions they need to be loaded via this mechanism and changing this parameter requires a restart of your database they're going to bring it down bring it up back it will take a few seconds of downtime for your database so it's something that for for you know production database you need to handle with care maybe you also need to tweak all the configuration parameters about the extension also adding them to postgresql.com but probably these parameters may not either restart or you'll do it only at once once these two steps are completed again the first one depends on the provider the second one depends on the extension sometimes you need this sometimes you don't then the user experience from this point is very simple you just connected the database that you want to create the extension on some extensions are database dependent some extensions are kind of global so you can pick any database and then just do create extension and then the name of the extension and that's it finally now your extension is available if you use something like pysql you can list the extensions with the backslash dx and you can also list the available extensions in the file system calling a function called pg available extensions pg underscore available underscore and what about updating the extensions so there's there's also command to update the extension but again first you need to make sure that they're present in the file system maybe you also need to update the parameters of all the extensions also depend on your use case and finally you'll you'll update the the extension also with our with our command line there is all the extensions come with some SQL control files that they provide the update path from version to version okay is there any like c i c d for those extensions so i can get them updated automatically probably not and and and it's an it's an important effort for example the the main problem might happen when you do a pg upgrade like if you're upgrading the postures major version one thing you need to make sure is that all the extensions that you're running today will be available in the next major version you're upgrading to them they work the same way and this is not always the case so well any major version upgrade is a is an operation meaning you need to take it careful carefully and requires testing and but this should involve absolutely your extensions there is a project i mentioned just before pg xn this project that will fetch the source comfort of extensions from a repository and try to compile on your local environment there is a project from the same project called pg x docker pg xn or pg xn docker that essentially is helping setting up what you're saying like a cacd where you will start with a base image of a postgres and add some extension and run this on a on a cacd fashion but it will also require if the extension has dependencies it will also require for on your site to improve this and to to get the extensions to compile first place only a few days ago i needed to run some docker container with your pg cron and it's not part of the base in the docker image and a few good people say like this is like the the minimum postgres and something called postgres base with some pre-package extension that you probably might need and even if you don't they can stay there that's fine but we have to always remember the core team of postgres responsible for maintaining the engine and the owner of each extension needs to maintain it so what when i look on an extension before designing whether or not i want to start using it i see how many stars it has who the owner how often they they are updated how quickly they respond to change requests that's a very important part of your consideration and again when you switch to pg15 you have to make sure everything works cool so you mentioned that not all platforms support all extend extensions right what is the reason what is the risk of bringing more extensions to the system well i would say there's there's two main considerations right as a provider you need to make first one is security um if the extensions is just you know SQL code there's there's nothing risky there other than the quality of the code but if the extension is an extension written in C that replace some function pointers within the postgres source code to provide additional behavior and get slowed into the memory space of postgres itself it's something that can literally bring down your database can literally corrupt your data and cause mayor harm to postgres itself right so i'm misbehaving extension can do that so that's a potential security risk when you're loading an extension you should know what you're doing and the extension should be assessed from a security perspective security and and also stability right so that's one the other one is long term that i think or i know that most cloud providers take uh into significant consideration which is the long term stability of the extension i know in detail for example amazon aws has a very very deep uh sense of customer responsibility in terms of maintaining services that are created so amazon's philosophy is that once you publish a service basically they will try not to deprecate it they deprecate it right then we're at least maintained like for a decade the same happens with extensions once they provide an extension there might be a customer using it and nine years down the road this customer say ah i want to upgrade this extension to this given version so there's also a point of who is maintaining the extensions and you know as a vendor you need to make sure that what happens if this provider stops developing the extension yet my customers still want to use it am i able to support this or not so i think those two factors are the main decision drivers especially for cloud managed services whether to add an extension or not the risks of you know blowing your database and it's not that this happens frequently by the way it sounds maybe that some of the audience may be like scared like okay i will never use a postures extension not at all actually i've never seen this case it's kind of a theoretical exercise but it's a theoretical exercise that needs to be done right and and the other one is the long-term maintainability of extensions cool so in case of ongress so alvaro you said that you have what 200 extensions in your platform how do you deal with this long-term stability as you call it so start rest let me make a very quick introduction it's a platform for running postures on kubernetes right it's a it's kind of a give lab for postgres meaning that you have everything that you need for postgres it comes with connection pooling high availability monitoring logs management graphical interface everything that you need to for for postures operation and obviously extensions are a great and most important part of what we do but start rest is not a managed service so we don't have the same you know criteria into putting into this as a cloud service because we are not providing the service the something you run on any kubernetes cluster could be on the cloud could be on-prem it's up to you how you run your kubernetes cluster we just provide you the means for running automated postures on top of it with kind of production production quality right so the extensions that we provide we need to rely on the upstream vendors if they stop maintaining an extension we will also stop maintaining that extension but we don't we're not servicing this promise of a service so i mean we will be sorry about that saying happening but we cannot control that fate no nor we have the hands anyway to do this right we're not aws yet so we we just select the extensions not taking into account this criteria we we actually as you said we are very close to supporting 200 extensions and i think it's also interesting to to to give numbers here so there is approximately 1000 extensions out there for postures as i mentioned before right take it or limit this is the order of magnitude now out of those out of that thousand extensions there is around 50 or 60 that are called country that essentially come with postures package so almost everybody supports these 50 or 60 extensions it's it's just they're just there they're not supposed to have the same quality as the postures package itself but they come with postures so mostly everybody supports this 50 or 60 so the question is how many of the rest of the extensions are supported this i call it third party extensions so if you look at the aws with rds and aurora which is the cloud provider that as far as i know supports a large the highest number of extensions it supports between 80 and 90 which means essentially 20 30 mostly third third party extensions no more than that the others are called core contract you call not all of them are supported by rds neither so there's like around 40 third party extension support if you look at cloud sequel and as your they are around 60 to 70 so less not much less but less i don't have numbers for other providers right now but they're all in this same area around 60 to 80 extension supported most of which are country anyway which are you know well almost everybody else supports in stagras we developed some specific software for being able to dynamically load extensions into postures so we don't need to build container images with them we can just download them directly we created a repository of extensions and we're close to supporting 200 and this number is just where we are today it will be 300 next year or 400 we will see but we are trying to constantly add new extensions there and we again we don't need to care about those long-term maintainability because it's just our choice to provide whatever the market provides and the user who's going to run the service it will it will be the one who makes the decision on whether to add a given extension or not oh cool it's high how do you use extensions in metis then we know those extensions are great you mentioned some of the names now what about the actual use cases what do you achieve with your service and extensions well like two totally different question first which extension do we use internally and what is a platform that want to protect the developers can say about extensions so we use a few extensions internally again everything related to monitoring the pg star statement and pitching back for progression everything that can help you to know what went wrong and kind of like bear in mind that because databases are complicated and because sometimes it's hard even for companies with good qa to track all changes at some point sooner or later the production database is on fire and you need to know why and you need to know which questions to ask so if at this point only then you start asking yourself which extension should i use how am i going to use that that's way too late so you need all of those extension in advance and also bear in mind some of the extension collect data so you want this data to be collected so you can see some some changes and this very same functionality is that we ask and provide we use it internally so again pg star statement pg buffer we're playing with hypo pg a hypothetical index says a great example is a pg plan store that help us looking into the execution plan so you can finally understand what exactly is happening as all the developers remember sql is declarative it just said select stuff on this table here are the logical conditions yeah but how exactly the engine processing information mainly doesn't use any index for whatever reason and only the execution plan can give you this information so we have a curated list of extension that we recommend using we raise alerts to all of our customers if they are not using those extensions and once they start using those extensions we bring insights not just like here is the raw data select star because everybody can do that just reading the documentation kind of like help helping you to understand what exactly is the information you see and alerts to proactively warn you before your database production database start encountering all of those problems okay cool so if we were to leave our audience with some specific call to action what extensions would you recommend to install in the very first place why was the rationale i would say pg plan uh sorry pg star statement no brainer because that shows what's going on in the the history of the table activity database activity pg uh buffer cache because if something is going is wrong with the memory i think you should have pg cron underhand because i can't imagine a production database without some kind of recurring jobs to monitor to delete all data to proactively search for for problems some of them are done externally but sometimes you want to do it internally if you can pg plan store will also show you the execution plan a hypo pg is also recommended just to avoid actually building the index as Avaro said might be very time consuming but you can just assume this in the index exists would you use it in that particular query so on top of my head these are the top extensions if i think of another one i'll write it down in in the blog post alvaro anything you want to add on top of that yeah i fully agree with all the ones that you you mentioned then depending on your use case obviously there's going to be all the extensions you're going to be using as a general one i could maybe think of also pg stat k cache that also helps you get a gather metrics from the ios sub system right um maybe pg hint plan can also be interesting to prevent these plan flips as i mentioned before but then it really depends on your use case there's there's one that i'm using a lot which is called post it comes with postgres also called postgres fdw it's a foreign data wrapper the foreign data wrapper is the mechanism that postgres has to access data over the network from all the sources and this particular one allows you to access data from another postgres and this can be used for many purposes including even for example for for transparently sharding postgres across all the nodes to access data from another node um there is there is a particular extension that i think it's worth mentioning not necessarily because you need it to have installed today but it doesn't harm either which is called pg tle and and tle means to trust language execution environment and it's um it's a mean of running extensions in a safer manner than what we've been discussing here it's essentially kind of a way of loading extensions without having to explicitly touch the file system so you can load functions over the call to the function to pg tle itself and this allows to load more extensions in a safe way into postgres so this maybe is also not about one to half but really it really depends on your use case if you're going to do some sharding you may load sizes right if you want to do time series you may want to load timescale if you want to use text search you may want to use a tgrm three-gram extension if you want to use additional data types for for some 3d functions you may use a load cube extension like there's so many extensions um that that you can go and look into them okay so that creates a great list that we'll share with the audience in the description let's now switch gears folks and blend this episode uh it's hi where are you now i live in hypha at the north of israel cool and what footplace would you recommend over the for the people around the area when when people come to hypha they usually want to see the behind shrine which is like a world heritage place uh unfortunately all around it are tourist traps so you don't want to eat there you want to walk a little bit toward the sea and go to the lux restaurant of chef ala musa modern israeli arabic food really really high quality cool alvaro how about you well uh you're asking me where i am now now i am in madrid but just for a few hours after that i'll be flying to south power in brazil but okay i guess i can answer the question as of literally now in madrid and in madrid there's there's uh obviously a huge community from from latin america from all our brother countries and one of my favorite foods from from latin america's peruvian food is really really rich very very great mixture of asian actually and latin american food so peruvian food is among the the best foods i've ever tasted and there's a really really great peruvian restaurants in madrid one that comes to mind is probably my favorite one it's called pi pi in in plaza del peru so actually very appropriate and this is a delicious delicious fusion food that you can have there absolutely recommend cool thank you for these recommendations and let's call it a day then so thank you for attending this episode about extensions to posgresiquel for our audience if they wanted to reach out to you what would be the best way it's high to ask you additional questions go to our website metisdata.io is at the top corner you will see our a link to our discord channel and you can take a conversation from there cool alvaro how people can reach you well probably the best way is to find me on twitter um i will not pronounce my twitter handle because not easy to pronounce in english but i guess we can add it to the show notes but find me on twitter i also run my personal website on aht.es alfajoteltango.es easy to locate and there is all the information contact information about me but also all the talks i have done they're recorded there slides there's more than 120 talks as of now so there's a lot of material there and there's contact contact information about me and there's obviously my company twitter ongrasink and website ongras.com easy to find me anyway um just a little search thank you for that so if you do have any questions feel free to ask them directly and being that said we would like to mention that soon you'll be able to run a metis with stackgres so we are working hard on our integration so stay tuned for our official announcement when it's ready and being that set thank you for tuning in and see you next time in the next episode follow us on social media look me up on twitter at metisdata and stay safe see you next time cheers cheers always a pleasure and a great flight thank you