 I didn't realize that I had to compete with Star Wars out in the hall so if you have any trouble hearing me please let me know and as for the critics down the back row you know if I if I get it wrong or something like that just hold it off until the end of the session okay I've got one objective from this session and that is that I want everyone in this room to walk out of the room at the end of the session equipped to have a conversation with your manager or your customer on Monday about why they should be doing data warehousing on PostgreSQL you go out of here with four ideas on why it's a great thing to do then I've achieved what I wanted from the session something I've noticed is that it's traditional at these events to say a little bit of something about yourself well that's me more to the point that was me three years ago and that's because three years ago I ran Microsoft's big data operation across the whole of Asia so I was out there selling SQL server and selling MPP SQL server right across Asia to the sorts of customers that you're working with now but I like this little quote there's no zealot more fervent than a reformed sinner that's me I'm reformed let me tell you why last I was working with Microsoft we kept coming up against this company called Green Club and if there's a God out there it is the God that gave Green Plum to EMC because on that day Green Plum stopped kicking our butt and for some reason turned their focus inwards because in every POC in every sales opportunity they were beating us and when I looked into the reasons that they were beating us I kept seeing this name PostgreSQL coming up over and over again and at the same time I was meeting with customer after customer across Asia where the volumes of data that they were dealing with the architectures of data that they were dealing with SQL server was becoming too expensive for those customers instead of being a tool that helped them cut the cost of Oracle it was actually a tax on getting the work done in developing economies and as I started looking at this one day I just woke up like Paul on the road to Damascus and had this vision that said there's got to be a better way and that was the day that I decided I was moving on from Microsoft and I was going to take a very very hard look at this product called PostgreSQL and eventually having done that we started a business called Gillius which is in the business of data warehouse automation so who in this room has worked with the data warehouse okay that's only about the third of you maybe a quarter and that's what I expected so let me do a little bit of the scene setter that says why data warehouses are important the first thing that big rock is Oracle that little person is the PostgreSQL guy when you are trying to switch people off Oracle to PostgreSQL you are Sisyphus pushing that massive great rock uphill and at the end of every night it rolls down to the bottom and you start again and you push the rock back up and you know why that is that's because of this horrible word called applications to switch a database from Oracle to PostgreSQL usually means rewriting applications and that's hard because application developers have gone to a lot of a lot of efforts to make Oracle work properly in their applications and application vendors have certified their product on Oracle so even if you convince a customer to switch you still have to go through the problem of ISV certification which is where you want SAP or people soft or whoever it might be to say we will certify and support that application that you used to run on Oracle now that you put it on PostgreSQL I've got as much chance of walking out the door buying a 4D ticket and retiring rich on Monday than I have of the average ISV doing that okay but there is a better way and that way is peace and serenity and business intelligence why is business intelligence and data warehousing a great candidate to move because it doesn't have applications you haven't written barriers of transaction processing applications you haven't written multi-processing you haven't written screens and forms and reports and business logic and all of that type of thing that sits on top of your database what you have is a product like click or Power BI or Tablo or business objects or even Excel that runs on its own and connects to a database so all of a sudden your problem shifts from one of porting and database and a massive application to one of porting a database and that's a much easier problem to deal with data warehousing and business intelligence is where the next hundred thousand corporate users of PostgreSQL are going to come from think about that as I talk through the rest of this session why do you need a data warehouse well the first for reason you need a data warehouse if you want to integrate information from multiple systems so if you want to take information from your ERP system your HR system your general ledger system pull it all together and make it available for unified reporting that's a use case for a data warehouse the second one is where you want to integrate internal and external data now people often start to talk about sentiment analysis and things like that that's rubbish okay even if you get sentiment analysis most of the people in this room work for customers where they might see one tweet per week if they're lucky and you can't do sentiment analysis off the back of that if you're into you see maybe you can if you're C tell maybe you can but most of us don't work for organizations like that when I talk about internal and external data what I'm really talking about is customers and suppliers if you need to interchange data with people if you want to bring in information about forward order loads or whatever it might be and to integrate that with information from your organization that's another use case for a data warehouse but then we get into the two most common reasons and the first of those is to make sure that we're talking about one version of the truth how many times have you seen in the system in an organization that in one system your user ID is Ron D in another system you have the ID 823456 and in another system you're known as Ronald James Dunn and in another system you know it as Fred because the person who did the data and you couldn't figure out how you said your name and they just made it up happens a lot right so this is what we use a data warehouse so that we can bring together and cleanse all of that data from all of those different systems and package it up so that regardless of what the source system calls me when I do reporting I'm known as the one entity Ron Dunn the last one is a little bit harder to grasp and this is about analyzing data in historical context who did you work for five years ago five years ago yeah and who do you work for now two different companies right this is historical change in data so if I wanted to count the number of employees that worked for Fujitsu right now for example I would do a count across my database and I would get a number but that's not necessarily true if I wanted to know how many people worked for companies five years ago and this is where we need to keep the history of data over time such that I know not only your current employer but who you used to work for and where it comes up in organizations is restructures so we shifted a product from one category to another last year or we've restructured our organization and now China reports into GCR rather than APAC okay if we need to be able to compare data at rolled up levels at different points in time we need slowly changing dimensions in a data warehouse in order to have that historical context so we need a data warehouse and the other thing that's changed in the industry sorry okay the other thing that's changed in the industry is the traditional bi architecture ten years ago when you went out and bought Hyperion or business objects or Epic or a product of that type you bought a product that talked directly to a database and sometimes you might find that you needed a little bit more performance for aggregated queries so you put another product in here which was an OLAP server such as SQL server analysis services for example but in each case whenever I ran a query in the front end tool that query was executed against a back end data source that's 10 year old architecture the current architecture looks like this where click or tableau or Power BI each have their own embedded data engine sitting within those products and in fact is a small point of interest both click and tableau use PostgreSQL as their embedded data engine now what happens in these products is periodically and it might be once per day or once per hour or once per minute you refresh that product cache from the back end database your interactions then work with that products cache why is this important because it means that databases that might not necessarily have the very highest scale or the very highest performance are now more than adequate to deliver business intelligence analytics and visualizations so all of a sudden you should be having the light start to come on that say okay where once someone told me that I needed to have a teradata in order to process an extremely large volume of data fast enough for business intelligence now that I have these caching products maybe I can go back a step and PostgreSQL isn't such a bad product after all when it comes to the data warehouse and that's what I want to talk about now I want to talk about four common problems in data warehousing and what it is that PostgreSQL does really really well to make it a great platform for a data warehouse I'm going to illustrate that with a couple of little photographs from around Singapore three years ago I was living in Singapore and for those of you who aren't from here if you were fortunate enough to be here in August you would walk around the streets to the smell of burning and it's a festival that's called hungry ghost it's a bit like the European festival the night of the dead for example where people come back to visit and you'll see burning happening in the street and people will be burning replicas of cars or houses or simply money or offering food or whatever to make the people who have come back more comfortable I have a three-year-old son well he was three-year-old and he was six now he watched the people doing this and it must have sunk into his head in some way because one night we were sitting at Dunman Road Hawker Center eating our dinner and he wanders off and as he wanders off we heard a lady say no no no and we looked up and he'd taken his mother's coin purse and he'd thrown it into the fire so that's that's what you call throwing away money and that's the key to the first of these stories which is about the cost of data warehousing I'd like to show you two graphs this is an on-premise graph this represents the comparative cost of running SQL server versus PostgresQL now when I say on-premise this is the licensing cost alone for an eight core server so when you think about it the typical environment is going to run an always-on scenario for example so it's going to run two actives so you can double those prices but you know we're looking at enterprise addition which is what Microsoft will recommend for a data warehouse workload 55,000 US we're looking at standard addition well if you were looking at say a terabyte you might be talked to about standard addition being satisfactory $14,000 PostgresQL zero no brainer so far right big savings to be made there but what that's ignoring is the cost of associated hardware and so on that's necessary to run on your platform now you're possibly wondering why aren't I showing Oracle in there no point oracles lost the market as far as on-premise data warehousing goes that's why they're making such a big push for cloud at the moment we're across Asia most companies that you need to look at as targets for data warehousing migrations are sitting on this platform and there are hundreds of thousands of them sitting across Asia at the moment so let's move to the next one which is a cloud base because if we go to AWS RDS actually gives me a great basis for comparison on using PostgresQL including hardware so what I did here was I looked at a dbr3 2x large which is an 8 core processor 64 gig of memory roughly the equivalent to the machine that I sized here and if I do that for enterprise addition on a three-year all paid upfront which is the cheapest way that you can buy it the three-year cost is just on $100,000 if I did it on standard edition $48,000 if I did it on PostgresQL $9,000 so that's everything to run your data warehouse in the cloud at that point on PostgresQL so all of a sudden the economics really start to look good here and this is takeaway number one that I wanted you to have for your conversation on Monday morning when you talk to your boss or to your customer Mr customer wouldn't you like it if I could cut $100,000 from your IT budget next year pretty good argument for a little bit of services work that's going to be necessary to shift that data warehouse the next picture my second favorite airline my favorite airline is ASEANA out of Korea but Singapore Airlines has for a long time been probably the world's best airline but this is a picture of their A380 when it first came into production and deployment here in Singapore the A380 for those of you who haven't flown in one brings with it one significant problem and to put it in database terms it's ingestion gates at an airport are the things that these things hang off right gates are on slots a slot is where you park your airplane slots are very very very very very very very very expensive the longer you have a plane sitting there the more you get charged this is a problem for an A380 because an A380 carries a very large number of people and to get all of those people onto a plane as quickly as possible they needed to come up with a new gating system in order to direct the people into the various areas of the plane get them on and off quickly so that they could keep the slots turning over and not have to pay a fortune for parking an A380 at an airport and this brings me to the next part of the problem which is loading loading is how we get data into a data warehouse and you'll typically hear it called one of two three letter acronyms one two three the acronyms being ELT or ETL most of you have probably heard ETL that stands for extract transform and load and it says that I extract data from my source system I do some messing around with it and when I'm happy with its shape I put it into my data warehouse ELT stands for extract load and transform this works really really well with bigger volumes of data because you get your data out of your source system you load it into your data warehouse and then you mess around and clean it up using the power of the data warehouse engine so ELT is a better let's call it a v2 of ETL it's a better approach to the process of getting data from point A to point B where point B is your data warehouse so I'm going to demonstrate what is nice about PostgreSQL when it comes to loading data into an application and Samia mentioned that I was going to show something about our product I'm not really okay I am going to show a part of our product now but I'll talk to anyone later this is about PostgreSQL not about Agilius so you'll see something in the user interface I'm not going to explain everything that's there but basically our application has an interface that looks like this and we're ingesting data into our system from third-party databases repositories whatever it might be I've defined some metadata and on disk I have a very simple metadata equivalent which is a table that's coming from a Microsoft retail inventory data set you'll see more about this in a moment but in our application you click here and you say show me the scripts and we've generated a bunch of DDL and a bunch of Python scripts that handle all of the ELT process so I'm going to create that table in my database and while that's happening it just takes a couple of seconds because I haven't hit the database in a while so let me scroll down and right here is the important part of this line and unfortunately my screen's a little bit condensed here but it's basically this statement now is there anyone in the room who has worked on the copy feature of PostgreSQL from the PostgreSQL contributors you can I kiss you afterwards please you're asking about your work or work with on oh sorry sorry see now he backs down now he backs down okay I would love you should have offered just okay I would love to hug the person who invented copy because copy is one of the great competitive weapons of PostgreSQL that's being picked up by other databases in the meantime there are two reasons for that one it's very flexible with the types of data that it can ingest but number two for this little word here which says STDIN the ability to stream data from another process that maybe on another machine is just a fantastic benefit when it comes to simplifying ELT within PostgreSQL let me give you an example here I'm going to click this load button and it's going to execute that script so it's just waiting for a couple of seconds it'll take about 18-19 seconds this particular one to run but what I'd like you to look at is down here where it says script success just about where it says waiting for local host in about three two one half okay now that was a little bit slower than usual I'm not 100% sure why but let's look at this number here eight million and 13,099 rows in 22 seconds now that normally runs in 18 and a half seconds on this machine which equates to 450,000 rows per second that is a great volume of data to stream into your data warehouse let me put this eight million rows in context eight million is the number of passengers that passed through Changi Airport in four months so we kept every passenger record that came from there we would put that into PostgreSQL in around 20 seconds eight million is the number of passenger vehicles passenger not commercial that Toyota built in 2014 we could take the vehicle record of every car that Toyota built and put it into PostgreSQL in less than 20 seconds this is huge we have a data warehouse customer in Italy who's a company called D2, D2 is a a high-end fashion label for young women around the ages of 19 to 24 somewhere in about they're not exactly my demographic put it that way they have an interesting scenario because they're a collection based fashion house that means they don't have products that run for a year their product line is one they might have one dress of a particular type they might have five pairs of shoes of a particular type when they go through their ERP process and they're building out their planning data warehouse they're rebuilding the data warehouse completely every hour and this is for a global business so from go to work every hour they do a rebuild it takes them five minutes so they're always working with absolute current data where everything could change within their data warehouse this is probably lay down hands down the best feature of PostgreSQL when it comes to data warehousing the performance of copy is just fantastic so let's go back to the presentation because there's one other thing that i'd like to talk about that's when people talk about ingesting data everyone says to me why don't you use FDW foreign data wrappers well they're okay but it's a balancing act right and for me the balance is on the copy side rather than the FDW side FDWs are great in two scenarios one is where you want a federated database arrangement where you want to be able to within a transaction and i mean a business transaction not a database transaction within one transaction perhaps get a piece of information from this system bring it into your application update it and then send it back again they're great for that type of operation they're great for casual use where you want to bring some data temporarily into a PostgreSQL database but they have some pretty severe limitations at the moment they've got limitations around authentication they're not as fast as a copy statement if you're running to run across a network we can't compress we've got a slower method of network transmission that has to happen there so FDWs in my mind are not quite there yet as an ingestion method our testing shown that copying is still a far better solution in our application when FDWs do get there then we might rethink that decision but right now on balance look at copy rather than FDW when it comes to ETL and ELT okay one more thing that i really really like in the product and here i'm going to switch to a database editor and this is something called block range index now i spoke earlier about our database comprising facts and dimensions facts are the numbers that we want to look at dimensions are how we slice and dice them right so if i have a fact that is a sale it occurred on a date well that's a dimension i bought a product that's another dimension and i bought it in a particular store that's another dimension right one of the challenges in data warehousing is when you've loaded your dimensions so you've loaded every one of the million products that you have and the changes that happen to them that day you then want to be able to access them very quickly when you do joins to your fact table so indexing of dimensions is important but if you're dealing with very large dimensions such as every customer of DBS bank or every subscriber to Cintel having an index on those tables is going to slow down your ETL process because you have to update the index at the same time as you do your data ingestion so the typical approach in data warehousing is to drop your indexes update your dimension and then re-index your tables let me give you an example of that i'm going to create a little dummy here and i'm going to do that as a select from that table we just loaded so i'm creating a table called brin test and we'll execute that this will take six to eight seconds to run hopefully it didn't take 10 like it did last time just a little bit slower okay 8.9 seconds all right we're running a little bit slower today for some reason and now what i'm going to do is account so that we can see how many rows are in that particular table please dear god top of the list will someone fix PostgreSQL count star it is hands down the one head shaker when everyone from another database looks at PostgreSQL they say why is it so damn slow to okay i've read the box i've read the wiki i know the explanations but just fix it just make it make it appear to go fast even if it's not a hundred percent accurate just fix it so we've got eight million and a few rows there which is what we had before so now i'm going to create i'm going to re-index my business key column in that table well i'm not really and the reason for that is time because it would take three to four minutes of silence in order to run that particular query which is create index beetry index in this case um on that particular table right and it would fire away and eventually i'd have an index sitting there not great because this is showing the large proportion of my ETL time is going to an index rebuild so some clever person came up with something called this block range index and what it does is it looks at every page in the database and for the index column it stores what is the minimum value and the maximum value on that page so it means that rather than inspecting every row that's on that page and saying do i need this particular row or not it can now say um okay i'm just going to go through here and i don't even need to look at that page if i'm outside the range that's there so you can throw away a huge amount of the data that you previously had to look at when you were using an index so let's take a look at what that means and we'll run that and that's only going to take maybe 15 seconds and we've indexed our entire table now this is a huge saving in ETL batches if you're looking at an overnight batch window where all of your processing has to be completed before 6 a.m so that the people who come in early in the morning can do work on your data warehouse every second that you save in your ETL processing is important and here in 12 seconds versus four minutes which it was previously we've re-indexed that table but even better there's another benefit that comes from it and that's if i come here and run this query i'm taking a look at some of the details of the objects that i've created there ah darn i didn't create the other one okay what's interesting is that the table itself is 713 megabytes if i'd run that query that created the B-tree index it would have been about 800k in this case we've got 40k in the brin index now that's going to sit very nicely in memory without taking up too much space it's a high performance index this is a really really good strategy for star schema data warehouses and the joins that might come out of those so another feature that i really love about PostgresQL let's go back to the presentation okay here's another photo of Singapore Singapore for those of you who are visitors you might have used to train that's Singapore MRT SMRT SMRT for many years was the best train system hands down across Asia in the last two or three years particularly the last two SMRT has been having some reliability problems and this is a photograph from last year of an outage where trains between Bishan and Nishun were out of service now this is all being remediated at the moment but the reason that this came about was one thing there had not been enough maintenance done on SMRT over that long period when it was the premier train service in Asia it'll come back it's being worked on right now it'll be back as the best but things go wrong when you don't do maintenance or you don't allow for maintenance within your environment my daughter looked at me putting this presentation together and she said dad don't use a selfie okay it's not me it's not me promise i promise i don't have a shirt like that but what what this does show is that sometimes we can burst our limits okay and the next demonstration i want to do is about a feature that i think is an unsung little hero in PostgreSQL and i call this one preventing data bloat so let me go back to that editor that i was using just a moment ago i want to come down to here and i want to create a test table and this table is going to have three columns it's going to have a name a currency and an amount it doesn't mean anything they're just a text field a character field and a number okay so i'm going to create that particular table and then i'm going to put some data into it and i'm going to show you what those rows are okay we put some values in there we've got rundown australia $100 keegan chew from singapore $9,000 okay what often happens the number one reason the data warehouses break is because the designer of the data warehouse declared their column like i've got up the top there varchar 30 or had a numeric field that they declared as say decimal 10 comma 2 and as a result of business growth or some other change in the environment you've got a wraparound let's take a look at some classic examples this is a real name this is a gentleman who was pivotal in india in the fight against british colonialism let's just try and run this one and we see this horrible error value too long for character type variant that's a bit of a problem here's another one we have this is my wife in fact who's from hanoi and we're going to put in $10,000 us which is like 22 billion dollars or vietnam don right so if we insert that into our database whoops we've overflowed that particular number or third perhaps even worse is um i want to put in a rate associated with my dear friend herman tan from jakarta and i want to put in a value here which is 123456.7891 because any rates to do with indonesian repair you need to go to four or six decimal and in this case well that worked but looks what happened here when i do that select okay we've got 0.79 because that was a comma two field now each of those is a problem for maintenance in the data warehouse this last one's probably the worst because you've lost significant data and you didn't know it this is where postgresql has one absolutely insanely beautiful feature but i would wash the feet of the person who did this that's how good it is and this is this anonymous data type which says that i can say text or numeric and it doesn't care what goes into it in that case so let me recreate that table the amount of work that this has saved me in growing people's data warehouses you would just not believe i can sleep at night because i know i'm not going to get a numeric overflow in someone's database thanks to this feature okay let's put some of that data back in this case we'll take these three rows here we'll put them into the table and then i'll run that query again and what you see here is that it worked okay look over here we've got zero decimals two decimals four decimals it is storing the value that it got from the underlying data store now this is just a feature that is of immeasurable value when you're doing business intelligence and data warehousing this is the third one that i want you to think about for your customers okay let's go back to the presentation a couple of minutes to go yep yes yep yep leave it to the database to figure that out don't put that okay in a transaction processing system and thinking back to something that ck talked about this morning in a transaction processing system your job is to get one record as quickly as possible and write one record as quickly as possible some elements of performance might be true in a data warehouse don't care i'm doing yep we're not going to optimize the character we're going to store the number of characters you give us and we're still going to go hop to the next batch right there so even better car or car and text exactly the same yeah it's not like sequel service text where it's not an indexable primary key column text just works in the database it's fine okay no because you've never been access it it is like if you never use a long column never you're right it will create a statement but there's no you have to proper control your application so yeah so for the common business problem which is that someone changed a field in a source system or something like that this is a great solution okay next one Singapore's nearest neighbor is Malaysia and the way that you get to Malaysia is through something called the checkpoint the checkpoint is the funnel of death particularly on the eve of Hari Raya which is when this particular photograph was taken when every Malaysian with a motorbike or a car wants to get back home for family reunion and celebrations etc and it can take hours and hours and hours to get from Singapore to Malaysia or back again and this brings me to the third part of the conversation which is scale and the reason I call it scale is one of the reasons that this doesn't work is it funnels into a little bottle neck and they just don't have enough counters or gates to handle the number of people that are coming through the borders at that particular time of year so what happens in your database when you don't have enough scale capacity these three things are things that are really really easy as a result of a decision to use PostgreSQL and these three databases Redshift Postgres XL Green Plum Vitesse we've heard more about today and particular Deep Green as its variant these are so easy to move from PostgreSQL to one of these platforms that it's just a breeze I'd like to do one more little demonstration though which is from our product because we build data warehouses through metadata and as I showed you before we capture information about what it is that you want in the data warehouse so you tell us in a declarative manner but we have a little feature here that says if you built your data warehouse on PostgreSQL you can do something called migrate and you can pick another one here which in this case we'll say is Redshift for example and we'll call it PG Day Asia is that particular data warehouse and call that one Chinook PG Day okay and we'll migrate it and that's now rewritten all of the ETL code and the migration jobs to pick up your data warehouse that was on PostgreSQL and migrate the entire thing to Redshift and have it up and running in five minutes once you've migrated all of your data don't believe me let's take a look at that we'll go and look at the load scripts in that particular case list the load tables and if we have a look at that inventory job that we did just a little while ago is that a good example no a better example would be in a dimension and here we'll look at the scripts and see that in this case we're generating distribution and sort keys for Redshift because we have that knowledge about what the application should look like so migration is yet another great reason to be using PostgreSQL for your data warehouse because you have a step ahead of you no matter how big it is that you grow within an organization it is not a lock-in you're completely open on where you can go once you've built your data warehouse that way okay I was a little bit ahead of myself I went and did that demonstration one slide early let me come back let me do a bit of a wrap-up the end of the day this is one of my favorite places in Singapore this is somewhere called Pulau Ubin Pulau Ubin is an island just off the coast of Singapore until last year I think they've gone now last year they were starting to close down the last of the villages that actually lived on that island but it's a beautiful spot a great spot to go for relaxation if you've got some time free and it gives me some pleasant thoughts when I look at Pulau Ubin and when we start to wrap up a discussion of PostgreSQL data warehousing these are the four things that I wanted to equip you with for your conversation on Monday so that you can go to your boss or to your customer and say here are four things that we should be talking about on why we should be putting our data warehouse on PostgreSQL the first one's value you are going to save at least $100,000 by putting your database on PostgreSQL rather than SQL server I've shown you the maths you can show him the same thing second one performance you have better performance than loading in a SQL server environment if I'd had a bit more time I might have done a comparative test but I wasn't set up for it today 450,000 rows a second is nothing to be sneezed at and you could take all of your company's data and ingest it in a very short space of time third one maintenance those context-free data types are just fantastic for avoiding the most common problem that comes up in data warehousing management which means that the callbacks etc that you might get as a result of having your data warehouse on a platform are minimized and the last one the one that I really like is scale and freedom no matter how big you get your PostgreSQL data warehouse has a path to another destination it's not a massive rewrite it's not a radical oh my gosh where do we go let's start again and we're completely incompatible it's a step away from some of the biggest platforms that are out there and having the freedom to choose is what I really like about this platform and about what I've been hearing over the last couple of days let's go back to that little view of cool urbin call to action this is the last slide that I've got this is what I want you to do I want you to go and talk to a SQL server customer this week about slashing the cost of data warehousing and bi if they'd like to do that and if you'd like to talk to us about how we can make it even faster to get their data warehouse up and running by all means feel free to contact me but in the interest of your customers go and do this thank you everybody for your time today okay we've got time for a couple of questions so one of your very first few slides indicated competitive costs yes the x-ray edition I think just the software the licensing went up to somewhere around $50,000 per year and we listed post versus zero would I be correct to assume that that $50,000 actually includes support costs no no uh Microsoft licenses on the basis of what they call L and SA L is the licensing cost that so-called support cost Microsoft calls software assurance and it has another couple of things in it that was L only pricing that I put there now it was list pricing uh most customers will be on most big customers are on something called an enterprise agreement that will have tiered discounting depending on where they are but for mid-sized customers that's how much they're paying no I haven't that's just the database licensing any other yes let's say if I have a huge amount of data display and that's why the storage cost is quite important yes in this case is it better to queue in a postgres database or I just put like as a compressed CSV wow that's a tough use case if it is structured data put it in the warehouse because the cost of storage is getting cheaper and cheaper if your major problem is the cost of storage think about moving it to redshift a thousand dollars per petabyte per year on three-year storage for slow processing but I think you're probably going to be more satisfied what's a huge volume of data for you ah nothing don't worry about it no need to change I heard an example today of 100 plus terabyte warehouse from someone who was speaking earlier I forget who it was one of our test data warehouses is five terabytes just don't worry about it that's a that's a nice number to run on a postgres girl data warehouse yes please a lot of us have been working with postgres for a long time and it's just refreshing to see somebody coming to brand new and sort of see a lot of the benefits that I think we kind of get used to if we don't think about because we just kind of live and read it but you come here from the outside it's it's a fresh it's a breath of fresh air and you point out things that we don't appreciate I think a lot of times thank you for granted yeah thank you for that but look it's a great product to be involved with it's a great community I've really enjoyed I'm really pleased that I made the decision to come to pg day and to meet a lot of you and I think this is one of the strengths that the product has going forward so fantastic well done any other questions one last question perhaps no I'll hand it over to you for the break