 Thank you. Thanks for sticking around toward the end today. I want to talk about our our new database time scale The CTO of timescale I'm also a professor of computer science at Princeton. So hopefully I'll try to not get too academic in this talk so the interesting thing that if you think about is Database is a raw emerge at this An era for relational data, you know going back to a series to system are from IBM in the 70s But since then we start seeing them used in much broad areas and one of the things that we started see emerging is that timescale data is everywhere Between IOT and financial and DevOps The and the interesting thing about it is that the volumes of data that we're starting to see are much greater than anything before And this is kind of a happen because of a bunch of different technology trends One of them is that we're starting to see more ways to capture data from sensors and IOT devices To servers we have fatter pipes to backhaul all this data to centralize data center and now cheaper storage to actually store it over longer periods of time And I think we're starting to see with the time series analysis because it's powerful You can do this to look at the past understand what happened understand the present and many applications where we're trying to do things like predictive analytics To think about or playing to think about how we're going to use the future The problem and why I'm here going to talk about today is that this is Means that we need time series applications are different than what we've seen before and so if you think about a banking application I'm gonna come back to this, you know, even if you're a Bank of America you might have something like a hundred a million users and So maybe the primary table of your database might be a hundred million rows. They could be long rows But if you think about a streaming application, so again on my Prince University You don't think a campus would have that made different streams of data. Prince's campus has a hundred eighty thousand Streams just coming off of its buildings, right? And so the amount of data that we're not starting to see in fact this room I you know depends on new it is but you could generally see a lot of these things are now connected So we're starting to see streams the data coming from everywhere and we're seeing this high volume of data we are trying to do these things for you know real-time analytical applications and Not only do we want to do that, but we want to often tie these data streams these time series streams across our Across the enterprise so you might do that from anywhere from dev ops You might tie it into product and marketing customer service and so the question is how do we build infrastructure to make this easy? Well, you could say you know just use a database And we have actually a funny story why we started doing this so we were initially looking we're not that new a startup We're initially building an iot platform And so we really face this problem We wanted to handle a high volume and high velocity of data and the problem we encountered is that a lot of existing databases weren't really sued to this You know relational data with relational databases Postgres very much a fan of Really easy to use solid reliable great ecosystem, but typically hard to scale on the flip side We have all these new no-segal databases which if anything sell their scalability But you gave give up a huge amount of query power and ability to support types of queries with that So in some sense I said we felt this pain because we set around trying to build this type of platform and nothing out there kind of fit What we need so we you know set out to build it ourselves And eventually we decided that was a bigger opportunity, which is why I'm kind of here to talk about time scale today And in doing so we kind of needed a whole a bunch of different properties We wanted our database to be time-series centric. We wanted to be able to support this large volume and velocity of data We wanted to be able to perform basically full sequel. We want to perform very complex And performant queries are easy to integrate into our existing workflow and some since easy to operate take advantage of a rich ecosystem Like Postgres has so that's what we decide today, and that's kind of what I want to talk about interestingly Postgres time scale is actually Built and packaged as a Postgres extension so you could incorporate it into your existing Postgres databases just you know create an extension and and it could run against a lie Parallel to you for the rest of your data So what what are the three properties? We actually wanted from this So the first is that we wanted this to be horizontally scalable So we want to support high right high right rates both within a single node and also across the cluster of nodes We wanted to perform a bunch of distributed query optimizations for performance complex queries We wanted this to be reliable and a lot of that we're able to basically leverage the existing You know 20 25 years of Postgres has and easy to use again supports full sequel We don't want to be like you know a sequel like language that some of these no sequel databases do which are really good If you have a key value look up, but you know throw in a complex where classes and they can no longer even do that And so you could ask you know Why haven't a lot of people been trying to build you know why hasn't been done before or why is this hard to do in the first place? You know why doesn't Postgres to solve this today? And I think this really comes back to the roots of what we actually developed these databases for and so as many of you Know they were designed initially to solve the problems of online transaction processing of LTP And so if you think about it one way is that they were designed for particular workload That was different in nature than what we have with time series workloads And so by thinking we're not trying to solve the general problem We're saw trying to solve this specific problem of time series databases So we were able to make a bunch of architectural decisions that you wouldn't make if you're trying to solve this in the general case So in particular if you look at traditional databases There's really if you look at the workload Think about again this banking application It's much more common that people actually move money in their bank than a bank gets a new user right so what does it mean you you have a million customers or a hundred million customers you Move money from one account next it's updating one row and and and writing into your crediting from one row Debiting from one row and crediting another row. So what there actually are is that in traditional to be workloads We actually see that updates not inserts are common And in particular you often get a random access patterns across the various rows of your database There might be some locality, but in more and more things. We're actually starting to see, you know, people send money all around the world So this leads to two challenges, right? It leads to challenges for scaling on single machines and it leads for challenging a scaling across multiple machines And what we'll see is why this is different with time series data In particular one of the big problems of scaling on a single machine is that this individual row starts get or this individual table Starts get quite large and as your table grows large your indexes no longer fit in memory And so often what's going to happen is when you're even trying to let alone query But even when you're trying to insert in your inserts if you're trying to maintain these indexes in real time You're going to often be swapping back and forth to disk, right? And so there's two ways to design this so the problem is you need to write and these are to a random entry So this is effectively means that you're going to be updating sort of locations in your in your beetry, right? So you have two options if you use a hard drive the problem is this is means it's going to seek That's five or ten milliseconds for every time you're trying to do these inserts Or if you're using a flash drive, you know, these are supposed to be faster SSDs You know if you start looking deeper at what happens with SSDs While reads might be particularly a bit faster writes could only write active page granularity So if you try to update one cell in an SSD what it actually is doing is reading an entire page from disk into You know your flash memory, which is about 8k in modern SSDs Modifying that one cell and then writing it back. So that means if you want to update basically four bytes You're ending up your IO throughput is 8k in and out of the disk and there's limited number of cycles they could do so people have a Tenth is address this and there have been a number of interesting ways in the last decade to address this update problem for Both of these hard drives and SSDs and particularly if you look at a number of new different type of architectures Such as level to be or the new back-end for for mongo They use a particular data structure very different from a traditional relational database And that is called a log structured merge trees and what this is basically doing is remember the problem We are that we had before is that these small rights became very expensive, right? So what they say is let's actually try to eliminate small rights at all this in some sense goes back to log Like LFS log-based file systems from the 90s that we basically keep the latest updates in memory And only after we build enough batch in memory. We have this sorted in memory We write it as one big batch to disk and that stays as a sorted table on disk now The problem is these are immutable so you don't have this update problem. The problem is now you have all these overlapping Overlapping batches at once then you have this other garbage collection process where you merge them together you resort them write them out and so forth The problem is unlike a B tree or something There's not this great index all your data So you end up you need to keep a lot of indexing memory to figure out which of these batches are actually starting your data In fact, even all updates and deletes are just diffs You don't actually ever replace a data that's even deleted you just write a tombstone to latest copy in memory And then write that to disk so we end up building very big indexes and we need to maintain them in memory And so the cost has come out They are very efficient in terms of use of the underlying storage But a cost is that there's really weak support for a lot of things that you might want from a relational database like send Secondary indices there are various ways people have done that sometimes people put a thin veneer of sequel on top of these That is translating basically to a full table scan on disk or they have a bunch of ways to pack together a Bunch of what appears to be a bunch of different secondary indices into the way they encode the key So for example in influx they have the problem that the index grows with the cross product of the cardinality of the columns Let me unpack that for a minute You know if you have a million different devices and you want to keep an index on that million add a thousand locations And they have 10 from firmware versions. That's still a pretty small database right there That means that your index in memory is going to have 10 billion entries All right, so you're going to quickly find yourself running out of memory And as you know on memory is about 20x the price of SS of SSD and about a hundred x the price of hard drive So that wasn't really giving us what we initially wanted But it's a great way to make SSDs more scalable for OLTP workloads Now the second part that was one of the challenges that people have dressed with scaling up on single machines The second part is how do we actually scale out to many machines? Well, there's the old trick is and we hear about this lot is use partitioning or sharding right you take One server where you might have the primary key Across its server and you want to add more servers You're basically splitting it out and as you add more servers you further Partition the workload you move some of the data and you kind of supposedly elastically scale it out to the number of machines you want In practice you don't actually do it kind of lexicon graphically like I show here But you use some type of hashing of the key space or using something richer like consistent hashing Now the problem is this actually is basically the hammer that we have for most of it to scale But if we've gone back to OLTP workloads, most of these updates are still to random keys So what actually happens is you actually often need to then transactions Across these different partitions across nodes So we actually return to a lot of more heavier weight mechanisms like doing something like two-phase locking across these machines in order to run any type of transactions and If you want this to be highly available you even each of these himself is replicated So if you look at spanner, there's actually two levels of this so every time you grab a lock This is actually an operation to a consensus group in three nodes. So doing one transaction might translate to something like 27 operations to underlying nodes different 27 different nodes So this seems like it's hard and you could say well, maybe that's why we've had such trouble Building these truly distributed databases for so long And what I'd like to pose is that actually time series data are different and so you can think about architecting them quite differently in particular if you look at OLTP workloads They're primarily update heavy while time series databases are insert heavy Think about OLTP. We often writes to random locations in these tables When in time series, we typically write to the recent time interval again If you think about streaming data from all these hundred thousand different sensors or even financial tick data, you know You can occasionally get old updates, but most of it's it's the market or it's the current state of the world That's coming into your database in OLTP we have transactions to multiple primary keys when in time series We often have data that is both naturally partitioned across time because of this property in space Again financial these different tick streams in sensors the sensors themselves for a natural partitioning over your data And so in some sense if we look back on how this might mean for our architecture We could start to see some of the glimmers of how we're going to set around to design timescale So in particular if you go back to think about how we're actually doing writes So again the streaming application we're collecting data from any sensors. It's coming into our database new data all the time What we start to see is that this is what I actually meant by it's giving you this natural partitioning across time and space So from the beginning at its core. We're designed with a type of multi-dimensional partitioning in particular usually two-dimensional partitioning one against time dimension and two against some type of spatial or primary key dimension And the system in particular unlike let's say you could say well doesn't Postgres 10 give you partitioning don't a lot of Databases give you partitioning one of the differences We're not typically think about this as a hard-coded in the primary ID space where you're just hashing it What the system is in the right path it is as I'll as I'll show you it is giving you the illusion as if this is just one giant table and even in the insert path is dynamically figuring out if this is actually going to Fit an existing partition if it needs to create new partitions and make that all completely transparent to the user And not only do you want to size the not you not only do you want to define them in a static way? But the system wants to actually adapt Their partitioning based on even observe size because you're the size of your time of the amount of data you're collecting can change over time And so we as I'll talk about you want to make sure that not only they are aligned by time But they want to be a size in the right way to be able to keep the recent intervals Efficiently in memory and avoid that swapping to disk part that we ran into before Yeah, so the way we actually implemented this used Postgres. I mean we internally their owns, but it actually is These are actually all Child tables of this parent table. And so the nice thing about this design is that we actually Allow you to under the cover see all these individual tables themselves and You could actually manage them like their normal Postgres tables though for most perspective The way we're designing is you could actually manage the parent table And then actually a lot of these things will be inherited to the child table But as I'll say we also do a bunch of distributed query optimization So our query path is not just going to be doing what is stock Postgres with child tables so this brings me and so Not only this then on one note. This is actually is how you actually do scale out version 2 that this gives you a way to Partition this data across many servers Although we also allow you to do things like make time intervals sticky to the same nodes, which is a common Which commonly gives you performance benefits that we've seen a lot of standard work standard type of queries that we see so again, think about this is how you would divide yourself by time and space and then To follow on your question What we find our design is we actually want to handle a lot of this automatically So our partitioning what we see is both opinionated and automated new chunks are automatically created in the insert path and it handles a bunch of potential deadlock issues out of order data issues and Local indexes are often a conjunction of this time and the various values and sometime basically based on the cardinality You index them in slightly different ways The other thing which I was alluding to to answer your question is that Unlike, you know, what you do in instead kind of standard database partitioning Our arse has a particular flavor in that we also want chunks to have an approximate max size We'll also take into account that you could have late data arrive into your system So you want to think about what it is during normal time periods But also you want to give yourself some buffer and you want to handle the fact that you might have out of order data Or even early data appear as well So in particular we size them in a certain way to be able to maintain these indexes from recent chunks fully in memory And what this means is of course is that when we on the insert path We could then update these indexes just in memory and not need to swap to disk Which is going to give us a lot of performance benefits but we also want to make them time aligned and What that the reason time alignment is important is we for example If you only did it by size then you run it into the issue that data that arrives early could push out a time interval And that if other data comes in that individual chunk could be come too large This also so you have to balance that carefully It also allows you to do nice things like enable retention policies like delete data more than a certain time period old Because this will actually translate to dropping entire chunks which is dropping tables or files on disk Which is of course much more efficient than deleting in place rows and so effectively you run ourselves without actually doing vacuuming on these tables Because you know we do that with with an approximate data retention so we could transit that all to dropping tables again As opposed to deleting in place data The second thing this actually allows us to do is now that we manage this and actually keep various constraints About the data that is found in each of these chunks for each of these tables This lends itself to a number of distributed query optimizations that we're able to do in particular for example a very a Basic one that though gives a lot of performance benefit is to avoid querying certain chunks at all through and ask answer queries Be a form of constraint exclusion analysis. So for example this simple Query if you know for example, of course the time constraints on these various tables You never have to touch the other chunks that are outside these time intervals. Similarly. This is again two natural times of types of queries you often seem in time time series analysis one which is Slice across a number of different sources, but with a time some time downed another common one is Dig deep on on some of your data streams. So maybe look at one over a lot longer time period But now deeper on this so again this gives us a form of constraint exclusion analysis So we don't need to ever touch the other tables or even contact the other servers when we're in a distributed setting Because all of these all of the servers act homogenously so any of them could basically Support both the insert path and also the query path. So they have knowledge about all the other constraints on the tables in the cluster We also do I introduce a number of this is actually one optimization that postgres does which is a type of online merge append for example, if you want to do something like give me some data order by time Descending limit a hundred what postgres will do is they could take multiple chunks or multiple partitions And it could incrementally walk through them It doesn't need to grab the entire you know do the full table scan first Compute the order and at the end of the limit what it actually could do is if you have this type of query It could go through multiple these chunks in parallel and look at the last Value that comes off of each chunk and then compare compare them against each other So if you want to do order by whatever limit a hundred it could look at the the the newest value on each See which one is the is the greatest take that one The the one chunk gives me the second value I compare the second versus first the first and keep doing that until I hit My limit and then I could move on this is particularly useful This particularly saves you time when you actually have a rich predicate such like getting the next value off that chunk It could actually involve scanning a lot of that chunk One way that we extended this which is again very Particular to time series analysis is that we actually extend this to work with time aggregates So you could specify something like give me the some aggregate over a time period Group by hour order by hour to send give me the maximum time the maximum temperature for a particular device for every hour Over the last 24 hours, and this is going to make sure that it only incrementally processes individual chunks in order to figure out How I can compute that result? There's another another number of distributed query optimizations. We do for this perform partial aggregates on this distributed data a Vory full scans for example again a common monitoring application Tell me the last K readings from every device in my system Right, that's really hard to express efficiently in SQL That could often be a full scan until the end of time until the beginning of time And we give a bunch of optimizations in and store metadata that makes us efficient because again our focus is on time series data Both on the architecture and the type of analysis you want from that different from general OTP Yep, we only we run a 9.6 We're an extension for 9.6 and I was looking to this already one of the key design choices we wanted to make was a We want to give the users the illusion of a single table And so what this means is that when they actually perform selecting it looks like they're interacting with a single table and Also on the insert they can insert a single row or entire whole batch of data But just treat it like it's a single table and so for example. I said before we kind of act every node acts similarly so if you have a cluster with n servers you can give that batch to any of them and Inside the node that actually accepts that insert it not only will figure out how to route the data to the right table But also it take your batch it separate them into sub batches And then it'll insert each sub batch into the right table with a minimum of locking and avoiding and doing a number of things to To avoid kind of deadlocks And furthermore the engine is actually Automatically closing and creating these chunks as part of the insert path so this is not a a thing after the fact that runs that is then going to occasionally kind of Vacuum up your data and split it out and garbage collect. This is something that actually happens in line And so last key design point which is perhaps more appropriate for a different audience in this But was that we were designed up from Postgres So the advantage is that you could actually connect to it and query it like it's Postgres and actually manage it like It's Postgres as an extension basically tie into the query planner and execution engine With the various that hooks that extension that Postgres extensions gives us But if you come back later as each of those chunks are actually internal Postgres tables And so it be able to be managed in much the same way So what that means is that from a from an administrative point of view a lot of things become easier in that we could use all the existing techniques we have from replication and some of the stuff we saw in Postgres 10 for logical replication for example We'll certainly check them out and seems like so many you might be able to tie directly into the system without a Lot of a lot of effort you get checkpointing backup You get all this fine green access control from Postgres And this is kind of what we inherit really from the great ecosystem that Postgres has and the final thing is if you think About most time series database they go went off this line of building is no sequel databases and to find their own query language Which means that every time you want to connect her you actually don't only had to write it you had to write your own integration for every third-party system and Often you'd use these in applications where I talked about integrating with workflows. It's not only used for you know It's used across our organization So your DevOps people might want to stick Rafa on top of it or write to it via something like collectee while your Your analysts or your product people might want to use Tableau to understand their data and this stuff just all works Out of the box because we just speak Postgres even in the distributed case Okay, the last thing that Postgres gives to us is that it unlike most time series database It allows us to avoid data silos Typically what people do in most time series Databases and and this would perhaps I know this is strongly a Postgres community But when you're adopting some type of no sequel database you have two choices You either denormalize your data at right time so you keep a lot of metadata in every row of your database Which of course is very inefficient. It is often expensive if your metadata mappings change because you have to go back and update it And it's operationally difficult because it means that you kind of need to know that metadata You have to almost do the join an application space before it's inserted into your database at all and The second option is that you could basically and what most people seem or a lot of people seem to do Is you actually create separate databases you your metadata store in relational database and your time series database is stored in Like a no-seql database and then you push the problem on to application developers to join the data, right? That works okay for developers It's a pain one of the whole reasons we have a database is to handle these data management problems It's also a real pain for your business users because now if I want to say, you know, give me the time series data for like Mike's Sensor or or your customer service for John's toy my time series database doesn't know anything about John I need to go to one system get John's toys UUID then use that UUID to query my time series database Because again, we're built on top of time scale of Postgres You could just use SQL joins with us against relational tables So you could perform SQL joins inside the database itself or via the foreign data wrapper interface to an external database within the database data is fetched either depending upon if the Data is rapidly changing and it can be large You would store that relational table on one of these nodes and you do a join against it if that data is relatively small Or very or quite immeasurable you could actually use triggers to materialize that table on all the cluster nodes and then forms joins look locally We have not we do not yet support joins. Well, we support them just not efficiently joins between hyper tables Although it's something that we are working on And because we're an extension actually creating these tables and migrating is quite easy like I said, we currently are in Postgres 9 6 and really after you install the extension You just create a table, you know, it looks like a normal Postgres table and call this one line create hyper table This is the name of the table the Column that is your time index you could use not just timestamps, but you know ins and whatever And a partition key or your other your space identifier and right now We kind of have you specify the number of partitions you'd like to build from it My grading data is similarly easy Just insert into this database or use a copy command or a PG dump Insert into this table from your old table and this would be basically Sufficient if your old data have this schema would be sufficient for moving data into it So what we've encountered is a lot of times people start with a Postgres database because it's great when they have time series data And eventually they hit a certain scale and say, oh, there's no longer works for us Let me try to figure out some crazy, you know Mongo setup that makes us work and what we're saying is now you could actually continue to scale with Postgres Use your scheme schema in timescale as you used before and then just you know Migrate to us be this simple command. So I did want to leave you though with some performance graphs So I the interesting thing is we really designed this from the start to be for a clustered setting And what we came to realize is a there's a lot of people that also run are good enough on single mode. So We we didn't necessarily know because we weren't really thinking from the start about scaling on a single mode But interestingly we found that it actually gave you really nice performance benefits Even if you are only even if your data set is is only large enough to need a single node in particularly These are for different reasons. So in clustered Versions you can actually get performance benefits because you know, you're reducing your latency by paralyzing these queries over many nodes and You could also reduce network traffic because a bunch of our distributed query optimizations in the way we push down Different queries to the nodes means that for example You're not you need to copy the full tables back to the to the to the one node that issues the query and join them There you can do aggregation push downs the way you do partitioning You're basically localizing group eyes onto nodes and so and some of those you know those time aggregates again You're minimizing the amount of data that you need to send back But even on single servers we get benefits for really three ways One of the biggest ways we get them is based on the way we carefully sized chunks as I talked about before But also some of the optimizations though like merge your pens and group eyes because they're localized to the chunks You again avoid touching a lot of the data and you could and you could also paralyze cross disks so Let me let's look at actually what happens with vanilla Postgres This is postgres 9.6.2 running on a fairly beefy instance running on Azure with premium SSD storage and Postgres is working, you know Pretty good at this is inserting rows one by one. I have batched in a couple slides But one by one you're getting about 14,000 inserts per second each row here has about 10 columns And unfortunately once you get about 50 million rows Which is actually could be really large for a relational database or for like an OLTP database But you know on the smaller side for many time series applications What you see is this performance drops down and in particular the striking thing about is the rise of the variance Right, so these are individual the performance every 20 seconds. The average is the average insert rate every 20 seconds We do sometimes have stuff that still gets around, you know 12,000 inserts per second, but there are a lot of time periods where you're down in the couple hundred inserts per second On average you might be down eight, but actually if you're thinking about a production at workload You know what you're often really cared about is how does is not even just what the average does But how does it work in certain scenarios because of course whenever you're down to these periods where you're a couple hundred in spurts per second You're now growing up queuing deeper and deeper behind you. So this kills you a lot more than the average This is time scales performance in comparison So I think this morning we've this was from a couple Two or three weeks ago this this morning one of our engineers wrote us that we were up to about 12 and a half So we were actively narrowing this gap via a number of the chunk-based optimizations that we generally did as an extension our fast path is implemented all on C and our a bunch of the management stuff is implemented at PG SQL but you know this it's a scene Various parts of the things that we're optimizing to narrow this gap But the striking thing is how this is basically constant performance up to and I think we ran out this To about 250 million rows With very small variance Now you could say well, you're only using four gigabytes of memory. Let's throw some more memory on here So then you could raise it to let's say 16 gigabytes of memory on your on your one node and it pushes out this problem Right, but you're still running this problem You run out of memory you start swapping disk and then Postgres is again going out going over the cliff So with a single node you could scale out by trying to throw more memory out there And that's just pushing out your problem and the question is could you think about a different architecture design to really solve the problem? And that's what we think timescale gives you The interesting thing is that this is writing row by row in a lot of production settings You might have some type of queuing system before you're actually database That could because you're writing data into your database through some Logging system a queuing system like Kafka because you have some streaming engine That's pushing it to it and typically you might insert data as you know for efficiency more than an individual royal time So instead and this is what some of the other time series database have what numbers they've used So we insert or inserting data and set batches of 10,000 rows and Now in absolute terms you've gone up from about 15,000 inserts per second to about 150,000 inserts per second again this or 140. This is 10 10 match So this is about 1.4 million metrics per second And this again pushes out the problem a little bit with postgres But we again see this performance cliff at it, you know 50 to 100 million rows We start really dropping down and so out here postgres is getting I think on average 8,000 inserts per second while timescale has continued to maintain basically constant throughput across the time period That's on the insert path a number of the query optimizations were doing and I think it's still early in terms of the query optimizations we could achieve but at a high level they go from As far as we found so far We perform no worse than vanilla postgres and there are certainly many cases where we could perform Significantly better. So these are you know various types of of things where you know Some of the benefits they're getting are from localizing group buys and this last one where we get almost a 5x Improvement over vanilla postgres. This was the time aggregation problem. I was talking about before, you know select per minute the maximum reading Group by minute order by minute ascending give me a last couple entries and so This is actually an inquiry you could you could ask at any time interval like You know, what are the last five minutes of of February? What is the you know the readings from them, right? and so This is kind of a general type of optimization we could do and and and even running on a single node give you some significant performance benefits Now I do want to say that we were designed with a particular use case in mind We wanted full sequel we wanted complex queries and so You know, I think it's good that if we are completely frank of when you should not use timescale, right? and so You should not use timescale you could but there are other things that are more optimized than timescale is if you have Very simple read requirements if you really just want a key value store Or if you want to do something that column stores are really designed for And I think this is very much aligned with what what David was talking about earlier today Which is if you just want to roll up on one column, you know a column or format would be much more efficient Or similarly because the column or format if your highest priority thing is compression On an architecture like that might be more important although there are some things like we heard about earlier today that you could improve the compression of basic Postgres's row store or row storage thing if you also have very sparse for a lot of the same reasons if you're very sparse or unstructured data post crisis probably or timescale is probably not appropriate and You know, this is one of those, you know, what are your greatest weaknesses and you turn around So if you're happy with experimental infrastructure, you don't need something like timescale But if you want full sequel you have complex predicates or or or aggregates you want joints or relational data You need to think about scale out storage. You want rich indexing that doesn't fit in memory You're dealing with mostly structured data and you want to basically leverage the postgres ecosystem. We think that timescale is Allows you somewhat to have your cake at you to four times series data So three weeks ago. We actually we've been running and a hosted version for about Six to eight months with with some customers, but a few weeks ago. We actually open sourced The timescale released under the Apache to very permissive license You can get on github right now. The beta release is just a single mode with clustering coming soon There's also a fairly technical white paper. You could find I think actually we'll have some copies around here If anybody is interested, but you could also find it on our website Otherwise check it out download it. Give us some get stuck github stars if you want and I'm happy to take questions. Thanks So there's two different things. I think you should be able to define your partitioning key on Just to be clear I use the word spatial because I like the time and space but You're not talking about something like a GIS column. You're talking about just yeah So I think you should be able to define this on a on a composite column, but There's two different things remember It's primarily a scaling feature Right and so the reason to partitioning is just to give you natural this natural scaling You're able of course to always Define your ware clauses on any of these columns. So you're not limited to indexes I mean you could index any column and we support Postgres as full indexes What the the the reason you choose your partitioning column in a certain way is allows you to do various things that eliminates the need It's that constraint exclusion analysis, right? You'll have to the only reason to do that is if this is a very natural way Where your query is like you often ask about devices or about if you're a you know You're you're yourself a multi-tenant companies that purchasing key might be your own clients or your financial data So it's really about performance and you could index any table Yep, yeah, that's what we do. We do the latter. Yeah, so we we've rewritten I should say the guy who actually wrote this is staying in the back Matt But he could he could give you the nitty-gritty, but like I said, we you know are We rewrote part of the query planner and execution engine. We are on your insert path We are taking this taking the inserts calculating which ones to do it in so we probably do they are the same as green plum Yeah, so there is There is We do keep metadata that tells you exactly at where and how you need to execute the thing There are similarities with Cytus is doing I think in general they're trying to solve The broader data warehousing problem. So, you know, maybe they Talk to the guys, you know, nice guys talk to them yesterday There you know might be trying to do things like global indexes on tables Which you might for a lot of traditional data warehousing queries when isn't naturally a time series analysis So there's really two aspects of it. One is the time series part That's going to give you this that we did so far which is but architecturally and what we're actually developing now is also a lot of extra Support for types of queries that you might do with time series that are particular to that setting and And so point-and-time analysis merging time series a lot of things that are again are specific to time series We'll see we obviously we haven't looked at that closely yet. I mean, I think in general They're partitioning like I set that by saying we are opinionated, you know Like a lot of the the web frameworks say they are opinionated frameworks. We have a very particular use case in mind most database partitioning don't have that Same time component or that's not only what they're solving. So we're doing a bunch of things that are much More specific slash advanced for the ways we're doing partitioning So there might be some underlying mechanism that we end up rewriting some of our stuff to use them But I don't think they would at all replace You know the types of mechanisms. We're doing for partitioning. We do not So that is one thing that there's one way what we are different from attritional Well, we do not have we do not enforce something like a global uniqueness constraint, right? And so we do not support global indexes because again in time series you that generally doesn't make sense You're you're not you're not saying something like I want to only insert, you know one of you you ID once into my database What you're often saying is I'm collecting a stream of data from this you you ID So the none of the benchmarks here are from with the but with OS tuning as far as I know You know as an open-source package We're going to expect people to you know install this locally and we'll of course come out with some recommendations on What we think would be appropriate, but we this these were this is the stock Cloud VM that you saw those performance numbers from So that's a the question was how does time How does time scale compare versus other time series databases in terms of memory storage speed whatever this is a so the answer is yes I Joke that because there's not you know This is a really multi-dimensional answer based on your usage scenario. So for example if you're You know a column store You will do single roll-ups faster Generally, and you'll probably get better compression But if you start having multi-dimensional lots of different columns You sometimes can't even support that query. So for example, I think You know Cassandra a lot of times You do a where clause and this could turn into a full table scan under the covers or an influx You can't actually specify an order by after a group buy or influx again Doesn't index numerical columns at all. So if you want a numerical column, that's going to be a full table scan right, so the answer is It very much depends on the query And so if you find yourself running complex queries, this is going to be good for you if all you want are individual roll-ups Probably a column store could be more efficient We so far we've we've published some of our benchmarks against vanilla Postgres We're you know in the background, you know, obviously benchmarking other things and we're put up some numbers in the near future Hold on. Let's get a new person with most SSDs I guess the question is if you're trying to if you are updating a B tree Well, here's the question right if you that whole page is going to be marked as dirty if you update any bit in it Right and the answer is when you write it back to disk. It's going to be writing the entire thing In general, I mean you're right in some sense of the B tree structure You're going to be updating in memory when you can but there's a reason why A lot of these modern architectures move towards something like LSM trees and one of the reasons is yes if you take your in some sense B tree have moved this problem from the Flash memory to now the OS that if you want to update one cell you'll have to read out that B tree page Update it and then if it gets evicted from main memory, it's going to write the whole Block back to disk so it's still one a few bytes update is going to turn into a page right So no matter what you're paying the cost of a page, right? Sorry That is caused by updating one byte and So I guess what I do right the page right is you're going to see this Even though you're trying to only update one bite You're going to see this as a as an AK right if it's on the flash or if it's in the B tree You're still going to see a page, right? And that's at least one of the reasons that LSM trees have have changed that around but again with one of the reasons that Redesigner the way we are is so keep all these blocks in memory So we're not swapping them constantly to disk based on the workload patterns We plan to open source our cluster diversion. Um, so this is something that so The the mechanism this is not something that we've released yet We are actively working on it. The mechanism is going to be perhaps not surprising There are two different ways one is that you could use if you're okay with asynchronous replication You use Postgres's existing Replication thing the other thing is we'll probably add some support for two-phase commit to give us, you know really active High availability. I think if we ended up doing If we end up doing the act of two-phase commit, we're going to build some of that complexity ourselves That's actually where a lot of our background is from as an academic My background is all this is in distributed systems and thinking a lot about various ways you do consistency That's what one thing I said that we do not yet support is joins between two hyper tables Which would in the case be distributed joints. Sorry. Sorry. Let me let me sorry. Let me pray say We support them. They are not efficient Currently not currently not But but let me let me follow on one thing The one caveat again going back to the time series constraints is that a Join that has a predicate it would still take into account the constraints in the predicate So if you're a distributed join for example is related to hyper tables by time, you know You would actually exclude a lot of the chunks that need to shuffle and so you wouldn't be shuffling your entire table You would of course be be narrowed it down by the constraints we were able to read from the query That makes sense. So if you want to do this tribute join on the last week's worth of data This is a much smaller data problem than a distributed join over all of your hyper tables across all time. No, there are many many Yes, so we support if you look at this picture Yes, you could have many partitions many tables many partitions on each on each server And again, one of the reasons you do that is because this time notion This allows you to eliminate them but it also even supports things like your data retention policy We want to we support data retention by just dropping partitions Which again is a different problem than you get in most like partitioning. We want to drop a week old So we avoid all problems with vacuuming. I don't Let me get back to you Let me get back to you on that. Let me get back to you Other engineers will be able to answer your question better Yep I'm going up and down from various sources, and I want to relate them to each other not necessarily relate them to the forward marching I'm not sure I understand your question. So The question was along the lines of you know time always marches forward the and that The other I expressed one other dimension, which was used for partitioning that might not march forward the aspect of of So when you think of I Guess I don't fully answer your question Let me try to answer something and then afterwards you could talk to me if it doesn't answer so Typically when we think of this Partitioning that is defined animation slow down It is not defined strictly on Offsets like QQ to you here is just trying to trying to give you an example. You're generally defining over the space So you of course could add I mean this defines the entire universe of that of that key Right, so it's not like you're saying. Let me track this 50 sensors. You're like I defy I divide my ID space into portions and all the existing sensors hash map into that space So of course with power you could add more or less thing In terms of analyzing back and forth with time if you could express it with sequel You could do that type of you know, you could do a query on it Beyond that I'm not sure I actually understood what you are asking