 Hi there guys, welcome Thank you for coming to my talk today. If you guys just listened to Burton speak maybe a couple of you In the room next door. He's a my colleague at chart beat I'm a platform engineer on to a chart beat. He works on the data science team but we handle building the systems that chart beat runs on and Chart beat is a pretty cool company. We're scrappy 75 in Union Square And we build a web analytics product that is used by tons of big publishers Across the world anytime you load up your favorite publisher probably we're on there unless you use ghostry and We are tracking you we Sort of pioneered this idea of engaged time where we think that people should care about the amount of time that People are spending on their articles and reading their content as opposed to just paying attention to page views Which of course is what apps metrics and so forth are based on Now at chart beat we handle really big data 300,000 requests per second about 50 billion page views a month We like to think that we have the most data per engineer since we're only about 25 or 30 engineers But we have really big data problems and picking databases for us is a problem because they could be either expensive or not performant Sometimes that's not the case sometimes when we find a happy medium, but other times we don't so we have one Particular database out of many many databases that we host Out of our 800 EC2 machines One of those databases alone cost 15 grand a month to operate and we do so with Mongo running these three I to 8x L's which I think is the largest instance class we can get on the old AWS EC2 system So pretty expensive very beefy machines Mongo likes to have big machines to work correctly. I should preface that this is Mongo 2 Mongo 3 has some performance benefits. We could probably scale this down a little bit But in any case we can see here that with our database that we've built We're able to get the same sort of performance guarantees and handle and process the same data for about two grand a month as opposed to 15 grand a month. It's pretty cool So what is Wade this thing that we're talking about today? It's a distributed fault tolerant horizontally scalable database So lots of words there And it's a database framework and I'll talk a little bit in a minute about exactly what that means But it provides strong consistency and high throughput by using a replication Strategy known as chain replication. This was done out of research done about 10 years ago in Cornell And there's been a couple other implementations that use chain replication But we'll talk more about what that means in a bit out of the box Wade supports something called command replication So basically when replication happens, we don't actually send the objects across the network We send sort of little pieces of information that tell the next node in the in the chain what to do with that object We can solve what's called the read write update loop Which is a problem that we face when building databases oftentimes not the first database to solve this But we do make it pretty easy to do so Ultimately clients and users of the database write query and update commands written in Python that live on the database Wade is built in Python, which for better or worse. It is a database. You hope it's really performant So we have decided so forth for now since it's more like an academic thing to keep it in Python So it's easy to understand by people who aren't used to writing databases You know other folks you use scala and earlaying and all kinds of fancy languages to make their stuff We're just saying let's keep it simple so To frame what Wade is and what it's not it's not an all-in-one solution. That's why we called it a framework So to give you an idea about What that means let's talk about a database we might be more familiar with my sequel So what are the parts of my sequel? We've got our drivers and clients the things that we interact client side to talk to the database There's a part to my sequel that understands things like replication Making backups partitioning the data the sort of administrative layer that also handles any user permissions and so forth There's also the query parser in the planner The sequel language and grammar on top of which my sequel is built a superset of sequel We also have things like stored procedures and last but not least the layer of sequel that understands the data on disk as it's written And in the case of my sequel you have multiple options for choosing your storage layer Now where does Wade fit in with this? This is a kind of a loose analogy. So it doesn't really map out but basically With Wade you have to implement a couple features that normally would be given to you namely we have to define the methods the in Python how to mutate our data on the servers and We have we are responsible for writing the data to disk if we want to we could keep it in memory Or we could use something else now laying out the data on disk You might be saying oh, I have to keep track of file pointers and locations and binary data Maybe but there's a lot of great utilities out there that help you do this called like level DB rocks DB These things that you hear about when you're choosing back ends for databases you can use those tools With Wade to help you define design your system So without further ado, I do want to do a quick example of how this thing works in practice So it is a distributed database framework and as it's a framework. We've created an implementation of this framework Which is an in-memory key value store. So this is an implementation of Wade again Wade You can build lots of different things. This is just the one that's easy to demo So in our example here on the right-hand side I have two nodes that are sitting here doing nothing at the moment and I'm going to restart those nodes right now So it's distributed. So we've got two replication factor two We have two copies of our data and then I configure those nodes and tell them to talk to each other so they know about each other and on the left-hand side here I have my ipython terminal ready to go to instantiate a client to talk to the database and The interface is pretty simple So I've written some code that we haven't seen yet that has that lives on the server and we're interacting with that code right now So we're going to send a couple commands to the server One is to get some data and the other ones to set some data now as I said before everything in Wade is either query or update Method and here at the beginning of this client. Can y'all see this finally? Yeah So We are querying the data so we have client query Our method name is gets this is something I created. I named it the get method That zero number represents the partition ID Wade relies on partitioning But we'll talk more about that in a moment But right now we're just talking to the zero partition And then we have our arguments for our get function, which is just a JSON dictionary in this case We just have one key or the one thing that we're trying to do which is get a value So we're just giving it the key which is Adrian and then last but not least we have a little bit of metadata Which is sort of a tag to identify the application to the database right now We're not doing anything with this but in the future This might be useful to say have metrics break down by application at the database layer We often have this problem where when we're trying to debug performance issues Knowing who the colors of our database is pretty useful and looking at things like network traces doesn't really give you much information So we're gonna try to get our value and there's nothing there I just restarted my database. There's no data because it's all in memory now Let's actually set some data. So the syntax is very similar except now it's an update method The name is set. We have our zero partition We have our arguments to our set method, which is a key and now a value of 12 for whatever reason and then a debug tag Again included we set that value and now we get it and it's 12 and on this on the right hand side here We can see our two nodes. I put some strategic print statements there To kind of help us along as we can see both nodes in our cluster got that update and that was replicated Any questions so far? Okay, so this sounds like kind of magic So let's actually look very briefly at what the actual key value store looks like so we're implementing an interface the store Interface that we give to Wade we register this thing with Wade It runs on the database and so when we execute that update command called set It executes this function here given these arguments for now Let's ignore the object ID and the object seek stuff really we just care about the arcs and as we can see in this in the line here We're doing the self dot data equals or dot object ID K equals V. That's our key value store Remember, this is our example is just in memory. It's a bad example We probably wouldn't keep this stuff in memory. You wouldn't want to replicate it in memory key value store Maybe you would But you could easily Initialize this with a level DB database and then talk to the level DB database instead and create a wrapper around it So that's our key value store. It's pretty simple I hope that gives you some insight as to what Wade does so behind the scenes Wade is handling things like replication Message passing between the nodes. It's handling partitioning the data just so long as we tell it what partition to write our data to and Aside from that and that nice easy to use interface Wade is built on top of something called chain replication Which I mentioned at the beginning So chain replication this nice handy excerpt from the paper really kind of summarizes what it is is that it's an Approach intended for supporting large-scale storage services a database is that exhibit high throughput and availability without sacrificing strong consistency guarantees most databases that we're aware of they Sad they either give us one or the other they give us high throughput or they give a strong consistency things like sequel You can tune it to be very performant, but it's best for consistency It gives you strong consistency always and that's kind of its main feature that we blow in love Things like for example, RIOC if you have ever heard of that database Extremely high throughput massively horizontally scalable, but eventually consistent. We lose out on the strong consistency We can ask RIOC for strong consistency, but with a major performance benefit penalty So before we talk about chain replication something that we're more more familiar with Which is known as a primary backup strategy to replication Drivers clients and so forth interact with what's called the primary node solely Riots go to the primary node and then in parallel the primary node sends those rights to backups Almost always in parallel. This is how mongo works for example mongo calls these primaries and secondaries This is pretty common in databases in general if you've ever used a database you've probably heard of this So and as you can see here, there's like a little back-end which represents the actual storage engine that you might be using if it's sloppable and One thing that to note too is that the reads they're not replicated obviously But they go only to the primary and I have this dotted line here on the right-hand side that says reads from backup There's a big caveat there is that if we read from the backups we lose strong consistency guarantees You could imagine a situation where a rights went to the primary and a read Simultaneous right after that went to the backup, but before the backup has acknowledged the right You'll be reading stale data certain scenarios. That's perfectly okay. You know your data model You know if that's acceptable, but you do lose out on strong consistency Chain replication on the other hand we always get a strong consistency and we get high throughput now with chain replication It's a little bit different our replicas are instead of being the backups being sort of like siblings of each other We have a strict ordering of nodes and The driver sends these update commands these rights to the head of what is called a chain So this these three nodes here these colored nodes they represent what's called a chain and then rights are sent through the chain and Each node subsequently forwards it on to the next in order So the head always receives the object of the message first and the tail is the last one to receive it And the tail is actually sort of the arbiter of truth It's the thing that's actually committing the data and it's the thing from which we read the data So the tails kind of like the primary here It's the place where we decide whether a right is successful or not because that's where we read from as well The benefit here is that we get sort of like we can have each of the nodes in the chain sort of be Processing something while more messages are coming through from the top So we get really good throughput in this example, but we don't get we kind of sacrifice latency a little bit So in our primary backup model, we're writing to our replicas in parallel So the maximum amount of time it takes for the request to be successful is the max latency It takes for that one of those replicas to respond in the case of chain replication It's the sum of the latencies between the head body and the tail Now the replication factor is configurable both with Wade and with chain replication the case here We have replication factor three. We have three replicas three copies of our data We could have two in which case we would have no body just ahead in the tail And at least Wade you can even run it with one node The node can simultaneously act as the head and tail for the command, which is pretty cool I think that kind of sums it up As I mentioned before Wade does partition your data. It's pretty common in databases. You have to tell the database how to partition your data Databases like Kafka Redshift just name a database a lot of them have this idea of partitioning in Wade They're called objects, which I'm kind of maybe thinking we should change that Because makes hope makes you think of key value stores, which is not what this is but they're named objects because basically it's a replicated state machine and That's a chain as a replicated state machine, which are called objects. That's what we talk about So an object is sort of like a chain. So the ordering of nodes is is a chain You can have multiple objects live on the same chain. That makes sense and Lordy skipped ahead so But we have to tell it how many partitions we want also kind of like with Kafka if you ever worked at Kafka We're kind of making up the number of partitions. You should you should have right now Right now one of our production systems running few thousand partitions like three three thousand partitions or so And that seems to work just fine for us The number of those partitions or objects is limited for the lifetime of the system. So we say put a lot in if you need them So let's dig a little bit deeper into chain replication and talk about what happens when a message actually goes in So if that wasn't clear in those arrows before hopefully now It's clear as we follow this cycle that the message goes into the tail and then comes back out Through the head, but we have to make this full trip through the system for the right to succeed now Chain replication kind of has this two-phase commit thing going on with it that when the messages come in to the system When they reach the head they are not actually written immediately to the disk Such as with primary backup rather they're added to a pending set in memory We're sort of recording the intent to write the message But we don't actually do so until finally the message reaches this tail and again the tail sort of the arbiter of truth It's it represents the known state of the system And so once the tail says this right was successful I've committed it it responds successfully to the body the body's like cool you were successful I'm gonna commit my data remove the command from the pending set and then return to the head and so forth And eventually it goes through so on the left-hand side as the message to coming into the system We are pre-committing and then once the message is on the way out of the system We are actually committing it presuming. It was a successful right Sure, what's up So if something goes wrong I'm gonna say that I'm gonna I'm gonna tell you what in a little bit what goes on But just know that if something goes wrong This is fail-safe and there is an algorithm to read to know what's wrong and like say like okay It'll make more sense in a little bit. You had another question over here somewhere. Oh Yeah, yeah, we're gonna talk about that a little bit because it is a little bit like how does this work exactly? So this is where we're talking about it. So how do we actually like Order the messages and send them through the chains and know like okay Is it is it appropriate to remove the command for the pending set and store it? Is it okay to commit? Well, basically we're we have to introduce a timeline So that as I kind of mentioned briefly, this is a replicated state machine So if we have some state that we're operating on which is like the object or values or whatever our you know Key value source doing or you know not key value sort rather what weight is doing what we've asked is to do We need to make sure that the commands that we've created that live on the database are Executed in the same order on every node if they execute in the same order on the on every node given the same inputs We will get the same output value on every node in the system and everything is is synced up together So when a message comes into the system It reaches the head and the head is responsible for assigning an incrementing number to what is like called the sequence number of that Command so it's an incrementing number per object that represents the nth update of The state of the command so for example when the object first starts out its sequence is zero It hasn't been updated ever and then if we make 10 updates to it We say we called our set command 10 times the sequence number is now 10 and The head is responsible for receiving a message and saying oh, yeah, I got a message from a client I'm gonna assign it a sequence number and then once that sequence number is assigned every node Downstream in the chain the body and the tail they check what they know The current sequence number is and they make sure that the new command has a sequence number That's exactly one more than the one that knows about so it says that hi. I am the body I just received an update from the head the update has a sequence of 10. I Currently have sequence 9 10 is good 10 is the next logical step But if I were the body and I got a command for 11, it'll be like what happens at 10 I missed command number 10 something's wrong with the system and It's possible at this point that the head and the body have diverged in which case we need to execute some Algorithms to restore the state of the system and basically use the tail Which again remember is the is the source of truth we ask the tail Hey, what is the latest sequence you know about and in which case the tail overwrites the state of the other nodes? So this is sort of the more complicated diagram of this. I don't want to spend too much time on this but basically Wade is that the nodes themselves are generic They ask a series of questions to know the state of the object and as we see here We have three nodes the top little green box as a client or upstream node could be either it's generic And then after the body of the know, there's a downstream node potentially so as we see the message enters the system We have some command in the top here. It's a set command. It's for the object zero, which is the partition We've got our arguments Adrian Actually, this is bad arguments. It would be a key and value and all the other stuff We talked about before and then we our object sequence We don't know it could be assigned or not depending on whether it came from the client or if it's coming from an upstream node So it enters a system. We check if it's valid We ask if the object sequence has been assigned if it has that mean it's came from an upstream node If not, we're the head. So then we ask are we the head of the chain? Yes or no if we are the head of the chain It's now our responsibility to assign a sequence number. So we pair up the store and the pending set We ask both of them. Hey, what's the current sequence number? You know about for this object and then we add one to it and that becomes our new sequence number Now our object has a sequence number attached to it We add it to the pending set the node then asks itself if it's the tail for this chain If it is the tail then we're done. We know that we can commit and that's why we go straight into the commit step If we're not the tail that means that we're not the end of the chain We need to forward the message onwards and that's why we potentially have a downstream node And then if the downstream node responds successfully We also commit the message and if not we receive return or rejection upstream the system either to a client or another upstream node any questions there I Should point out to that the sequence number is something that while I'm talking about it right now It's not necessary to understand how Wade works. Wade just asks that you keep track of it So Revisiting the key value store now we can look at our get and set methods again if we look at our set method We notice that we have this object ID that's object sequence That's the partition and the object sequence is again that number that's always Incrementing and so all Wade asks you to do is store it just store that object sequence for us and be able to retrieve it And so in this case we have a very simple this seek map Business here, which is storing the current sequence by object ID. So that's pretty simple I've omitted some methods here that you would need to implement to say give Wade the object sequence when it asks for it But that's pretty simple So enough about chain replication any questions about chain replication before I move on Yes So in the case of let's suppose that a Network of partition occurs somewhere in the chain such that the tail is still accessible from the client, right? But the head may not be in this case if the if the message if there's a network partition that occurs then If the data makes it to the tail, that's good The data is correct. If the data doesn't make it to the tail then it doesn't commit So it's so what would have happened is that there would have been pending commits In the head and the body, but they never would have been removed from the pending set They would still be pending so the state of the system is still consistent Well rather so Well, let's let's be clear here. So in this diagram of chain replication The the writer the client writing a message into the system It that respect that request to write does not return until Every node in the cluster has committed So it's not like I send a message and then the heads like Immediately responds and says great. I'm gonna handle everything asynchronously. It's all synchronous So the availability thing is something that's a little bit questionable. I haven't talked about it too much But chain replication does provide availability if you implement it as per the paper I'm kind of missed a big step here Which is that all of the messages are actually being proxy in the original chain replication design? All the messages are actually proxied by a master node and that master node can then detect whether a chain is up or down So I said it goes to the head in reality in chain replication At least as the paper describes it the messages first go to a master node and then farms it to the head so what means we maintain high availability because the master node immediately knows upon writing Whether or not the chain is is up and then can execute a series of commands to bring the chain back into it and Bring it back up. That's kind of That's really it's interesting stuff, but it's not really kind of part of the conversation here. You had a question Um, I honestly haven't we're we're still in mongo to we tried to upgrade to mongo 3 But we had a lot of issues doing that It's probably just unique to our particular case in our data model But you know, I don't want to say that our thing is better than mongo because mongo is a great database Rather, I want to say it's comparable to mongo. So for us we were able to get really good performance and We were able to beat mongo, but that's not to say that it's the best option Mongo is a great data general purpose database. Wade is a little bit heavier Lifting you have to write specific code that lives on the database. There's a lot more code. You have to write You use Wade when you need this sort of like I have very specific query pattern that I need to optimize for Does that answer your question? So We talked about that so enough about chain replication Let's talk about another feature of Wade that you can build in yourself This doesn't necessarily come with it, but it's easy to do with Wade is To eliminate what's called the read write update loop So oftentimes especially if we're talking about mongo if we need to make a change to an object That involves two round trips to the database We need to first read the object onto the client side serialize it stream it over the network Mutate it somehow and then write the object back to the database. So this can be kind of expensive We are putting potentially network strain because we're having to travel this huge object potentially huge objects across the network And then, you know, we have this sort of decoupling of where the storage is and where the code is and so forth But with Wade what we can do instead is that since the storage the update commands themselves live on the database The database can actually keep the value on itself and we just send the command that says Execute my function mutate. So this is very comparable to say stored procedures in SQL Which you might be familiar with although more powerful in another way because instead of SQL we write Python Now This is this has some benefits because the object doesn't have to travel the network we don't have to saturate the network anymore and Another sort of subtle thing here that we can solve is an issue of consistency If two clients happen to read the same object at the same time and then write it back and Overwrite the right of the other clients that was concurrently occurring. Does that make sense? so if the mutation method happens on the database and it happens in a very specific atomic fashion then We don't have to worry about these, you know multiple readers act to save the same objects I'm not having we don't have to do any locking basically. So we have a lock-free way to update arbitrary data So that was the whole read write update stuff something kind of similar to that although very distinct is this thing called command versus value forwarding so With in the case we were talking about before We're we don't have we're avoiding the case where we have to stream the object between the database and the client Here we're talking about avoiding having to stream the object between the nodes During replication. So when a node when a command enters the database and it's being replicated We have two options Again, the mutate method that we talked about it's living it lives on the database So we have a couple options here one we can Mutate the object on every node in the database and just forward the command on Between the different nodes just say here's my set command. Just do it and the next node in the database says okay I'll just execute the set command again That's what's happening on the left-hand side every node is executing this mutate method now on the right-hand side We have something called value forwarding instead of sending the command over the network We send the value which is probably what we're more familiar with and typically, you know How regular databases work that don't even have you know database sides stored procedure kind of stuff We just set the value on Now the benefit of both of these is sort of subtle on the left-hand side Because we're sending the commands over the network They might be really small compared to the values or the objects on which we're operating in which case We can avoid all kinds of network traffic by having to read the object and you know send it across the network The downside to this is that every node has to do the same computation So if your command or update is very expensive to compute maybe something like that, you know NLP stuff that Burton was talking about You may not want to have to do that on every node you might want to just have it do it once And then you know the other knows just have to say okay. Here's my value. I'm just gonna store it So there's a trade-off here on which one is better in what scenarios right now weight supports command forwarding We assume that that's probably more useful in most scenarios There might be situations where value forwarding is useful But the most useful scenario is that is where the computing the value is very expensive But on the left-hand side again just to clarify we save having to send the value across the network We just get to send this nice tight little command Okay, any questions there about command versus value forwarding What about sharding Yeah, it is sharded. Yes, so weight is called So this is so the command and value forwarding stuff is built into a way You don't have to think about it. It's just given to you Does that make sense? So you don't you're not in charge of it. You just The only thing you have to tell weight is what partition of a specific object belongs to so you have to tell it Hey, my update command is going to the zero partition or the third partition or the 10th That's all you have to do is just choose the partition Okay, I think I might be this understanding of question Maybe Well, I'm gonna say that for now the migration to another language is probably Not gonna happen. It might happen if this thing becomes a thing, right? And that's a big if For now Python is great one thing that even though we said, you know, Python may not have performance Well, it can have performance if we see libraries and stuff like that We can gain an immense performance using things like level DB to back your back Wade is an excellent choice All the level DB or at least most of the clients are written in C So yeah, you get to write Python and talk to C Wade is also built on top of a framework called Pi UV, which is based on lip UV Which is in a very efficient event processing framework for that We're using extensively code is a little messier because we use it, but we gain immense performance Any other questions about command versus value forwarding? Okay so Last but not least I kind of mentioned it before Chain replication has this idea of a master or proxy node Wade our design is a little bit different. I don't know if it's right or wrong But instead of having a master or proxy node the client directly talks to the head of the chain avoiding that extra step and What we've done is say it is built an asynchronous process called the overlord which talks to the chains itself and detects failures So Basically when a chain when the commands in a chain might be out of order because based on the object sequence You know the node receives an object sequence and it's like sorry. That's not the one I'm gonna accept it's not one more than the current one. I have I'm gonna stop accepting updates. Well That's great. It means that the database doesn't go into an inconsistent state But that's kind of bad because it means it just stops working. So the rights just happening So we have this other process as overlord that's periodically checking all the chains and trying to write to them And if any of the chains that can't write to them for whatever reason it starts to execute a series of commands to handle Reconfiguring the chain. So for example, if like in a chain the body You know something happened to the body in a chain and the head and the tail are still available It'll just remove the body and then make the head and the tail be the new chain But you know, you might want to like add a new node in there and replace the body and stuff So you have a lot of choices there. This is not part This is part that hasn't received a lot of love yet So we definitely need to think more about how it works, but currently it works It's just not the most performant thing. It takes a while to bring this chain back up From being unavailable and in order for us to be available like chain replication talks about we're gonna need to solve that problem Any questions there on the overlord? I'm gonna skip this but that's basically like the the gist of what the nodes think they are That's like here are the nodes listed This is my little local example and here are my chains or my objects or partitions We have two zero and one and then they're composed of node zero and one and one and zero and the ordering of that is the head and the tail So the first node is the head, the second node is the tail This is I'm gonna really skip past this briefly But this is basically like how do we decide where our objects live and our partitions live There's a lot of good research that was done on how to help you determine where your partitions should be There's some work here you can look at And that's pretty much it I'm not really Python Our client is if you've heard of G event It's G event compatible under the hood. It is multi-threaded So if you don't have G event, it will use Python's thread and look threaded library implementation Only threading is scary in Python, but it's okay if you're just waiting on IO, which with a socket you are So I'm not sure I haven't used twisted before But we've used it in a variety of different like applications For example, we've used it on web servers before and it works You know it writes in parallel and you can use the same client object and share it with all your you know You could use you can instantiate one client and use it on your web server. That makes sense How is the latency so I can't really say I Don't have a lot of comparison numbers with me But just based on the way that it works like the theory behind it Wade is gonna have a higher latency So what reads will be fast because reads go to the tail That's just one node look up there, you know really simple Like all pretty much all other databases When you do the riots if you recall from our chain here, let me pull back a little bit We have to wait for each of those red lines Before we get a success of the driver whereas with the other database Same thing except those red lines are in parallel So it's whichever one respond first so with this model we get slightly better lower latency The difference though is that you trade off latency for throughput So basically the solution to getting more rights per second is to have more writers So you spin up more writing processes So instead of making you know one client writing 200 I'm making this up one client writing 200 requests per second You'll have two clients writing a hundred requests per second each, but overall you still get 200 requests per second Okay, what's up? So that's something I really didn't talk about here Just know that that object sequence number that I talked about gives us a very good state and that the tail always represents The true state of the system so suppose in this example that the body Something bad happened to the body in this case. We would just remove the body and The head and the tail we would just tell the head and the tail. Hey by the way instead of There being a body you're just talking to each other now. That's really simple now suppose that the Suppose you wanted to add the body back in What we do is say because the tail is the state of truth. We would just tell the tail send all your data to the body Okay Right now there is there are ways to minimize the downtime But there is some and the downtime is by partition So that's why we feel like you should choose a lot of partition IDs So that way Presuming if you have a lot of partitions each one is smaller and if each one is smaller the system can more expeditiously bring it back online So, you know, I said it having two objects and it takes forever for one of them to come up You have a thousand and each one can kind of come up slowly, you know, you can get better availability It's something we're still tackling because it does feel like it should be better than it currently is Yeah Yeah, but I mean it's pretty good. It's still alright It's it may be not be as good as some other databases, but again, we still get high throughput and Still get strong consistency, which are great, you know Any other questions for the crowd? Hi correct So basically the reason why at Charby we can get away with this is because the clients themselves don't actually write data to us That happens all asynchronously so, you know Reading the data is the hard part for us. We have Okay, if you probably saw our dashboard at the beginning I kind of gloss over this Which I think this is actually kind of cool. You see this line here. This like graph thing So this is by the way, this is Gawker. They're partners strategic partner of ours. This is their dashboard And I don't remember when I took this screenshot, but we've got like some of their top articles that are here Pokemon Go is a government surveillance Psyop conspiracy And they had at the time they had 12 about 11,000 people on their website this graph here that you see These are the number of people on their site over the current day So I took this around 11 a.m. So they haven't reached the peak of their traffic yet But as you see this graph here, this is the database the database that powers the graph is the thing I was talking about the thing we're replacing and so We have several thousand customers With several thousand webpages that we keep track of so it's really it's a lot of data for us to keep track of and Mongo Just wasn't doing it for us. So That's nothing against Mongo. Just our particular data model just needed something that was more hand-tuned and customized Correct Some Yes, yes, exactly Correct correct so for us at chart beat The size of our data is bounded to some extent because it's a time series and so we roll data off so Yeah, this isn't Unfortunately right now. This is not resizable. Just bad. I know But you know for now it's yeah, it is what it is Yeah Yeah, yeah, yeah, no, no, no, no, it's more of my own insecurity So there is a there is a couple considerations under the hood the database itself is storing Python dictionaries of Mapping data from partition ID to you know something so if you chose more than max ant partitions It would not fit in memory. So there are some practical limitations Correct so all the objects are Basically the code is written such that if the object doesn't exist yet Then it uses a default value of zero and then if it does it actually writes the data So this is kind of my configuration. This is static. You choose this thing you create this file And then this is the this is the truth henceforth now in the case of reconfiguration You know like the overlord might change some stuff. This will change But this is like sort of the canonical truth of the system. That was a good question Any other questions? No Yeah, yeah, it's really not an issue whatsoever For better or worse our current production system for whatever reason we decided to do replication factor two Which I know if you have like sensitive data, you'd be like, oh my god, you need three But with two, we don't even have to worry about that latency because it doesn't exist for us Any others? Thank you guys wasn't my box