 trying to learn some American accent. It didn't really work out. It was too hard. Don't know how you guys learned it, but anyway, you have to deal with my Indian accent. I am Bhaskar. I work for Apple. I'm working on foundation DB layers for the last 100 years. Before that, I worked on Cassandra. I did an internal Cassandra storage engine. It was a bad idea. Before that, I worked on also some bulk load improvements. Well, before going into document layer, I want to spend a little bit of time on foundation DB layers concept. Well, Ben already explained it. I just want to, I'll speed it up. So if you look at a typical database stack, it looks something like this. It has three components, core engine, a transaction module, and storage engine. This is a very theoretical conceptual view. Usually, the lines are blurry. They've overlapped quite a bit. The core engine is the one that decides the data model for any database. Whether it is a document data model, document or SQL database, core engine is the one. A transaction submodule and the storage engine, these are usually independent of the data model. Well, I guess I skipped it, but core engine is the one which implements query planning and all, and transaction module guarantees the asset properties and storage engine is the one, which guarantees replication and durability guarantees. So the problems that we usually solve in bottom two layers are independent of the data model. They're usually same with your database you deal with. It usually depends on what kind of transactional guarantees the database is providing, not much on what kind of data model it supports. So FoundationDB tries to solve the bottom two-thirds of the stack at scale. FoundationDB has serializable transactions, and it has very strong isolation guarantees. It supports, well, it has all the usual database stuff. It has like synchronous backups, multi-layers and support. We talked about what FoundationDB is good at so far. Let's look at the API. The API for FoundationDB is very minimalistic. It has a raw key value store API. Unlike the sequential or document databases, it doesn't have joins or aggregations or secondary indices. But the API is strong and minimalistic enough. The idea is you can use this as a basic building block and you can build interesting and complex data models on top. This is where the promise of layers comes into picture. Layer stays on top of FoundationDB, and it builds the data models on top of the FoundationDB API. Because FoundationDB solves all the most of the database problems, layers can just worry about data model, and only that particular problem, not much of everything else solved by FoundationDB. As the persistent state is stored on the FoundationDB, layers are usually stateless, and layers can be either built as libraries, or you can build them as stateless microservices. As long as the layers use transactions properly, concurrency is taken care of, taken care by FoundationDB. So because of that, if you want to have a more throughput, you can just run more instances without worrying about concurrency or scale. So it's quite easy to scale as well. Well, this goes without saying, obviously, layers inherit all the FoundationDB features. If you get the multi-layer support, you have the replication, everything just happens seamlessly. Well, we already know there are some utility layers, they are so critical, they are actually shipped with FoundationDB language bindings. The directory layer which provides us key space abstraction on top of FoundationDB API, and tuple layer gives us a shorter-order data type encoding. Of course, there are a couple of layers from community. There is a Linux NBD block layer, and a Janus graph layer, I think we have talked after this. In the sales spirit, we have document layer. Document layer, we think this is the first step from us towards the promise of FoundationDB. In future, we'll see more layers from community and hopefully from us as well. Well, good. Well, document layer, it implements document database API on top of FoundationDB. It's not just any document database API, it is MongoDB wide compatible. Because it is wide compatible, you can just use any standard MongoDB drivers or framework to connect to document layer. There is no, you don't have to, for all that matters, for the application, it looks like a MongoDB server. It does a lot of things better, that's a different point, but it looks like a MongoDB server. Well, whatever you can say about MongoDB, if people really love the MongoDB API, it is very easy to use, it is very quick to onboard the query language is very rich, there are lots of different features on top, then and also lots of frameworks out there. So combining that with the strengths of FoundationDB, we really think this is a very good way to very good and easy way to get onto FoundationDB. Well, the API, there are no surprise here. You all know MongoDB, it is MongoDB API. It stores JSON documents and it is KMLS. I'm not going to spend any time on API because there is huge documentation out there, there are lots of tutorials and videos on MongoDB API. Let's look at the feature set. Document layer, it is not a drop-in replacement for MongoDB for every application out there. We started as the core set of document layer features. It did document layer supports credit operations, large set of query or update operators, it supports secondary indices and transactions. These transactions are FoundationDB transactions exposed through document layer API. You see that here you might not find the popular features like chain streams or aggregations of MongoDB aggregations. I guess these are two big ones. Also we might not have all the second all like for a test index and stuff like that. This is the project, this is just starting up. We are working on it. It's going to happen very soon. What makes document layer very special? Well, we keep hearing since morning, anything to do with FoundationDB starts with strong consistency guarantees. Document layer just inherits that. The consistency, it has very strong consistency guarantees and it happens seamlessly. Your application doesn't have to specify what kind of read concern, what kind of read preference, what kind of write concern, what kind of consistency level they have to access, they don't have to carefully manufacture all that. It is always consistent, it is given out of the box. There are no logs, it is optimistic concurrency control and also there are no database lock and all, it is completely lock-free. The scaling, FoundationDB, well it is a distributed database, it is distributed at the same time, it does the shorting and everything is dynamic. There are no static short keys. It's again, it is better at the same time simpler for application. Anybody using static shorting, database like Mongo or Cassandra, applications usually, it's quite often that you use a short key and after six months or one year, once your data grows quite a lot, you realize, hey, my short key is wrong, now I have to change my short key and guess what? I have to re-migrate all my data again. There is no short key, so you don't have to worry about how to distribute your data. Database, FoundationDB takes care of it. Well, I said so much about document layer is this good and that good. I had to back up all those claims. I think the best way to back up these claims is going a little bit into the design. I want to talk a little bit about how the core execution model works and also how the storage model works. These two things at high level explain all the claims. So with the single node SQL database like MySQL or Postgres, we are used to the norm that everything is a transaction. Transactions are not some slow features which you have to use very carefully only when there is no other way around. Everything is a transaction with something like Postgres or MySQL. Even when you run a statement in Postgres, without starting a transaction, Postgres automatically starts a transaction, runs the statement, commits immediately after. So that's the statement, runs a separate transaction. So document layer works in the same spirit. Everything is a transaction because FoundationDB everything is a transaction. So either application can explicitly go and say, I want to start a transaction, I want to run all these statements, and commit this, which grows all the statement. Or just like old MongoDB applications, you can just keep issuing the request separately and each request starts a separate FoundationDB transaction and we call them implicit transactions. So let's see how they actually look like. Well, the green dot represents the MongoDB request. Let's say document layer receives one single request. This is out of transaction context, a single request and document layer sees this request. It has to do some operations on FoundationDB. If you take an example of let's say, a simple MongoDB insert that might need to read the metadata from FoundationDB, and it might have to find out if there is a duplicate document, it might have to insert the document and also update the secondary indices for that particular collection. All these operations will go a separate FoundationDB operations, but all of them happen under a single transaction. That gives serializeable that gives serialize the consistency for the whole, for the request and well yeah. Obviously, if there are any conflicts, document layer takes care of retrying and all. So this is how the transactions work. But there is a catch. FoundationDB transactions are short or they have a limit of five seconds, they have a five-second limitation. Obviously, we can't really fit everything. You can imagine a MongoDB request which touches all documents in a collection. I could be running a request to give 10 percent bonus to all my employees. That has to touch each and every document, and that may not, depending on how big my company is, it might not fit in a single transaction. It might not finish in five seconds, so it might not fit in one transaction. So obviously, document layer does the obvious thing which is, it slips into multiple transactions. So the guarantees here are not as strong as short-lived transactions, but short-lived request. But it still guarantees a consistency at individual document level. This is, I think, even for the consistency for long-running request may not be as good as for the short-running ones. I think it is still better than the competition than explicit transactions. Application can go out and say, hey, I want to start a transaction, do this request. This is actually best way to get even better performance than implicit transactions. Because whether you want it or not, document layer starts the transaction anyway. So if that's going to happen anyway, then why don't you like amortize the cost by running multiple requests under one transaction. So this is, even though the data model is document data model, lots of the design principles here really remind whole SQL days. Well, not whole, but the SQL days, the SQL times. But when you're running explicit transactions, the same principles apply when you write application on top of FoundationDB. It is optimistic concurrency control. So you have to worry about conflicts, and you have to worry about retries. The transaction, the explicit transaction, the transactions we have right now are not yet compatible with V4 MongoDB transactions. This is something we are actually working on. This is going to change very soon. The implementation we have right now is tight to client connection. This is again, to do with not being compatible with MongoDB transaction. This is probably the only feature which is not compatible with the existing MongoDB features. Then storage model. Well, FoundationDB is storing. So we don't have to worry about persistence or replication or something like that, but we have to worry about how do we map document onto the FDB keys. That is quite important. Actually, well, that is quite important. Let's see. If you take a sample JSON document, in MongoDB, underscore ID is the primary key. We have a sample employee document. Let's see how we store this. We document layer stores a single document across multiple FDB keys. This allows the document layer to support larger documents because each FDB key can't hold more than other value, can't hold more than 100K bytes of data. By spreading across multiple keys, we can support larger documents in one. Number two, if you want to update just one field in the document, you don't have to reread the entire document and write again. Where is the gun? I think I didn't cover everything here. The way keys form it, it is quite important. So key has a prefix of the collection name. Well, usually it's the directory prefix, but that's not important. It has the collection name so that you can group all of your collection data under one key space, and then key includes the primary key so that you can group all of your keys for the document together at the same time, all the documents are ordered by the primary key. So you can see here we have multiple employee documents here. All of them are ordered by the primary key, primary key here like ID is the primary key. If I want to read, you can think that's a MongoDB command. If I want to read all employees in the collection that boils down to doing a get range on employee prefix. This is like, well, get range is foundation to become all right. Yeah. So the same way, if I want to read, if I want to access a employee document based on the ID, because ID is the primary key that is part of the FDB key, I can just prepare the prefix using the primary key and prefix, which becomes employee two. It access the employee record of Bob. This is good as long as you only care about predicates that include primary key, but we have predicates that include something like, I want to access all employees with name Eric. We have to have secondary indices and we will have secondary indices. Secondary indices have to maintain mapping from index key to primary key. Index key here is name, primary key is ID. So this is how it looks like. Well, it has the index prefix and after that, the index key which stores the name and then primary key. The primary key can be stored in the value, but if you don't want to because unlike primary index, secondary key, secondary keys are not unique. You can have multiple value. Here we have multiple erics. If you don't keep the primary key part of the key, then manager Eric, I guess he's going to overwrite the developer, which is not great. Then the value we don't really have to, we don't restore anything. If there is a query something like, give me all documents with name Eric, it first goes to the index space, does a get range on Eric, it gets all the primary documents, goes back to the primary space, and reads all the documents. We could avoid doing this two sets of Gats if we could store the document or the fields of the document we care about as value. Well, they are covered in indexes. We don't support them yet, but that's something that we can do. So what is this kind of source model giving us? In spite of what we call primary key secondary key and all, when it goes down to foundation DB, they're all just normal keys, they're all treated the same way, the primary indexes and secondary indexes. They are shorted the same way. This is a big difference from lots of other nosical databases, where secondary indexes shorting is very closely tied with the primary index. So if you have to do a query on secondary index, it usually has to go to each and every short. You'll be familiar, if you see Cassandra or MongoD with this, how they usually do. The any query on secondary index usually has to touch each and every short. That's not the case here because secondary shorting is same as the primary index shorting. Also, we are getting all these features without setting up any short case. Then the indices. As explained before, index updates always happen together with the actual document update. So indices always stay constant with the primary index or the document period. There are no exceptions. Indices are distributed as well, that's good. Index revealed. This is some place where we are focusing on at the moment. We MongoDB customers know how painful index rebuilds can be. I didn't believe until I didn't even think that's possible to do. A index rebuild, if you application a MongoDB index rebuild, they actually has to go to the operations team and ask them carefully bounds replicas one by one, so that they can actually rebuild indexes on each individual replicas separately. Well, that's not the case here. There are lots of other improvements we are doing, but I guess I'm running out of time. Well, building layers on foundation DB can be quite easy because of all the guarantees foundation DB is giving, but the same 10 there are lots of challenges. It is optimistic and currency control, so you have to worry about avoiding contention. A like from our team is giving talk this afternoon, I strongly suggest going for that this is very interesting one. Caching because we want to run multiple instances of a document layer, any data we are caching is going to work against us because you have to now worry about concurrency, because it's not part of the foundation DB transactional guarantees, and any code you're writing on top of foundation DB has to be added a button because of unknown commit failures. The last one, this is very deceiving. I put it in like one line, query planning optimization. People spend decades making these things better. Well, we have a basic model that works, that kind of works. We are working on making better and better. Future improvements. These are the things coming very soon. We want to make our transactions compatible, so that they actually work in all kinds of deployments and which will be a less seen from our team is working on it. He's going to probably commit a PR like this week. Index rebuild improvements, quite a few of them are coming, our metadata design could be improved quite a bit. Features, aggregations, change streams, a lot of people ask me about change streams. I understand that is a very desired feature, that is something we want to work on like beginning of next year, and spatial index has test indexes. Spatial index is probably very soon. Community. Now, document layer is open source. I should have put some link here, but you have Google, you can Google it. It is open source, Apache V2 license, it's not any data no restrictions, it's not, you can run a service if you want to. I'm not pointing fingers, but you can run a service if you want to. Give it a shot. Please give feedback on forums. If you like it, let us know. If you don't like it, definitely let us know because we can improve on it. Rise issues, if you have some features that you really want to see soon, and welcome to yours. We are really excited about this project. We think we can really build community around this. This is a best project, and also this one is written in Flow C++. Flow is a very fun language to work on. Trust me, I did Java until two years back, and whereas a flow is good, it's good. That's all from me.