 So what is DynamoDB? Well, Jean Tiare already spoke a little bit about it. It is Amazon's NoSQL database. It's new. It's... Well, it was introduced in January. And at Ludia, we have been using it for a couple of games in development since, I think, March. And we've been doing a lot of work on it. As you saw in Jean Tiare's presentation, we developed DynamoDB Mock, and we did something else called DynamoDB Mapper, which I will be talking about in a few minutes. So do any of you guys know or have any of you used NoSQL databases? Okay, that's cool. So I don't need to explain a lot of that. So yeah, it's a NoSQL database is not relational, and it trades off some, well, in this case, a lot of the power of SQL against performance and scalability. There are many applications where you don't need the whole power of the SQL language, but you need speed and you need scale. And NoSQL databases are built from the ground up for that. So DynamoDB, unlike MongoDB, which is a document store, DynamoDB is a key value store. So basically, it's based around tables, which contain items, which are... You can say they are basically Python dictionaries with a few limitations. But unlike document stores, there is no nesting allowed. You can't have complex queries. It's fairly simple. And basically, the only thing you need to specify is the schema for each table, which, as we will see now, are much, much, much simpler than SQL schemas. Because in a SQL schema, you need to specify every attribute of every row, and each row has to be the exact same. In DynamoDB, there is one required primary key, which can either be a simple hash key only. So for example, one string, which is the user ID for a user, or a composite key, which is based on one hash key plus one range key. So for example, in a blog post, you will use the user ID as the hash key and the timestamp of the blog as the range key. So which will allow you at the end to, for example, select all the blog posts from one user by only specifying the hash key, and you will automatically get all blog posts automatically sorted by a range key. And that's the only thing you need to specify in a DynamoDB schema. Everything else is optional. You can add as many attributes as you want within a few limitations. And you can change them on the fly. Once you've started using the database, there is no downtime involved in adding stuff to your items. So, yeah, DynamoDB rocks. It is awesome. It is fast. Okay, well, until today, this was a very good argument in favor of it. And I hope it will contribute to continue to be so afterwards. But, well, it is entirely managed by Amazon. So when Amazon goes down, so does part of DynamoDB. So it has the same advantage. Oh, that's a good point. Oh, yeah, sorry. As I was saying, it is entirely managed by Amazon. And yes, unlike the rest of Amazon Web Services, it kept working today. That's awesome. DynamoDB rocks. So, yeah, it has the same advantages as EC2 and others. The entire time to install and deploy is one minute per table. You don't need to spool up new instances. You don't need to install any software. It can be troublesome, as Jean-Tierry explained, when you are running a unit test. But when you are doing production, it is good. And the really, really good thing is that it scales with money, linearly. Basically, every operation you can make in DynamoDB has a very predictable performance. And this performance is rated in read or write units. And you pay for those units, plus, of course, for the space you use to store your data. So, what it comes down to is you want more, you pay more, and you can get more. Each item that you write to the database costs you one write unit per kilobyte, rounded up. And each item you read from the database is, again, one read unit per kilobyte, rounded up as well. So, the base tariffs are there. So, yeah, it's 10 write units or 50 read units for one cent per hour. And you can get up to, I think, it's 10,000 read and write units for one table. And you can possibly get more if you ask Amazon nicely. And, of course, you pay for the storage, which is, yeah, the first 100 megabytes for each table are free. And above that, it's $1 per month per gigabyte. And since it is not only a key value store but also an eventual consistency-based key value store, it means that you can go even faster or, to be more accurate, use less read units if you are ready to accept data that may be a few seconds out of date. So, you just forget to specify the strong consistency equals true flag. And you get double the performance but slightly old data. The underlying access protocol is HTTPS. So, it's simple. It's stateless. It is accessed by pure Python code, which is provided by the Botol library. That's the standard stuff you use to access all the Amazon web services. And, well, there is much less black magic involved than with SQL databases, which all have, you know, underlying C drivers, weird stack inspection things. And you can monkey patch it with DynamoDB mock, which is very, very useful for unit and functional tests. But, yeah, DynamoDB also sucks. It has a lot of limitations which can be troublesome. For example, I said there were two types of schema, either one primary key, well, one simple primary key or one composite primary key. That's all there is. So, if you need a composite key made out of three elements or more, you can't. You have to hack around it. Also, when I was talking about the hash key, which is used as the primary key in simple key schemas, it is, in fact, yeah, something you can hash so it can be any string or number. But the downside to that is that it will never be auto-generated for you. You don't have auto-incrementing keys. Which, as it turns out, is much less of a problem than you would think, but it's still there. And, of course, there are no relations between tables. Each table exists basically in a void. There is absolutely nothing but your code to ensure the consistency of your data between the tables. There are also exactly five operations. So, when I was saying we lose a lot of the power of SQL, well, yeah, we lose a lot of it because GET is used to retrieve one item. PUT is used to write one item. There are batch versions, of course. DELETE, DELETE's one item. QUERY allows you, well, can only be used in tables with composite primary keys. You have to specify the hash key and then one constraint on the range key, or none at all if you want to, for example, get all the blog posts by one user. And finally, we've got SCAN, which is the closest thing we've got to a select operator. But SCAN will always go through the entire contents of your table. It is not indexed and therefore very expensive. Because all those operations, of course, use up read units and write units. Also, there are exactly six data types. Number, string, binary. Number set, string set, binary set. That's it. Numbers can be ints or floats. I don't remember the exact specs. Jean-Thierry has them, he's been studying them for DynamoDB mock. And yeah, there are no lists. Yeah, those are sets, not lists. So no ordering and unique elements. Strings and sets must be non-empty. So if you want an empty string or an empty set, you have to actually represent them as missing attributes. And there is no nesting whatsoever. Sets contain only numbers and strings. You can't have a set of sets. And finally, you access DynamoDB through a low-level-ish API, which is Boto. You saw what it looks like in Jean-Thierry's presentation. You're basically working with Python dicts. And all the above limitations are exposed fairly directly. So it is mostly a problem with the whole no empty strings or sets thing. And yeah, it can be troublesome to work with. And finally, the only way to have integrity of any kind in DynamoDB is by conditional writes, because the whole protocol is stateless. So nothing is preventing other programs from accessing your data at the same time as you. And the only way to make sure that the data you are writing to hasn't changed since your last request is to use conditional writes, which is basically write this item if and only if its old attributes in the database are the following values. Or if and only if it doesn't exist. So of course, we are the Python community. We can make it better. And that's why at Ludia, we wrote a little thing called DynamoDB mapper. So I started work on it. Jean-Thierry has been heading the development of it for the last few months. It is open source. It's LGPL version 3. You can find it in Bitbucket here. It is on PyPI as well. And it is an object document mapper for DynamoDB. So it allows you to work with Python objects instead of raw dicks. So how it works is that you can specify a more complete schema for your classes, which is something we totally stole from MongoKit. So I got this example here of metadata for a do map. So here you have the name of the table where stuff is stored, the name of the hash key, the name of the range key if there is one, if there isn't, you can just leave it to none. And then each of the attributes in this dictionary will show up as an attribute in instances of your class. So the map is an int as well. We've got names as unicodes because I forgot to mention the strings that are stored by DynamoDB are actually unicode strings. They are stored as UTF-8 and you get them as unicode objects. And you've got a set here of cheats. And if you want, you can also specify default values for any or all of those attributes. So for example, in a do map, obviously, we allow the standard IDKFA, IDDQD, and IDCLIP cheats because, you know, that's how it works. We support empty strings or sets. It's made at the mapper level. You don't have to worry about those. You just put your empty string or empty set, store it in database, and it will be retrieved as we want. All the standard operations are supported. We've got get, save, delete, query, scan. Well, there are class methods with fairly natural APIs, so it's myclass.get hashkey equals whatever, rangekey equals whatever else. And you can call the other methods on the objects themselves. And we even added support for extra data types. So I mentioned the lack of auto-incrementing int data types. We implemented it using an extra little thing that DynamoDB has called an atomic counter. It's out of the scope of this presentation, sorry. But basically what it means is that you can get auto-incrementing int keys. But honestly, I feel that you should avoid using those because most of the time you will have something else that uniquely identifies your objects. For example, well, at Ludia we make Facebook games. We can use the Facebook user ID as a key in the user table. It's already specified by other people. We also allow the support of date-time objects because, well, it looks much better than a raw timestamp. They are stored as W3CDTF, so it's the standard date format. So it's year, month, day, hours, minutes, seconds, time zone. They are all UTC, of course. Well, of course. Unfortunately, we don't have support for time zones yet. The good news is since this format can actually be sorted just like a string, we retain ordering. So you can use them as range keys as well. And we also added support for lists and dicts, which is done through JSON serialization. So it's slightly hackish. The lists and dicts show up as strings in the database itself, but you retrieve them as actual lists and dicts. You just can't run specific filters on them. And you can even nest some of the things. We are supporting conditional writes through a flag called raise on conflict, which will raise conflict error if the object has changed in the DB since the last time you retrieved it. So it is very useful when you've got multiple servers accessing the same database. And we have a special case of it, which is overwrite error, which is for new objects. You create directly and want to insert to the database while making sure that, well, their primary key isn't already in use. We even added something called transactions because for conditional writes, we noticed that the most common use case is get something from the database, modify it, try to save it using the conditional write syntax. If it works, also carry on. But if it failed because the object changed in the database since we retrieved it, restart at the beginning and try again. So we extracted a common pattern from that called transaction. One important thing to note is that those transactions, sorry, I'll get back to it in a second, are not SQL transactions. They offer somewhat similar semantics, but they are much less powerful, and so, yeah, don't use DynamoDB if you're a bank, by the way. So you define a transaction by implementing a class and just giving getter and alt repairs. So the first function will actually retrieve your object from the database. And the second will apply modifications to it. It does not save to the database. The saving portion is handled by the transaction itself. And, yeah, it is retried until successful. And the good thing about this is that one transaction can alter multiple objects at the same time. You just have to give multiple pairs. I'll have a case study in a few minutes. So, yeah, again, there are not SQL transactions. You don't have rollback. You don't have atomicity across tables. And you don't have any integrity constraints between tables because at the actual DynamoDB level, each table exists on its own. We've got on-demand migration that we stole from MongoKit. And, yeah, here's a case study. We are currently at Ludia developing a multiplayer bingo game based on a high-profile brand which I am unfortunately not allowed to reveal. Yeah, yeah, I know, I know. It is a free-to-play game with micro transactions and we are expecting about 300,000 daily active users. So there are a lot of concurrent users. But basically, when you're playing this kind of bingo game, you're actually playing with at most 150 other players. So there isn't a lot of interaction between the users. We also have a REST backend API which is perfect because it allows you to deploy as many servers as we want without changing the client code much. And we are sharing one set of DynamoDB tables between all those instances. And since everything is running in the cloud, we've got fairly decent performance between our instances in DynamoDB, except when it crashes. So, right. The comments are gray. Sorry about that. But those are two actual classes we've got. So this is the room class. So this is a fairly simple model with only a hash only primary key, which is the room ID, which will be the name of the room, the theme of the room actually. So we've got an actual name which is more user-friendly. Is it enabled or not? Yeah, we support Booleans because, you know, they're just ints. Extra information here. A set which says what kind of, what actually means a bingo in this room. And yeah, see we've got lists here because, for example, this is the number of, the amount of energy it costs you to buy one card, two cards, three cards, four cards, etc. So ordering is important. We need ordering. So we're using a list here. And yeah, we've got another list here for rewards. Yeah. It references IDs in other tables, which is purely by convention. So all the underscore ID things, we have to maintain those manually, make sure that it doesn't break. Use unit tests. And this is both a transaction and a composite primary key table. So yeah, when a user completes a game, we post a number of information, for example, the amount of experience the user gained. Which are used, well, they are transactions. So as you can see, the transactors are here. Get user just gets the user from the database. And this simply adds all this stuff to the user. And we just commit the transaction and it works. And that's also