 Good afternoon everybody. As Demeter said, my name is Joe Drumgul. I am Director of Developer Advocacy at MongoDB and what that means to most of you people in the room is that I can give you money for your meet-up events. So if you follow me on Joe Drumgul at Twitter or Jadrumgul at Twitter, I'm happy to help fund meet-up events across Europe, especially around Python. MongoDB doesn't have to feature, but we like to talk a little bit about it. That's part of the package. We do talks like this all over Europe. I did one at Europe, Python a couple of years ago. And this is just a basic introduction for people who are not familiar with MongoDB or not familiar with... Well, I'm assuming some familiar with Python. I've been using Python since 2006. I actually built a business in 2006. A SaaS backup service built around Python and Django. I dabble in it. I'm not an expert, but I can show you some tricks with MongoDB that should make your life a bit easier. So for those of you who aren't familiar with MongoDB, it's a document store. That means it stores JSON documents. You should be familiar with JSON if you've done any kind of programming at all. It's one of the excellent things about programming with Python and MongoDB is JSON documents look exactly like Python dictionaries. And so there is a one-to-one equivalence between the types that you're going to use directly in Python and the objects that you can store straight into the MongoDB database. This means there's no wrapping classes. There's no extra code. You can just use the Python objects directly, and that turns out to be very straightforward. If you were using C sharp or Java, you'd have to put a wrapping object around it. And that makes life just a little bit more noisy in terms of the code base. We get similar benefits from Node.js, but an impersonal view of Node.js is like, that's just too crazy for me. I can't work that stuff out at all. Callbacks are just, that's for somebody else who's a better programmer than me. So when you store JSON in a database, it's not obviously pure textual JSON. That would be too expensive in terms of encoding and decoding. You're not going to do that. And of course, you've got to stuff this stuff onto a wire and send it over a network to the database in the first place. That's what the Python driver is designed to do. So we actually encode it. We encode type and length information. So we understand that things are strings or nested documents, that things have arrays nested inside them, that things are integers. And we also understand geospatial coordinates, although I'm not going to demo geospatial queries today. And the way it's stored is Beeson. Beeson is our own standard. Beeson spec, it's an open standard. You can contribute to it. It's effectively a binary encoding of the JSON representation. So if I show you JSON hello world, you can see there is a size at the start. Then there's a type field. In this case, it's a string. So it's a two. And then there's the field name and the field value hello and world. And then it's terminated by a null at the end. Obviously these can get more complicated as you get nested documents and arrays and so on. I just want you to understand that this is what's being sent across the wire, but that's the last time you need to worry about Beeson in your life as a Python programmer. For you, you're going to be working with dictionaries, arrays. They're the key types you're going to use. Now, when you download MongoDB, as I hope all of you will do, you're just going to install a MongoDB on your local desktop in a Windows. It installs as a service on other Linux's and variations like OSX. You can just run it as yourself. I'm actually running it on this desktop myself, although it's a Windows desktop just so you can see the log files. But the production deployment of MongoDB is as we call a replica set. Obviously a single node database like MongoDB, if the node dies, the data goes away. We don't have an independent log file on single node databases the way you do in a relational database. Instead, we keep whole replicas of the data. When you build a replica set, you build three instances of MongoDB running in three separate nodes with three separate disks. Then you join them together into a replica set. I'm showing a replica set with three members there here. You could have up to 50 members in a replica set. Again, not many people do that. That's a lot of nodes to manage. Rights go to the primary. The primary is designed to take the right operations. You cannot write to a secondary, but you'll see how that's managed in a couple of seconds. Once the rights are made to the primary, they're then effectively copied to the secondaries via an internal log called the Oplog. The cluster, the replica set, manages all this for you. All you've got to do is set it up, and you'll see our MongoDB Atlas database in the cloud will do all this for you with one click. You don't have to do this. If you want to set this up locally, there is a Python package called mTools that will allow you to set up a complete replica set again with one line of code on the local desktop you're using. So it's all set up. All the rights are going to the primary. Everything's fine, but what happens if you have a failure? Well, in normal operation, the three nodes connect to each other using a heartbeat that tells each node the other nodes are alive, and with replication streams from the primary to the secondaries. Remember, you read and write from the primary for consistent read and write activity. If you don't mind a little bit of eventual read consistency, you can choose to read from a secondary. And of course, this works out great if you're running a distributed cluster where you've got users in New York and London and Basel and you want to be able to eliminate that wide area loop when you're reading from data. And in those situations, often it doesn't matter that you don't have the most up-to-date records. So normal operation, this all works away, and again, we all set this up and manage it for you. But imagine the virtual machine that's running the primary dies for some reason. And we know nodes die. That's just the life of a node. Somebody kills it, it runs out of disk space, it gets jammed because some process runs and uses too much CPU. Well, eventually, the heartbeat that the other two nodes send is going to not get a response. At that point, the remaining nodes, and there must be a majority of the nodes remaining for this to happen, will have an election. It's like any election. You can't have an election if you don't have a majority of people participating. So the election effectively says whose node has the most up-to-date data and it uses, for those of you who are into this kind of thing, a consensus algorithm which is a small variation on the RAF consensus algorithm. So they eventually decide, and it depends on the size of the cluster and how many nodes, it takes a couple of hundred milliseconds, and eventually they will elect a new node. And that node will spring to life as a new primary. What happens to the clients while this is happening? Well, the Python driver, the client library that you're going to use collaborates with the cluster to ensure that no rights are lost. So even if there's a right in flight to the old primary and it dies, the driver will restart that right automatically and recover and make that right item potently on the new primary. So there's no way to lose data. If, on the other hand, you lost the whole cluster, if your data center went on fire, happens, I know, not very often, but it does. What will eventually happen is your client will time out, will get a server time out. That happens after 30 seconds by default. And then you've got to do something else. My database is down. What do I do in that situation? But with multiple nodes and the ability of the primary to move to the nodes that are alive, that's going to happen much less often than in a single node database. Now, those of you who have been watching astutely will realize that if all the reads and writes are going to a single node, the point at which that node is going to saturate, right? Recovery happens, we're going to go through that. Will it scale? How does it jump from like a single node to scaling to millions of nodes, scaling from millions of transactions, millions of users? Well, it can and we do it with sharding. Effectively, we run a partition of the data on multiple replica sets and we use a separate set of demons called MongoS to route the reads and writes to those nodes. Going into sharding is a whole talk in itself. Trust me, it works, but you don't have to trust me. You can trust Fortnite. Fortnite has 25 million users, 10 million active users. They run on a sharded MongoDB cluster in Atlas. And trust me, that's a high workload. So enough of the slide where let's actually see it in action. So I've got an iPython here. I actually wrote from datetime import datetime because every time I did this demo I forgot to import datetime and later on it would call some of the class. So I want to actually connect to a database. I've got a server running here. So it's kind of shrunk down here. There's a database server running there. It's kind of getting connections or whatever. So I'm going to do client. I need to import the PyMongo library first. So import PyMongo. And then I just need to make a client object. The PyMongo library handles connection pooling, security, encryption if you're using it, and also encoding and decoding into Bison. That's why you don't need to worry about it. So we do pymongo.mongoclient. And if you look at the client object, it's actually pointing at localhost 27017. By default, all clients, all servers start on port 27017. So as long as you don't specify anything, it's all going to work. If you want to change that, you can. It's a minus port argument. Now I still need to make a database. So I'm going to look at the client object and I'm just going to make a database off the client. This is the beauty of MongoDB. It's very simple to spin up new databases. So we're going to make a database called test. And then we're going to make a collection. Think table when you think collection. But instead of rows, we're going to have JSON documents. So we're going to make a collection and we're going to call that test as well because we just have no imagination in MongoDB. And so if I look at the collection, you'll see it's got a client and it's got a dict and it's got two things, a database and a collection. Now inserting into a collection is just as easy as making a dictionary. So I can do collection. Let's get that typo out. Collection.insert1. And I just put in a dictionary. I'm going to make an explicit dictionary here. Call it username. And the username is geodrumgoo. And we close the Curlies. Nobody should do a demo in Python without IPython because it closes your Curlies for you. And it's going to return a result object which basically says that's been inserted. And we can actually look at that object. We can do collection.find1. We just do one because we know we've only got one element in this collection and there it will pop up and you can see username, geodrumgoo. But what's this strange ID? This is added by the Python Client Library and it's created on the client so there's no round trip to the server. The object ID is your unique primary key for every object that you insert. We do this insert now. Let's just do it again. And if I do a find rather than a find1 because find will return all the documents in the database. I'm going to get a cursor. That's kind of annoying because I'm in the Python shell. Normally for other programmers they have to use the Mongo shell which is an OJS environment because Python has this cool REPL and because Shane Harvey and Co. have done such a great job of building the library. We can look at this stuff and we get a cursor. I could do something really ugly like get the cursor and the cursor isn't iterable so I could do x.next. You get the object. That's a drag. Who wants to do that? I wrote a Python package just to make this stuff easier called MongoDB shell. Actually I want to get a particular object from this shell. So I'm going to do from MongoDB shell import MongoDB and this is like a super object that saves you some typing. Let's get rid of those curlies. So now I'm going to make a new client object called MongoDB and I can pass it in the database I want and the collection I want with the right quotes and now if I look at the c object it's kind of similar but it's just a bit more legible it just shows you the URL and the database name and the collection name it doesn't put all the other craft in there and now I can do c.find and bingo we get, I've undone some inserts already we get the objects displaying so MongoDB shell effectively wraps the cursor and displays it out and it adds imagination and other nice stuff it adds those line numbers they can be turned off we're going to play around with both of these in the future so that's fine in the shell but let's have a quick look at what an actual program looks like so here's the simplest Python program you can write for MongoDB it literally just pings the server with an isMaster command so I'm going to run that and you'll see it'll just produce this Java this JSON document and it's basically systems information without the server isMaster's true local time etc etc etc so you just get a bunch of data about the server that's the simplest program you can write but what if I wanted to make a lot of documentation so I'm going to build this other program and this is again let me just put this in a slightly more legible format enter distraction free mode so this is going to make a pile of documents so it really imports the programs we get date time we're going to make random strings we're going to make an article which just returns a document which eventually has a bunch of random fields in it an ID, a title note that the ID field which we generate automatically previously can also be overridden to insert your own unique ID because ID underscore ID is always indexed it means you can save yourself an index if you already have a unique ID for the database and we're going to make user which does the same thing and then we've got PyMongo Client we're making a database called EP2019 I'm going to just change that because I don't want to overwrite the demo database I'm going to use and we're going to drop the users and collection articles and then we're just going to insert them and we do something here that is an important performance improvement note here instead of doing insert one we're building lists of users appending them here appending make users to the articles and we're just inserting 500 at a time why do we do that? well because each insert requires a rhyme trip to the server and if you have to do one round trip for every document that's going to take a long time with this model you can insert up to any number you want the nice thing about insert many is you can give it as long a list as you want and it will internally chunk it if it's exceeding the chunk size that it can use the default internally is about 1000 documents I'm setting it to 500 here so you can see feedback as you insert clearly if you're going to run an insert with a million documents you're going to store your Python program for quite a while and there are async versions of this library which again I'm not going to get into if you look at PyMongo Motor it allows you to do all this stuff asynchronously having run this we're just going to spin this up and put it back into non-distraction mode and then we're going to just run that again and we're going to change it to many docs and it's going to chug away whacking those articles in it and it will bail away doing that so I'm just going to connect to a similar structure I built already so let's just create a new database articles which is off DB articles and we're going to create users which is a type of there users which is off again the database and it's users and now of course if we do articles.find we're going to get a cursor so we're going to let's just clear that up articles.find we're going to get a let's control see out of that caps up let's go back to that articles.find we got our cursor back so we're going to make our articles view using MongoDB shell and we're going to make that a MongoDB and it's going to be EP2009 team and the database is articles and the same thing with users and now we can do articles view.find and we'll get positive documents now that's insert that's query what about update how do we update an article well there's an update one article we can use so we can do articles.update one and again we can look for an article because I know underscore ID and article 100 and then we're going to do an update operation and we're going to do a dollar set and a dollar set just basically sets a field or if the field doesn't exist it adds it and we're going to add a comments field and that's going to be an empty array we're going to close all our curlies and one more and again we've got this problem we can't see what we've inserted but that's okay for now we're not going to do too much about that we're going to have a quick look at article view find one and we're going to look at underscore ID and we can see did I do the insert hmm let's try that again and so look at articles still on test that explains it articles equals db ep 2019 articles okay just got to set that database up again and then we will get to articles and then we can do articles.update one and we want to pick our ID which is unique articles 100 close curly and then we're going to do the dollar set operation as before we're going to set the comment field to be an empty array close the curlies close the other curlies close the bracket I just need to okay let's just rename this article to articles view and make this articles the actual object on the db and then we will re-run articles.update when we get a cursor back and then we can do articles view find one we'll add underscore ID is article 100 thank you and you'll see the post date there are the so the comment should be in a field there I'm not going to try and do this anymore so now what we do is we will then append to that comment so we can say articles.update one and we will open the curly again underscore ID 100 and now we're going to do a push which actually appends to the array dollar push and we would do open a comment and then we can just effectively specify another document here so it would be like username is joe and body is hello and now if you look at articles which is going to make sure it's the right database articles view.find look at the actual object we've created which is articles 100 something's wrong there underscore ID articles 100 it's all okay let me just try okay I seem to have messed something up there but effectively you'd get a push which adds a comment to the database and that would give you an update operation. Simple deletes are of course just articles .delete1 and you'd specify an object as an underscore ID and just articles 100 oops, typed again articles and it would just return the result of delete1 so what are we shown today? well you've seen how to create a database in a collection, you've seen how to read one and read many databases you find out how to insert a single document and insert many and also how to update those documents although the update demo didn't quite work out that's just my fat finger typing I haven't shown you how to check performance and add indexes because we kind of ran out of time but I'm going to show you one more thing you can build all of these clusters inside the cloud in MongoDB you don't have to build these clusters manually you can set this stuff up in cloud.mongodb.com you can create your own clusters at any level and any scale it's pay as you go you can turn the clusters off and save yourself money and if you use the code hack100 you can go to Open Data or go to the top level organization when you create it and it starts you down lower ones you have to go up to the top level to get to the billing tab you go into the billing tab you scroll down and you apply credit and put your hack100 in there you get $100 of free credit to use MongoDB that's MongoDB free to use in a nutshell free to use and the easiest way to use MongoDB in the world is with Python and PyMongo ok thank you very much