 Hello everyone and welcome to this UK Data Service webinar about MongoDB. I'm Marguerita. I'm a communications officer working for the UK Data Service and presenting today is Peter Smyth, a research associate working with the University of Manchester, but also working for the UK Data Service on the training side. Right. I will hand over to Peter now. Okay. Thank you, Marguerita. Welcome to this webinar on MongoDB. So what we're going to talk about today about MongoDB, I'm going to start with a very brief background of where MongoDB has come from and where it is today. I'm going to look at some terms that we're going to use throughout this webinar. Some may be familiar, some will be MongoDB specific, some you might just not have heard of before, so hopefully I'll give you some kind of feeling for what I'm talking about as we go through. I'm going to then give you sort of a demonstration of a comparison between storing data in MongoDB and storing in a relational database system. And then we'll look at some examples of how we might use MongoDB to query data and at the end, very briefly, I'll show you how you can use MongoDB in an R environment or a Python environment, which may, depending on your background, you may find more suitable. So MongoDB background, first developed in 2007, it's publicly available in 2009, so that's only about seven years ago, very new. It's a document-based database system. Now I'll put that in green to remind you that you possibly don't know what that means, but it's covered in the glossary of terms I'm coming up to later on, so don't worry too much about that. It's the most popular non-relational database in use today. It's open source, it's 3TU, you can download it, you can install it yourself on your desktop. You can't have far more commercial enterprise-sized systems which are formed of large clusters of machines or running MongoDB. It's schema-less, we'll cover what a schema is later on. No SQL queries, that means you do not use something called SQL, to be covered later, to query the data, to get data out of the database and into your application or onto your screen. It's excellent for horizontal scalability, by that we mean it can handle very large collections of what MongoDB calls documents, and very large collections means big data. The difference between, the reason it's termed as horizontal scalability as opposed to vertical scalability, in a traditional relational database system, if you run out of room because your data gets too big, what you used to do is, well, we'll get a bigger server and more memory and bigger hard disks and we'll continue running like that. Unfortunately, that doesn't really scale very well because you get to limit and cost and size of how you can physically make, how big you can make a server. The alternative approach to that, and it's very similar to the Hadoop type approach of using servers in a cluster, is you just have an ordinary size server. If it fills up, you get another ordinary size server, put it next door to eat and share the database between them. This approach in MongoDB works very well because as you keep adding servers to it, it has very little effect on the actual overall performance of the database and running queries. Whereas in the relational side, you can sort of do that if you really want to do, but there's quite a serious degradation in performance. More information, you can find out this site here, which seems to run a monthly update on how popular databases are and just to prove it, here is the release for June 2016, so out of date tomorrow, but for the time being, you can see here MongoDB number four below only the major relational database systems, Oracle, MySQL, Microsoft, STL server, and then there's more relational ones below that. Cassandra is another non-SQL database, referred to as a wide column store, and Redis down here at number 10 is also a non-SQL database, and that's a key value store, so within non-SQL databases, there are still different types. MongoDB document-based system is what we're going to be looking at. So, glossary, as promised. Right, relational is a database system which uses tables to store and define relationships between the tables and relate data in different tables, a table, rows and columns, just like in an Excel spreadsheet. Non-relational, no predefined relationship between the data, if appropriate, you can define relationships, but the user has to do that themselves in code. SQL stands for structured query language. It's either programming language in its own right, there is a standard version of SQL, but most of the relational database providers like Microsoft and Oracle tend to have their own slightly different version of SQL in order to work with their databases, but it's all essentially SQL and it's used to write queries to extract data from relational databases. So then, obviously we have no SQL, that's query language or languages. It's really referring to databases which do not use SQL to query the data. So, as we saw in the earlier slide where you've got the different types of no SQL databases, each one of those will have its own method of querying its own data. So, although SQL is a language, no SQL is more of a concept, rather than an actual languages in its own right. Structured data, typically data that can be easily recorded in a table structure. So nice and ordered, you define the columns and you put data in rows below those column names. Unstructured data, you can't store it in a table. The types of items normally referred to here are things like PDF files, audio files, that sort of thing. And then finally, we have semi-structured data. So, I've defined this as not easily stored in table. You can sort of shoehorn some of this into columns if you really wanted to, but it's not ideal. Semi-structured data is really the type of data we're going to be dealing with today when we look at MongoDB, rather than truly unstructured data. Unstructured data can be stored in both relational and non-relational like MongoDB databases if you need to do so. So, look at some of the actual terms that we're going to be using. In MongoDB, we have databases as we do in a relational database. MongoDB has a collection, whereas a relational database refers to tables. Within the collection, we have documents, whereas within a table, we would have rows. The document is split up into a series of fields in MongoDB. And in the relational database, the table has columns which make up that table. Shard and partition, they refer to what I was saying before about the scalability. When you start splitting a MongoDB up, the database of the collection up into across servers, it's referred to as sharding. And similarly on a relational database, when you're using multiple servers, it's people who talk about partitions. The last three, no SQL, I just reinforced as well as I was saying before, no SQL is MongoDB term, but not exclusively MongoDB. SQL is relational database. We've got semi-structured data in MongoDB and structured in relational database. MongoDB is non-relational, and a relational database clearly is relational. So, I just want to show you, before we get onto the demonstrations, the system that we're going to be using today. So, we're going to use a Windows command line just to import some data. I'll show you how to import data. We're going to use a MongoDB shell. Now, the MongoDB shell comes as part of a MongoDB installation. It's really just a command line which allows you to write MongoDB commands and what's called JavaScript commands. JavaScript is a programming language in turn right. It's very popular with web developers and it's used by MongoDB to incorporate its own database commands in a shell which allows you to do proper programming. So, you have all the normal structures associated with JavaScript programming. We're also going to use something called NoSQL Manager for MongoDB. Now, this is a third-party Windows application. It's available from this place here. Unlike almost everything else we spoke about in this series of webinars, this isn't actually a free product, although you can get a free 30-day trial. And after that, you can get licenses from their quoting $79, about 60 pounds, I suppose. But the 30-day free trial is probably going to be enough for you to get a fee for whether or not MongoDB is going to be a fee to you or not. Installing the MongoDB in Windows, I'm not going to show you that today, but we will provide you either a guide or video of doing it on the UK Data Service website. It's actually incredibly easy. There's only a couple of inconveniences which you have to deal with. It really is very easy. Certainly all the demonstration I've shown you do is running on my desktop here and we've had any trouble at all. So the demonstration, the first demonstration I'm going to show you is loading data into a table in a relational database and then doing the same thing into MongoDB. And really, this is just for comparison. Okay? Oops. Just bear with me a minute while I find my place. Okay. All right, so just leave that for a minute. If I go up, what am I going to do? Oh, no, my mistake. I'm back in there. I think I should have another. No, yes. Oops. Okay, so this is demonstration. What we're going to do, this is a CSV file that we're going to load into both relational database and MongoDB. It's just a standard CSV file. We've probably seen many of these before. It's got a header line at the top with various fields or columns, column names in there. And below that, it's got rows and rows of data. And I've just color coded that so you can get an idea of what goes with each field up here. There's a few more fields missing off the end there. Okay. Now, if I want to put this into a relational database, it's going to be a two-step process. First, I need to define the schema that is the structure of the table. I've got to say what all the columns are. I've got to say the order in which the columns are going to occur in the table. And then I have to actually load the data into the table. Okay, so the first step involves creating a table. Now, I appreciate a lot of you may not know any SQL or use SQL, but just take it from me. This is a very simplified version of creating a table for almost any of the large relational database systems. So in this case, Microsoft SQL Server. So what we have is a great table. We've got a name for this table, sales, order, details. We've got in the blue here, we've got all the different column names. And also what we have to do, we have to tell the SQL database or the SQL relational system the types of each of these columns. So sales, order, ID is an integer. Carrier tracking number, despite the fact that the number far char 50 refers to effectively a string. So that's text type data. And here decimal, that's like a, something with a decimal pointed it effectively. Okay, so that has to be done before I can load data into the table. Once I have done that, there's a utility we can run which says load this data into that table. And when you've done that, what you end up with is not surprisingly a table with data in it. That's entirely what you wanted. So that's perfectly okay. Now, if we look at loading the CSV file into MongoDB, MongoDB has a command line utility to import data. You can do it in either JSON format, which I'll come to in a minute or CSV format as we've got here. There's also an export if you want to take data out of your MongoDB system. This is an example of the MongoDB import command. In reality, I've split this up over several times just for clarity and showing you. In fact, you'd have to do this all as a single line when you're typing it in. And what we've got here is this collection equates to the table name, sales order details. So I'm using the same name here. We specify what database is, I'm calling that sales. And here I'm just saying, oh, this is a CSV file and there's a header line in the top of that file. And this is where the file of the data is stored. Notice I am not defining any tables here as such. I'm not saying where these columns are and so on and so forth. And that is the key difference. Because when I run that, if there'll be no table called sales order details, there may not even be a database called sales. But when I run that, the database and the collection will automatically be created for me if they don't exist. And what do you get after you've done that? You get a series of documents in MongoDB. And each document, you recall, refers to or is equated to one row in a table. So what I'm showing here between this open brackets here and this closed bracket here represents one MongoDB document or one row if it was a table. And you can see here, I've got sales on ID and the value exactly the same as I had in my CSV file. So the columns have been converted into these field names down here and the value for the first row, as in first row from the table, the CSV file rather, are actually being shown down here. The only slight difference to this is this first line up here, this underscore ID, and this funny thing up here, which is a unique identifier effectively. And when you load anything into MongoDB, whether it's in bulk like this or individually, they're all giving a unique ID field like this. If you want to provide your own, you can do that, but if you don't provide one, as we did into our case, MongoDB actually includes one for you. And most of the time, you can probably ignore it because unless you've deliberately installed it yourself, it's not going to make much difference, okay? Now, at this stage, you think, well, what's the advantage of doing it this way? Why am I wanting to use MongoDB so far? Well, on the face of it, the store, sorry, I should have mentioned our previous slide. This format is called JSON. Clearly, Bracket is the beginning and the end of the document. And then you get this general idea of you get a field name, colon, and then some value. And if it's a character, despite what it says there, it puts it in quotes. So this is a general format for JSON data. So why use MongoDB? So far, we're just using JSON objects and it seems it's going to increase the overall size of the data. The reason I do that is because every document in this collection is going to have those field names repeated on every single document. So that's bound to bulk up the size of the data. Exactly what I've just said there. On the face of it, you're thinking, why bother? Well, the reality is that not all of the data is as structured as that CSV file. That will actually came from a relational database into a CSV file in the first place. So naturally it'll go back into a relational database quite happily. But what if we've got something a bit more complicated like a tweet? Tweets are naturally returned to the user in the form of a JSON object. So what we've got here is just a part of a tweet which has been collected in a stream or whatever. And you can see from here, well, this top part here is quote status, quote default, well, that's not a problem. That would be my column name and that would be a value of one of the rows and the same here and the same here. The difficulty comes when you get down to entity here which starts here with the name entity and it ends down here with this closing curly brackets. Everything in between is part of the field name entities. So you can immediately see that within that one field name I've got a whole host of other things specifically down here, this section down here, I've got an array denoted by these square brackets of sub-document because I've got curly brackets at the beginning and end here. And within there I've got effectively field names of this sub-document with values, I've got IDs. I've also got something called indices which is another array. Now I know that array only has two values, perhaps up here I've got an array which has no values. I could have three, four, five values, perhaps I just don't know. The point is that no two tweets are necessarily gonna have exactly the same structure. So you can't easily store that in a single table structure. You might have a different structure depending on different tweets. So some fields may be missing, fields may be in different orders and so on. If you wanted to do that in a relational database you'd need many, well, several tables let's say in order to have all the sub-components of the different parts of that tweet data. And it's very complex to set up. And it also be very slow to actually place the tweets into the data from the tweets into all of the relevant tables. So maintenance would be a problem. In MongoDB however, all you're gonna do is store the whole tweets, okay? Now I think that is enough. So let's go on to some of the demonstrations. Just bear with me a minute. Oops, last line. Yes, regardless of the complexity of the documents that you've stored and the fact that it may be varying between two different documents in the same collection, you query it in exactly the same way in MongoDB. So what I'm gonna show you in the demonstrations and we'll start, we'll look at stuff from the MongoDB shell but I'll actually be using the S2L manager to do that for convenience. And then at the end I'll show you using MongoDB from an R script and from a Python script. So what I need is MongoDB, okay. This is the overall interface for the MongoDB manager or the S2L manager for MongoDB. What we have is we've got our databases down here and our collections down here and in this main part here is where we can write queries. So if I created a shell there, I could write a query in here which I'm gonna do in a minute. Well I'm not actually because I've already pre-set some of these up. The one, oh. Oh. Right, these are the shells we've, this is scripts we're gonna run today. What I want to do is start off some very simple commands and we'll build it up and we'll try and show you how these things look in terms of results and so on. So first command, I'm gonna use the local database. You can do simple things like, well, what are the list of commands you can use? I want one that is quite a long one. I can certainly get a list of what collections I've got in local, which I also see from down here as well. I can, if I wanted to delete a collection called new collection, which I currently don't have anyway, so I won't bother running that, but I could drop a collection down there. If I want to create a new collection and put a document into it, this here represents the document. I've just hand coded that. Here, this new collection is what my collection could be called, it doesn't exist at the moment in this database local. The fact that it doesn't exist isn't a problem because when I run this command, this one statement here, it will actually create that collection and insert that record, that document into it. Okay, and what you get back is a yes, true, and then I can tell you what object ID it's given that one record. So even the results coming back from MongoDB are couched in terms of JSON format, which is quite handy. It doesn't make any difference here. I mean, it's quite readable here, but if this was done programmatically and you actually want to check whether something worked or didn't work, it's quite handy being able to get the information back like this and where you can test whether or not that knowledge equals true and things like that. So if I haven't put that in there, just to show it there, I'm gonna use new collection. So I'm telling it what collection I'm interested in and find one with this closed bracket down here. This is where I would put search criteria in if I wanted any, but in fact, if I don't put any in there, I'm just saying show me the first one. And if I run that, what comes back is what I'll just put in there, F name, Peter, L name, Smythe. I can run another insert one here. And here, again, I'm putting another document in, but I've got two different field names. Does that bother MongoDB? No. Are they both still there? Oh, I've obviously run this twice. It's in there twice now. But again, it doesn't check, it doesn't care if your field names are the same or different from other records in here. You'd probably have to know if you're gonna do a search and you want to search on full name and things like that. But in terms of storing the documents, it's not a problem whether it's got the same or different sets of documents in there. Okay, if we go on to slightly more complicated ones, now I'm gonna use the Twitter database. And in the Twitter database down here, I've got this one collection called Brexit. Not quite as topical it was when I created it, it's been a mind. And in there, I've got a collection of tweets. If I just hover over that, you can see there, it's got 55,547 tweets in there, occupying 261 megabytes. So this one's a slightly fair size file. And just to prove that I got that right, what I can do here is say on the find, I can add the dot count and it will tell me exactly what I've just found out, 55,000 records in there. What I'm interested in is this field called Geo. Geo in a tweet will tell you if there is any, well, Geo type information, i.e. whether the location has been specified from when the tweet was sent. The majority of cases, it doesn't. So I've got 55,547, if I do a count on that, which have actually have the Geo not equal to null, i.e. there is some Geo information, and run that, I get 53. So that's about 0.1% of tweets, is that all right? So a very small fraction anyway of tweets have this information, but that's a manageable number. So I'm quite happy to take it on this next one. And here I'm just saying, I still want the Geo not equal to null. I'm not specifying anything else. So it's gonna return all 53 records of documents which have some Geo information. And it's returning, it's one row per document here, and it's returning everything in that tweet. So if I just scan along here, everything in that tweet isn't there, and you can see by the fact that some of them are bigger than others, tells you that these don't all have the same structure or the same field contents. One of the things I can do in this environment, you can't actually do this in the share itself. This is like an added part value of using the SQL manager, is that I can look at the data returned in different formats. So this is showing in a nicely formatted way for each of the records. And if you just look at this, how long this tweet is, so it starts up there, it gives you a record number by itself, all the way down to the 97 or 90, line 98. If I look at the second one, there's no guarantee, well, I'm over 200 now. So this record is clearly far, far larger than the previous one, because the information, the fields in there doesn't have to be the same. What I can do here is I can actually get it to show me this information as a table. And here, well, ID created that, that's pretty straightforward ID. And a lot of these, they fit into tables quite nicely, but you get to a point when you get down to user, user itself has documents within it. And so in the table, all it can do is try and give you everything in that sub-document. So you see at this point that the notion of using a table falls over somewhat, because it can't really represent the data in the table format. I'll just miss out a few, I'll go on to this one. What I'm asking for here is, it's the same query I wanted the geo not equal to null, but here I don't want the whole record returned, I just want you to tell me what the user dot name is. I'm using the dot notation because user has sub-document, of which one of the fields is called name. And I also want to see the text of that tweet. So if I run that one, you can see here, I'll go back into the other format. Here, all I've got is the text and within user, I've got the name. Now the way this works is that if I want it by default, it will produce all of the fields in the record. If I start saying I want username by putting a one against it, that means I specifically want username, at which point it says, well that's all you're gonna get a username. So I want text as well, I've got to say text, you could want everything else is not returned, the exception being the ID field, which isn't really ours anyway. And there, if you don't want the ID field, you've got to explicitly turn it off, which is what I've done. There's no text field listed in there. Okay, I'm gonna move on. In this script here, all these commands here, we're actually gonna look at these sales order header table. This is the one which we loaded up before. So this is in the sales database, which is down here, and sales order header. Double click on that, again, this is just MongoDB. The manager here is providing you with this information, and I can actually look at the data here, and I can look at that in a table format. And not surprisingly, this looks perfectly okay in the table because that's affecting what it came from. There's no problem with showing this in the table because we know there's no sub-documenting or anything like that, which is kind of convenient for us to do these scripts, run these scripts against this. So again, I could do the count, which I won't, could be those can just count the record. Here I'm saying I want customer ID of 29825. So there I get, that was a fine one, so it's only for one record, nicely formatted for that custom ID. I can ask for sales order ID with a value, and here I've just heard fine, so you can find all of the records where the sales order ID equals that. Again, there is only one, had to be more than one, and I've got more than one. Again, here I can do, find all the records where the custom ID is 29825. I won't run that, I'll run the next one, where I'm running the same query in terms of the conditions or the selection criteria, but here I just want you to join the sales order ID, the order date, and the total due, and there we are, 12 records with just the fields requested. Now, looking at this is all very well, but generally you want to do something more meaningful with it, and because this shell, which is exactly the same as the proper shell, which I didn't show you, which is, this is the proper shell, you can see here I've just done DB get collections. It works in exactly the same way. You just don't get the same options for displaying the results, I think. Okay, so I'm running exactly the same, but I can also run JavaScript code in there. So here, in this line here, I'm declaring a variable called SOHCursor, and this query here is exactly the same as the one before, and if I run that, then the results are placed in that variable, so nothing appears down here. If I want to see what's in that variable, I can run that, just say SOHCursor, and it looks exactly the same. Now, but what I really want to do is having collected this information, I really want to do something with it. I can do anything I like with it. In this case, I'm just gonna print it to look a bit similar, but I don't have to print the whole record. Again, within each document or part each document in that collection that has been returned, I can say I just want to the sales order ID, for example. So if I run this code here, oops, then all I get back is a list of those sales ordered IDs. This one down here is a lot more complicated. What I'm going to do here is, I think I've got this on the slide, so I'll try and explain what I'm gonna do in here. I'm effectively gonna get data from two different collections. So initially, I'm gonna read the data I want from the first table into a cursor, which I've just showed you an example of. I'm gonna select the field as in I know fields I'm gonna use, and I'm gonna iterate through the cursor in using that while construct as I've done before, using the field value and select as a selection criteria to find a second, to find on a second table. I'm gonna store results in a second cursor. I'm then gonna go through the second cursor and put part of the first set of results with part of the second set of results. So back to the code. And that's what this is doing here. It's my first cursor being set up, exactly the same queries I had before. And then I'm gonna iterate through that cursor to select at this point here, I'm gonna select the sales order ID. And then I'm gonna use, assign that to a variable called sod ID. And then I'm going to create my second cursor by running DB sales order details, which is the second table with a find. And for the sales order ID, I want to put in the value of that variable as it currently is. So when I run all of that, what I get is from the first table, I get the sales order ID. And from the second table, I get the detail ID. So for that specific customer ID, 29825. So that's a bit like joining tables together in code. In this one, I don't think I won't run that one. We'll go move on to aggregation now. Aggregation in SQL terms means grouping things and adding up or whatever. So the simplest example is probably just a count where you count the number of records or you can take a field and you can sum the values in that field over some grouping of the records. So here in this example here, we use the aggregate function. And we say the grouping that we're interested in is the sales order ID. And then for the sales order ID, I want you to create a field called total, which is the sum of the line total item. And then I want you to give me a field called count, which is going to be the sum of the records within each sales order ID. So it's very simple, similar to a grouping type SQL query, which is fine if you know SQL, but probably doesn't mean that much if you don't. So I'll explain it like that. Okay, so if I run that, take a while to run. You can see how many records it's done. And then it plays the results and saying again, so for each ID is now going to be a unique value. So for this sales order ID, the total value that sales order was 189.97 and there were three items in that sales order. So you can do groupings and aggregations. And I've just used some when you've got one day's effective account, but there's other options here like averages and so on and so forth. What you can do in the latest version of MongoDB, which you couldn't do in any of the earlier versions, is you can actually perform a proper join on data. So here I'm doing a join on the sales order ID. The sales order header table and the sales order details table. I think actually I have a slide of this. So here where we're joining things, we've got sales order header and we've got sales order details. Within sales order header, there is a field name called sales order ID. And equally within sales order details, there is a field name with sales order ID. And what I'm going to ask it to do is for each of the sales order IDs in here, group together or collect together all of the records or all the documents in the sales order details collection, which have the same value for the sales order ID. And that's what this query is doing here. So from, this is the second table, the local field is a sales order ID in sales order header and foreign field is sales order in sales order details. And I'm saying, ah, so this is actually going to create a new field called sales order details. The fact that that collection is called out doesn't make any difference at all. So if I run that, what I get is this, and this isn't the best way of looking at it. If I look at this as a text view that I've got up here, what I've got is for the first record returned, I've got sales order header information. This is all just standard information from sales order header. Until I get down to here, the end of it, it's added in my sales order details. This is the field I named. And within there, it has created an array of all of the matching records from the sales order details collection where the sales order ID matches. In fact, so that's an array which is going to have 12 records in it, one for each one. So that's a way of actually creating records of documents which contain sub documents. And that was one of the ways, or still is one of the ways in MongoDB where you avoid the need of having relationships defined. And that is by including within the main document, sub document which would have been found had you done a query on a relational database type system where you want all of the, in this case, the detailed records associated with the header record and all stored with the same document which is can speed up searches when you're running it. What I'm doing here, what you can also do, a very similar thing is, down, oops, okay, so that's a very similar one there. And then I'm putting that into a cursor called cursor join. And down here, I'm using that cursor to actually create here a new database, a new collection called sales join two. Here, you were just seeing the results of that query, but on this one, which I won't actually run, I'm actually creating a new collect called sales join two, which I've actually got here as a collection and that will actually contain in it the results of the query, which again, if I expand that, and you can see the bottom here with the sales order details and all of the records of that first one and so on. So it's a way of joining the tables together and creating new collections. Okay, I think that's covered the general select queries with a quote of search criteria and we've looked at aggregates and we've looked at joining. So now just to finish off, what I'd like to do is show you in a very simple way how we can use R and Python to run this sort of thing. So this is our studio. Very simple, and the reason it's very simple is because you can see all this library called Rmongo, but it's not actually brilliant. And the reason is I'll try and demonstrate if I just run that code there. So what I've done is I've connected to this database called peter and I've run a query db get query. These are functions contained within Rmongo for mg1 for that connection, list collection, empty criteria there, which means return everything. So query one is gonna be defined as a data frame in our data frame containing the results of this query. And showing up here, I've got query one, six, the observation, 11 variables. If we look at that, it's a table because the person is, the data in there is tabular data, so or structured data. So that works perfectly well. You can take it from there in R to do whatever you may need to do with it. The problem comes if I try and do this with Twitter, my Twitter data, so this is the database Twitter, Brexit is my collection. ID string is a unique identifier in a tweet. So I'm only, this should only return one, one document or the query two data frame should only have one observation in it. But in fact, if I look at this, it's got four. And frankly, the data in here is pretty well all over the place. Now in fairness to the providers of this R-Mongo interface, they do actually tell you about this. Let's just slip through the slides. The joins that I've mentioned using the lookup I thought we did a minute ago. And you can start if you wanna use code as I showed you the results of anyway. For our package R-Mongo, you can connect quite happily. That's not a great problem. Problem is it can only deal with limited simple, effectively relational tabular structures. And it does actually warn you of that, so that's not bad. If we look at the R-Mongo, the Python, again, Pi-Mongo can be installed. You can collect to the system in a very similar way. You get a dictionary object, console dictionary object. And again, it's quite convenient for holding the return data from MongoDB because it's quite happily deals with JSON formatted data. And because it's closer in the way they're constructed, the command in Pi-Mongo are very similar to that you find in the Mongo shell. So let me just show you the Pi-Mongo. I'm running this in a Jupyter workbook. I've got Python running in the background or Jupyter running in the background. There's first set of lines of code here. I'm importing the Pi-Mongo. I'm selecting a particular Mongo client from that. I'm connecting the client, I'm collecting the database, I'm connecting to the specific collection, the Brexit collection. So if I run all that, you don't get any result from that because it's all done in the background. But if I run, find the first one in there. If I run that, you can see down here that complete tweet as I was getting before. And there are several other, these are very similar to the ones I was running before. So here, I'm finding one where the text, and I only wanted to display the text value. I'm finding one here. I want the entity's user entity's name only, and so on and so forth. So if I run this, just take this one, oops, at the end, where I'm gonna run this command to return the information text and user mentions name, but I'm actually assigning that to a variable. Again, if I run that, I don't appear to get anything returned. But if I print that, I do, of course, get the information I wanted to the text and the user mentioned name. And I don't have to print all of it or make use of all of it. I can select the part of it I want in that cell there and just return the text. And just to prove the point that that is, in fact, a dictionary object in Python. So again, once you've got the data back, you've got the whole of Python to do whatever it is you need to do with it. Do with the returned data. So, oops, back for the last few slides. Oops, let's put it in slow. So if you want more information on MongoDB, the official site is MongoDB.com and that includes links to the official documentation. The documentation is certainly been updated for 3.2. It creates a really good documentation. If you have used SQL and you're thinking moving to MongoDB, you tend to have sections in the documentation explaining or giving examples of SQL type code and what the MongoDB equivalent is. There's a MongoDB university where you print self-study courses. You can find the community areas where you can find your local user group and there are support forums listed for you there. And then, of course, if you just wanted to search, there's always Stack Overflow which has a MongoDB area or you can just Google the question that you've got. Books, there's lots of books, but the same caveat as usual applies to them. They're not necessarily based on the latest version of MongoDB. 3.2 only came out in March and I don't think you'll find a book with that in. So the definitive guide will give you most of more than you probably need. This one down here in the Mongo, nice to go up with MongoDB 24 hours. It's quite a basic book and it will also include decels of how to use JavaScript in the shell. That is it. So thank you. Thank you for listening. Bye-bye.