 Hey, everyone. I think that was overselling my master in data and analytics But hey, everyone. Thank you so much for coming It's the last session, but yeah, you're all here. Thank you so much I'm sangharshan and and this is build a database with Python So it's a good idea. Is it or is it not? Let's find out in this talk Cool, so I'm going to be using like a lot of code in this talk And most of them is kind of inspired from DDIA, which is designing data and data intensive applications It's an amazing book. So check it out if you haven't it's by Martin Kleppman and The code for the slides and the slides everything will be available on the QR code So you can just check it out and you know look at the code because it might be a little small in the screen and might be a little hard So it would be much easier that way Cool. So before we you know start about building a database with Python, let's you know Take a step back and take a come on a journey with me to my past. It's a little flashback So I'm gonna tell you a story and imagine it in black and white because you know, it's it's from the past So I was an intern at a company which I'm currently working in And I was in a meeting where they were discussing, you know back-end systems and you know How we're gonna build like an application and they were deciding on which database to use and I was an intern, right? So I was all like excited and I was like guys, you know what we should do We should we should use these cool DB's I was just mentioning like random names like spanner cockroach TV I was mentioning like Like there's this door this git for data. We can use like amazing things like let's do it and The senior person in the room immediately recognized that I was the new intern and he like pulled me aside after the meeting And he gave me a little piece of advice which kind of stuck around with me till now So he told me that whenever you want to you know, build an application a back-end system and you want to choose a database Go with Postgres And if you don't want to go with Postgres, give me three really solid reasons You are not going to be using Postgres for this and if you can convince me with three reasons Then you know what I'm gonna start debating with you like I'm not gonna I'm not convinced yet I'm gonna start debating until then don't even talk to me And I kind of felt weird because I thought that devs, you know loved playing around with the cool new toys You know like the new thing pops up and we just start using it But with databases that doesn't happen, right? Like we have these industry standards that have stuck around like since the 70s Which is as old as my dad Which is like nuts like we like to use cool new stuff like why our databases like sitting around for so long What's so awesome about them? What's what's so mystical about them that you know take a single database that's existed for so long has become a standard And it's continuing to be the standard and actually the short answer is the community of people who maintain and develop it for us But still, you know, it's standing the test of time and software normally doesn't stand the test of time So that's not a good enough reason to build a database is what I learned by talking to sane people So here are some other reasons where we why you can go and build a database on your own So the first reason is you can break down abstractions Databases are like black boxes, right? You exactly don't know what's happening inside you write a query It gives you a result You just like spin up a postgres instance docker run and then you just write a query you insert data you query it It comes out. What's happening though? Like a lot of people don't understand what's happening. They just you know work with it and it works. It works out of the box And but when you start working with it, you have these bad performance issues that you face when you start working with databases, right? So people write bad queries bad things happen like really really crazy things like Like a 1 in a million chance where you have two things running in parallel. There's like a clash It's like a really weird bug that sometimes happens in databases and then if you write good queries you have fewer worries because I'm like I work as a data engineer and You know good queries make me really happy And it's fun. It's a lot of fun. So building databases is fun Cool. So as this Victorian man once said what make a database of make it a database and When I started going ahead wanting to build my own database, here's what I thought a database looked like so So that's me and I thought there was a base that I talked to and there is data I I don't know where that is and I don't know what the base is made of but this is Essentially what the database looked like. I don't know. I thought it was Maybe it is we'll find out Cool. So the first part we need to figure out is the base. So base is how you communicate with your database That's how you talk to your database and essentially figure out how We are going to do that. We're going to use our terminal to do that And to do that with our terminal to talk to a database with the terminal we're going to build something called a wrapper and Python is awesome in this way where you can build a repel really really easily So I'm going to use something called a prom toolkit, which can be really easily used to just pin up a repel So this is the quote I was talking about So what so what this code essentially does is just spin supper while true loop in your terminal So it creates a repel for you and then you have these mumbo jumbo Which is like if you if you want to quit the while true loop you have to press control d and control c so for that I just have these exit conditions and essentially what's happening is there is a function called a query executor Now the query executor executes the sequel the query that we give to the database and it gives the results So we're going to look at what exactly this query executor does and how it can be used to talk to your database So what is sequel sequel is? Structured query language marks to me It's just the language our database is fluent in And there are like so many dialects of sequel that it you know puts me to sleep It's it's just a query language this This there's so much sequel to explore, but I'm going to define like a really simple grammar for sequel for my own database So this is going to be my simple grammar for my database for now So we're going to have we're going to have DDLs DMLs DQLs What does mean so it's data definition language Which I can use to define something in my database and then I can use DMLs to manipulate something in my database And then I can I have wild cards like star and stuff and then I also have my query languages Which is like from where and stuff so this is what I use as my grammar essentially for my sequel so Yeah, so we're going to look at the query executor next so now we have figured out how to talk to a DB We're going to use a repo we have the query executor and we have sequel so what this query executor does is it's going to Just parse your query it's going to try to understand it and once it understand it It's going to plan its execution and once it plans its execution It's just going to execute it and it's going to just give you the results simple It's not let's let's look at what essentially each of them is doing Cool, so first I start by writing a very simple query, right? So it's going to be select star from customer where name is Jojo and that's a Jojo reference So now the query gets parsed And as it get as it gets parsed our database is going to know that you know select is a dml star Is a wild card from is a dql and customer is a keyword which is essentially my table reference And then you have where name the operator which is equal to and you have the left-hand side of the operator on the right-hand So the operator and the operator itself so your query executor essentially parses this and understands that you know This is that and I have to do this So once you have defined this grammar inside your query you can now create a plan So for the query plan a database creates something called an abstract syntax tree This might sound awfully similar to what programming languages do and that is true. Essentially. That's how Compiling English works in computers like it generates a plan it has ASTs and then it kind of executes that So it's the exact same concept. So it generates this tree for you So this tree is generated based on the query that I just wrote to select star from Customers where you know name is Jojo. So once it has that plan it has to just go and execute So it knows that it has to select all the data, which is a bad idea By the way, do not write select star queries. I do not listen to me select the columns that you want to select so Database knows now what to do right like it knows the query So it has to just go and execute it and it does But where is my data? We just discussed how to talk to a database So our base part is kind of clear not fully clear But it's there and now we have the actual data part, which is the fun part So to start with I thought, you know, why not just you know save the data to memory, you know, no big deal So first I started by defining an in-memory database So it's a hash map for every table and I just insert a new key every time someone was to insert something so When I execute this I just define my database and I have test which is my table name And I just do a dot call the insert method with the name and the Actual name and then I insert age and then when I do a DB dot select it kind of gets the value out for me So this is actually a really simple database that I just wrote in a single class in Python. It's technically a database But there's one thing that's missing. That's persistent. So if my class is run again, the data is gone It's like poof. It's in my pram. It's not in persisted anywhere, right? Like so to do that remember to flush So what we can do is essentially rather than writing to a dictionary. We can just like append to a file And I just have an end code so that I write it in binary and not like as a text so that I save space and when reading records If we encounters a key that does not exist or has been deleted by someone We just return something called a tombstone and for my tombstone I'm defining an actual tombstone emoji Which is a bad thing to do because it it occupies a lot of space in in your database. So don't do that But it's fun. So it's it's a tombstone Cool. So what I loved about having data in memory was her ability to look up things super fast and the Flexibility to choose hash maps or dictionaries in Python because it's super easy and flexible but The problem is that my file is going to grow really really big as I keep appending new things And I have to go through my lists one by one if I have to find out a single row Which kind of sucks. I don't want to do that every time So how do we fix this we fix this by using a Cache so we're going to bring back our in-memory dictionary and by the way that's a boot time reference. Thank you But rather than make it a primary data storage our hash map We're going to make it kind of like a cache, right? So we're going to have our data in the disk and then we're going to have a hash map That's going to store references to the data on the disk So what I just explained is just indexes So it's an additional data structure that we kind of write to Which makes our reads much faster, but we just have like an extra write That's going to happen on every right to the database So it's going to write to the file and it's going to write to the index and when you want to look up It just looks up the index and it just finds where the data is exactly does have to go through every single line of the file Cool. So hash index is essentially just a dictionary So it's going to look like this if someone's so if someone's looking for key to they can directly seek the ninth byte offset in the data file They don't have to go through zero to nine, you know one two three four that sucks So they just go to nine directly and they get the data that they want. They're happy Cool So this is the Implementation of the hash index this might look really shit on the screen. So the code is there on the link So you can take a look at that Cool. So the problem with our hash indexes Which is here is that like if you know the length of a record we can actually get it and it's really fast But the problem is again, we have append only logs, right? That's that's getting written to and we are eventually going to run out of disk space and our file is going to get too big So to solve this we're going to break down our files into segments Cool, we're going to break down our files into segments And then we're going to start closing the file and then we're going to start adding to a new file And then we're going to just keep having like segments of files that that is going to get written to which means we will also have Duplicate inserts, right? So a same record can get inserted twice if someone wants to like replace it So what we do to fix this is we run a process in the background where we kind of combine these segments into a single segment So that process runs in almost all databases, I guess which uses indexes and this process is called compaction So compaction in databases essentially mean you throw out duplicate keys and keep only the recent update of every key And you do that in the background so that the user doesn't know what's happening So this is essentially what compaction is doing and this is just from DDIA So it compaction generally makes segments like much smaller and we can also merge several segments together at the same time So what happens is segments are never modified, but it's essentially written to a new segment So you have these two old segments and there is a new segment that gets created and these old segments kind of get deleted or Vacuumed is what databases call it So after compaction we kind of switch our read request to the new segments and we kind of ignore the reads that happen in the old segment Which are vacuumed or deleted? Cool. Awesome. So we have solved for lookups. It's super fast. We have append-only writes That's awesome We also have like crash recovery now because we have an in-memory data structure and also append-only logs So we can recreate our state if something happens and we also have concurrency By the way because we just have a single writer that's writing to files and you can Parallely read because it's a single state that gets created, right? So you can just read the segments parallely It's gonna have this it's gonna return the same state But but there are again two problems here that we have to solve So one is that they can be a lot of keys Right, which is a problem and then you can also have range queries, which is again a problem So to fix the problem of range queries, we're going to just sort our segments by key value pairs and We can easily sort our segments by using by not just appending to a file and just having like a hash map We can just use Something called a balance tree, right? And there are so many balance trees that are there So what I'm using is like a red-black tree It's like a self-balancing special binary search tree which can like hold sorted data in memory So if you only know more about in-memory balance red-black trees You can check out other Indian men on the internet who can explain it in a really better way than I can So essentially what it does is it just maintains a sorted structure in memory So you can actually have it in memory and it sorted and you can look up look look things up really easily So again, we don't have persistence because our tree is in memory. It's a data structure, right? It's in memory So we need to convert this in memory tree data structure into something that can persist on a file So to do that there is something really cool that people have come up with a very long time ago And it's still a standard in the industry right now something. That's really awesome. That's just a b3 So what we do is we take the nodes of our tree and we convert them into fixed size blocks or Pages Because that's how things are written in disk so we use pages and blocks to write it So it's kind of conforms to how OS writes it so we can just use our OS to do the same thing and each page consists of a pointer reference So we can kind of looks like this So it's like we have a root node and then we have all the keys and we have references between the keys the number of references is called the branching factor and We just have these child nodes kind of it's a kind of like a tree But on our desk, so we just have nodes. We just have references and we just branch out So a quick fact here the branching factor is for If we have a branching factor of five hundred and four different levels we can store data up to 256 TB Which is like awesome. So you have a small tree that can show a lot of data Cool. So again, there's going to be I'm not going to kind of break down how inserts and updates work in binary tree Like there are a lot of tutorials for that But essentially what it does is it tries to insert into a tree if there is space and if there is no space It just breaks the tree into like two different levels like and then tries to insert it into the nearest level So it just maintains a sorted order. So it's like you have 250 251 right and then it just has to find out where it is So it's like hundred to 200 it's on the second level and the total to 300 It's in that level and it has to just keep traversing through the tree till it finds where the data should go in and it can Do insert and release based on that and if there is no space just break the key break the tree And then you know create two different trees and now you have more space sorted So that's how it works a B trees has been around for a long time like since the 70s And it's almost the basis of every relational and non relational database that exists and with B tree The operations make sure that the tree is balanced and it's always has a depth of O of log M It's never more than that so So B tree essentially solve a lot of problems for us But we can do a lot of things apart from just this rather than overwriting a page We can create a copy of the page that we have on B trees right and just like change our reference And by doing that we can actually do something really really cool That's MVCC which is multi version concurrency control. So rather than just Overwriting we create a copy and just change the reference so that if a transaction is happening It has access to both the old pages and the new pages that branch out from the old pages So it's kind of like get right. So you just have these kind of What do you call these commits that come out of a page, right? So it's just like you have multiple comments as everyone does and if you're writing something you have access to only your Commit so the data is always at the state that you credit in so if you want to know more about this talk to me I'm super excited about MVCC so I can talk to you about it all day and Also, we can have something called Val logs which databases use so every operation to a B tree can be appended to Append only log so that you can recreate it tree whenever you want And also there are several amazing things that I haven't broken down about the database. You can implement asset transactions I haven't even gotten into that yet. A database is completely not asset. It's very stupid. You can do a lot of things you can Also kind of have transactions. That's not something I have implemented or started with yet That's another possibility. It kind of gives you goosebumps. You can make it distributed as well But like now it's just one person can create the database right like now You can also make it distributed where multiple people can create a base at the same time And you have the same state that gets returned every time And you can do that by you know, just having MVCC implemented so you just have multiple references So it's it's kind of linearizable So every every operation to a database can be traced and it can be placed in a single linear time line Even though it's happening in parallel That's another interesting concept, which I would love to talk about if you if you all want to Yeah, so there's so many things that I that you can do in the future with the database but to start with This is how you go about building it. So that's what I wanted to just talk about and I am done Thank you so much for coming and listening to my ramblings And yeah, any questions just shoot. I'm here Again, anyone online don't hesitate to ask questions I had So this might be a silly question with my database knowledge is nonexistent Well, now I know a little bit when you talked about compaction So you showed a nice example where you had lots of repeated things It's so it was pretty obvious that you were gonna get something smaller But what if there's not much to compact in the two segments will you merge them and get a bigger segment or will you not compact them? We still It kind of depends on the size So we just have like fixed size for our segments So if the compaction is going to make the segment smaller we kind of create new segments But duplicates happen only when there is new inserts, right? Like if there is no new inserts for the database, we don't need compaction. So it doesn't run Like it's it runs, but it doesn't compact. Yeah, any other questions Any questions online? Hi, thanks for the Interesting that like lots of information coming through some of the processing I have a question like your last sheet You showed the hands emoji Is that is it praying or is it a high five? Oh, this one? Yeah It's a prayer, thank you Could have been a high five It could also be a high five Thank you Anyone else? I will stop for several minutes so don't be shy I think burning in your brain about databases is the guy to ask You can also reach out to me after the talk anyway I'm just going to be roaming around and I'm the guy with the shot so yeah So just a quick question regarding the world of the writer head looks Yes I can't really remember but basically is it serialized only one writer that writes at a time in that look And that's why you can reproduce history when you read it If you reconstruct the look later on, if you have a crash Yeah, so while locks essentially have just like So in this implementation it has like so there are like two different kind of logs So in the standards that are there right now my sequel and Postgres They have like Postgres has wall locks and my sequel has been locks Essentially just writes every operation that happens to a B3 and it's a single writer thread So it just kind of happens every operation even though they kind of happen in Like it can kind of happen in parallel we have the time stamp that's like a logical time stamp for every operation that happens It kind of happens it to like a pen only log and you can so wherever your database kind of crashed You can just like rerun those operations top to bottom like just in the order it's written to And your state will be recreated for the B3 and you can run queries on it What's the best database? I did not say Postgres is the best database if I said I am sorry I'm sorry I don't know why there is So I just mentioned Postgres because that's something that like I have been told by Like my mentors and people who are like I guess like but no things better than me But it's like one of the standards, standard RDBMS that people use at least So it can be both Postgres and my sequel like so I just mean Postgres slash my sequel So if you want to if you want to write an application you just use Postgres my sequel for it Because kind of anything that you can mostly think of that's not you know super niche You can represent it in a relational manner like even though you think that you want to scale something You want to use something else you can represent it in a database and you can write far more Performance queries on RDBMS is so I just meant like RDBMS is so I just mentioned Postgres But it can be Postgres my sequel Oh, yeah, so one of the constraints that I faced was building this in python Which is which is which which is a problem that I haven't touched on in this slide because it's a pycon But it's kind of difficult to do OS level operations in python that I realized So if you want to work with pages and stuff like that, I mean you can do it in python So to implement a proper B3 in python you cannot have like really fine-grained access to like the pages And like the disks and stuff like that They're like M-maps like how POSX is written and like how you have access to these pages in your file system Python doesn't give you fine-grained access to this. It's kind of abstracted into like a module that it lets you use the OS module So that's why database is certain in C++ So if you want to write a production database maybe think twice before you use python So yeah, that was a constraint that I got through by like writing to a writing to like a very hacky implementation of what I thought was a file system But it wasn't so it barely works. So it's a toy implementation. It's a toy database. It's not like a production database. It was just a learning experience for me So you basically built a REPL for this toy database. Do you also have an API so you can use it from a python code or was it just as a REPL? What sorry I didn't So basically can you use it from another python app? Do you have like a Oh got it. So I did not implement like a client for it because it doesn't run anywhere like on the server. It just is a REPL interface But that would be fairly easy to do like I would just run it somewhere and I could just add like a JDBC Connect on top of it. Yeah, it's just this driver. Yeah I have a follow-up question would have been about what kind of isolation level do you offer if you don't have if you have only one client anyway Oh, yeah, so that that's what I touched upon on the the previous slide Like if what if like there are like several people using it like in different interfaces, right? Because right now there's just like one REPL that someone can open So in that time that's when like as it comes in like you have to make sure that you run like the modifications to the database Happen in like a serialized way so that you can have it in a timeline. Yeah, that's something that you have to kind of ensure It's a big problem, right? Yeah, it's a big it's a big problem So I think that that's one thing you can solve by you know creating a copy every time So whenever someone queries a database you take a snapshot of the state that's a database right now And you make changes to that snapshot right and if someone writes another query you make changes to that snapshot And if somebody commits a transaction and if someone writes a query after that commit then you take from that commit So you always everyone always has access to the state at the current time the query is running right now And if that if there's a mismatch in the state when somebody's writing to database you just say you know what It timed out like this scenario like that So that's like a very cheap way where you can ensure that it's there is no like clashing things happening So that's one way that I tried to do it Oh, so that's what I defined like previously like in my grammar So if someone writes a query I just tokenize it and then based on whatever the key that I've mentioned I know that like select means that they want to select something so it's a defined grammar Which cannot be changed so someone cannot write like give me this and I cannot give them that I don't think there's any questions online so let's thank Sanghar Sanan one last time Thank you everyone