 everyone thank you very much for joining the talk warm welcome welcome to my talk walkthrough of Django internals so all of you who are joining or coming for this talk I am expecting that you must have done some projects in Django to get sense of looks out of this talk so let's get started so my name is little mystery I have around 10 plus years of experience in the industry I have laid the teams in full-stake web development devops data engineering microservices and some of my past role included the solution architect chief architect where I had to innovate in a technology such a way that I can create value to business so there I got interested in entrepreneurship and I started the company called DGQ 2 Technolabs and there we actually help the companies around the world to do a digital product development devops data engineering microservices so let's get started so let's start with a simple question how does Django starts so Django start with a simple command called python-mender.py run server that's how we start the attribute server so here run server is the management command in Django which spins the attribute server now there's a 360 view of the things that Django run server does internally to spin the attribute server so it looks through all the apps in Django it finds the management command all the possible management command in Django codebase as well as in the install apps it later on passes the command line arguments it loads the settings configures the logging it loads the app configurations from individual apps it loads all the model in all the apps and in the end it starts the attribute server now let's see the detailed view of the same so whenever we do python-mender.py anything it can be migrate or make migrations or runs or any kind of any any custom user defined management command then this class called Django.code.management.management utility gets initiated so there's a front door to it's a front front door to start any kind of Django management command and it has the so that class has the execute method which gets called Django uses the command parser which actually inherits Python's argument parser and whoever don't know so argument parser is the actually a utility provided by the Python itself which converts the command line arguments into a key value pair. Django actually overrides its parse argument method to make error messages more relevant. Later a Django loops over all the apps in Django codebase as well as like it goes into Django's internal apps like admin or core and it tries to find management command make a management package inside it inside that it tries to find the commands package and inside that all the modules which are there they are the management commands possible management commands actually so Django actually prepares the list of all the possible commands that which Django can have and later on it checks with the internal with the enter command and if it is finding the match then it goes if it could not then it raises the exception. Now Django also tries to do some intelligence here for an example I've written Python manager that we have run so but Django tries to predict that okay did you mean run server so it tries to match the string with the list and it gets this prediction. Now once it's done then Django tries to load the settings. Now what Django actually does is that in Django and there is an environment command an environment variable called Django underscore settings underscore module which is the import part part for the settings and Django actually tries to import that module to load the settings. In case it could not find that module settings module in that part it raises the exception. Django settings are lazy by nature in a way that Django will not load any any Django settings will not actually load any attribute or anything unless we try to access it. So for an example unless we try to do settings dot data with settings dot install app Django will not actually load the settings actually so that's kind of a lazy behavior it implements. Now for an example I have imported the setting with the Django dot com dot settings and at the time I'm trying to settings dot database at the time Django actually dynamically imports that module and loops over all the attributes and later on it loads in a class called lazy settings and lazy settings class variable that's all the key value pairs. Now at the time of loading this setting for Django does it also put some checks for an example two settings two two setting variable cannot be set together or if some setting variable is set in different way then it should be then Django this is the warning or this kind of checks are generally implemented at the time of loading the settings. Django settings lazy behavior is actually implemented by the overriding the Python's class methods such as get attribute, refer, set attribute, delete attribute. There is one more way if we don't want to load the settings from the module there is a method called configure which actually accepts the key value pair and that way also we can load the settings but it needs to be loaded so for to make it happen we will have to load it before it loads with the before it actually before the server starts so that's a way once we have configured with the configure method we cannot configure it Django resists the exception it says that okay you cannot configure twice you already configured with that method or for example you already loaded with the module so now you cannot once again set it set set with new elements later on Django calls Django.setup. Django.setup actually loads the apps and the models from all the all the apps now how it happens is that there is a class called Django.apps.digist.apps and this class has this class actually stores all the individual apps app config now all the app config can be found in the individual apps apps.py and where we will find that there is a app config actually inherited and so so so that app.py can have a multiple app config class but one must be marked with a default true in case it could not find in case it finds two classes with a default true then Django resists the exception it exits later on Django actually loops over all the all the classes in the models.py and it tries to I mean it tries to find that hey I mean tell me what all models has has are inherited by the base.model in case it finds them it loads them and in the end the app configs ready method is getting cold and in ready method we can actually write a signals registers custom signals we can also write some custom logic okay I mean my models would not have like this so we can write any sort of logic which will be called just before loading now in the end runs over command management command will be run so that module can be found in the course less management less commands less runs over.py and over they will find that it is leveraging the Django's base at it to be module and that base at it to be module contains the contains a lots of classes and definitions to how to run at it to be server so that's a whole separate module for it Django leverages the trading mixing sockets servers trading mixing which actually whenever we inherit in any other class then the class who is inheriting it is empowered by a feature to spin a tcp or udp sort of server with a trading support and Django also leverages WSJ reference or WSJS server which in turn inherits the HTTP server which is also from the Python itself and that's how the whole HTTP server gets started now implementation is multithreaded that means if we are if by default whenever runs over start the trading is enabled so to serve each of the request new thread will be spawned by Django so now now let's see like everybody whoever use the Django we have seen that we whenever we modify any file and save it then Django actually reloads itself now how does that work so Django has two ways to do it one is state reloader one of the reason watchmen now state reloader is a by default Django leverages to reload what state reloader does it that whenever Django starts at the time it goes into sys.module gets a list of all the modules and stores its modified time in the sense like file modified time and one thread spawns in the background which actually checks every one second saying that is there any modified is there any change in the file modified time if it finds then it reloads and one other approach is a watchman which is far better and performant approach to use this we will have to install pywatchman library so what pywatchman does is that instead of 3d is actually looking for the modified time on lots of files by running a loop watchman actually leverages the i notify fs event key queue which are provided by operating system and they actually fires an event to Django process Django actually captures them and it reloads now we have seen how does transfer works in Django now we have up and running a tdv server now let's see how does request works in Django so we have the server up now here what I'm trying to do is that I am just simply sending one post request on this URL and sending a content type header and raw key value data that's what I'm sending right now and if I see my raw data will be look like this at the very low level at the tcp here at you know packet level so I got this data from the wires that's a packet tracing tool so here if you see over that curl request actually got converted into this kind of text sort of format and somebody will have to parse it to get some understanding for the Django and later on Django can parse and return the values now how does that work so there is a ttp client it can be anything it can be browser it can be a curl or any kind of a ttp at ttp client it says that ttp request to our web servers web servers are like Unicorn, UWS, GI Runs, there are a couple of more but these are pretty much standard ones and they understand the ttp so they understand this kind of pro request how to parse it later on they actually communicate with the Django code base with the help of WSGI protocol and they also respond with the same protocol and later on client gets the response now so whenever we see the package mean package baby where we see the settings.py package or they will find a WSGI.py file and over there we will find a method call get WSGI handler and we will see this kind of class object is getting returned by that so it has two methods call init and call call so init is getting called whenever there is a run server start and call is getting called whenever there is a request so call is the environment and start responses are the two common two function argument which is getting passed and how does that work since so environment actually converts that draw request into good-looking this kind of key value pairs so if you see here that at ttp accept is a header but it it's a key likewise catch control is a header but it got it to get caught as a key there is a WSGI doing it's a stream so dot read method on top of it will actually get us the raw raw body data so this kind of and also it also gets us the WSGI to multithread and couple of more environment variable and lot of this kind of data which can be leveraged by the server to response now start response can be actually used to respond to the server so there is what I'm trying to do is like I'm assigning a header content type to explain and also assigning a status 200 ok and start response I am sending the start response status and headers and the body which is hello world that's what I'm returning so call method also does couple of other things such as it has the it calls a get response which is there in a WSGI handler itself it actually matches the route is route ok or not it executes all or it changes through all the middle works then it executes our view and it gets the response and returns to client but here this code is very simple in Django internals we will find lots of error handling their tri-cache there where Django want to make sure that if debug is true then client the end client sees the errors in case it's not true then client never sees the exceptions now let's come to a very you know very important part in Django let's see how does ORM works in Django so Django ORM is actually powered by this many classes now model is something that we inherited all of our Django models we do model.objects here objects is a manager query set is a class which whose object is getting written whenever we do a filter query then there is a query class so query class actually holds that filter data so for example I'm doing objects.filter and I'm saying name equals to abc so that name goes to abc that data will be stored by the query class and return this SQL compiler actually compiles our that query that filter or create or any sort of query to good-looking raw SQL queries and there is a database wrapper which communicates with the database with the help of database driver to get the result and that's how the whole ORM work now let's see in bit detail how does that works so here for our example I'm defining two classes one is college which has the name and address and I'm also defining one more model called student where there is a name and a number and college college is a foreign key to college model now if later on if I do this models.college.underscore.dict it will print me all the attributes attached to that class now we did not add it right I mean we simply just define this but couple of more and more attributes such as meta does not exist these all things are attached so how does that works in Django how how does that happens let's see so generally we inherit the Django.db.models.base.model to our Django models and that base.model inherits the Django.db.models.base.modelbase and model base is actually a meta class to a model so and that meta class has a new method that also so that new method is getting called whenever we actually import the import the you know some module or something like so at the time of run server or at the time we are actually loading the models at the time only this new method is getting called that new method in that meta class has all the logic to load all of this module all of this attribute with the class so now there is a terminology in Django where add to class and contribute to class so what Django actually does is that exactly this meta class does is that it loops over all the attributes in that class in that model and it checks hey do you have this contribute to class method with you if it founds one yeah I mean that contribute to class method is there then Django says okay call one and that contribute to class method actually attaches the attribute to our model instance our model class and add to class is generally a terminology used in Django where there is a model and that model itself want to add something to self so so that's a terminology we will find in Django there is a underscore meta which is an instance of Django.db.models.options.options and it contains a lot of utilities which helps Django to form the proc query it contains a utility such as get me all the related names get me a specific field get me all the possible manager get me a default manager and couple of more I mean a lot more utilities are there in that class model also has the underscore state which is an instance of model state it stores couple of data such as it stores the what db we are using from the setting sort database for the db we are using adding is true it's for the validations so let's suppose that we have initialized the model with some quarks for example in the last model we had seen like there is a college so in a college I'm initializing it with some name equals to a bc address equals to bbc that's something I'm doing so at the time and later on I'm calling a save in that model so how can Django get to know that Django need to do a update query or insert query so if there is a adding equals to true then in that case Django will create a insert query but if adding is false then Django will do update update query that's how it works in Django so whenever we did get the response from the database as a model object at the time this adding will be false so on the same Django will do update query to gain more performance we can also do a select related on the foreign keys so all the object for the foreign key will be stored in the field underscore catch now there are a couple of method which is very important in Django which is getting called so there is a from underscore db and it gets called whenever we get the response from the database model implements a couple of method which allows certain operations such as we can do equal operation we can convert model to string we can do print we can convert to has we can also do object serializer serialization on top of it so in the sense we can do pickle and pickle on the models so what does model has model has the field so all the fields like character field or integer field all of those are inherited from Django dot db dot models dot field and as we had seen the metaglass loops or all the attributes and each x to have contribute to class method so our field has that some of the field has that contribute to class method which dynamically adds the new methods so such as the related names we so that students go set that was a related name that got added dynamically how did that work because our foreign key has that contribute to class function which gets called and it actually adds all the related keys or related names to the model for example if somebody is using date field so we will get that get next by field name that's dynamically added so that's added by a contribute to class function now there are certain other like methods in the field which is also very important so these two methods are called used at the migration time to add new field or to create a new model at the time this fields are generally used so for an example we are creating our own new field and what data wrapper or what Django has is the mapping with the Django's known field to our database field so for an example if there is a character field it should be where care with the max length if there is a auto field then there should be a serial so this kind of mapping is being stored in the database wrapper class and whenever we call this get internal time ideally it should be Django's known type so it gets a real mapping from for the database but in case we are creating our custom one for example that is IP address type just example and there is no type available in Django then in that case this data type mapping will return none but Django will need a real database type to do a migrations in that case dv underscore type method will be called where we can simply return the real database type for example it can it can return anything like IP address is a field and we can simply return the IP address that's written that will be actually passed at the time of you know row at the time of migrations there is there are a couple of more methods such as from db value so for example we got the response from the database and now we want to tweak some value or like something in that value before passing it to up front so for an example there is a date time field and we are getting a QTC time zone in the database or from the database and we want to convert it to our time zone and from db value can be used get db prep save is the method which is called just before we are saving the data into database so we can modify certain value if you if you would like to all of this field are generally useful when we are creating our internal or our own field in general we don't need to touch these methods now let's see the whole crude impression operation how the whole flow works in Django so here if you see debug models dot call it dot objects dot create so here objects is a manager and create is a method method which will actually create a new row in the database now manager is a class where where it inherits the inherits the class which is actually created on the flight dynamically how does work is that there is a base model and there is a method called from query set which takes a query set class as an argument what it does internally is that it loops over the query set class and it checks is there any tell me all the methods which has which are not public in the says not starting with the underscore as well as there is a monkey patched attribute call with all the functions called query set underscore only it should be false if these two condition are matching that method is actually eligible to be part of manager so indirectly most of the functions such as filter all etc will be actually we are getting from the from the query set and it is getting added to a manager now whenever you do create create method call at the time see you in the Django model is called Django's create method returns the models object and we can define multiple managers but underscore default manager must be true in one of them so now let's see another example where what I have done is that like I have created the same same instance like models that call it dot object dot create and here I am passing name address and later on I'm just doing printing the attributes and also doing a getting a type of that instance if you see the type of instance it is the same you know I've got model instance I mean it's a type of mistake I'm sorry but it's ideally it should be models dot like models dot whatever I mean it should be path is wrong but it's a college instance it's the same instance actually that's that's that's that's that see ideally I should be getting a name should be a field right I'm getting a value address is also a value ideally I should be getting a getting a field field field class object or something like this why why that is how that is happening in Django so as we have seen like in past light that Django models actually overwrite search not overwrite sorry has a meta class called model base and Django's model does not have any in it method but meta class has a in it method so whenever we try to initialize the model at the time meta class in it method gets called and if we see the any attribute so meta class just as bunch of methods that's it practically there is no attribute in it so dynamically that better class in it method actually forms this object and it returns so that's the way they're getting the real values instead of getting the object of the fields now let's see the query which actually holds the value for the compiler to form the rows equal queries now there is a class called Django dot dv dot models or SQL dot query which is a base class and it is inherited by multiple queries in Django dot dv dot models or SQL to sub queries that's a module inside that will find multiple classes such for example insert query aggregate query update query every class is used for the individual application every class has their own methods such as for the insert query this is the insert value for the update there is add update will add related update there is a add filter in the in the in the aggregate query so all of those fill all of these method actually what it does is that they accept certain values and they simply store all of those values into the class variables and that's what they do on each of that calls so generally whenever we do any create or filter queries at the time these methods are internally getting called and they simply just updates the values in the class variable that's it later on all of this like insert query or update query all of these classes also as a class variable called the compiler and which actually also stores the detail of our name of the compiler which needs to be used to convert this query into a good-looking raw SQL query so all the compiler in Django can be found in the Django dot dv dot models or SQL dot compiler and there is a individual compiler for each of the type for example there is a SQL insert compiler to create an insert queries and couple of more I will stick down here and all of the compiler has the s equal method which in general leverages the query objects or query class instance so data and converts them into a raw SQL query that is also execute query method in the all the compiler which actually leverages the database wrapper internally and executes the query and gets the result so now let's see one more scenario with the filter so filter actually returns the query set object so query sets holds multiple objects multiple models of multiple objects it's a container for the objects query sets are lazy by nature in the sense once if I execute this then there is no database operation is done unless I try to do some operations such as if I try to print this the output of this or I try to run a loop on it or try to find a length convert to Boolean or try to get specific index on that query set so in if I try to perform this application this operation then only Django will actually do a db operation one more thing we must have found is that whenever we do a print on Django query sets we are seeing a limited set of objects so why does that happens is that Django don't want to kill our application in case there is there are millions of records in the database so what Django does it whenever we do a print at the time Django actually adds a limit of 21 so it gets a limited set of records query sets has a catch that means once we have fetched the record from from the database let's say I run a loop and I got 1000 records and if I try to do a one more loop then Django will not do any database operation again so there it has a catch sometimes we have a millions of record or very huge amount of data in the table and we don't want our applications to be killed so there is a query set dot iterator method where we can pass the chunk size to some value what it will do is like it will fetch the values from the database in smaller chunks of 100 but the drawback of this is that it will later not catch the results so as many of time I will run the loop Django will do a database operation Django.dv.model.sql.query is the class which is used to form the select query it has couple of methods such as add filter, add queue, add select relative, add annotation, add extra, add ordering they are doing nothing they are simply adding the or updating the values into the individual class variable that's what they are doing and in turn there is a SQL compiler which takes the query instance as an argument and it forms the raw query and they are leveraging the database wrapper to execute it so in further all the application we will find the SQL compiler has the ssql and execute query that's the same step behind the scene internally in the Django just the upper of the query class and the compiler is changing that's it so let's see how does this chaining works so if we we can see there is a filter and then I'm doing another filter I can also do update how does that works so in general that filter is actually returning a query set object which is a filter method so simple basically that query set at filter so again that is getting called so internally what is happening is that whenever we do this kind of chaining then Django is doing no operations so this filter is reading very set and again this address and square underscore contains is getting changed inside the inside the class variable no databases operation are happening and again like whenever we try to do operation then SQL compiler will be called and ssql will generally convert into raw queries and execute and let on execute query method will execute with the help of database wrapper and that's it now let's see how does update works in Django so update query actually does does the real update database update query and it returns the number of records updated it will not do any send any kind of signals like post or pre delete sort of signals Django has Django leverages Django dot DB dot models dot SQL dot sub queries dot update query that's a class which it uses to hold the data it has couple of methods such as add update values add related update and much more to store the data for the query for the for the compiler to generate the raw queries there is a SQL update compiler which will be used to generate the update queries now let's see how does delete works in Django so Django delete will do a delete query in Django but not like update but it will actually send the pre and post delete signals to all of the object which are deleted so if here if we are seeing like I'm doing a doing a filter with name equals to a PC college but in term my role queries looking like this delete call debug college where debug college ID so how does it works and why Django is doing it Django is doing it because Django want to send a pre and post signals after the release and takes place so Django has the Django dot DB dot monster deletion dot collector class which does the filter query to the database gets a list of objects which needs to be deleted and it only forms a query with the ID in the sense it deletes by the primary keys now later on this class also sends the pre and post signals there is a SQL delete compiler which actually forms the row SQL query what will happen to my related object that depends on what is the value we have given to on delete at the time of defining the foreign keys so there is there are three options available cascade protect and restrict so in case of cascade yes my related object will be deleted because a real cascade query will be done by Django protect will not allow such operation that object cannot be deleted and restrict will check okay is there any any object related to this this object if it finds so then it will not allow but in case it finds it could not find any object related to that object then it allows to delete and generally for the better performance rows queries should be used for the better performance but the topic is that there won't be any pre or post signals now let's see the database wrappers so database wrappers are generally different for individual databases for an example if I want to see the post grace one then Django db.backends.postgrace.base.py that's where I will find the post grace database wrapper if I want to go for the MySQL then again backends at this MySQL so this will be different for each of the databases why Django is doing it because every database is different every database has a different syntax to to the things every database supports some features and that's a reason they are separate it provides some methods to create a new connection get me new connection it also contains Django's known type to db type method db type mapping so for example if you know this is an example for the post grace this example I got for the post grace backend so there is a class variable called data underscore type where it's a mapping if I'm defining auto-fill then in database it will be serial data a serial data type binary field and it's a byte so then again certain other things are also defined like if I'm doing an exit then how the syntax look like so it will be equals to percentages so percentages we will be replaced by the real value then contains is like like percentage as then there is a pattern matching and again it's a defined here actually this is how Django will actually make the queries there is a class called database feature so not all databases are same as I just said so some database supports some features some doesn't so what Django does is that that are lots of class variable in this in this database feature which is Boolean in nature and at the time of forming the rock very or like SQL compiler leverages this this database feature class or or it also restricts some of the operations so for example if I'm doing some operation on query set at the time Django raises the exception so that's that's the that's a thing for the database feature now let's see the class called the database operations so every DB has different syntax for running some different kind of queries so for an example set time zone can be a difference equal in in my sequel it might be different syntax in the in the postgres so this all standard methods are there for an example set time zone sequel so it has the so this is a function which actually returns a query like this set time zone percentages so this percentages value will be that on me replacing replacing the query so contains lots of this kind of function like data of the time cast date sequel so there might be some in some database it can be database is it can be like like data type and it's a function in the database somewhere it can be dates space so like in postgres it can be like one example double column and data type so all of those things are pretty much defined in this methods so that's it thank you very much guys any any questions thanks for the talk I was really good and did you find it difficult to put together with Django changing true different versions to put together a talk or was this where the internals that you discussed sort of common to different versions I know this Django 2 Django 3 and I think there's Django 4 now was it was a difficult to which like sort of a moving target to try to put together a talk such as this and just was it was a difficult to put together a talk because like Django changes like like Python but Django changes as time goes by so if kind of like what what what could be an internal in one version of Django is not the same okay next version yeah so see I mean in general version changes yeah I mean that there are that can be some changes pretty depends the depends on the what is coming in the future version but I have I have not seen lots of changes I have seen like couple of very you know small set of things are changing or I can say 20% or set set of things are changing but not whole set of things are changing and if there is some whole new set of features are being added then Django also makes sure that other feature is not being break so they make it safe so your existing code is anyways not going away in one sort so they also give a warning like we are going to deprecate after this version that person so we will not find it okay suddenly that code is not there we'll find it for some time and then one or two person will find that okay that's being modified actually hope I answer your question yeah definitely yeah thank you okay thank you