 I need to say a little bit about state modeling in games, that is how do we represent the data which is the game logic part in our games. And it's something I talked about in an earlier video, but I think the prescription I gave in that video was perhaps well out of date, really not up and up on the latest practices, and also it may not have been terribly clear, so I want to correct that. So say you have a game like Super Mario, and in this game you have a bunch of things which we can model as data, like say Mario himself has an X and Y coordinate, fireballs have X and Y coordinates, and there's momentum and so forth, and you go through a momentary freeze frame of this game or any game, and you can just start just enumerating all the things that make up the game state. The question is how do we model that state, how do we organize it? The most obvious approach is to use what's called the object model of data, which is not necessarily about object oriented programming, it's just we organize the data in terms of these things which are data objects that reference each other, such as you get this big hierarchy or actually in practice this big graph of things that reference each other. And so when it comes to say modeling Mario, we end up with this big composition hierarchy or graph, where say we have like a object representing the game itself and that has a bunch of fields which are like a list of the monsters and an object which represents the player, the Mario himself and game state like whether the game is paused and that sort of stuff. And these objects in turn are going to be composed of their own objects potentially. Now this object model is obviously a workable solution, it's also the obvious solution, but the downside, well there are a few downsides, I think the main one though comes down to how flexible is the state of structure. In practice when we create any program, but especially games where we don't really know exactly what we want ahead of time and we're just incrementally working out features, adding stuff, removing stuff and so what we don't want is for our data structure to tie us down such that anytime we add anything new or remove something or move stuff around and reorganize that like that one change has repercussions for the rest of the data and also for our code logic, right? As much as possible, if I say add some new thing into my game state, I don't want to have to rewrite any existing code. Now for various reasons, the object model of data isn't really all that fantastic measured along that metric. It's not particularly flexible. It's simple and obvious, but it's not really particularly flexible in that way. In contrast, the relational model of data tends to be more flexible, and the gist of the relational model is that we organize our data into these things called tables where each row is an entry, it's a thing made up of the properties represented by the columns, the attributes as we might call them, and the way we organize our data into tables is governed by these rules called the rules of normalization or the normalization forms, and the idea is that if we follow these rules, then our data will have the virtues of these four properties which were laid out by E.F. Codd, the creator of the relational model, and the way he expressed these benefits is fairly academic, but I can translate quickly. First, to free the collection of relations from undesirable insertion, update and deletion dependencies, the gist of that one is that if you follow the rules of normalization, you're removing redundancies in your data, so you don't have like the same facts expressed more than once, such that when we add new entries into these tables or remove entries or modify an entry, we don't have to do it in multiple different places. If we're updating just one single fact in our database, that just exists in one place, so we don't have to do it in multiple places. The second rule here to reduce the need for restructuring the collection of relations as new types of data are introduced and thus increase the lifespan of application programs, that's the flexibility I was talking about. The idea is that if you follow normalization rules, then you should be able to add new tables and remove tables or modify existing tables like add new columns without having to rearrange everything else. Number three, to make the relational model more informative to users, I think really all that means is just the ideas of the relational model is somehow more intuitive and more natural, has a more natural correspondence to the way we think about the data. That one's I think really arguable, but maybe it's sometimes true. And then the last one, number four, to make the collection of relations neutral to the query statistics, where the statistics are liable to change as time goes by. The gist of that is that when we design our database, on first pass inspection, we don't want to bias the performance of querying one part of the data versus querying the others. We want to have equal performance for similarly complex operations, like if I look up 1,000 elements in one table, it should take about as long as it takes me to look up 1,000 elements in another table. However, there does come a point in real world programs where we care much more about certain kinds of queries than others. And so maybe this property isn't exactly what we want. Maybe we do want to bias our data structure to be more efficient for some parts rather than others, because that's the sort of thing we need to do more frequently and more critically in the application. So there comes a time when we make certain tradeoffs about efficiency, but the relational model itself is attempting to keep everything kind of, everything kind of even. So those other purported benefits of the relational model, the question is, well, what are these normalization rules? Well, there are depending whom you ask, there are at least six rules of normalization. In practice, the most critical rules and the ones which people abide by most of the time are rules one through three, and not necessarily four or five or six. The idea is that you can conform with rule one, plus also rule two, plus also rule three in order. Like if you're conforming with rule three, then implicitly you're conforming with rules one and two already. And it just, for most people's purposes, conforming with rules four or five and six is more trouble than it's worth. The most bang per buck is in rules one through three. So what are these rules exactly? Well, I won't go into them here, but the gist of it, it works out is that, well, we need to first off put attributes in the right table that maybe that's not very helpful perhaps, but that's kind of what rule number three kind of boils down to. In other words, make sure when you include a property in a table that that property is really a direct fact about that thing described by the table. Like if you have a table describing the planets, make sure that the property is directly a property of the planet itself and not some secondary thing, which is something which might be a component of the planet, for example. Like if you have an atmosphere and there's all sorts of things about an atmosphere you want to store in your data, then you probably want to have a separate atmosphere table. Also, according to the first normal form rule, all of our values that we store in columns have to be scalar, basically meaning they can't be lists or other kinds of collections. They have to be just individual values like a number or a string. When we have lists of things, when we have many things, that's what having multiple rows in our tables is for. A table is made up of multiple rows. That's how we express our multiplicitous data as multiple rows in some table somewhere. And then the remaining implications of normal forms one, two and three is that when we have something that we would most intuitively think as a one to many relationship as this thing having a relationship with many other things, we invert that and express it in our data in our tables as a many to one relationship. So imagine like a player has an inventory and that inventory is basically a list of stuff, but we don't have lists in our tables. So instead of having say a player table where there's an entry pointing to many different things that exist in their inventory, that player's inventory, we have a separate table where items existing in inventory are each entries in the table or rows in the table and they each point back to the player. So we're inverting the relationship instead of having a player with a list pointing to the references of many inventory objects, those individual inventory objects point back to the player. We're inverting the relationship. So that's how we express one to many relationships. And then when it comes to many to many relationships, those we express in some separate table. So like say imagine you have a bunch of people in your game and you want to express in your data, which people are friends with whom the obvious way to do it in the object model is that each person has a list pointing to all of their friends, but we don't have that sort of thing in our tables. So instead what we do is we have a table where each entry is one of these relationships is just a record of this if each relationship. So it's typically just a two column table that records a reference to one person and then the other person. And this way of expressing this sort of relationship can be a little unintuitive, especially because like in this example of friendship, we think of as just being a mutual relationship that goes both ways like if a is a friend with B, then B is also a friend with A. So do we record that in the database twice? Do we do in this table, do we say A is a friend of B and B is a friend of A or do we just express a relationship one way and consider it equivalent for both? That's a sort of icky question that arises when we follow this pattern. So it's not a totally ideal way of expressing this data, but that's actually a problem we already have in the object model where you know you have each person potentially listing all of their friends and so the you have the same redundancy of the relationships being expressed both ways. Note the advantage however that in the relational model, this relationship information is a separate fact about our person objects. And so when we add new relationship facts of some nature about our persons, then we don't have to modify the existing person table, we're just adding new tables. The interesting question though, well if this model is so flexible and ideal in that sense, why is it not what we use in games? And I think basically two answers. And the first is well performance. Certainly we wouldn't use off the shelf relational databases like MySQL or MSSQL or Oracle or whatever, because you know those are separate programs that we talk to over a network connection. And you know, not definitely not conducive to high performance games where we're reading a bunch of and updating a bunch of data objects many times a second like 30 or 60 times a second. I don't think you could conceivably have really a even an in memory database like SQLite which operates in memory as a library of your program. It still even then isn't really conducive to high performance because relational databases they have different different goals in mind. They're about like concurrent access to this large mass of data that's probably in excess of the sort of thing we need to represent in games because even most games don't represent all that much data, even the most intensive games. They have a lot of data in terms of like textures and assets and so forth. But the actual state data is relatively small compared to even fairly trivial, like web apps, but you know how much data they need in terms of like user data stored in of like, you know, content management systems and purchase history and that sort of stuff, right? So maybe these performance problems are just a matter of tradition and just coming from different development backgrounds of you know, where there are different concerns and different needs. Maybe it's conceivable we could have an in memory database that followed the relational model strictly that was up to snuff with the performance requirements of if not the highest performance games at least like moderate level games. But to my knowledge, no such thing has ever been done or even attempted. Aside from performance, the other sticking point I believe with this relational model in games is that in games relative to other sorts of applications, we tend to need many, many sort of ad hoc things. The nature of games is that we tend to have many more one off exceptions in terms of data. Like I know a lot of like business applications, there's lots of ugly ad hoc rules about like business logic like oh on Tuesdays when it's a full moon, we have to do our taxes this way, you know, that sort of thing. There was a lot of ugliness and in non game applications, but in games in particular, the data itself has to be ad hoc where you have, well, there's this one part of this level where there's this one monster that's special in this one way, that sort of thing. And so I believe in response to this need for having the flexibility of relational model, but also having a need to define many ad hoc things in the course of game development, game developers about like early 2000s started embracing a way of data modeling with what are called components and entities. And the gist of components is that they are effectively like tables in a relational database. But then an entity is a record that potentially spans multiple tables. So it's like a row and a table, but it's the row that potentially belongs simultaneously to multiple tables. And entity is a thing that's made up of multiple components. And there isn't a lot of formal literature about this stuff. It's it's all a bunch of informal blog posts that people point you to and you can read up on. But from what I've read, a common prescription among developers is that with components and entities, you should follow rules that are very much like the normalization rules. Like most importantly, your components, the properties of a component should be scalar, they shouldn't be like lists and other kinds of collections. And then also I see developers suggesting that well, generally, when you're expressing a one to many relationship, you should invert that and express it as a many to one relationship with the each thing of the many pointing back to the one referencing the one. Now, the degree to which you should normalize your data in this pseudo relational model is one of those things that I think people are not totally clear on and there's a lot of debate on and probably actually just varies on your use case because in practice also when people use a relational model and conventional relational databases, you know, there's we deliberately often de normalize for the sake of optimization because yes, we're adding redundancy into our data, but there's certain kind of queries that we need to be really efficient, we need to get at this certain data a lot very quickly. And so we sometimes need to denormalize. Well, same deal with these components and entities is sometimes for the sake of performance, we're going to be denormalizing. And then also in connection with these components and entities, you'll sometimes see the term system used to refer specifically to these functions, which are used to loop through every entity with a particular component and update those entities. And in some versions of this component entity system pattern, the prescription is made that well, you do all of your updates, all of all of your processing of entities is with these system things, these functions that loop through all the entities of with a particular component. So it's sort of like having this, you know, the pseudo relational database, where we're doing one primary query where we're iterating through every element of a particular table, except in this case, we don't have tables, we have components. And that's I think where this this whole pattern, there's a lot of disagreement, I think of how strictly should you restrict yourself to just that kind of query, that kind of looping through entities by component, like what if you want to do other things, because sometimes in your game logic, you need to be comparing the data of this entity with this other entity, which is totally elsewhere in your data, you know, has a totally different set of components that that does happen in many games, there's logic that depends upon disparate elements. And how exactly you handle that is something which I don't think there's a clear prescription. And there's there's variances of opinions. So that that's an interesting question to resolve. It's not something I have enough experience with where I can really give my own prescription. So I'm just going to leave the question open there.