 Hi, are you enjoying your python so far? Okay, are you getting inspired? Yeah, I guess are you like Getting like you we're gonna apply everything we've learned to to next week's work, right? No, no fucking way. That's It's very difficult like no matter how incredible and how much we we keep on learning on these conferences There's always some blocker that prevents us from doing these amazing things in our job and I think the task of every speaker here should be to warn you into how much shit you're in and that's why I'm here I'm gonna present you the actual problem in all its clarity and Hopefully Allow you to have some way out. So that's essentially a talk But first of all, can you raise your hand if? At work you deal with data like It's a stupid work right instead. It's anyone here Like talking about CPU at work like optimization that kind of stuff Okay, not that many I know it's a lot of the question because The thing is The engineers that used to raise the most hands are the second question and not the first one So over the last 20 years or so Lots of companies have sprung for which Data is a bigger challenge and not CPU The tool that exemplifies the most probably is even Python where we've released some control over how to Use memory management and that kind of stuff and in exchange what we get is a much more control of our ability to Glue things together using the tool So the problem essentially is that companies nowadays Deal with data rather than CPU that the speed its complexity and the amount and How how difficult it is to to deal with it and its speed of change? So I'd say that It used to be the case that engineers become experts by mastering algorithms data structures and nowadays we become experts by Mastering the selection of tools that are most appropriate for this task. I'd say that a true engineer expert It's someone who can define. What is the tool the general purpose tool that is needed for a particular task? However We engineers face most of our engineer challenges Using tools that I feel and I think you agree with me that are a bit inflexible to changing and Combining the existence It says it is as if we are We are using the same data Persistent tools that we use from the very beginning like that's that's the deal whatever you choose at the beginning, that's what you what you are with and In a sense, it's as if we are we are provided only with a hammer and Obviously we're puzzle when we realize that not every problem that we have in front of us looks like a nail So let me show you a concrete example of what I mean by that and given that literally every company seems to be created a social app No, a days Let's have these quicker Thought experiments like try to come up with the tools that you use at work With how would you describe in code? The user class of a social media app like try try to envision that for just just a sec So let me show you what I think it's come up in your head So probably there's something a lot like this probably this is not the tool that you use the not technology is maybe different But essentially this is it This is called the active record pattern and I'm I'm going to show you like in no time why I hate it so much So It's Incredible nevertheless that these active record however much I hate it and probably that's the reason I hate it so much is that Reminded it's sequel alchemy. It's Django. It's the pony or M. It's tortoise. It's be we Everywhere every or M up there at least in Python probably somewhere else as well They use these very same pattern to teach beginners how to develop websites And I would say the engineers are not necessarily wrong when they choose the active record pattern, but They only work well when the problem is not very complex like The tool is easy to build and understand and that's a good thing for Developers that are just starting but it only works well over time if you maintain a direct relationship between the objects and the database essentially So even though tools like Django flask and whatever have become the de facto standard of web development They did so They became so only because they made specific design choices to where They aimed at emphasizing speed of development So it's you know in a sense they worked and they succeeded and they become prevalent because they they are quick start frameworks for developers But however past that cut over level We experience what is what is it? What is it? What is wrong exactly with these kind of tools, right? Most engineers even experts is not it's not a matter of a seniority realized that they hold there in they find themselves unable to make progress and Even though they were told to choose boring technology because they wanted to avoid these unknown unknown So we are also told they fall into problems that are essentially way too familiar and Yeah, that's essentially the problem with Decision-making under a certainty is that it's dominated by information that you don't have yet The what matters is what hasn't happened and It's like the the back that quote by Steve Jobs, right the the connecting the dots you might recall that all Obviously you do You can only do that in free trust back so to Give some context of what I meant by the social media app and why this is a problem So roughly quick orient head, which was the we used to be VP or platform engineer as a great talk called time lens at scale in which he he essentially Claimed that the problem that Twitter face what as they were scaling Wasn't the amount of people writing new tweets and into the into the system, but alas the inability of the system as a kid is at scale to To show the home query page the home the home timeline page to a lot of people who follow a lot of people So let me explain what I mean by that. So there are at least two ways in which you can Architect a social media one Essentially is the one we just got with the tab record pattern Which is what makes it easy to read to to write into the database So you write a new a new a new tweet and it gets stored in the database and it's related by a primary key obviously to the to the user that posted it and By at the follow follow we relationship the many many table were allowed to show These information to every user when when they when they go to the home page Another way that you can do this Is by making things easy to read and the way they achieve that in Twitter was to maintain some sort of cash For each single user Sort of a mailbox if you want to like have a met clear metaphor What I mean by that in which every time a tweet was written into the system What they did was to fan out that tweet and copy it into every followers Timeline so that next time he would go to the Humquery page did to the hum to hum like page What they would get is from these cash all the tweets ordered and Yeah, essentially what we what Twitter did was to control more granularly how data is read and written on the database So the problem essentially is That the soundness of these choice comes down not to what happened at the beginning or how easy it was to develop at the beginning But essentially these very rate between how many tweets are read versus how many tweets are written on the on the system And the fact that the rights went on the order of the ten on the thousands Whereas the reds where and the order of the hundreds of thousands made the key difference essentially So the question from the question remains essentially like how would you how would you go after you know that information? How would you go about? Migrating from these active record pattern to something else to these other strategy Yeah, you can't that's essentially the problem you can unless you just throw it away and start over You see the problem, right? It makes it impossible To shift to a new information to accommodate a new information that you have about how the system is used and Your only way out is essentially to throw it away and start over which is obviously no good idea because In the meantime what you left is with the system that hasn't scale the problem has already arrived at your door and Your only solution is to have something in parallel to like try to accommodate these as far as they can and Then you have the migration of having two systems operating at the same time and In the meantime these active record is unable to serve the home queries Hopefully people are we've called what these is right? So this is exactly this these very problem This is what happened to Twitter and this is what we are bound as software engineers every single day because of because of the active record pattern The fail whale as it came to be known Showed us exactly what is the root of the of the problem when the requirements of the database or the data system So to say changed active record Prevent us from moving to Unsuitable architecture to a one that is more suitable to do the task Engineers that choose the active record may be Optimizing for quick wins, but they do so at the expense of more rigidity at a data system level They get to develop a relational data model that is fast implement, but they're stuck with it So it's it's like it's okay if you use the active record for simple projects I don't have anything against that, but in a larger sense We cannot dedicate ourselves to solve the impenance mismatch between objects and a database We cannot consecrate the mediocre state of a system and just avoid touching it because yeah I don't want to bring production down and we cannot hallow the great read write as the only solution that we have To it to accommodate an early Database decision that proved to be in Not good, so to say So from the assemble stand what I can do is Ask you to join me in declare these systems free and independent from all allegiance to early database decisions What what I can what a one you to have instead is Rather than depending on an architecture that is rigid from the very beginning just because you needed to have something quick I want you to aspire to engineer systems that are modular Where the value of the technical platform comes not from the capabilities that you are provided with But with the ability to migrate into something else that might be more suitable in the future and in doing so decisions about how Data system works can always be reverted when new information comes along and in a world where We constantly iterate in order to find product market fit It's it just stands the reason that we do the same with data systems. We have the ability to change our minds Later on in when the new information comes along So this is what the what the concept of persistent ignorance essentially means is that we Architect the systems in such a way that the rest of the application behaves as if all the data is in memory and We we find a way to encapsulate all the database or the ecosystem related Decisions in one single place And so that everything else is abstracted from the fact that there's even a database to begin with So that's essentially the talk that I'm gonna be talking today. My name is Salvador and These talk is called working in units, which is the name I came up with to describe this way of doing things that maybe We'll provide you with wins that are probably less quick as you would expect But in exchange what you would have is Something that I believe is a very specific need that all people here have which is the ability and the flexibility to change data systems So in order to do that what I'm going to do is something that I do believe it is controversial so What I'm going to do is invert the relationship between the database and the and the and the objects so This is essentially what is what is going on? So rather than declare With the active record how the database operates what we're going to do is to Specify exactly and with precision what it means to change from the data The database to the object in in our domain so we use these imperatively approach to Admit a bit probably a bit more complexity because of these extra layer But in exchange what we get is the the to these two levels Object and database operate relatively independent from each other. So essentially we're shifting from one from one pattern to another cool data mapper and Presumably you haven't heard from it even experts aren't even aware of these data mat the the state design pattern specifically because of these Special effort that you need to have to treat these these two levels as Coherent with each other, but at the same time independent So it's it puts pressure on your expertise to manage this relationship rather than just declare it and forget about it But in exchange what we get is that we treat the classes in our domain as something entirely different from database rows And that allows us to do all these sort of tricky things that we are being taught during these conference And we are so inspired to to apply it right away unless we architect things to be independent from the database what we are bound with is To treat objects in in our code as they if they were rows in the database and there's no way that doing some fancy tricks with Object or object oriented programming is going to do is going to do it. It's going to it's bound to fail essentially So, yeah, the class that is going to handle this relationship Rather than just declare it and we're going to make use of it. It's these cold repository. Essentially, it's just a class that Centralized the methods from which we obtain and we update the database with new instances I put an example if this is not really Right of like production like code, but the essentially the essence of the of the class is essential Essentially that you have the ability to create stuff in the database and get stuff back from the database But always shown in terms of domain classes rather than models from the database However, the the repository pattern on itself is not enough Because of these extra layer what you need is somehow to address the problem of concurrency and in a sense what we would We would do if it was just the repository going on is That we will have to be constantly trying to keep things in sync all the time And that will defeat the purpose of behaving as everything is in memory You are you will be clearly exposed to the idea that there's concurrency going on and database involved and that defeats Essentially the purpose so what we're going to do rather than figure is skating on the age of concurrency What we're going to do is present a context manager called unit of work Which obviously that's the reason why they call this talk is called at working in units In which what we're going to do is define the moment where What is going to happen when a comment or a rollback happens, but it can do more than that It can open the connection it mitigates the concurrency the concurrency problems with isolation levels and it can also Write down things very efficiently like at the end of the context manager We write everything down on the database and we're good to go. That's essentially what makes it so efficient But more importantly we are as in the rest of the code is essentially abstracted from the idea that there's even a database What I mean by that is that if the user Posts a tweet on the database the old the unit of work opens up the connection relies on the repository to Provide this information back and forth from the database in this case It might be even posting it to the mail work of tweets that I was mentioning Before and then it goes back so that the next time the guy goes on the home Page of Twitter the open the connection with the unit of work go back to the repository that is associated with that context and It might Retreat that information either from the from the cache or from the database Which is the final twist of this situation what I mean by that is that? Twitter and any social media app. There's two kinds of people so to say there's the people who are followed by Millions and hundreds of millions and well, there's millions of others like me I guess The problem with these is that if you try to mailbox tweets from these widely followed people You're gonna run into problems with resource intensity You there's no way you can Copy-paste tweets to the mailbox of hundreds of thousands of followers So instead what you do is you merge that information directly from the data This is way more way more cost-efficient to do so. So the thing is that These guys are getting a different treatment just because they're celebrities not not in terms of real world Although that's the case But also in terms of how the information that they post on Twitter gets gets Fan now to the rest of the of the of the followers that they have So the tricky thing about thing about these and what's so remarkable is that we've achieved precisely these these These are scenario that was so deal. I believe is that given a new piece of information I believe it's sensible to see that The implementation of these hybrid approach will rely entirely within the context and the boundaries of the repository and not It's not going to be scattered across the across the whole system Which leads me to believe that these is the exactly the core advantage of these design system the ability to change your mind without changing too much too much code and so for those of you who are interested in Really more about these or even like getting inspired as I said to Try some some of these ideas. I would suggest these three These three tools is the first one is architecture pattern in with Python in which most of these ideas are Discussed in the context of the main driven design The second one is the sign in data intensive application page turner. You asked me in which the the book is all about how to navigate the landscape of these new data systems that have come across over the These last 20 years and that will inform your decision-making about what is the best data system for the task and Last but not least the old the good old patterns of enterprise application architecture in which the design patterns that I've discussed Describing more detail So yeah, I'm gonna be available on on the discord channel But also you can ping me on Twitter and I also have a sub stock in which I discuss most of these ideas in the context of Payments and the traveling it's travel agency industry. So thank you so much So thank you very much and we have a bunch of time for the questions. So just skew up to the microphones and No, actually while you do maybe can you describe? What's the basically difference between unit of work pattern and session in the database So that's that that's actually a very good question, honestly What's so interesting about this question is that this session is indeed a unit of work SQL alchemy for some reason Use this pattern in the beginning. It's it's called classical mapping in the documentation. You can all you can look it up but They ditched it or at least they are now masquerading it with some meta programming So that they can use the active the active record pattern I believe what happened essentially was that they realized that unless they provide these active record option for younger or more beginners Developers they couldn't reach like market adoption in a way So they had to deal with this pattern, but the the question remains what happens What happens when I want to move beyond hobbyist projects? SQL alchemy provides you with the option, but the documentation is like in the background. You know what I mean? Hi, thanks for the talk. It was nice I have a question about about the internals of the pattern You mentioned at the beginning that you would like the business logic to operate as if all the objects were essentially in memory and Do them do it via domain? Defined objects and types So whenever you open this context manager for the unit of work, and then you manipulate objects You need to somehow track the changes in order to efficiently Reflect them in your storage system when the unit of work is done. Yes. So how would you go about that? so, I mean, there's a whole There's a whole new term not industry, but the whole new group of people dedicated entirely to these idea So it's it's essentially what I was describing with the problem with figure skating in the ace of concurrency If if you're trying to make changes While someone else is doing changes from a different server or different thread you run the risk of Producing raised conditions or these these these these stuff that might happen when when you open the connection But you haven't closed it yet. What happens while these stuff is going on, right? So there's so there's the commendation on sequel I can be for instance Which is the one that I'm most familiar with and I can share with you it's up to the of the talk But it's all about isolation levels Which essentially means that the session Should be clever enough to figure out how transactions interchange information between each other at the moment of the comet and How they they interact when they are together I can share more information with you later on don't worry Right, but I wasn't asking specifically about SQL and radiation at databases What happens if your data store is something else like files or whatever and then You want to save the changes that you made to the objects in memory and you need to know what the changes are and how Do you know that? so In terms of document, I'm not really familiar with the non-sequel database. I'm not really an expert But I'm sure there must be a way in which you can do that the thing with the document database is that they they put more pressure on the application code and the integrity is Is handled by the application rather than the database so to say so I would I would say that the long story short is that It's your job to do to make sure that that's the case. Okay. Thanks Hi, thanks for the talk. Do you have any advice? Let's say you you wanted to implement this in an existing project. Okay, but like loads and loads of database access How would you like start? Okay, so first things first. I really like your talk that you gave us yesterday So the problem the problem is essentially how you migrate from what you have to to these and I honestly have I Don't think I have a good answer at this point. What I would say is that What I would say that I would it start with trying to replicate these in what you have eventually these These concepts should emerge somehow naturally Without more specifics about what the system you're operating in I cannot give a proper answer but But yeah, I encourage you to try and maybe you can give a talk next year about how you did it I actually I actually was one of the technical reviews on that book the first one and I have tried and We have done some of this, okay so We have like a big Django code base with like 27,000 modules many many many models And we yeah, we definitely got some way But like I don't think we've got to the point of being able to have repositories. We've got use cases and But no further Looking forward to have a conversation with you and so you can explain to me what you did and how how work Okay, we have time for one short question like one minute. Hi, hello. Hi, how's it going cool? Yeah, very well. I enjoyed your talk For years we Python developers superior Python developers have ridiculed Java developers and seashell sharp developers to writing bunch of boilerplate code to Transfer user as in the device that base user to a user as a business user to a user as a Front-end presentation layer user. Yes, what you are suggesting here is essentially going to back to that concept as this partially so or More or less like that's the assumption So how would you respond to this critique because I don't feel like it's entirely true But I would like to hear it from you like what how is what you are proposing different from what these Inferior inferior Java and C sharp developers have been doing for years. Okay Thanks for your question. I Think I think it's a good question because it shows that There's been a failure to understand What is the difference between how database operate and how object oriented programming can happen within that context? And that's essentially what happened without the ability to use Domain classes truly as objects and apply Python stuff with it you are incapable of doing all this stuff that you mean and That's why it proved inferior So, yeah, only by I'd say that rather than Going back to what you saying is essentially these talk these these ideas what allows you to do is what you actually We're trying to do in the first place, but fail to do It makes sense to me. Thank you. Thank you very much. Let's get it