 Thank you. It's wonderful to see so many people here for a domain-driven design topic in Python. I've been involved in programming with Python since 2000, so that makes 18 years now. I think I came in around 1.5.1 or something, so a long time ago. I've been working in domain-driven design projects in Python and in other languages for about the last decade. So I'm here to share with you some of my experiences from working with Python and DDD. There's a lot of content in this talk, and you'll see why in a moment. I'm not sure we're going to get through all of it, but I'm going to rattle through it, and you can come and talk to me afterwards if you need clarification on anything I talk about. I'm going to start with a quick introduction of DDD, domain-driven design. What is it anyway? I'm going to move on to some strategic DDD topics, ubiquitous languages and bounded contexts. Then we're going to go to the other end of the scale and talk about the smallest parts, the smallest simplest patterns in domain-driven design, value objects and entity objects. Then we're going to move on to aggregates, which is a topic that some people find very confusing at first when they first get into DDD. We're going to shift gears and talk about software architecture and how the architecture of an overall application can be centred on a domain model. Then if we have time, I'm going to talk about repositories and domain events. So nearly all the content you're going to see today is extracted from a training course that my company Sixty North delivers. I don't normally promote commercial offerings at community conferences. I'm getting an exception today because I'm going to promote Pycon UK in Cardiff later this year where we are giving this course for free for two days as part of the Pycon UK conference. So if what you see today interests you, you just need to come to Pycon UK and you can get the full two-day experience. So let's get back to the programme and think about, well, what is DDD? It's a health warning, though. There are no answers in software design or modelling, but there are plenty of choices to make. And whenever we design a software system or design a domain model, it's all about making choices. And there aren't necessarily any right answers. So I'm not going to give you right answers today. Discovering good answers for your context is something that's very much up to you. What we've noticed is that a domain-driven design is often encountered in classic enterprise computing projects, which are nearly all written on the JVM stack, in Java, on the .NET stack, in C-sharp. And some of the languages have picked up quite a big DDD community, like F-sharp, for example, but we see relatively little DDD activity in Python, which is a shame because Python is a great fit for DDD. There's nothing that is happening in DDD that we cannot implement in Python. Domain-driven design is about distilling a problem domain down to its essence, and Python allows us to produce very low-ceremony solutions, and so they're actually an excellent fit to each other. So we're going to talk about how to bring these things together, and hopefully you can get inspiration from this and go away and think about using DDD in your own projects. So one approach to learning DDD is to go out and buy the original book by Eric Evans, which is, what, 15 years old now from 2003? It's a very heavy blue book. It's almost completely indigestible unless you've already been through a DDD project. So this is not the right place to learn DDD. It's a great book to read when you've been through a couple of DDD projects, ideally one of which has failed, and then you will understand what Eric Evans was talking about. So this is definitely a book that you will come to in your journey with DDD, but it is not a good starting point. I, later in the talk, will refer you to some much better starting points for getting to grips with DDD. So DDD is a philosophy for bringing software closer to the problem domain. We are coming up with software solutions that are driven by the problem rather than driven by technology, by involving domain experts, and it's a systematic approach which involves some strategic design practices. I'm not going to really talk about those today. A bunch of principles and some guidelines, rules and patterns, which I largely am going to talk about today. DDD is not about specific technologies. It's very largely technology agnostic. Technology is, of course, important, but it isn't the driver. It's not a heavyweight method. It doesn't imply that we need to do waterfall and do all of our design up front. It's completely possible to use it in an agile context. And also, it's not always appropriate. Anybody who stands up at a conference and tells you that you're doing it wrong unless is a charlatan, all the advice you get at software conferences is context-dependent, and there are plenty of contexts in which DDD is not appropriate. So you should be aware of the context in which DDD works well and be prepared to use it in those contexts, but be prepared to search for other types of solutions in other contexts. So let's take some pictures from Eric Evans, the father of DDD, if you like. This is very difficult to read, so we'll zoom in on part of it. Actually, we'll go back and zoom in on to this part. So the practices and patterns of DDD are really broadly divided into two areas. Strategic DDD, which have a broader context. They're very influential on how a project turns out or how a product turns out, but they are not really related to the kinds of decisions you're making every day as a programmer, as a developer. I'm going to briefly talk about some of these, particularly bounded context and ubiquitous language. Then there is the tactical DDD patterns, which are the things that as a developer, as a programmer, you're dealing with day in, day out. You're creating entities, you're creating value objects, you're working with aggregates. So these are, as programmers, these are the things that you'll interact with much more frequently. So before we get into the tactical aspects, let's just quickly look through ubiquitous language and bounded contexts. So, one definition of domain-driven design by Vaughan Vernon is that DDD is primarily about modelling a ubiquitous language in an explicitly bounded context. Now, that is actually a very deep statement, and it makes a lot of sense to me because I've been doing this for a long time, but I'm going to try and unpack that for you and what it means. Now, all domains have a language, which people working in that domain use to talk about the things in that domain, including DDD. This is a kind of meta ubiquitous language. These are the words we use to talk about when I'm giving this talk, you'll hear me words that use words like domain and design and language and system and context, because this is the ubiquitous language of DDD. But any domain that you're working in will have its own language, and the experts working in that domain will have their own language. One of the key things we need to do in a DDD project is to learn that language, and rather than imposing our own language on the domain, take the language from the problem domain into our solution. So, this is called the ubiquitous language, and ubiquitous meaning to use everywhere or being everywhere. So, one of the key objectives of DDD is to create a supple, knowledge-rich design which calls for a versatile shared team language, and that's not just shared amongst developers, that's shared amongst everybody involved. And lively experimentation with the language, it takes time to discover language, you won't get it right first time. You need to experiment and discover and refine and be prepared to change decisions you've made earlier. The ubiquitous language is context dependent. If you're doing fighter aircraft avionics, that's going to have a very different ubiquitous language to sending out people's utility bills. A ubiquitous language is valid within a context, within a limit of applicability, and that is called the bounded context. I'll come on to that in a moment. So, we see a domain expert as a shared meaning, it's unambiguous as a single meaning, and it's expressed, and this is really important, the language is expressed in the software model. We use those words when we're writing the software, and this helps reduce the distance between the problem domain and the solution domain. And because the solution is very close to the problem, any change that happens in the language, the domain will evolve over time, and it should be reflected as a change in the software, as a change in the model. So, we need to strive to use the ubiquitous language when we're naming things in code, when we're naming things in databases, when we're naming things in tests. A bounded context is the limit of applicability of a ubiquitous language, and many larger domains will have multiple bounded contexts within them. If you work for an energy trading company, there will be an energy trading bounded context, but there might be some kind of systems administration bounded context where different things are going on, and different languages apply in those areas. So, it's very easy to come up with words which mean different things in different contexts, like the example I have on the slide. One of the failure modes of DDD projects for people who are new to it is that they try to model everything as a single, grand, unified model, rather than many smaller models with narrower scopes of bounded contexts. So, when you're modelling, you should definitely exhibit a bias towards separate smaller models rather than one grand unified model for your domain, because if you build a grand unified model of the domain, the larger domain, it will be suboptimal for everybody, and nobody will have a good time working with it, and you'll spend a lot of time wondering why the things that are going on in the real world don't map very neatly onto the things that you've expressed in your software. So, models will exist within bounded contexts. You should prefer to have multiple bounded contexts for anything other than trivial domains. The context is the scope of the language. We have separate segregated models and independent autonomous implementations. You don't even have to write them all in Python. Maybe another language is appropriate for some of these models. Ultimately, these bounded contexts of language will align with technical components. If you identify three bounded contexts in your domains, it's a really good idea to have three systems. If you try to map that onto two systems or four systems, you're setting yourself a difficulty. Ideally, they should align with technical components and be loosely coupled. Then you're going to have to do some work, which I'm not going to talk about further today, in context mapping in terms of how concepts in one domain map to another, and a context map is both a conceptual thing, but ultimately it needs to be a piece of software that is able to move information and share information between these domains, mapping from concepts in one domain to concepts in another. For example, you might have a user in one bounded context, but that might be a customer in another bounded context. They are clearly very related concepts, but you might wish to model them differently. Let's go back to the other end and look at value objects. These are just about the simplest things that we, the simplest pattern in tactical domain-driven design. Value objects are used to measure and describe things in the domain. We're trying to measure or describe characteristics like a currency value or a telephone number. Equivalent values are interchangeable. We don't actually care about the identity of the object. When we're dealing with value objects, we care about the value of the object. They're comparable by value, and they lack an intrinsic identity. If we have a string which contains someone's name and another string which contains the same name, the fact that they're equal is fine. The fact that they are actually different string objects is irrelevant. They are a name. They're self-contained, and normally they're encapsulated and owned by entities which will come up to next. It's a very good practice to make your value objects immutable, which means that instances of them can be shared. We have some really nice value objects in Python already, such as time and string. They are immutable and shareable. Generally speaking, though, you should avoid using neat value objects like integer or float or string. You should try to build some kind of domain-specific abstraction around those types. The reality is that your actual problem domain does not have things in it called string and does not have things in it called integer. It does not have something in it like a currency amount or an amount of money or a duration. Mm? Yeah. So strive to make them immutable. The easy way of doing that is to lean heavily on the built-in immutable types in Python, tuple, frozen set, str, et cetera. Validate them in the initialiser, dunder init or dunder new. You might be using dunder new here because you can share instances that are immutable. You can do interning. Getters with property. Remember to override the equality operator. In Python 3, I don't think you any longer need to override the inequality operator. You get that for free. Give them a useful dunder wrapper. Think about whether rich comparisons are necessary to implement for that type. Quite often they are. Consider preventing mutation by overriding dunder set et cetera to prevent people mutating the object. Of course, always give your types a useful dunder strer. Consider a dunder format as well. That's quite often useful for many of these value types. Use side effect free functions to produce new values. If you do need to modify a value object, rather than having a mutating method on that, have a method which returns a new value object with a modified value. Date time is a good example of that in the standard library. Date time is immutable. You can create new date times by replacing the day with a new number, but it doesn't actually mutate the date time object. It returns you a new date time with all the other numbers of the same, but the day is different. That's a very good approach to follow. Consider using what I call named constructors using class methods so that it gives you an opportunity to express the domain language, the ubiquitous language in your class when you're creating things. I've already mentioned interning with dunder new. Here's a very simple value object. It's an email address. This is quite a naive implementation, but it's simple enough to show on a slide. I have a class method, which is a named constructor. I'm going to construct an email address from a string. I have a dunder init, which accepts a local part and a domain part. They have the parts either side of the at symbol. I have a dunder strer and a dunder repper. They're not very interesting. I've implanted dunder eq for equality and dunder hash, which needs to go along with dunder eq. I have two properties for read-only access to the local part and the domain part of the email address. I have a single method here, replace, which allows me to either replace the local part or the domain part, or both, and notice that it returns a new email address object. You can fake mutation, if you like, by always returning new objects. Value objects are very simple to build in Python. The thing is to focus on immutability. Most of your model can be implemented in terms of value objects, usually. It's a very good place to start. The reason immutability is so important is because it makes it so much easier to reason about and test and debug, because there's essentially no behaviour in them beyond some initial validation. Let's move on to entity objects, which are the next level of sophistication. Entities are different because they represent things that are distinguishable by identity. Even if they are completely equivalent in value, the fact that one of them is a different instance to another one is an important thing. The identity needs to be stable over time, so an entity object cannot change its identity, and it needs to be unique within the bounded context of that particular model. For entities, life cycle is important. It's important when the entity is created, it's important how the entity evolves over time, and it's important if the entity is deleted. That's not something we often do, or deactivated, which is more common, is to say this entity is no longer in use. They're more often mutable, although immutability is always to be preferred, and they tend to be composite. They tend to be made of other entities and value objects. In Python, it's important to distinguish between creation of an entity, which only ever happens once for a particular entity, and instantiation of that entity as an actual live object in the system. You might register a user with your system as an entity, and that registration only ever happens once, but that user might be put into a live Python object millions of times over the lifetime of their interaction with the system. We need to distinguish between creation and instantiation. Creation needs to happen via a factory function, which needs to establish all the invariance, all the things that have to always be true about an entity. It probably needs to either create or get hold of a unique ID for that entity from somewhere. I've discovered to my cost you should really strongly prefer what are called factless IDs, not an ID that comes from the domain. Just cook up an ID somehow. It could be an integer, it could be a UUID, a GUID, whatever, but don't make it some fact from the real world, because things that you think are immutable in the real world often aren't. So prefer an internally generated ID, and often the factory can just be a module scope factory function sitting next to the class. Sitting next to the entity class. Dunderinit, on the other hand, is used for instantiation of the entity. That might get called many times over the lifetime of that particular entity, but each time creating a different instance of that. Dunderinit needs to accept all the ID and all the state. It's already validated by the factory at this point. Dunderinit tends to be very simple. It's generally just assigning to attributes. Consider maintaining a version, an object version, an entity version, within the entity, and possibly also an instance ID. It is very important that Dunderinit always succeeds, because you might be calling Dunderinit when you take an existing entity out of storage and you never want that to fail. It needs to be in a valid state. So I generally use an entity-based class. I'm not going to read this in detail, partly because it's too small on my screen here. But you notice the Dunderinit here is very straightforward. It's just assigning to attributes. It's not doing any validation work, because we're assuming that validation work has already been done elsewhere in the factory. Nearly everything else that's on here is machinery to do with version keeping and a marker for whether the entity has been discarded and really put beyond use, which is generally a better technique than actually trying to delete the entity. That's an abstract-based class, which I tend to carry around something similar like this. An actual entity here, which subclasses entity. This is a customer. We call the superclass constructor. We assign an attribute. And we have some getter and setter properties. And notice that the query is a mutator's check the liveness of the entity as this entity remarked as discarded, because if so, this needs to fail. Notice that the factory function there on the right to register customer. That gives us an opportunity to express the domain language again. We have two opportunities to express the domain language now. Once in the entity name as a noun, last name, and once in the name of the factory function. It isn't just something like make entity, or sorry, make customer. We actually can say register customer, because that's what we're actually doing. That's maybe the language that's used in the domain. At the most basic level, domain models are constructed from graphs of entities, shown in the rectangles here, which own value objects. The immutable value objects may be shared. So the entities are identifiable, they have a life cycle, they're probably mutable, and they're composite, and the values measure, describe quantities, they're equivalent, interchangeable, self-contained, and preferably immutable. So this is very simple, and almost any Python system, whether it's built using ddd or not, will involve a domain model, something like this. So what ddd brings, really, is what happens next when we get to aggregates. So let's look at aggregates. So here's the previous picture again. What is an aggregate? Well, you might look at this and think, well, it's just a cluster of closely related entities, and that's partly true, but it's much more than object clusters. Although an aggregate may have more than one entity, often aggregates will have only one entity, and there are good reasons to prefer having only one entity. Aggregates really are consistency boundaries. We are going to require that the model is always consistent within an aggregate, but we are going to allow the model to be eventually consistent between aggregates. So it turns out that in your code, aggregates don't really exist. You don't have to write any code for your aggregate. It's really a convention that you follow about how you use your entities. So this is a part where people get a bit confused because we have this very, very important idea, the aggregate, which isn't actually reflected as a class, a separate class necessarily in the code. But this idea of consistency boundaries is really the most important idea here. You'll notice that every aggregate, shown in this kind of orange colour, has a one entity, which is special. That's called the aggregate root entity, and it is the entity that is responsible for maintaining consistency within the aggregate. Because it's responsible for maintaining consistency, it is the entity which hosts mutating commands on that aggregate. You can put mutating commands on the child entities within the aggregate, but then it gets more awkward to maintain consistency amongst the other entities within that aggregate. The aggregate is responsible for maintaining its own consistency. So while it's generally fine to put query methods on entities wherever they are in the aggregate, I find it works well to only put the mutating command methods on the root aggregate, because then it's in a good position to maintain consistency within the aggregate. Notice also that the root aggregate is special because it is the target for any in-band references to that aggregate. We are not allowed to have deep references to entities within another aggregate. If we want to refer to a specific sub-entity, we must provide enough information in that conceptual reference to go to the root aggregate and then navigate within the aggregate to a particular child. So that's a very important idea. The inter-aggregate references are to the root, and all inter-aggregate references are by root entity ID, ideally this opaque, factless ID that we talked about a moment ago. This is important because then I can instantiate part of my model one aggregate without having to instantiate all the other aggregates that it depends on. It's fine to use regular Python references within the aggregate, and it's fine to have the whole aggregate instantiated simultaneously. But between aggregates we need to use an ID. Generally speaking, the factory is responsible for creating a whole aggregate, so it may configure an entity with particular child entities at the time it is created. The factories facilitate aggregate creation and allow us to express the ubiquitous language. They hide the construction details such as entity ID generation. Getting the aggregate boundaries right is really tricky, and you probably won't get it right first time. You need to iterate. Look for conceptual holes which can be instantiated, queried and modified independently of other parts of the model. Look for parts of the model that it is useful to do work on alone. Use the delete rule of thumb. I'll come on to that in a moment. And as I mentioned already, we enforce consistency at all times within an aggregate, and now eventual consistency between aggregates, possibly with asynchronous updates. Try to keep aggregates small, many, or perhaps most even, should have only one entity. From a Python point of view, I find one module per aggregate is a pretty good way of organising things. I might have multiple entity classes within that if it's a more complex aggregate, or if it's a very complex aggregate I may end up promoting that module to be a package in its own right. That's generally how I find myself organising the code. As I said already, the command method should be on the aggregate root with only query methods on the child entities. It's okay for methods to accept transient references to other aggregates, as regular Python references, but you shouldn't keep hold of those things. It's okay to use them for the purposes, for the lifetime of a single method call, to do some work, but you shouldn't keep hold of those references. Above all, it should remember that your models need to be useful and not necessarily realistic. I've seen a lot of time wasted on producing incredible models of reality, which turn out not to be very useful for the actual problem the system is solving. So try to keep things, you know, try to keep things as simple, but as complex as they need to be to solve the problem. So I talked about how finding aggregate boundaries can be tricky. I'm going to show you a quite simple example. If you think of something like Trello, Canban board, software for that. Let's think about some entities that are involved here. We have a work item. In my model here, I'm going to allow work items to exist independently from the Canban board on which they are on. We have a board and we have some columns on the board. So we have three entity types here. It's a very simple model. And the board owns some columns and the columns own some work items, maybe. And thinking about how aggregate boundaries could be here, we could do it this way. But this kind of feels wrong, doesn't it? You wouldn't model it like this because you wouldn't separate the columns from the boards into different aggregates because you need to keep those things consistent with each other. But I've already said that work items can be allowed to exist separately. This might be okay. I could go for this. But I think the right answer, if there is one, is something like this. The board and the column are separate entities, but they live together within the same aggregate. Work items are their own aggregates because they're allowed to exist independently. A simple trick to follow is to use the delete rule of thumb and ask, well, if I start deleting things, what else necessarily needs to be deleted? So if I delete the whole board, it doesn't make sense for me to keep the columns around. But it might make sense for me to keep the work items around. So the delete rule of thumb is quite an effective way of getting at least a first pass stab at where your aggregate boundaries need to be. We would allow our internal reference within the aggregate there, the board to the column to be a regular python reference, but we'd want the columns to know about the work items by work item ID. Of course, once I've finished on my white board, then I might want to make this more formal. But I wouldn't often make the effort of drawing pictures like this. Any questions on aggregates before I go on? I'm just going to take a couple of questions now because there's more to come. I can repeat the question. The question was, how prevalent is the use of UML in domain modelling? My answer to that is that UML sees hardly any use in any project, including DDD projects. I would say it's well under 10%. I have this particular diagram because it comes from a training course. On a real project, would I draw this? Probably not. I have worked on projects where we build UML models after we've written the code for documentation, but I would not recommend building models like this and I certainly wouldn't recommend trying to generate any code from the model. I think you're just wasting your time. We can still be quite agile with this and the model will still need to evolve quite quickly, perhaps. If you're dragging some horrible diagram behind you while you're doing that, you're just costing money. I wouldn't overestimate how long it takes to do some of this stuff and I would go through cycles like this that last a few days. I'm not talking about people doing weeks and weeks and weeks of design here. Small bounded contexts, small aggregates, you can work on parts of the system, you can divide the work up. You can delegate to people. You don't need to have some big design effort at the front for this to be a useful technique. For the video. Who should I find the borders between microservices? I'm going to mention microservices in a moment because I'm going to talk about software architecture now. Can I come back to that shortly? So, let's talk a little bit about software architecture. Nothing we've seen here requires complicated frameworks or proprietary infrastructure or fancy architectures or enterprise anything. A lot of DDD is associated with the kind of heavyweight enterprise computing. It doesn't need to be like that at all. I find it very interesting the way we approach building things in the Python community compared to the way things are built in other communities in which I've also been involved. I think to some extent we tend to build models in spite of some of the frameworks we have to work with. Like I said, DDD is a choice and sometimes using those frameworks is the right thing to do but I would also like you to think about the alternatives and that's really what this talk is about. It's about getting you to think about some of the alternative approaches. So, this picture here is from, I think, not the DDD book itself, but a shortened version of it that came out in 2004. If you look at what this architecture is about, this layered architecture, we have a user interface which these days tends to live in the browser. We have an application, we have a domain model and we have some infrastructure databases and actual computers, I guess. And then you match that up against, for example, a Pyramid web app where we have some JavaScript front-end in the browser, we have Pyramid, we've got your model which is maybe implemented in terms of something like SQL Alchemy or a Django app where we have the browser and the model and we use the Django orm. So, more than 10 years... Well, probably around 10 years ago now in the DDD community, everyone was crazy about using object relational mapping technologies to build their models. I was at a conference in London about five years ago and Eric Evans, the man behind DDD, came out with this wonderful phrase which I just think is fantastic. The object relational map takes two brilliant ideas, object-oriented programming and the relational model and incapacitates them both. And I think this is very true. You end up making such horrible compromises for your object model and your relational model in order to get them to work nicely together. So, okay, this is fine. I can stand up here and rant about the problems of using object relational mappers or modelling frameworks, but what's the alternative? Well, people have been talking about the alternatives for a very long time. Bob Martin talks about the clean architecture. Notice in the middle of this he has entities and around that use cases. You can think about controllers. And then look what's in the outside blue ring there. We have the UI on the outside which you would expect, but we also have the database on the outside which you maybe don't expect. We tend to think of the databases kind of being in the middle underneath everything else. They've moved the database to the outside. Mr Coburn, another signatory of the agile manifesto, proposed the hexagonal architecture. We talked about ports and adapters. The application is in the centre. But again, look at where the database is. It's on the outside. My favourite presentation of this is Jeffrey Palamo's onion architecture. Again, domain model in the centre. Then we have domain services, application services, and the user interface on the outside. But again, infrastructure and the database is also on the outside. This approach is called externalising the infrastructure. It places the model at the centre. Your application is not about databases. It's about some problem, some domain problem you are solving. That should be the most important thing in your software. The database is just something you need to do that. It's just a tool. The alternative to expressing a model in terms of somebody else's framework, over which you have probably little or no control, is just to use plain old Python objects. Just write some Python code and own all of it. Don't become beholden to somebody else's technology. If your project is successful, it's going to be around for a long time. It's going to be around for longer than probably many of the technologies you might choose to use. So as a software architect, you should think about that. It's a risk. So mitigate the risk and don't depend on other people's frameworks where you don't need to. We can have a pure Python domain model and we can prefer not to build that domain model in terms of persistence frameworks. Because persistence isn't a domain concept. Go and talk to your users about persistence. They won't even know what it means. It doesn't feature in your ubiquitous language. What on earth is it doing in your domain model? So I've talked about bounded contexts and a bias towards separate models. Let me just mention how to answer the microservices question. So in terms of microservices, I would draw it like this based on Jeffrey Palamo's diagram from a decade ago and say each of these microservices represents a bounded context. It has itself contained. It has its own model. It doesn't care about the other models. It communicates with messages to other bounded contexts. So they can be written using different technologies. They can be written using the same technology. But each one is optimised for a different part of the greater domain. Does that make sense? Does that answer your question? Good. I really am going to get through 100 slides in 60 minutes. Okay. So let's think about repositories. So I've said that we need to not build models or we should consider not building models in terms of persistence frameworks. That doesn't mean you shouldn't. Think about your context. A very large number of use cases where building a CRUD app using your favourite object relational mapper is exactly the right thing to do. And I'm not railing against that. I'm just trying to open your eyes to some of the alternatives here. But ultimately, we do need to store the data, right? It's not enough just to hope that our computer is turned on and never gets turned off. So we need to have repositories indicated by the red circles here. A repository is somewhere where we can put an aggregate and go back and get it later. How do we know which aggregate we want? Well, it might be in the simplest way just by its ID. Maybe it's referred to by another aggregate. We have these ID references between aggregates. So one aggregate can go to a repository for another aggregate and retrieve it by ID. Or it might be some more complex domain-specific query. So repositories store aggregates, they retrieve aggregates. We usually have one repository per aggregate type. And they are an abstraction over the persistence mechanism. And they are very architecturally significant. So generally we want to be able to put, to have an aggregate that we've instantiated, put it into a repository, get a hold of one based on some criteria and persist changes to some store, whatever that store is. And possibly also remove aggregates from a repository if they're no longer required. It's very tempting to, at this point, get involved in database transactions and things like that. You should resist that temptation. Transactions are something that belongs in the application layer, not in the domain model. The domain model shouldn't be concerned with database transactions. Why? Well, go and ask your users whether database transactions is a thing in their domain and how they talk about that in their domain language and they'll have no idea what you're talking about. So it doesn't belong in the model. There are many ways of building repositories. We can have what are called collection-oriented repositories. In the Python sense, you can think of a repository that behaves like a dictionary where we can retrieve, if you imagine building a class which has an interface that looks like a Python mapping, a dictionary, and we can go and fetch things by ID. How it actually pulls those from storage is an implementation detail. So it's certainly possible to make a dictionary-like repository interfaces, or we can have a more what's called persistence-orientated design where we put an aggregate in, we remove an aggregate, and there's a much more explicit notion of saving, just like with a persistence-oriented repository. There are other repository types as well. I'm not going to talk about event sourcing today, but that is certainly an option here. Probably a too popular option. So collection-oriented repositories are very easy to use, but they can be quite complex to implement because you're trying to put a dictionary abstraction on top of some machinery that might be quite complex. And they're a good fit if you're using something underneath like SQL Alchemy, which can be an intrusive dependency. Persistence-oriented repositories are much simpler to implement. They're a good fit with no SQL stores like MongoDB, but they do require some diligence on behalf of the application programmer because you have to remember to actually save the stuff into the repository. And as I mentioned, there are other options. So generally, we will have an abstract repository inside the domain model. Why? Because the different aggregates need to be able to get hold of each other somehow. So we need to have a repository interface in the domain model, but we defer the implementation of that repository to outside in the infrastructure layer. And because of that relationship, that dependency inversion, repositories need to be instantiated even further out in the application layer. There are some pictures coming up that makes this more evident. So you need to consider testability. In fact, this is very good for testability because you can build fake or very simple in-memory repositories for testing. Testing in your domain model, although not testing your large application. Some advice here. It's very tempting to... Thank you. I'm going to break selected queries, something like having the top right here, like repo, employees with... and then some lambda that says age greater than 60. That's really nice to use as a programmer, it seems like. But you're not really expressing the domain language here, and also it's very difficult to implement that in terms of different repository implementations. How do you convert that lambda to a SQL query? We have technology to do that, but it's probably not a technology you want to necessarily get involved in. It's much better to take the opportunity to express the domain language and have a repository query method, which is something like on the employees repository, but something like eligible for early retirement, because that's the actual question that's being asked. Okay? So in terms of how I might organise code for something like this, I would generally have multiple packages, I'd have an application package, which is the outer layer, if you like, an infrastructure package, which contains concrete repositories, and then a package per bounded context, which actually contains the domain model. So they have this kind of dependency relationship. So the application depends on the infrastructure, and the infrastructure depends on the domain, not the domain depending on the infrastructure. So this is the key dependency inversion in the clean architecture and the onion architecture. So if you're using something like SQL Alchymeg, this is probably the exact opposite of what you're doing today. Here's another picture superimposed on the onion. Here. So I have just under 10 minutes left, so let's crack on, and talk briefly about domain events. And so we've just done like an entire day of training, without the exercises. So now we're on to day two in the last five minutes. So let's see how we get on. Domain events and modelling time. So what is a domain event? It's capturing the memory of something interesting which happens in the domain. Now interesting means interesting to the domain people, not interesting to software people. So if your disk fills up, that's not a domain event. It's something your logging needs to know about, but it's not a domain event. If a new customer registers with your system, that's a domain event. So generally speaking, aggregates will emit events. This allows us to model time explicitly. They're significant to the business. They're first-class members of the domain model. It's just something that was really missed back in 2004 at the beginnings of DDD, but the DDD has become very event-centric recently. Of course, it's the foundation of event sourcing and projections and process managers, which I'm not going to talk about today, but you have to have domain events for these things to work. They allow us to establish causality between things. Of course, if you have them, they are wonderful for logging, although that's not their primary purpose. Given the time, I'm going to go to this one. You should name your events in the past tense. Money deposited, not deposit money. There should be immutable value objects themselves. Why should they be immutable? You don't want people going around, screwing around and changing attributes of something that's just happened. It needs to be an immutable record. It's a good practice to get a time stamp in there, somehow, monotonic, hopefully, and include the aggregate root ID of the aggregate that created the event. Any information required to navigate to child entities. I generally define my event classes as nested classes within the entity, the root entity. You may consider organising them in the same way we do with exception classes, so that you can listen to particular events based upon the inheritance hierarchy. Prefer a publish subscribe messaging system rather than subject observer. Something like PyPub sub within your boundary context of the application is fine. You may want to republish those messages to a message bus for other boundary contexts to listen to and act upon, but within your single boundary context, it's probably going to be quite a simple Python application. You can just use very simple technologies for that. I'm not going to talk about that because I'm going to be out of time shortly. I do want to cover this though. Descriptive events versus prescriptive events. So it's very tempting to write methods which modify the model in some way and then emit an event which describes what's just happened. That's the kind of natural way to write things and you probably do that a lot. The problem with that is what you actually did and what you just published don't necessarily match. Those things can get out of sync. One of them gets changed, you forget to update the other one, and then your events aren't a reflection of what actually happened. A much better approach in your command methods which mutate the model is to describe the mutation you would like to make in an event and then apply that event as a way of mutating your model. Then those things cannot diverge, they have to be the same. Probably last bit of code here, descriptive events on the left. You can see I mutate the underscore name attribute to the new value and then I publish an event. The alternative approach in the name setter here, I create the event object and then I do self.apply event. That is what actually causes the mutation to happen and that dispatches to the function at the bottom of the screen which actually does the work. Just to summarise, we've actually in an hour covered quite a big chunk of everything in the DDD book and it will take you a lot more than an hour to read it. You've had a bit of a flavour of what's in there. We've covered domain events, entities, value objects, aggregates, factories, repositories, the layered architecture. We haven't covered services and there are lots of other things which sit around DDD that aren't on this diagram like process managers, how do we model long running processes that affect the model where a user may come and go because of the course of that process occurring. We've done pretty well in an hour, I think. I'm just going to close on some references for you to go and look at other things if your interest has been peaked by what I've said today. There is the blue book. Remember what I said earlier? It's not the best starting point. A very lightweight starting point is Vaughan Vernon's book Domain Driven Design Distilled. That's about a centimetre thick. The blue book is about five or six centimetres thick. If you choose to read only one book, read this one. Implementing Domain Driven Design by Vaughan Vernon. The tweet there has a suggested approach to reading this book, and it's one to four. Then go away and try to build something and come back and use the rest of the book as a reference. I'm going to promote a few wonderful conferences in this part of the world and further afield. There's an amazing conference in Amsterdam called DDD Europe. It's just incredible. If you are interested in implementing DDD or understanding DDD in any programming language, you should really go to that. I hope to be presenting some Python stuff there next year. There's a much smaller conference in London called DDD Exchange, organised by Skills Matter. That's also good. If there's anybody here from further afield, from the US, there's a new conference called Explore DDD, which is in Denver, which is a new conference. This is the second year it's running. I have not been, but I've heard wonderful things about it. Of course, remember that our training course is available commercially, but also on a much less commercial basis at Pycon UK in September. If people want to work through much of what they've seen today, but with a lot of exercises and coding and in a much more practical and less theoretical presentation than I've had opportunity to give you today, it would be wonderful to see you there. Thank you very much. Thank you for your wonderful talk. I believe we have time for two quick questions or a big one. Thanks for the talk. It was really good. I have a question about how that refers to REST APIs, where you want to give a user as much control of how the user queries are entities in a domain model, and it refers to your example of converting a lambda into a SQL query or something. Right. So in the REST APIs, your goal is to give a API consumer as much power as you have, but then you would end up with your repositories having hundreds of options for how they query the data. I don't have a really good answer to that. There's obviously a trade-off involved there in terms of how much power you give the client of the system and your opportunity to express the ubiquitous language in the system. So if you need that kind of ultimate flexibility on the query side, then maybe this is not the right approach. I mean, the most flexible query interface to a system like that is SQL on a database, right? So, you know, one view is that you're describing a database, right? I have a bucket full of data, and I want ultimately flexible querying on it, okay? So, you know, there is clearly a trade-off to be made there. But you're right, and you're right about the lambda example. I'm not saying the lambda example cannot be made to work. It clearly can be made to work, and I've done it because I've made that mistake and then had to live with it. It just requires more work in the concrete repository to actually understand what that query means. Yeah? Yes. You say that the domain doesn't use the infrastructure layer. It's the opposite. But the domain needs to instantiate entities. Right, but when an entity is instantiated, it's just created in memory. It's just an object. It's not until that entity is put into a repository that you actually need to interact with the infrastructure layer. The exception to that is entity ID generation. You may want to use an ID that comes from something like a database, in which case you may need to talk to a repository in order to get hold of that ID. That's one of the reasons I've developed a strong preference for using UU IDs for my entity IDs because I can just do that in software without any dependency on the infrastructure. If you are... For example, in the column you want to... In the Kanban example, you want to retrieve the tickets. Probably the column will have a method that is tickets in that column that this belongs to the domain. They would be in a... The tickets are another aggregate type and the column may be able to go to the abstract ticket repository to get hold of those things. That then dispatches onto code in the infrastructure layer which is responsible for creating that filling out the ticket. It depends whether you are creating a new ticket for the very first time in which case you talk to the factory or whether you're getting hold of a ticket that maybe was created six years ago but you're pulling out of storage in which case you go to the repository. Factories are for entity creation once. Sadly, we don't have more time for questions but I'm sure that Robert will speak with you. Let's send Robert again, please.