 So firstly, thank you for coming to the session. Either you guys are really tired of Venkat, who's doing a talk on the other side, or Henryk is so boring. But anyway, so thanks for coming along. Is anyone here that was at the talk that I gave in the management conference? No? OK. Because some of the examples I'm going to put up were discussed at the management conference talk that I gave. But there was more from a perspective of how managers can deal with this. But this is going to be more at a technical level. We're going to fly at different altitudes, some quite high up, some quite low down. And we'll see where this takes us. So who am I? I'm Aslam Khan. I'm from South Africa. I live in Cape Town, being paid to write software for about 20 plus years now. Not always successful in doing it, but fortunately, they still gave me a paycheck. Now, for the past eight to 10 years, I've been an independent consultant, basically. My blog is just fluctuating. This is just terrible. My blog is at f3yourmind or freermind.net. If you have kids that are struggling to sleep, you can point them to that. It helps. And if you want to get hold of me on Twitter, that's where you'll get hold of me. Is it fine? Just carry on? OK. So I'm going to walk you through a few case studies. These are companies that I've worked with. I've blurred a lot of the images, et cetera. That's irrelevant, actually. And we're going to walk through some things. And this is going to be things that I've observed over the last few years, many years. And I've been trying to pull out a particular thread to see if there is a common thread across all of them. And it's software that has spanned many years, regardless of what generation we were actually developing the software in. So some of it is client server. Some of it is desktop apps. Some of it is web apps. Some of it has now started going into this cloud stuff and things like that. So I'm just trying to look and see what are the common patterns or patterns are wrong words. Common things that I've just observed. So the first one is, this is freaky. The first one is what I've called the Impris new clothes. So this is a customer that I've worked with for many, many years, less so recently. And when I first walked in there a few years ago, I asked them to just someone on the development team, just give me a picture of what your software does. This is a company that produces a product. It's used by multinationals and it's to help them with foreign purchases and supply chain management and customs and exercise and all sorts of crazy bureaucratic processes that people must follow just to import and export goods. Very large footprint in South Africa and to a certain extent in the US and Europe. And so when I asked them to show me a picture, this is what they gave me, right? So it's blurred, so don't squint. It's deliberately blurred. The writing is irrelevant, okay? So I asked them, do the colors mean anything? And they said, yeah, that's version one. We called it the black box, okay? Version two was called the red box. Version three was called blue box. And the one at the bottom was in version four. It was some weird integration thing that they had conjured up. Because what was happening, and we'll discuss why it happened, was that it's great to have different releases and versions, but some of their customers had all the versions, right? They were unable to deprecate an older version for new one. And so they conjured up this thing. And they called it some integration stuff that actually just barely managed to keep the data in sync between them. But they used some really grandiose methods to do that. So how does this happen? How on earth do you end up with three versions of your product sitting at the same customer? Well, that's not the most interesting part for me. Let's just start digging a little bit deeper. The first version was about 800,000 lines of code. This was all Java development, 800,000 lines of code. It was about eight developers that were involved in the first version. The second version, the red box, was about 650,000 lines of code and about six developers, right? And then the blue box was about 700,000 lines of code and about 20 developers. But here's the kicker. The first version was 65% complete and they decided to do the next version in functionality for the customers, 65% complete. And that had to really eke out of them, right? It was, how is this complete? What percentage? Oh, we don't know. And then eventually just by comparing it to what the other software was doing, we established this roughly about 65%. The second version with six devs, you know, 20% less code was 80% complete and the one with 20 developers, right? This one was 50% complete, 700,000 lines of code already and it hasn't yet even reached the functionality of the previous version. So when management asked what should we do, I said stop version three. I mean, what would you do? Every version from scratch, right? So my advice immediately we're just looking at this was stop version three, just continue doing version two. What's the big deal? There's no magic around this, it's a simple decision. And then also that was over seven years of development. So who's working on a code base that's over five years? Right, is it enterprise software, is it product? Enterprise software, less than five years or more than five years? Enterprise software, software you're writing in like your insurance company, you're writing software for the business itself, much, generally much longer. Seven years, this team was doing XP, XP. And they're pretty decent about it. They had tests, they had reasonable coverage, they did, they were doing the right things in many respects except this is what they were doing. This is what I call, part of it is I call the archeology of Java. The flavor at that time, black box was EJB-1, right? Which they regretted, and struts and JDBC and stuff like that, and then they moved on and guess what happened next, it was EJB-2 swing. Then they said this JDBC stuff is an absolute nightmare, so let's just go with an object database. We don't want to do this object relational mapping stuff. Let's just get rid of that. So that's what the reduction in code size was. They got rid of the object relational stuff, right? And then when I started analyzing the code, they said what kind of code do we have? All the SQL shrunk, right? And they had this reduction in code, okay? And these guys, come on, at that time it was EJB-3, it was Wicked, even though Hibernate and EJB-3 was pretty closely aligned, they wanted Hibernate, this thing in the middle, they started off with the flavors then, and what people wanted was these enterprise service buses, some kind of thing to do orchestration and all sorts of stuff. So they chose service mix, which was open source. For some weird reasons, they were convinced that it doesn't work, so they chose open ESB. They didn't like that as well for whatever reason why it doesn't work for us, okay? Works for many people, doesn't work for them. Then they went with mule, they must probably have something else now, right? But can you see what the problem is here? There's no change to the design. It's just changing tech and the rewrites are predicated by what was cool at the time of the day without any significant improvement on what they were actually attempting. So just replacing the technology. And just the act of replacing the technology has incurred that kind of code bloat, code bloat that we've seen, right? So I dug a little bit more into this enterprise service bus stuff that they were doing. So it gets worse. Basically, they were just reaching straight into the database of each one. They didn't bother, right? The guy, one guy came, comes up to me and says, do you know anything about transaction managers? Right, Java transaction managers? I said, yeah, a little bit. He says, well, I'm trying to find one that works. I said, what do you mean? He says, well, you know, this is our situation, right? So we've put this JMS queue at the top, right? And I'm just trying to get these extended architecture transactions to work across these things. And I said, but we know that that kind of thing really doesn't work, okay? You can't roll back on a JMS queue. I mean, a lot of the vendors will tell you that they're close to it, but it's not gonna happen, right? So he says, well, I've tried this, the Jboss transaction manager, and then he tried a few others. And I said, you know what, eventually you're gonna have to buy one because you've exhausted all the open source alternatives, right? And if you don't, if you just think about it, maybe that's not the problem. Maybe it's what you're trying to do is the problem. So perhaps it's a better way to get the data and keep it synchronized as opposed to, you know, a modern-day two-phase commit. So Doug Moore, same guys, Doug Moore went to the code base, right? So this is the one that I call the disappointed pattern. So there was a part of the code base that they had to model where they wanted to model the relationships between customers, suppliers, and any other parties that are involved in a trade across continents, right? That's what they wanted. And there's many parties involved, logistics, shipping, banks, the actual supplier, the retailers, anybody in between that had an interest in this piece of cargo had to be modeled. And there's lots of different rules around it. So this is what they had. And in the black box, right, that was version two. It looks like that. In version three, which was incomplete, it looked like that. And if you put them side by side, what do you see? It's similar, very similar. This one is an incomplete version of that one, right? There's no difference, actually. So I asked them, what is this based on? I mean, what are you attempting to do? And they said, well, you know, we're actually implementing Martin Fowler's party pattern, which is an analysis pattern. You can read Martin Fowler's analysis pattern book. The party pattern exists. So this is, and amazingly, this module was called the party module, right? So they decided to call the module after the pattern. But then I said, but this is what I understand that analysis pattern to look like from a UML perspective. It's just three entities with some kind of composition going on. So how did you get from this to that? From an implementation perspective? They said, well, you see here, there's inheritance. And if you look carefully here, there's lots of inheritance, okay? I said, yeah, but you know, this is organization in person. There's very specific roles. It says, yeah, well, we have lots of organization. So that's how we implemented it, okay? And you're gonna continue doing that. But anyway, so if Martin Fowler was in the room, he'd be disappointed, you know? The pattern itself, if it had a human form, would be crying. The other issue there is that it's an analysis pattern, not for analysis of a domain, not necessarily for design. So here's another one. This is from enterprise company, insurance company that I work with. And I call this about agile weight gain, right? So when I walked in and I said, please show me some stuff about what you're doing. And I said, no, we have this really nice layered architecture, right? So ignore the words, right? But they said there's some stuff at the bottom which is very infrastructural. It's an enterprise company, but for some reason they thought they will change app servers like every six months. So they wrote some code to protect them from app server change. But we know an enterprise, once you've chosen an app server, it's gonna be about 17 years before you can change it again. Now, I mean, the motivation you have to go through will be phenomenal. So this is all infrastructural stuff. Then they had some, they called it web persistence reporting. Some domain stuff services, all these layers and ignoring what the layers were and which position they were at, ignoring all of that hats off that they attempted to get some kind of structure to the architecture, some kind of structure, some kind of layering, right? And then I started digging into this a little bit further. So that high altitude, it looks great. The lay of the land is fantastic. And they genuinely believed it was fine until I started digging in. At a namespace level, at a package level, one small part of it, tiny part of it, I saw this. So let's ignore the numbers for a minute. But you can see the cyclic dependencies here, right? It's all over, okay? Ignore the names, but check those out. This is a count of the number of dependencies from one package to the other. 95,000. You know how many lines of code you need to create 95,000 dependencies? 43,000, 10,000. This reminds me of signs of organizations in India. In South Africa, when we talk about scaling up, we go from 50 to 100. Here when I speak to people, how many people in organization that are developers? Oh, 10,000, 5,000. This feels like that. I think if we took services companies and we told each of them to write one line of code, they wouldn't be able to do that. They wouldn't be able to create all those dependencies. You don't have enough people to create those dependencies. So, huge disaster. Just to give you an idea of what's going on. I looked inside that admin package. There was one particular class that called me in to say we have a performance problem on this piece of code that actually is responsible for calculating the value of portfolios across all our mining clients. And it just takes long. So what they do is that they just schedule it for big gaps in between and tell the users you'll get yours on Tuesday and you'll get yours on Thursday and you'll get yours on Saturday. That was the answer. Eventually they realized they ran out of days and gaps. So come back, let's look at this problem. So this little picture here on the side, this one. That's the dependency matrix of what? Of the constructor. The constructor of this class that does a calculation of portfolio value. What it's gonna take to actually instantiate that class 90 dependencies. They've seen this really nice pattern that says that if you have exploding or telescopic parameters, roll them up into another object. So they rolled it up into an object. That object became really big. Then they said that's not gonna work. They just passed in a map of values, keys and values. And they tried to instantiate 90 dependencies. The constructor's call graph goes eight level deep. By the time the constructor exit, it has gone eight levels deep before it's instantiated itself. This is another dependency matrix. And it is their so-called main piece of code, their main method in that class. 4,500 lines, top to bottom. Call graph goes 13 levels deep. One transaction, I'm not kidding you, right at the end of the method, just before return is commit. So you know where the performance problem is, right? I mean, that's insane. You know, it's a piece of code where I was doing this eclipse thing and you hold on and you click through stuff. And I didn't realize I was always in the same piece of code. I was in the same class. I was just moving up and down in the same class, clicking through, okay. I mean, this is crazy stuff. Here's another one. Company that builds some hardware and software and stuff like that, vehicle tracking. I mean, by now you're most probably getting bored with this. So that's just what the hell is going on here, right? This was an output from JDPEN of Static Analysis at package level, not class level, at package level, okay. This code is about three years old. They've been doing scrum for about two years. Very successful in the process. Even the previous one. A little bit less on the agile side, about one year of scrum. But when I started asking them what this is, and if you look very carefully, there's one, two, three things here. One, two, three things on that side. So I asked them, what is this doing? It's Java servlets on that side. They take stuff coming off the query string or the body. They make pojos out of there, or plain old Java objects. They do some transformation to support a web service contract. And that's the web service that they call on the other side which happened to be something that was hosted on a .NET platform. This is an integration tool. But look at the train wreck in the middle. These are questions I ask. How does this happen? It's unlikely that any of these people sat down one day and said, let's create the worst application. Let's try to design this in the worst possible way. It's not like they went home and at dinner their kids asked them, what did you do? Today we just bastardized this design. It felt so good. What do you do, dad? I just messed up some stuff, man. Really? You should try it sometimes. It's a great feeling. This thing doesn't happen overnight. You see this weight gain? It's incremental change. It's genuinely incremental. It did not happen yesterday with the dorsal. Okay, it did not. It's slow. It sneaks up on you. It's insidious. Unless you're taking care of it, you're going to put on weight. Code is the same. All of those things exhibit this stuff, right? But when my wife tells me, you know what? Like Venkat was saying, I have the same problem. She just hasn't bought the treadmill. You know, you should be doing this exercise, look at you and in my eyes, I look fantastic. I seriously think I'm fantastic, you know? Until I wake up in the mornings and they look at me and my son would say, you're going to work looking like that. What do you mean? I thought I was fine, you know? So anyway, so all of these people were completely unaware that over those period of time, the code, the design had completely sprawled out of control. Completely sprawled out of control. Why? Because as you're working as a developer, you're generally working in such a localized blinkered view of this piece of code. You just do enough to get it done, ship it out, move on, that type of stuff. And what happened is that the entire integrity of everything gets destroyed. So I started thinking about why does this happen? How does it happen? What is it that we can do to make it better? And then I started reading back on some older stuff. And there's some questions that defy time. Here's one. Doesn't matter when we are going to be in software development, I genuinely believe this will be a question that we'll ask forever. How do we break up this large system into smaller modules? How do you know whether you should slice it up like that or slice it up like that? How do you know? What informs you? In the cases I've shown, it's just been completely accidental or coincidental. But that's one question. The another one, which modules are subordinate to others? How do you know that blue must depend on gray and gray depends on green or gray depends on blue and green? How do you know? How do you design? How do you make that choice? Anyone? Say Lala? Business requirements. Business requirements. Use of patterns. Interesting one. Yes? No refactoring. The virus. Yeah? If you don't know this virus, the virus is not years, the virus is just the virus for two weeks or maybe two months. So a little bit shorter than years. Yes. Then you don't end up like this. But on outset, that's a tough question. I find it something that I really struggle with. On outset, how do I know that this class belongs to this little package? He's talking about separation of concerns. So maybe this does data access. I'll put it with the rest of the data access stuff. We'll talk about that coming up just now. So how do you decide what data each module needs? Or which data, what module needs from another module? How do you decide that? Here's another one. How do you know what data to pass between this? It's just, oh, well, I've got this class. I know it's got these properties, so we must have all of those things. So we'll just pass all of them. How do you know you need to pass all of it? Do you really need to pass all of it? So those are questions that we just tend to do quite accidentally. That's what I found. People do it quite accidentally in the flow of the moment. Let's just try this out. Without having care and deliberate thinking about is this relevant? Is this necessary? Is this appropriate? I'm not talking about over analysis here, but I'm just saying is that perhaps we need to actually pause, consider this data structure that's being passed around and is it appropriate? How many of you have written DTOs, which is a direct map of something in the domain to something that the web tier wants? And you've used reflection. Or you've written something like in C-sharp, well, they'll use something like automapper and it'll just copy it over. Struts had this thing that copied properties from one to the other, right? Like a bean copier. So the worst invention. So when is a module a good module? When is it a bad module? Those are things that we want to know. When is this decent? When is it not decent? So those are all of the things. But we can't talk about if you don't know what's a module. So here's a great definition I come across. So it's not some esoteric thing. It's right down to the level at which we work. It's a bounded contiguous group of statements. That's all it is. It's this piece of code like that. It's contiguous. It's not disparate. It's not spread out all over the place. It's in one contiguous chunk. Importantly, it has a single name. Even more importantly, given that it has this name, we can actually refer to it by its name. Now this sounds so trivial, right? But just think about that for a second. Contiguous piece of code, having a name by which we can refer to. That's a module. It doesn't matter at what altitude you're talking about. Whether it's a method, whether it's a class, whether it's a package, whether it's some larger component, some jar, TLL, it doesn't matter. Now, how many, this is not the black box from the first case study. This is in general, we talk about oh, this is a black box. What do we mean by that? What's the first property of a black box? It's black. It's black, it means that it's opaque. We can't see inside it. Which means that if we can't see inside it, we only have one choice. We have to rely on its interface. So given that we can't see inside this thing, the only way we can use it is by relying on its interface. Bad interface, bad black box. How many times have you taken a piece of code that someone else has written, and you've had to look inside the class implementation in order to use the method that is public? If there's no one here, I'm going to leave right now because problems are solved, right? So those classes are not black box, even though they've got private, protective and public methods, which means that its interface is insufficient. It's opaque, but it's insufficient. Right, so that's the problem. So in order to exploit a black box, a module, we must, we can only do it through its interface. The train wrecks that we saw earlier had interfaces that we could not exploit. The interfaces that we couldn't exploit, which resulted in us going and picking a little bit from there, picking a little bit from there, and there, and there, and there, constructing it here so that we could just get going. Right, the interesting thing, so black boxes don't exist on their own, right? Here's something, it's called a normal connection. So a normal connection between two black boxes starts at the boundary and ends at the boundary. Interface edge to interface edge, right? This is, I love this term, it's called a pathological connection. Pathological connection is one where the connection either starts inside beyond the boundary or and or ends inside the boundary. So if you just think about that quite logically, right? Only black boxes, genuine black boxes can have normal connections. If you have a pathological connection, it is not a black box. It's simple as that. If you had to reach deep inside to get something, it's not a black box. You've broken the boundary, you're not exploiting the interface. So really, really simple thing to think about. And all of those things that I've spoken about have exhibited exactly the same thing. They violated this very, very, very simple principle. Very simple idea, tough to execute though, right? So in general, what we're basically saying is that good black boxes exhibit a high degree of cohesion. That's what we're talking about, okay? So I think cohesion is the forgotten twin of coupling. So when I talk to teams, they all wish for loose coupling. They all say, we are, but this is so loosely coupled, et cetera. But you ask them to show you the cohesive parts and they can't describe which parts are highly cohesive. And it's the forgotten twin, right? That's the way I think about it because they work hand in hand. You know, high cohesion will automatically give you low coupling. Low cohesion will give you high coupling. That's just the way they work. There's an interplay between those two, right? Whoops. There's a fantastic quote, which I'll read out to you. Oh, where am I going? Go back. Right. Go back. Done that. Okay. Right. So here's this thing and I'll tell you who wrote this before in a while. It says both coupling and cohesion are powerful tools in the design of modular structures. But of the two, cohesion emerges from extensive practice, real world practice, as more important. If you think about cohesion, the advice here is that through practice, people have realized that if you work towards cohesion, you'll get the coupling that you want. You'll end up with a better structured design. So then it leads us to all these things that are connected together. It forms something called an architecture. So what is an architecture? It's a result. It's whatever is produced. When you make significant design decisions, in the large and in the small, right? It's reflected in code. In the small, it could be methods, classes, packages. In the large, it could be components, systems, third-party things. The thing about software design is that it's so beautifully fractal in nature. Things you see in the small, you see in the large. These principles will transcend all of those layers. Different altitudes, you'll see the same thing over and over again. It's like natural beauty. Only if you make it so. Here's what Kent Beck describes that. And it was actually quite recent that he wrote something along this. It's when parts have beneficial relationships. And he goes quite extreme about it. He says, if this line of code is next to this line of code, and it has a beneficial relationship, then I don't necessarily want to do extract method and pull it out, because it's more important to keep it in line. It has a beneficial relationship when they're close together. As opposed to blindly just taking out your refactoring tool and say extract method, extract method, extract field, et cetera. It's about the beneficial relationships of parts. That's what we're after. So when we do these types of things, we're after beneficial relationships, which means that modularity is more than just our obsession with reusability. It's about how to establish these beneficial parts, these relationships. If we have that as opposed to obsession with reusability, I think we'll end up with designs that have exhibit a higher degree of beneficial relationships as opposed to trying to see where we can reuse things. Reusability will elevate itself through different levels of abstraction out of that. So let's look at these guys, right? I call this here, oh, we could contest here that this is a black box, because we've got three on that side, three on that side, known interfaces. We can genuinely contest that, okay? But inside, some poor guy has got to maintain this code. Someone really has got to maintain this, okay? But if you look at it as well, the number of lines emanating out of that side, number of lines entering on this side will really tell you that it's a creaking design. Even if you treat it as a black box, I call this object-oriented go-to, that's what I call it. There's no structure, everything can call everything else. It's go-to, we might as well just have go-to that class, go-to this class, it's the same thing. It's no different. So the question again is to slice it up like that, to slice it up like that, okay? Here's the bad news. Some guys wrote it is all but impossible to simplify significantly the structure of an existing system through after-the-fact modularization. That's depressing. That is so depressing. It's like going to your manager and says, you know this mess we've made, forget it, there's no hope, we're just gonna carry on. We have to live with this mess. That's what it's basically saying. I have a slightly different take on this, right? So I say it's not impossible, but it's gonna cost you a whack regardless of which way you slice and dice it. It's gonna cost you more than what it cost you to write it, to correct this. Orders of magnitude more in my experience. If it cost you a million dollars, maybe 10 million to fix it over time. And it'll take you time to do that. So those quotes come from this book, Structured Design, 1963 to 1965 was the effort, was a time period in which they gathered material out of practice to write that book, right? It's by Jordan and Konstantin. At some anniversary event of that book, Kent Beck pulled it out and said, this is the physics of software design. Okay, here's another book. It's been quoted several times, even by myself, Mythical Man Month. It feels like it's a book for managers. It's a book for us guys, really. It absolutely is a book for us. There's concepts in there which we apply would make life a whole lot better. Here's a crazy book, which has probably reached cult status among certain community. Structured Interpretation of Computer Programs. It's basically if you were at Kavelin's talk last night for the dinner he was talking about, less than the regions offered. This basically, and he mentioned something called Metacircular Evaluator. It's in that book. It's a crazy heavy read. It's something that you should read at least once in your life. But in our generation, this is the books that we've rested on, right? Design Patterns, the Gang of Four book. Who's not heard of that? It's okay. It's not embarrassing. It's fine. I mean, I came across it quite late in my life. There's one sitting behind you, a test-driven development by example, Kent Beck. Implementation Patterns, Kent Beck. Who's not read Implementation Patterns and is a Java developer? You should read it. It's about 200 pages. It's most probably the best book on how to restructure your code at implementation. Domain-driven design. Owen mentioned it yesterday in his talk. Eric Evans, who's not read this book? Okay, many people say it's object orientation done right by example. Heavy book, right, hefty. There's actually a really nice practical book associated that came after that that was done by a friend and colleague of mine called Jimmy Nielsen from Sweden. You see, Rebecca, some stuff comes out of Sweden that's known to the rest of the world. And it's called Applying Domain-Driven Design and Examples in Sinshan. Clean Code, Uncle Bob's book. You guys have been using that, right? So, and there's a lot more. So, I'm calling that old school and I'm calling this new school. So what? Absolutely so what? So what if we have this and new? So, why don't you just spend time here and ignore the rest? These guys have built upon that, so why should I even bother reading that? There's one simple reason. The fact that these have built upon that, these have largely been concentrated in this paradigm called object orientation. The books on that side transcend object orientation. They are principles that actually go through all lots of different paradigms. Even structure interpretation of computer programs is largely about functional programming. There's a hefty word in the middle which says, given the following ideas, this is how we could construct an object oriented system. So if you wanna dig deeper, right, you wanna understand where this came from without having to, so that you don't just use things, just literally, just try it and just use it in vain, like the party pattern, then I think you should go back and read some of that. I can tell you that structured design has things in there that will probably feel irrelevant, but it's a crazy book as well. The first piece of code they show is seven lines and they are all blank. This is my code. It's seven lines, all blank. It's the start of my module. Then they stick in one thing that says, call this. So we know that somewhere from here we're gonna call that. Crazy stuff. And this, for me, has been one of the most influential books that I've come across in my life. I read it when I was 19 years old, completely wasted on me. It's called Zen and the Art of Motorcycle Maintenance. It's a fantastic software read, but I think you need to reach a certain stage in your life before you appreciate it. Which stage it is, and age is up to you. But I'm gonna read some quotes out, right? So basically this book is about this guy that goes on a road trip with his motorcycle. So you look where you're going to and where you are and it never makes sense. But then you look back at where you've been and a pattern seems to emerge. You look backwards and a pattern emerges. And if you project forward from that pattern, then sometimes you can come up with something. Isn't that wonderful? So looking back, you find a pattern, it emerges, and then you can project that forward and makes life better. So what I'm talking about here is that going beyond the books that we've been working at, going a little bit further back and bringing it forward again. Here's another one. I've noticed that people who have never worked with steel have trouble seeing steel. There was trouble seeing this. That the motorcycle is primarily a mental model. Whatever we produce is a mental model. It doesn't matter what software it is. It's a mental model. It's something that has a coherent mental model. The problem with the case studies is that they've lost the coherency of that mental model. Steel can be any shape if you want, if you are skilled enough, and any shape but the one you want if you are not. When you look directly at an insane man, all you see is a reflection of your own knowledge that he's insane, which is not to see him. Although motorcycle riding is romantic, motorcycle maintenance is purely classic. Robert Persig, who wrote this book, lived in India. Either motorcycles weren't happening then, but there's nothing romantic about motorcycle riding in India. And this one here is one thing that I really appreciate. Who can really forget the past? What else is there to know? So we've got these fantastic pieces of work. Why do we ignore them? So Fred Brooks does something. What is the common thread across all of this is a lack of conceptual integrity. And he contains that. I will contain that conceptual integrity is the most important consideration in system design. It is better to have a system omit certain features than to have one that contains independent, uncoordinated ideas. Basically, if you get a requirement as a developer, and it breaks a coherent mental model that you have of the system, whether you're using a metaphor or not, et cetera, if it breaks that mental model, either your mental model is wrong or it should not be there. So if you get a flood of ideas that break your mental model, it tells you that your original mental model is a mess. It's cheaper then to stop and restart. The guys behind you did not do that. The mental model was fine. They just switched tech. And then later on, you see the mental model destroyed over time as well. So a lot of the stuff can come down to, that's great for new greenfield stuff. What do we do with brownfields? This is a very popular book. Every team that I've seen who's worked with code base that's over a year old ends up buying this book eventually. It's a great book. It's micro feathers working effectively with legacy code. It's easy. It's practical, read, and it's so hard to implement. It is insanely complicated to try to implement those things, because the code bases you work with are so messed up. So I've taken a slightly different approach, which is I do myth busting before refactoring. This myth bust is popular here. You guys aware of this discovery show? Myth bust is where they prove or disprove certain myths. That's what I do. So this is what I do with brownfields. Before I even start refactoring, trying to clean it up, I actually look at the data. This is a case where in the past four weeks I've been working with a customer. It's an ISP hosting company, and this is part of the billing data. And we needed to redo some of the, or they wanted to rewrite part of the billing data system. And I analyzed the data. So I wrote code to do this. And what I did was that the billing process does things. There are quite typical send out invoice, get paid. Send out invoice. They didn't pay. You send them a reminder, et cetera. So all I did was that I trolled through the data, built up unique sequences or cases of each step in the workflow. We did send an invoice out. We did do this. I just created a signature for the workflow, each single use case. And I counted up the number of customers. I created a histogram that actually fell into that particular workflow. So this long pillar here was on time debit order payments. The next one was on time credit card payments. And the last one was those where the next one was those who got a first reminder. This bit here was a long tail of approximately 90 unique scenarios, 90, right? There's code for every one of those scenarios. There has to be code for every one of those. Where did it come from? There has to be code that has to support that, that long tail. So let's validate what's going on. So it doesn't stop there. So then I said, when did these scenarios occur? How far back did they actually occur? When was the last one? And that long tail on that side, right, all happened anything from four to eight years ago. The long tail. It just was ancient data that's sitting in the system, code that's supporting it for things that don't happen again, right? The top five scenarios were from the last billing run. So another confirmation, before we even start refactoring, let's look at the revenue, right? For the top scenarios, revenue is very high. The long tail of 90 scenario revenue is very low. As a developer working with this team, this is the first time we've brought this information up to management level. You know what it does for us as agile developers? Management can't argue with you. It's fact. It's indisputable. This is the reality of your business. These are all the use cases that have sneaked up insidiously over time and affects the integrity of our system. So what do you do? Management realizes that the long tail costs more. They've got more people employed to process the long tail the day it happened again than it is for the automated stuff right at the front. And we're maintaining this code. And we're doing next feature, incorporating the long tail. It's crazy. So given this knowledge, you are now in a position where you bust myths about those crazy cases. Because in that discussion, people came up and said, no, we have these weird cases. But now you can prove when it happened, how frequently it happens. So you can now delete code that has low frequency, et cetera. And then you can start refactoring just like Michael Feather says so. It's a lot easier when you start cleaning out the craft and you do that. But just deleting code in itself is a crazy refactoring. Or you're now in a position to say, let's create the smallest possible system on the side and build in some migration. What is fantastic about this is that we had courageous management. Management made a decision right at sea level executive to say that we will take the customers that fall into the long tail and work with them to move them over to this other way of working. You can't do this alone. If you drop those use cases without corresponding operational intervention, you're actually heading for a disaster as an organization. So myth-busting is what I like to do before refactoring. And it's simple code. It's once off code. It doesn't even have to be clean. Write it. Get some pictures out. Those are done with things like Excel. Just spew out values, put it in a spreadsheet, and look at it. It's not complicated stuff. So remember these guys? There was about 15,000 lines of code. It came down to about 500 lines of code. One developer, pairing with a domain expert, went from development-centric to operational-centric. Yesterday Venkat spoke about sharper tools. We used a sharper tool here. And it was only for this and nothing else. We never used a tool again. One week of myth-busting, one week of writing code, side by side. We ended up basically coming up with the right level of abstraction to describe the problem at hand. And we elected. This was a choice to write it as a Ruby DSL running on the JVM with JRuby side by side. This is what some of the code looked like. All of that object relational mapping code, all of those entities that were being persisted, ended up as code. So I don't have to read through that. But basically, it just describes the structure of a particular organization and the rules associated with it. Notice that we're not chasing the DSL. We're chasing the abstraction. The DSL was an implementation choice. So I know DSLs are cool and has a cool feeling around it these days. But that was the last decision. The first thing that we were after was how do we model this in an appropriate way? So basically, the advice I give to teams is don't run around with pointy sticks. You're going to poke yourself in the eye. You're going to poke someone else in the eye. Surgeon with a scalpel has a very steady hand, small cuts, very localized. No callbacks and scalpals. We're talking about that yesterday. Surgeons don't do a cut to the scalpel and wait for a callback. So cohesion in the small. In the domain-driven design book. So we'll just go through some things for you to help you get more cohesion. Cohesion in the small, at class level. This is concept of aggregates. So here's a customer class, a phone number class. But check this out. I've broken the object graph here because I don't have a reference to customer. I've got a reference to the customer ID. I've deliberately broken the object graph here. But at this one, with the phone calls, I actually have a hard reference to that. It's a distinct association. The consequence of this type of thinking is that as you move around your object graph, you're creating cohesive units around these things. You're breaking up your object graph. You're creating cohesive units. You can reason about those things and actually do sensible things with them. As opposed to having a sprawling object graph that just is calling into everything else. It feels weird to do that the first time. Immensely satisfying once you actually implement it. Okay, so this comes from this idea of aggregates by Eric Evans in domain-driven design. In the structured design book, there's an entire chapter dedicated to cohesion. We were speaking earlier about separation of concerns, et cetera. They come up with seven different levels of cohesion. The first one is coincidental and that's without thought. It just happened, right? Logical, you put all your data access objects together. It feels right. Temporal, oh, all the startup code goes here, all the shutdown code goes here, et cetera. Procedural, you look at a particular workflow and you say, oh, this particular complicated branch, I'll put that one side, okay? Communicational, sequential and functional is the first time from a cohesion point of view actually start thinking about the problem. This is just about the plumbing. Communicational means that it's cohesion where the computational unit operates on the same input or output data. Sequential is that the output data is used as input to the next one. Last night, Kevin and Henny was talking about pipes in the Unix world. That's pipes. And functional, which is a crazy thing that is written, is that every processing element that is part of itself, so it feels very circular, then they go on to say, well, it feels so circular. It's basically everything that's not above, then it becomes functional. So it's completely self-contained. You could think of it as closure of operations of a particular type. So go back to your code and look at this type of thing and just see where you, which of those you're actually exhibiting more of. Right? This stuff is strong cohesion that focuses on the problem and this stuff is weak cohesion, ignoring the problem, just thinking about plumbing. Bounded context is basically cohesion in the large, right? Where we're finding, trying to find natural contours throughout a system. And we pull them together. And he comes up with quite a few patterns. A shared kernel, which is an agreement to share the heart of something. Customer supplier, which is negotiation of the contract of the interface. Anti-corruption, that you can't change the interfaces that we have to transform. Conformist is that there's no choice, but we just have to do it. Separate ways and big ball of mud. How to stop the mud from oozing out. But here's the most important thing about this is that it's deeply influenced by collaboration. Now this is the first time I've seen someone put down some patterns around cohesion basically and said how is it affected by the maturity of the team and how much of control you have of the code? It's crazy. If you have a highly matured collaborative team, high control of the code, you can go through certain particular patterns. And if you don't have highly collaborative teams, then you can get other patterns that will suit you. I'm gonna actually skip a particular part and get to this. So in functional programming, there's a notion of high order functions. So regardless of what paradigm you're working in, high order functions is some of the glue. There's a guy called John Hughes which was trying to actually emphasize the value of modularity in functional programming. And he says modularity means more than modules. Our ability to decompose a problem into parts depends on our ability to recompose them. You know, it's like buying something, taking it apart, putting it together and you've got a few parts left over. To assist modular programming, the language must have the appropriate constructs. You are limited by the tools that you have. If you can't do it in that way, don't force it. That's what guys were doing with those crazy sprawling designs. So choosing the right tools that give you the opportunity to glue things together and slice them apart, et cetera, is absolutely essential. The theory behind it, the principles behind it, go back way into the past. We've packaged them up really conveniently into the present. The languages and tools that we have, we don't have an excuse not to get it right. It just takes time, right? It's always this unavoidable tension. You're fighting functionality with complexity or simplicity. And sitting in the middle, right, is this. You have the number of modules on axis and the cost on the other. So it's pretty obvious to know that, to figure out that as your modules increase, coupling increases as size, et cetera comes down. Low cohesion. But look at this thing in the middle, which is the total cost. The sweet spot is somewhere in the middle. It's this delicate balance between low coupling, high cohesion, which means not aim for one extreme, right? Don't aim for one extreme. There's something sweet in the middle. And that's the lowest cost point. You won't figure it out until later on. So to end off, Constantine and Jordan again, not having go to look code, go to like code, does not offer any value if the basic architecture is flawed. So thank you for your time. Thank you for going through this journey with me into the past. And I hope there's one or two little tidbits that you've picked up that you could apply to your code base.