 parts of the Agile ecosystem or the engineering practices around agility, I think we've solved a lot of problems in the Agile space around project and process and estimation, but there's still a lot of area for improvement and sophistication on the engineering practices. And this is probably the most advanced of the engineering practices is how do you do emergent design for real on projects. Now some of you may have come to my talk a couple of days ago on rampant emergence, the long-term effects of this, and these talks are kind of backwards because this one is basically how to do these techniques and that talk was kind of the implications of that, but Nourish wanted that as part of the management conference and wanted this as part of the technical conference and so that's why they ended up the way they did. So I'll be talking about emergent design this morning. There is an article series that I've written under the heading of evolutionary architecture and emergent design on IBM Developer Works. These are free articles. You can go down and read them to your heart's content. There are 19 installments here. There's a link there that gives you the baby's table of contents of all those. That's hideous and awful and so I've improved that with technology. There's a bitly link that you can follow and that's basically the things I talk about here and a few other things in more detail. My agenda for this morning is to first kind of tease out what is software design and do a proper comparison between software engineering and other kinds of engineering. I'm going to talk about the distinctions between architecture and design. Then I'm going to talk about things that make emergent design hard, things that make it easier, and then some details about how to take advantage of these things once you found them. I'm going to start with the poetry of Donald Rumsfeld, who's a famous defense secretary in the US for a while and he very famously said at one point, there are known unknowns. That is to say, there are things we now know we don't know, but there are also unknown unknowns. There are things we do not know that we don't know. It turns out he was talking about software here. This is the really deadly thing in software. It's not the known unknowns, because you can plan for those. We've got to have security. I don't know the details. We can do some planning for that. It's the unknown unknowns that always nab you in software, the things you didn't even know that you had to plan for up front. This is the thing that really poisoned the idea of big design up front in software. It always fails because of these unknown unknowns that you cannot possibly anticipate. It turns out that this is really just a reflection of a much broader principle and that is that the future is very, very hard to predict and that's exactly what you're doing. If you embark on a gigantic architecture exercise before you've tried to start solving any part of the problem, you're trying to predict the future in all these myriad details that you've got no chance of getting exactly right or in some cases not even getting all that close. I'm going to talk a lot about software design in this talk and so we should probably do a real definition. I'd really like to answer this question. What is software design? Fortunately, someone has already answered this question for us. In an essay published by Jack Rees in the fall of 1992 in the C++ journal, Jack Rees wrote a famous essay called, What is Software Design? It is still available. It's still worth reading. You would believe that it was written last week because it's still very, very accurate in what he talks about. In this essay, he tries to do a proper comparison between traditional engineering and software engineering. Now, we have a lot of really shallow comparisons between these about bridge building and buildings and gardening and mountain climbing and all these metaphors. What he wanted to do is try to do a deeper comparison between these two styles of engineering. He reached a really interesting conclusion as part of his essay, that the final goal of any engineering activity is some type of documentation. If you're a civil engineer and you are tasked with building a new bridge over here, what is your deliverable for that work? Is it the bridge? No. It's the plan for how to build a bridge at this location with these constraints with this scalability and you will turn that over to a team of expert bridge builders who will actually pour concrete with rebar and make that bridge a reality. Once the design effort is complete, the design documentation is turned over to the manufacturing team and they manufacture it. Now, they have a little bit of feedback with the original civil engineer, but mostly this is a one direction trade off. But this is one of the places where we see a stark difference between traditional engineering and software engineering because manufacturing for physical things means you're stamping out atoms and it turns out that atoms are relatively hard to reconfigure once you've got them stamped in a particular shape. In other words, refactoring physical things is hard and so there's a really strong desire to get exactly right up front before you start stamping it into atoms because it's really hard to fix after that. But we don't deal with atoms in the software world, we deal with bits. It turns out that bits are very malleable. You can change them at any time you want. They're very, very soft rather than hard. But now you have to ask yourself the question, if the final effort of design is documentation and it gets turned over to a manufacturing team, then who is the manufacturing team in software? What's the part of the software process that takes something and makes it manifest things in the real world? Move electrons through wires and access networks and all those things. In software, our manufacturing process is compilation and deployment. So if that's the case, then what is our design document? It turns out our design document, our blueprint is the source code, the complete source code for the project. Because if you make a change to one line of the source code, you have to re-manufacture the whole thing. But here's a real difference between traditional engineering and software engineering, because in traditional engineering, stamping out atoms and manufacturing things is very, very expensive. In fact, one of the reasons that civil engineering became an engineering discipline was not to build safer bridges. It was because it's too expensive to build a bridge and roll heavy stuff over it to see if it collapses or not to determine if it's a good bridge or not. And so they built up a lot of engineering techniques around mathematics and other sorts of predictive abilities. So they didn't have to build a thing because it's so expensive to manufacture things. But in software, our design is cheap. We can make changes to it any time we want. And it doesn't have that same kind of negative downstream impact that traditional engineering has. And so Jack Reeves says that given that software designs are relatively easy to turn out and essentially free to build, an unsurprising revelation is that software designs tend to be incredibly large and complex. I think that the most complex thing is created by humans right now in a lot of ways because they're also very unlike other kinds of engineering in that. For example, our tolerances are impossibly small compared to other engineering disciplines. Let me give you an example of that. Let's say that this hotel, this big nice beautiful hotel, let's say that arbitrarily it took 200 people a year to build this place. I don't know what it actually is, but let's just say that. And imagine a piece of software that would take 200 people a year to build. And this building on the light, on the electrical socket, on the wall, if they didn't get it completely covered, if they didn't completely cover the hole in the wall, the building won't fall down. But if you make a similarly small mistake in a piece of software, it will collapse and stop working. It will fail. Now, of course, you can spackle in that hole and remanufacture the whole thing almost instantly. But our tolerances are impossibly small compared to physical things. Pharyngeic engineers, when there's a problem with an airplane wing, a wing falls off an airplane, they typically go to where the wing was attached to the airplane and start looking for problems. But for us, we can have something that creates a little instability here that doesn't manifest for 200,000 lines of code further down the line. We don't have the same locality of problem that traditional engineering often has, where the stress point is the part that's broken, because instabilities can be created in other places. We also don't have the same kind of economy of scale that traditional engineers have, rivets in it. And you can bet that people who understand the structural characteristics of bridges can tell you what contribution those million rivets make and to do overall structural integrity of the Golden Gate Bridge. They're all consistent, and so you can take that value, time some number, and get reasonable numbers out of it. In software, we may have hundreds or thousands of pieces too, but they're all uniquely handcrafted. We've never figured out a way to create really generic parts that can plug in, and so we have all these unique little things that we've created for specific problems. In the traditional engineering world, they have spent an enormous amount of effort on predictability. We want to be able to predict the characteristics of these physical things, because it's so expensive to build those physical things to actually verify those characteristics. And in fact, in the 1950s and 60s, IBM invested an enormous amount of time and effort to try to come up with a calculus for software. They figured, hey, this is just an engineering discipline like civil engineering, we should be able to apply the exact same kind of rigor to that, and it turns out they're not nearly similar enough. But it turns out we don't need this predictability in software. And in fact, chasing it is not a good use of time, because we have an advantage over traditional engineers. Our manufacturing process is free. It doesn't cost us anything to manufacture our code. So why don't we just manufacture it and test it in realistic kind of conditions to see if it's going to hold up under real world kind of conditions, and we can simulate things very easily in software because everything is soft. And so testing really is the engineering rigor in software. The same kind of rigor you get from statics and dynamics and mathematics and traditional engineering, we get on the opposite end, reactively with tests. So we don't have to be proactive. We can afford to be reactive because our manufacturing process is so cheap. Testing is the thing that lets us take this jumble of unique parts and isolate each part from the others and make sure that we understand its characteristics exactly before we incorporate it into all the other parts that are going to make up our piece of software. And so Jack Reeves concludes his essay by saying, software may be cheap to build, but it is incredibly expensive to design because every part of the software process is really part of the design process. The little preliminary sketches you make on napkins is part of the design process. Use cases and sequence diagrams are part of the design process. But they're not enough of the design process to say that's complete because now you've got to get into the source code. So writing code is part of design. Debugging is part of design because changing bugs changes the source code. And that is your design document that you're creating. Every single thing in the software world is part of this design space, making it very, very expensive. Meaning that if you spend a lot of time upfront trying to come up with a speculative design that doesn't really work under the real world constraints that you find yourself in when you start building your piece of software, you've wasted a lot of very, very expensive, valuable time, and getting no benefit from it. So I want to make a distinction here. Between architecture and design, you'll probably notice that the article series that I wrote for Developer Works is called Evolutionary Architecture and Emergent Design. And that's on purpose. In fact, Rebecca, who's doing a talk after me, is the one who convinced me that there's no such thing as emergent architecture. Let me dig into that just a little bit. This is a very abstract picture, but this actually tells an interesting story. The difference between architecture and design. There's a famous white paper that Martin Fowler wrote in, I think, 2006, which is still available on his website, called Who Needs an Architect? It's about the role of architects and agile projects. And in that paper, he gives several different extant definitions of software architecture. And he ends up giving my favorite definition of software architecture, the most accurate one I think I've ever seen, which is the architectural elements in software are the things that are hard to change later. You can look at your software stack and ask things, are you hard to change later, and determine if it's architecture or design. So your database server is clearly architectural, so is your computer language. The web framework you're using is architectural, because it's going to be hard to change that later. But the way you use that web framework, the way you use its workflow pieces, the way you use its validation pieces, those are part of design, because those are relatively easy to change later. This very abstract little blocks picture actually conveys a little bit of that sense, and the reason why you can't have emergent architecture. Because architecture is the foundation. You have to have some architecture to start with. You can't start with no architecture and build it up. So the trick here is, how can we start with architectural elements, and have as few of those as possible, and have more design elements, which are relatively easy to change later. That's one of our goals, is to try to make more things design-like, so we can change them more easily later. And that is, in fact, this distinction between evolutionary architecture where you have to start with something, versus emergent design. You can afford to do a lot less in the design space, because the problem will inform you, and it's not so expensive to make changes. So let's talk about design, and in particular, emergent design. I'll give a definition here from Webster. Something that is emergent arises or emerges out of something that covers or conceals it, and comes to light. That's one definition. Another definition is something that suddenly appears, arises unexpectedly, and calls for prompt action or urgency. The two definitions here very nicely correspond to the two key parts of emergent design that I'm going to dig into in this talk. And here's the first one. Let's say that at work, you have a problem that's conceptually like this. This is kind of tangled knot of concepts and connections between them. And you think about this while you're at work, and you're working on this project. You think about it while you're driving home. You think about it while you're mowing your lawn. When your spouse is talking to you, and you're supposed to be paying attention to them, you're still got a background thread thinking about this problem. Just really bugging. You think there's got to be some sort of sanity there. And you think about it and concentrate on it until it finally resolves itself into some sort of reasonable structure that makes sense. That's really an exercise of finding abstractions and patterns that are already there in your code. Now when I say the word patterns, that's kind of a loaded term in the software world. I'm not talking about patterns with a capital P here, like design patterns, the elements of reusable object-oriented software, which is a very useful concept, but it's very generic. What they did in the design patterns book is say, look, we've identified these things that show up in pretty much every piece of software we've seen pretty close, and we created this big giant catalog for you to reference. Those are not the kind of patterns I'm talking about. I'm talking about patterns with a lower case P. What I'm calling idiomatic patterns, patterns that are unique to your application or your suite of applications that nobody else in the world cares about. You'd never write a book about them because nobody cares. So a perfect example of this is you probably have a specific way within your company that you handle authorization and authentication for the parts of your applications. You probably do that consistently across your entire enterprise, or you desperately wish you did it consistently across your entire enterprise. That's an important thing for your business is how do we handle this particular common use case? That's an idiomatic pattern for your business. Nobody outside your business cares that much because they have their own way of doing it, but it's very important to you. I further subdivided these idiomatic patterns into two subcategories, technical and domain idiomatic patterns. Technical patterns are things like validation or security or transactional data, any sort of reusable behavior that you would like to be able to harvest and reuse at some point in the future because it does something very elegantly or does it very simply or it's just the best way that you've found to do it. And domain idiomatic patterns are things like business rules, shared functionality that you discover from a technical standpoint. It's not that uncommon in companies where, for example, marketing and sales are doing a lot of things that overlap and they don't really realize it. And the technical people realize that and find some common elements that they can harvest there. So when I use the term this way, patterns really describe effective abstractions, effective ways that we solve problems that we would like to leverage in the future any time that problem or very similar problem pops up again. And so that's one half of this idea of emergent design is finding and harvesting idiomatic patterns. The second part of this has to do with this lean concept of the last responsible moment. And most of you are familiar with this, the idea of last responsible moment is how can you defer important decisions in a safe way? And of course the concept here is that the longer you can delay the decision, the more knowledge and context you are going to gather about the real, true nature of the problem, not the problem you thought you were solving before you started working on it. Those things are never the same. You think about it, but once you actually get into the code and get elbow deep into it, things always change. And so the idea of last responsible moment is the longer you can wait, you have a better potential decision. But of course the hazard here is waiting too long and actually causing yourself big problems. This is part of the magic of emergent design. The really tricky part is how do you determine exactly when that last responsible moment is. And I don't know, I don't have silver bullets. I'll give you some pointers of how we find this stuff and I'll give you toward the end a case study for a project that I think pretty much nailed last responsible moment almost exactly. But that's still a tricky thing. The other thing I would mention about these two pieces of emergent design, finding and harvesting idiomatic patterns, really more pertains to existing brownfield code bases, whereas last responsible moment is very much a kind of greenfield concept. And both work in both cases, but they kind of lean in those directions. I'm gonna talk about some things that make emergent design possible, but before I do that I'm gonna talk about something that is unfortunately way more common, which are things that make it hard. I'm gonna talk about three very common things that we see in software projects that really hamper your ability to do something like emergent design. And the first one of these is the nature of complexity in software. Turns out software has two kinds of complexity. There is the essential complexity of a particular problem, and that is just the inherent complexity of what this problem is. Every problem has a certain amount of complexity that is just inherent to the problem itself. But unfortunately we also have lots and lots of accidental complexity in the software world, which are all the externally imposed way that software becomes complex. It seems to me that this is where the true innovation happens in software is figuring out new unique ways to introduce accidental complexity on projects. It's amazing how much craziness is on projects a lot of times due to accidental complexity. Something that shouldn't be there, but it is because of historical reasons. And a lot of being able to do emergent design is being able to identify what is accidental versus what is essential. And I've got a couple of examples of this. This is of course on a spectrum. A classic example of essential complexity comes from the original XP project. The original Extreme Programming Project, the C3 project, was a project to do a human resources system for auto manufacturers in the US. And auto manufacturers have facilities in multiple cities, multiple states, they're scattered all over the country, and that was the goal is to write an HR system. It turns out that one of the manufacturing facilities had a really clever union rep who managed to get the first day of hunting season off as a paid holiday for everyone just in that facility. That was really good for him because he managed to get something for his employees that nobody else had. But it made the software more complex because that's one of those little changes that ends up finding its way all over the place because now when you accrue work days and when you accrue vacation days and sick days and personal lead days, there's always this extra condition to see if it's this facility versus all the other ones. And the developers would have loved to go on to the union rep and said, you know what, that sounds like a really good idea but it's gonna make our software too complicated so I'm sorry, you can't have that. We don't have that luxury to do that and so they had to figure out a way to solve this. This is the essential complexity of the problem. As you start creeping a little further to the right, you end up with something that I've encountered a lot in projects, really fine-grained security on a page or form somewhere. Somebody says, I wanna control who can see and change every stinking field on this form or this page somewhere. And now software developers say, all right, we can do that and so you build an enormous amount of infrastructure, handle that really fine-grained control and you have to build some infrastructure so somebody can come in and define who can see and who can access every stinking field and every stinking form. And you finally roll this out to your users and let them see it and they kinda turn their nose up and it's like, wow, that's really hard, I have to do that for every page? I don't wanna have to do that. It surprises me that there's not more workplace rage in the software world because you've poured all of your heart and soul into this elegant implementation and then your users are not that crazy about it. But this actually points to another problem in software that very often users request things that they have fantasized about but they've never actually seen for real. And sometimes your fantasy life is richer than your real life and when you see it in front of you it's not nearly as cool as you thought it would be. A lot of times people ask for things without understanding the implications of what they're asking for and when it gets delivered to them that's way too complicated or way more cumbersome or not what they want it after all. So that becomes a form of accidental complexity that's being driven by requests from people who don't understand the true nature of what they're asking for. And you keep creeping further to the right and you end up with things like EJB and BizTalk which is not to say these are inherently flawed technologies but the way a lot of people use them end up being flawed because these are heavily architectural elements. These are extraordinarily hard to change later. And so now you have this deadly question at the beginning of the project are we going to need declarative distributed transactions at any point along the way? And if you say yes it's like okay well let's start with EJBs and build all that behavior at the bedrock architectural level but it adds so much ridiculous accidental complexity to your project a lot of times the project fails before you even get to the first release because there's so much stuff to slog through before you even need that capability that you're taking on an enormous amount of accidental complexity. This distinction of course is not new at all. Fred Brooks made this distinction back in 1975 in the book Mythical Man Month and we still haven't learned to fully evaluate this. The second thing that makes emergent design really hard is runaway technical debt. This is a metaphor that you're all very familiar with I'm sure for Ward Cunningham. It's this idea that there's a delta between if you could have the most perfect design, a code base where no amount of attention is too great to make it as perfectly honed as possible versus all the compromises that you're forced to put in it because of external forces. The most classic driver of technical debt is schedule pressure. We've got to get this stuff done but another driver of technical debt is the boat anchor anti-pattern that's on the C2 Anti-Patterns Wiki which is the we've bought this expensive piece of equipment or this expensive framework and you have to use it on every project whether it makes sense to do that or not. Tying these two things together I was on a client at one point and I was helping them debug a problem they were having it was this resume driven development kind of project where they had EJBs and they had a spring in hypermate being instantiated inside EJBs and it was going to cocoon to produce XML and XSLT to produce a portal that portlet that goes in this frame and they were having problems that were having stack blowing up on the application server and at one point somebody said yeah and our seven users don't like it very much either. And I said this thing has seven users and I said yeah well it's eventually gonna be like 10 users and I looked at what it does you go in and fill out three forms and then click a button and it comes back with a billing code that's the whole thing. I said how much have you invested in this project so far? A million and a half, maybe 1.1 million dollars so far. Like that's craziness, why are you doing that? Well the corporate mandate says everything goes through cocoon, everything is EJB based and so we're just following the corporate guidelines for what we're supposed to do. And so I went home and over the weekend rewrote it in Ruby on Rails. It took me about two hours. That made them change their mind that maybe we shouldn't standardize so much on this really complex architectural stack particularly when we're building simple little things somewhere. Martin Fowler, our chief scientist has actually created a quadrant of how you end up with technical debt and he has these categories of reckless and prudent and deliberate and inadvertent. And so reckless and deliberate is we don't have time for design, we've gotta get busy. You guys start coding, I'll go upstairs and see what they want. Prudent and deliberate is we have to ship now and deal with the consequences later. Inadvertent and reckless is what's layering? This is talking to developers who say you know it's really convenient to put all your code in one giant JSP so you can scroll up and down and see what all the global variables are defined. And inadvertent and prudent is now we know how we should have done it. Hindsight is always 2020 in the software world and in fact, Fred Brooks has an interesting observation about this. He said if you want an awesome piece of software write it once and then throw all that away and then write it a second time and the second one will be awesome. Why is that? It's because for the second time through there are no more unknown unknowns. You've already solved all the unknown unknowns. They're all known knowns now. And in fact, the second time you write it you can probably estimate it to the day of when it's gonna be done. Because it's easy, you've already done it once. Unfortunately, that's a really expensive way to write big softwares, to write the entire thing and throw it away but it is a way to get really good software. Technical debt is a reality in our world and I don't think we should try to stamp it out because it's not inherently a bad thing. Just like credit card debt is not inherently a bad thing but you have to know how to use it. So the real trick on technical debt is not eliminating it, it's negotiating repayment for it. You first gotta convince someone technical debt exists before you can start a conversation about repayment. Wouldn't you love to live in this world where your project manager comes to you and says we need these three new features and you say nope, I'm sorry, you've reached your credit limit on technical debt. We're not gonna implement any more features due and you pay back some of this debt and if you keep abusing this privilege we're eventually gonna declare technical debt bankruptcy on you and never listen to you again. It would be awesome to live in that world but we don't live in that world but that's the real problem is that companies accrue technical debt and they never really go back and try to address it and it has long term bad implications. But how do you convince some manager type of this? Well, here's one of my favorite mantras in the Agile Engineering world, demonstration Trump's discussion. You can talk about this stuff until you're blue in the face, you'll never convince someone until you can show them objective data about it. So I wanna show you a couple of ways, there are actually myriad ways to do this, but I wanna show you a couple of ways to illustrate technical debt to non-technical type people. Before I do that though, I need to take a very short digression and talk about a metric, a very old metric as it turns out, cyclomatic complexity, which measures the complexity of a method or function, giving it a numerical value. This is created by Tom McCabe back in the C-day, so this is back in the early 70s. Here's the formula, edges minus nodes plus two where edges are possible paths through the code and nodes are the actual lines of code. So if you look at a little method like this, you can draw it out in this kind of nodes and lines view and the cyclomatic complexity of this is four minus four plus two, so this has a value of two. It's kind of boring when the numbers cancel each other out, so here's a slightly more sophisticated method and I've drawn it out in this kind of nodes and edges view and I can number them, eight minus seven plus two gives me a cyclomatic complexity of three, so this is one more complex than the one I showed you before. This now gives you a tool where you can compare two methods side by side and say which one is the more complex method. This is useful in several ways. One, this allows you to run a metrics tool and go find the most complex class in your code base. That's not to say it is the most accidentally complex, but it is the most complex because you can find all the really complex methods in it. You can also, because this is a method level metric or function level metric, you can say let's take the entire code base and run cyclomatic complexity over the entire thing and then divide it by the number of lines of code. That gives us cyclomatic complexity per method which is what this graph shows. This is a graph from a real software project. This is a project that was for a public facing media site in the UK. This is a snapshot in time, it's probably hard to see these numbers from April 1st, 2006 to August 18th, 2007. This is cyclomatic complexity per line of code, that's the graph line that you see there, and little gray numbers at the bottom are the public releases or the releases of this software. Not the public releases, but the releases of the software. Just by looking at this chart, can you tell what the first actual public release of this software was? It turns out it's number three because you can see technical debt manifesting right there because at the beginning of a project, this is a project for a client and so the client is very, very nervous about getting to their ship date because they're not making any money, they don't start making money until there's ships and so what that's driving is complexity which they then kind of tamp down and then complexity and there's a lot of volatility here in complexity per method. You can almost see the business analyst standing behind the developers going, is it done yet? What about now? What about now? What about now? Is it done yet? What about now? What about now? We gotta get this thing out and we're gonna lose all our money. And so they finally hit their date at release three and Eric Dernenberg, tech lead on this project, produced this much of the graph up to three and took it to the tech lead and said, okay, we need to do something about this because the trends are disturbing because we keep letting this climb, it's gonna be bad. But in what way is it gonna be bad? Technical debt manifests as a drag on velocity because the more technical debt you have, it's literally the more stuff in there that's a workaround for something that shouldn't be there. It's kind of like underbrush. If you're trying to walk through a field and there's really tall weeds, it really makes it hard for you to navigate and find the path that you're trying to be on. Technical debt manifests as a slowing velocity. In fact, some of you may have experiences at some point where two years ago, adding a new feature to your code base could be done in a couple of days. Now it takes a couple of weeks to add the same level of new feature because you've got this giant pile of technical debt laying around causing it to be hard to do things like refactoring and changes to your code base. So Eric won his argument and they said, okay, let's put some effort in. And so 3.0 and 3.1 was them going in and primarily cleaning up technical debt. And the inertia from that effort carried all the way down to this little valley right here at release number seven. Then an interesting thing happened. The client decided we want more features faster. And so what they did was add a second parallel development team working on a new set of features and that's where we started leapfrogging releases here. What you're seeing now is the gradual creeping complexity of having to integrate two code bases together. So it is creeping up in a much more controlled way now. But notice the other thing that this gives you. Cyclomatic complexity per line of code has a particular dollar value for a company. This is how much it's costing us to write software per developer. And you can look at the amount of software versus the effort here and the amount of software versus the effort here and make an objective decision is it worth having this second team here because what are we trading off here? We're saying that we're gonna get a higher overall velocity but we're gonna take a hit on individual velocity just a little bit. Because that's what this says is you're gonna be a little less productive because there's a little more complexity there than it was before. So every developer's gonna be slower but we've got a lot more developers and so that's gonna speed us up. Now you can actually do a real evaluation. Is it worth monetarily having a second team or are we better off reducing it back to the original team and lowering our complexity? I think this is a great example of demonstration trumping discussion. The one that Eric produced this was by hand. He actually wrote a little script that went to the first version, check it out, get cyclomatic complexity, go to the second version, check it out, automated all that, mashed all that into a single comma separated file and then pull that into Excel and had it graphic form. Eric actually does a lot of work on visualizations as a good example of one of his works. And there's a little bit of effort to create this. Fortunately, someone has already created a lot of these things for you. Now, most of my talk is kind of Java centric because it's had to pick a language but a lot of these tools exist in other spaces. Sonar is a Java specific tool but what Sonar is is an open source tool that gives you a lot of nice visualizations for a bunch of common metrics like cyclomatic complexity coupling. They give you a sample instance where they've taken a bunch of open source pieces of software and run them through Sonar. And the nice thing about Sonar is it gives you a lot of results from very little effort. You basically install it and point it to your code base. The main reason I talk about it here is one of the interesting things that Sonar has is a technical debt calculator that'll actually take some of the criteria in your project and calculate what it thinks your technical debt is on that project. Fortunately, you can tweak these formulas a little bit because one of the hazards in Sonar is that that number tends to come out a little bit high on the high side. So the last thing I want you to do is take this back to your code base, run Sonar against your code base, and it come back and say, it's gonna take one and a half million dollars to get your code base back to the point where it doesn't suck. And you're gonna take that to some manager somewhere and show them that and then you're gonna jump out a window because there's no way they can afford a million and a half dollars to make your code not suck anymore. So you may wanna tweak that formula a little bit. You wanna frighten them but not enough to jump out a window. So that's a really fine line there. You wanna get their attention but you don't wanna frighten them to death. So it might take a little tweaking there to do that. This is really good if you have managers that can understand numbers and can read words but if your manager doesn't have time to read words then it comes with pretty pictures too. And in fact, this is really valuable in metrics because very often in metrics, in software metrics, raw numbers don't matter that much but trends are everything. Cyclomatic complexity is a great example of that. I have a method that has a cyclomatic complexity of 12. Is that good or bad? I don't know, it depends on what it does but if it's 12 now and then I check it again in a month and it's 20 and I check it a month later, it's 35, clearly it's going in the wrong direction in terms of complexity and so I can try to tamp that down some. This is the kind of thing that you don't wanna see on a project. This is what they call time machine which is the blue line is coverage, the green line is cyclomatic complexity and the orange line is complexity per method. This is on the struts code base. What this shows is that in September 2009, something horrible happened to the struts code base. They imported a whole bunch of other stuff from another framework that had terrible test ratios and giant complexity and it tanked their whole code base for a while and you could see it on this graph. This is really valuable for you as a company. Let's say that your company merges with some company that has really awful dysfunctional IT and then a year later somebody says, why is our IT in such a mess? You can go, oh, I have to answer that question. Look, I've got a graph that shows that the day we merged, things went bad. Yes, so his question is, your manager's gonna say, well, your code works, so why do you care? And you don't, if you're never gonna make a change to that code ever again. So when your manager says it's fine, it works, say, okay, good, I never have to touch it again, awesome. I can move to a Greenfield project and my life will be much better. It's only when you have to change it that this manifests and the more you change it, the more technical depth it's there, the slower it's gonna be and the longer it's gonna take and the more complex it's gonna be to do even simple things. So if it is, in fact, end of life. Now, when is end of life of code? Code ends life when you unplug the final server that has version control of that code and use a magnet to wipe the hard drive. That's when software is done. Because nobody will ever make a change to it after that. Until you do that, somebody's always gonna wanna make a change to that software. The cobalt that prints paychecks, I guarantee you back in the 1960s when they wrote it, they thought that'll never be running in the year 2000. Guess what, cobalt still prints your paycheck. This is actually a much better picture. This is a spring batch, which shows over time gradually escalating complexity as they add more behavior, but very consistent complexity per method and very consistent code coverage throughout. So this works good if you have managers that only like pretty pictures. If your manager is so ADD, you can't even get them to look at pretty pictures. You can show them pretty moving pictures. Yes, because it's doing more things. This is not accidental complexity. This is essential complexity. This code needs to do more things over time and so it becomes more complex. That's an important point. Notice that complexity comes in two flavors. This is essential complexity as it adds more actual behavior. You will see essential complexity go up. Complexity per method is steady, whereas the number of methods are growing. That's right. So if you have someone who doesn't even want to look at graphs, Sonar also has a gap-minder-inspired motion charts. Here's an example of one of those. This is complexity over complexity per method in struts and you can see as my little ticker gets to September 2009, little bouncing ball is going to sprint to the upper right-hand corner and turn beet red. That's a good indicator that something has gone wrong in your code base at that time. That is the value, the numeric value over time. This is a gap-minder style, so what you're seeing is times that little red ball and that's the value of complexity over complexity per method. So Sonar's got a ton of nice things like this and there are tools for other languages that give you some of the same kind of metrics. There's independent and the dot net space that gives you some of these same kind of metrics. These metrics exist across a lot of different languages and that's the second thing that makes emergent design really hard is a lot of technical debt. I like that underbrush kind of metaphor because if you're trying to make your way through a jungle, if there's thick underbrush, it's hard to see the real path that you're trying to follow and it makes it very easy if you just stumble off on a side path that's not leading you in any interesting place. So third thing that makes emergent design really hard is rampant genericness. This is generally viewed as purely a good thing in the software world that if we design our, build lots of lyrics for extension, we can easily build more into it later, which is true, but you end up adding complexity before it has to be there. This is what the pragmatic programmers called a software entropy, the measure of complexity in software and anything you have in your code base that's there for some future ability really manifests as technical debt and accidental complexity because you're not using it yet, it's just in your way. And in fact, a lot of experience shows us that things you add to your code base because you just know it's gonna need to be there by the time you get six months to that place, it either doesn't need to be there, it's so different, you end up having to make a major change to it and you haven't actually saved any time and you just kind of generically obfuscated your code. So this is my perspective on technical debt, that when you add something to your project, that's when you start paying for it and you only get payback when you start using that feature and everything in between there is technical debt. This also applies by the way to big giant frameworks. If you've got a big giant framework of some kind and you're not using two thirds of it, that's manifesting as technical debt because it's stuff in your way in the place of things that you should be paying attention to. Okay, well I've been ranting about things that make it hard, let's talk about some things that make it easier. Let's talk about some emergent design accelerators. The first of these I wanna talk about is doing TDD as a design tool. What I call test-driven design. I think that when done correctly, TDD is as much or more about design than it is about having unit tests. I view TDD as a design aid that has a really cool side effect of leaving you with a bunch of tests because design will emerge from the test if you get out of their way and you can get less accidental complexity because you're getting a better atomic understanding of intent. But I wanna demonstrate this. So I wanna show you an example of this concept. I've actually written about this on that article series. There's a two-parter about this example if you wanna see more details about it but I wanna show you the results from it. So what I wanna do is take a very simple problem and implement it two ways. I'm gonna implement it test after and I wanna implement it test-driven and then compare the results. And here's the problem I'm gonna solve. Finding perfect numbers. This turns out was numerology trivia back in the 15 and 1600s. A perfect number is where you take the sum of the numbers, not including the number and it equals the number itself. Six is a perfect number because its factors are one, two and three and if you add one plus two plus three it equals six. 28 is also a perfect number. If you take the factors, sum them all up not including 28 itself, it'll equal 28. They're not that many perfect numbers in the universe so that's what my code's gonna do is try to find perfect numbers. And the first version I'm gonna write they test after version. And so here's the code. So I have this perfect method that creates a list of factors. I add one of the number because those are always factors of a number and now from two up to the number if number mod i equals zero that means i goes into number with zero remainder then this is a factor because it goes into that number evenly. I'll harvest that factor then I'll sum them all up and I'll apply my formula to see if it's perfect or not the sum minus number equals the number. And I write unit tests for this and it works but it's really slow. And so I look for opportunities to speed it up something turns out there's a really good opportunity to speed it up. Right around there because there's an observation you can make about this problem domain that factors always come in pairs. So let's say my target number 16 when I grab factor two I can also grab factor eight because two times eight is 16 which means I don't have to actually traverse all the way up the list of numbers I would be much more efficient if I just went halfway up the list of numbers but I can be even more efficient than that if I only go up to the square root because that by definition is the number times itself which equals my target number so if I go up to the square root and harvest them in pairs then I get all of them and it's much more efficient. And so I change my code to do that. I now go up to the square root of the number and I add both of them symmetrically right here but it fails a unit test. Turns out there's an edge case around whole number square roots. Think about if my target number was 16 when I add four I'm gonna inadvertently add four again. Well it's easy enough to fix so I'll come in and add a guard condition for that edge case if it's not a whole number square root then I'll add the symmetrical factor. This works okay and it's acceptably fast so I'm going to call it done. Now the TDD version I'm not gonna show you each method you've already seen the algorithm and this has a lot more methods in it. This is very characteristic of a TDD code base lots of very small methods is just one thing each and that's actually true in this code base as well as a very, very highly factored code base. And in fact, TDD code bases tend to have a particular shape. They're kind of pyramid shaped where you have a lot of functionality written as very small methods and then they aggregate together to build real functionality. In fact, I will maintain that if you show me cyclomatic complexity per line of code I'll tell you if you have a TDD code base or not because in my experience TDD code bases you average 1.5 to two per method and in non-TDD code bases it's between 10 and 15 for your methods. Yes, cyclomatic complexity per line of code for TDD projects is usually between 1.5 and two per method. On non-TDD projects it's usually 10 or higher. A lot of projects actually say any cyclomatic complexity under 10 is okay on this project I think that's ludicrously high for a method. I want to see in the one to two range and most TDD code bases naturally kind of fall into that because you're building very small methods. That's what cyclomatic complexity per method is measuring is very small singular method. That's exactly what TDD encourages. But anything thing happened along the way of building this TDD version. I realized something. Are the factors of a number a list of numbers? Does the order matter? Turns out it doesn't. What I'm really after here is a set of numbers and a failing test showed me that because it said I was expecting one, two, three, six and I got one, six, two and three. Maybe realize, oh, dummy, these don't have to be ordered. I can make these a set. And so very early on in that code base I converted everything to a set. Remember this code turns out this problem has nothing to do with whole number square roots. It has to do with the fact that I chose the wrong data structure upfront. And this happens all the time. When I sat down to write this problem, I said, oh, factors are listed numbers and kept going. And then I had to break it into unit tests. Now, in a perfect world, I would have gone back to every bedrock assumption I've made to the beginning of time to see where the problem was. But you never do that, especially when you can see a quick bandaid fix like this. And so what happened was, by choosing the wrong data structure, I inadvertently made my code more complex than it should have been. And the way that I fixed that complexity was to add more complexity to the code. Very often the solution to accidental complexity is more accidental complexity because you don't really understand what the root problem was. This is where TDD is incredibly valuable for design because it forces you to vet every little piece before you put it into the large overall structure that you're dealing with. The other really nice characteristic of TDD is that you spend very little time debugging traditionally. Because what is debugging? Debugging is building this big structure and then basically walking through it with a flashlight trying to find bugs. And if you're walking through this huge structure, it's gonna take you time. When you're doing TDD, you never walk through big structures or rarely because you're debugging at the brick level, not the building level. And you get the little bricks debugged and you have fewer bugs when you incorporate them into the overall building. So testing is one of the ways to enable emergent design. And this is very much a kind of a greenfield way of handling emergent design. But let's talk about a brownfield example and refactoring toward design. So this is a little method that comes out of a little toy e-commerce site. What it does is not important that it's hideously complex is what's important. What this does is take things from a shopping cart and put them into an order with line items to with a database using a low level JDBC in the Java world. I don't really care about the details of what it does. The point here is that I would love to be able to see if there's some useful idiomatic patterns hiding in this code, but I can't tell because it's just a jumbled mess right now. This code seriously violates what Kent Beck calls the single level of abstraction principle in that you're jumping abstraction levels all over the place here. There's some really low level JDBC stuff. There's some really high level business method kind of stuff and all kind of jumbled together. So what I wanna do is refactor this and the first pass I'm gonna do is just take the kind of common method, the common elements, and kind of just slurp them up and extract method. I've got a question for you. How many of you when you're writing a method put blank lines inside your method as kind of a spacer to separate things? It's okay, everybody does it. You know what that blank line is? It's a cry for help, make me my own method. Because you separated it out. You put a blank line there to say this is conceptually with all this other stuff. Why don't you just make it a method so you can give it a proper name and test it? Blank lines is a smell in methods. That means this should be another method. And I took the blank lines out of here just to make it fit. So that's exactly what I'm doing is basically, and this is easy in your code base. All you have to do is search and replace for blank line, just grab everything between blank lines, extract method. But if you do that, you'll end up with something like this. Because your refactoring tool has a very specific contract with you, I guarantee your code will still work the exact same way when I'm done. And when you have local variables defined like this and you need to call out to other methods, the only thing your refactoring tool can do is pass them as parameters. So you end up with parameter explosion when you do the style of refactoring. But I'm a human. I understand the implications of moving things up to the class rather than as local variables. And I know that it's safe in this case to do that. So I'm gonna make another refactoring run that's partially tool based and partially manual and get this code to here. And now I'm getting somewhere because now I can tell what this code does. It does some setup, it adds a couple of things, it completes a transaction, something goes wrong, it rolls it back and finally it cleans stuff up. Yes, I can barely hear you, so speak up please. Java is really good at method indication and class loading. So you should never worry about the cost of method calls versus the readability of your code. Is it just Java or any languages kind of? Pretty much any language now, you shouldn't worry about that. This is, so if you're really worried about that you should write everything in hand tuned assembly language. There's no way it'll ever be faster. It'll take you 20 years to write a web application, the fastest web application you've ever seen. But 20 years from now, probably not gonna give you much business value in the intervening 19 years, so there's a trade off there. You shouldn't worry about this. Worry much more about the structure of your code than any sort of performance problems. If you do find a performance problem, go back and fix it then. Don't prematurely optimize it and end up creating a lot of technical debt just for that infrastructure. Thank you. So now I've actually got something because if I kind of replace those two specific lines of code, I have created kind of a template here that I can harvest and reuse for transactional data access. Yes, so methods do add complexity but they're essential complexity. They're not accidental complexity unless you are just creating one method per line of code. There are huge benefits to having well factored methods that are cohesive and just do one thing. So that's something worth pursuing. Even if it adds a little bit of essential complexity to your code base it's paying off and that it's isolating changes so you can go find it. So is it safe to assume that you need not bother about class level complexity? You are well off if you are just taking care of your method level complexity. Well, no, his question is do you just care about method level versus class level? You care about complexity overall. So this actually does not have the damaging effect that you think it does on your code base because every method has at least one psychometric complexity because it does at least one thing. This really does not cause because you're not going to create a method per line of code. So the amount that it increases your code base well pays off by the factoring of your code base and the nice cohesive methods to get things done. Yes? So the question is wouldn't it be difficult to test those methods? No, it's not at all difficult to test private methods. You can use reflection to test them. If you use a groovy, you can actually just call them directly because groovy when you unit test Java code ignores private because it's private annoying if you're trying to do testing or you can make them public and then make sure by convention that no one calls them. So these methods also should be tested. The private methods whatever we're breaking through. I agree. I believe that private methods should be tested. In fact, people who say that you should never test private methods, you should only test the public methods that call them are not doing TDD by definition. How do those private methods come into being if you didn't test first? By definition if they say you should never test private methods, you're not doing proper TDD unless you're writing that method and then refactoring them out into private methods but then even then you're not really doing TDD. You're building big methods and then refactoring them back. So I wanna show you, I actually come and harvest this as an enigmatic pattern but I wanna show you another refactoring example first. So what I wanna do is take an existing code base out of the world, an open source code base and see if I can use some metrics to go find some enigmatic patterns that haven't been discovered before. So I'm gonna do that. First though, I need one more metric. We have psychometric complexity. Now I want afferent coupling. Afferent coupling is a metric that tells you how many other classes reference you as a class. And so if you have a class like this and you have six other classes that reference it, the afferent coupling for this class is six. So six other things reference to this class. And so in some ways, afferent coupling is a measure of importance because if a lot of classes use this it's probably pretty important. Encyclomatic complexity is a measure of complexity. So using these two in conjunction, you could go find the most complex, important classes in your code base. So that's exactly what I'm gonna do. So I said I wanted to, so here's the criteria I wanted for the code base that I was gonna do this for. I wanted it to be open source so that you could do this as well. I wanted it to be around for a long time so that you'd hope that most of the design elements, the important design elements have been chased out of it. I wanted it to be ubiquitous. And I wanted it to be in Java since most of this talk is in Java. So I chose struts. So it's a very well-known web framework in Java. Thousands, probably tens of thousands of web applications have been written in this thing. It's well over 10 years old. It's in its second major release. You would hope that by now the major design issues have been chased out of struts. Let's see. So I downloaded struts and I ran a tool on it called CKJM. This is a metrics tool for Java. So a freeware metrics tool. It runs the Chitimer and Camera Objectory and Metric Suite on Java, which is a whole bunch of metrics. The two metrics that I want, afferent coupling and complexity come out of CKJM. And so that's why I used it for this problem. Yes. Open source, freeware, yes. Very, very small tool. It actually does the entire Chitimer and Camera Objectory and Metric Suite against Java code. So that includes both couplings, PsychoMint complexity and a whole bunch of other metrics that I didn't include here. There's an objectory suite called Chitimer and Camera that defines I think six different metrics for objectory code and it runs all of them against Java code. This tool is specific to Java. You can get PsychoMint complexity for every language on Earth that's structured. There's Ruby, PHP, JavaScript. You can't get it for SQL because it's not a structured language. PsychoMint, in fact, most code coverage tools give you PsychoMint complexity for free as part of the code coverage tool. And afferent coupling you can also get for every code base in the world. Independ gives it to you for .NET. It's a very easily harvestable metric. And the way that CKJM reports PsychoMint complexity, and remember, PsychoMint complexity is a method level metric. But CKJM reports things at the class level because it's an objectory metric suite. And so what it does is take the sum of the PsychoMint complexity of all the methods. So that's the number that it reports here. The WMC means weighted method complexity. If I look at the struts code base, it turns out the most complicated thing in all of struts is this class called Double List UIV. It has a PsychoMint complexity of 66, which is hideous. And so if I wanted to go tackle the most complicated class in struts, this is the one I'd go for. But then look across to the afferent coupling. This guy's only used by three other classes. So if I go to enormous effort and make this better, it only directly benefits three other classes. It may indirectly benefit others, but there's only a direct correlation here. What if I sort these by afferent coupling instead? The most important thing in the struts universe is component, which is not surprising, but I'm not looking for the most important thing, I'm looking for high combinations of numbers. I want to find important complicated things to see if they are overly complicated. And the one that catches my eye here is the one five down UI bean, which has a complexity of 53 and it's used by 22 other classes. So I started poking around in that class to see what's making it so complicated. And here it is. This is a method called evaluate prams. My favorite part by far this method, I don't know if you can see this, the very last line of this method is evaluate extra prams. So it's almost like the developer just got tired of typing. It's like, oh God, I can't type anymore. Evaluate extra prams, okay, I'm done. This is hideously complex. And it turns out that if you start poking around in struts at a lot of complicated important classes, they have evaluate prams or evaluate extra prams in it. So I got curious. I did a little command line judo. It said, give me all the Java source files and struts and then within those show me which ones have a method declaration for evaluate something prams. There they are. I have found an idiomatic pattern in struts. And this I think is a great example of this because 15 years ago when they started working on struts, there's no way that somebody said, you know what, in 15 years, the most complicated part of the code base is how we handle URI parameters. That's such a simple thing. It's so simple in fact that they've done it over and over and over and over and over again. I wrote this up as part of the article series that I was talking about. And one of the struts committers contacted me after the fact and said that he had independently just recently discovered the exact same thing and he was undertaking fixing it by extracting that as an embedded framework. So if you download struts now, there's a parameter handling embedded framework. And by doing that, he reduced hundreds of lines of code and 10s of cycle complexity points out of the struts code base made the entire thing smaller and less complex and preserved the functionality by taking all that scattered code and consolidating it, nobody had gone and found that scattered code before we applied these tools to it. That's a good example of using just a few metrics tools to actually learn some things about your code base that are actionable that you could do something about. So I've been talking about finding these things. What do you do after you've found them? How do you harvest these things and make use of them? There are a couple of different ways to do this. The most obvious way to do this is just harvest it as an API. And that's exactly what I'm gonna do here with my idiomatic unit of work pattern that I created before. Remember, this is the target of my refactoring code. And if I take out those two specific lines of code, this becomes a kind of generic transactional data accessing. And so here is where I can use proper gang of four design patterns. This is really just the command design pattern and so I can go and harvest that by creating a wrap in transaction method. And now my add order from method is this code at the bottom. It's worth pointing out here that remember the first version of this method that I showed you that took up the entire slide in nine point font? This is the final version of that method. Where all the reasonable code has been boiled out of it and harvested somewhere else, this is what it really ends up being which is wrap this command in a transaction to do those two pieces of work. But actually going back to what Venkat was saying earlier this morning, Java forces a lot of ceremony and noise on you. What I really wanna do is say these two things wrapped in a transaction, but I have to create this anonymous inner class with this execute method and all that other stuff. What if I wrote this in a more modern syntax language? My groovy. Here's that same code but in groovy, anything between curly braces is a code block. And you can pass a code block as a parameter and you can execute it just by putting an open close print after the variable name. The calls groovy has code blocks, lambdas, closures. You actually can get rid of the command design pattern entirely. The command design pattern is a bandaid for language that doesn't have higher order functions. If you have higher order functions you can boil away the command design pattern forever and it syncs to just this where I'm passing functionality up to someone who has the context executed. So this is a good example of in a more expressive language a lot of the boilerplate kind of melts away. I'll come back to this concept in just a second. The other way to harvest these idiomatic patterns is with annotations. These are annotations, attributes and C-sharp. You can't harvest everything this way but one of the things that you want, a characteristic you'd really like out of the patterns that you harvest is that they stand out from the code around it. Anything that you've found that you've harvested as an idiomatic pattern is very valuable for your company. You'd like for it to stand out. And a great way to make things stand out is to use annotations for them because they look different from the code around them. So for example, let's say for bizarre historical reasons my country class, the name of the country has to be 10 or less. I can create an annotation in Java that lets me decorate the domain class and say make sure this thing is never greater than 10 in length. Create a little validator framework that basically takes a class, walks through each method, looks for a particular annotation and if it's present, calls the validate method on it and both of these at the bottom are abstract so I supply the annotation type and the method and I can validate things. And so if I want my max link validator, I extend my validator, there's my validate method that makes sure that it's not no more than 10 characters and if it is, it throws an exception. Annotations can actually can be pretty sophisticated. Here is one that says make sure that if you add a new country to this region that it's not already there. So make sure that it's unique. I can use that same little validator framework to do that. The point of this is that annotations can actually get to the runtime values of collections and things like that that are running in your code so that you can do verifications and other sorts of things that are pretty sophisticated. Here are some tests that show that these annotations work. The reason that annotations are so nice for this is that annotations add orthogonal expressiveness to your language. They don't look like other code which makes them stand out, which is exactly what you want for the really valuable pieces of code you've written, you want them to stand out and look a little unusual. So developers consume a lot of annotations that framework writers create. I don't think we created enough of them ourselves because when it's applicable, this is a really nice way to be able to harvest behavior. But I can't resist showing you this. Sticky annotations in Ruby, and of course this works in JRuby as well. So consider this little problem. Here's a unit test for some code in Ruby. And let's say that this complex calculation thing takes a really, really long time. So long, in fact, I don't wanna take the time hit for running that when I'm doing unit tests. I only wanna run that test when I'm doing acceptance tests. Now Ruby being a very flexible syntactic language will let you conditionally define methods if you want. So I can come in here and say if the environment variable build has acceptance as a value, then define that test method and it'll execute if that's true. But that's kind of ugly, especially if you need to do this for several methods that would kinda make your class look very unusual. There's a much easier, more elegant way to do this is to create an annotation for it. So this is only an acceptance where you run that. Here's the entire implementation of acceptance only. This takes advantage of a Ruby language feature called a hook method where you can hook into a class. And here I'm defining a method called acceptance only which sets a private member variable to true or false based on the status of this guy. And then I write a method added method. Method added in Ruby, every time the class adds a method, it executes this method for you right afterwards. And in here I say unless acceptance build is true then remove that method you just added. This is the computer science equivalent of the game whack-a-mole, whack-a-mole where the things pop up and you hit them again. So as soon as you declare this method it strips it right back out if that annotation's not true. And then it resets acceptance build to false so that you have to put acceptance only on top of every method that you want to only run an acceptance and that works beautifully. Here's another example of that. Let's say that you had some Ruby code that you wanted to log every single thing that happens within this method. You can use hook methods for this as well. So I'll come in and create a log. And now in method added what I'm gonna do is save the name of the old method and then redefine this method to all my logging stuff and at the very end I'm gonna turn around and call the original method I saved. Did anybody know what this called in Java? What I just did? Aspectorian programming, this is an after point cut. After you do this work then do my code. This is actually before point cut because I'm doing stuff before I call my code. You know what they call this in the Ruby community? Monkey patching. It is the exact same mechanism. There's just a little bit of difference in terminology. It's called monkey patching because somebody originally called it guerrilla patching like guerrilla warfare, but somebody misheard it as guerrilla like a primate. And so they thought it would be funny to call it monkey patching. All monkey patching is, is aspect or programming in the Ruby world. But the language already supports it. You don't have to bolt on another framework with a post compiler or bytecode weaver or any of that stuff you have to do in Java. The complexity to make this happen in Ruby is significantly less than Java because the language supports it at the metaprogramming level. You don't have to bolt something on on the other side. This is an old example, but a classic one. There are struts harvesting something off a form and there it is in Ruby on Rails. Expressiveness matters. It matters a lot, a lot more than most people give it credit for because I agree with Jack Reeves. I believe that code is your design. And you want the most expressive medium you can find. This is like painting with crayons versus watercolors versus oil paints. It's harder to paint with oil paints, but you can get much more sophisticated results with oil paints you can get with crayons. It's really important to understand where to apply power and when to use it, but you need the power there. Frequently in software that becomes metaprogramming and metalinguage nature. So you should push for expressive as hard as you can. Now, I know most of you probably don't get to pick which language you write code in a day-by-day basis. It's javas.net, but that doesn't mean you can't start taking advantage of this on the periphery. Most of the time they will let you control your own developer tools. So a classic example of this, so you're writing Java code and you're writing unit tests in Java, you should stop right now. It is much easier to write unit tests for Java code using Groovy. Groovy is well-designed unit test Java code. It has mocking built-in. It automatically ignores private, so you can TDD private methods in Groovy without having to fiddle with reflection on all that other junk in Java. And it's unit tests. That code never goes into production. So why not use that? It's gonna make your life a lot easier. It's gonna make it easier for you to exercise, like Venkat said, because you've removed the friction there from doing tests by using a more expressive medium. Build tools, another great example. I've come to believe that there should never be XML in a build tool, that Ant and Maven are irrevocably broken because they are too declarative. You need languages because you always need the flexibility of a language at the build level, which is why I love tools like Rake, Gradle is actually a fantastic thing. It is the get out of jail free card for Maven, because it understands all your Maven stuff, but it is language-based rather than XML-based, is much more expressive and much more powerful. Why not start using that? Even if you're writing Java production code, there are lots of places on the periphery you can start using more sophisticated tools and languages to make things go faster. This really comes down to abstraction styles. We're all very familiar with imperative abstraction styles, which encompasses both structured and modular programming and object-oriented programming. But you hear a lot of people talk about functional programming now. That's a different abstraction. Solving problems in a functional programming language are gonna be different than solving problems in an imperative programming language. And so most people think that to switch abstraction styles you have to switch languages, but you don't. A great example of this is the concept of anti-objects. This came from a paper from Upsla in 2006, I think it was, a paper called Collaborative Diffusion. And the point of this paper was that the metaphor of objects can go too far by making us try to create objects too much inspired by the real world. This is the problem if all you have is a hammer, then every problem looks like a nail. If the only abstraction your language supports is object orientation, then you're gonna try to cram every problem on earth into object orientation, even when it doesn't make sense to do so. That's actually why we have aspects, because it turns out there are classes of problems that you just can't cram into the object-oriented tree and you need cross-cutting concerns to handle those things. The idea of anti-objects is to create something that does the opposite of what you think it should be doing if it gives you a simpler solution. It's kind of like this optical illusion. Is this a vase or is it two faces? It depends on your perspective. And so there's a fantastic example of this anti-object principle that's around how the game Pac-Man works. I have to warn you, I'm about to explain how Pac-Man works and it's not going to be as enjoyable anymore. So if you want to play Pac-Man and keep enjoying it, you should probably leave now. Sometimes knowledge comes at a cost and there's gonna be a cost to this because I'm gonna explain how it works and it's not gonna be nearly as cool when I'm done. So think about the original Pac-Man console machine which came out in the early 1980s. It had less memory and processing power than some watches have now. And they had a really complicated problem to solve. How do you calculate the distance between two moving objects in a maze? They didn't have anywhere nearly enough processing power to do that. So they took an anti-object approach. Now, if you're a Java object-oriented programmer, probably your first instinct here would be to create a ghost class and create a Pac-Man class and create a fruit class and instantiate all those guys and calculate the distance between them. That's not what they did. They built all the intelligence into the maze itself. The maze is basically a state machine that executes rules for every cell in the maze and they invented this concept of Pac-Man smell. When Pac-Man is sitting on a cell, it has maximum Pac-Man smell and the cell he just vacated has maximum Pac-Man smell minus one and it decays pretty quickly. And all the ghosts do is wander around semi-randomly looking for Pac-Man smell and when they encounter it, they go to the next adjacent cell where the Pac-Man smell is stronger and they can move slightly faster than Pac-Man and that's the whole game. They played a really nasty trick on you because they gave the ghost eyes? They can't actually see you. They can only smell you. So if it made it a nose, it would be actually a lot more clear cut is what they're doing there but notice the next time you play Pac-Man, you can run right up on a ghost because they can't see you coming. They can only tell where you've been. They only detect Pac-Man smell and you only leave that when you've walked across part of the game board. That's the whole game. That's a great example of flipping the problem to the opposite of what you think you should be modeling and ending up with a much simpler solution. There's some other solutions in the collaborative diffusion paper. They were trying to model in software drops of water in a bucket and they were trying to model the drops. It was really complex and then they ended up modeling the water and it was much simpler. We did this on a project where we were managing leased rail cars and we were trying to build all this logic to get rail cars in the right parts of the rail yard and we realized that's dumb. We should write the abstraction at the track level. That was an anti-object approach and that turned out to be way simpler worrying about the tracks and the actual cars that were running on the tracks. So this is a good example of abstractions and finding and harvesting idiomatic patterns in the brownfield world. But I wanna come back to last responsible moment. As I said earlier, this is the tricky thing and I can't tell you how to find this but I can give you some clues. One of the things you can look for is that over time a particular element of your piece of software, a particular component, a particular service, will have a particular job to do. And over time it will take on more responsibility sometimes. Every time there's an uptick in responsibility whether that's more code or more scale or whatever it is that's a good time to evaluate that component and ask it has your last responsible moment arrived or not. So this is one of the tasks of an architect on an agile project is have a pretty good idea of these components and where they are in terms of complexity and how much work should they be bearing, et cetera. Another great tool that most of you are already familiar with I'm sure which are spikes. Spikes are time boxed experimental coding exercises so that you can learn something about your code base. A lot of times you end up misestimating things because you don't have enough understanding about the true problem. Spikes are a way to accelerate that so you create a time box for two hours or two days. I think the longest spike I've ever been on is a week but during that spike we were trying to see could an open source search engine replace the commercial search engine we have? So we had all these criteria we needed to try so we spiked out one of each category and by the end of the week we had pretty good confidence that yeah we could replace the sophisticated function out to this guy with the open source one because we understood a lot more about it. So here's a case study of a project that I think did an extraordinarily good job with the last responsible moment. This is for a site that ThoughtWorks has been working on close to five years now. This is ove.com, online vehicle exchange.com. This is a site that allows car dealerships to buy and sell used cars at auction. This is a Ruby on Rails project that we've been working on in Atlanta, it turns out, for about five years and the really interesting part of this is the evolution of asynchronous messaging within this application. Relatively early on, within the first six months of this application, the users decided they wanted status on uploads. This is an auto auction site for car dealers, it's not unusual for people to upload 50 cars at a time and each car has 50 or 60 photos because it turns out that you don't trust used car dealers, it turns out that other used car dealers don't trust used car dealers either. So when they sell each other cars they basically want a photo of pretty much every molecule of the car to make sure that it's all exactly what they say it is and so every one of these listings has lots and lots of photos. Takes a long time to upload this, nobody wants to look at an AJAX spinner for an hour and so we want a progress bar or we want the ability to start it and then come back and see if it's done, that's classic asynchronous behavior. And so one of the suggestions was well we should add a message queue because that's the traditional way you handle asynchronous behavior. But then we did a couple of little spikes and found this little thing, background RB, this little thing in the Ruby world that basically simulates a single message, message queue backed by a relational database. Well we did a couple of spikes and said you know what, that's good enough. But the tech lead on this project realized that at some point in the future we may have to replace this. So he put it behind the Ruby equivalent of an interface. And about the one year mark, second asynchronous requirement came in timed things like cron jobs, things that automatically kick off at a certain time of the day. We thought about an asynchronous message queue but then realized we could create another instance of background RB and it would work just fine. So we left it. And about the two year mark, another asynchronous requirement came in, things that are continually run, things like updating counts and cash values and things like that. This whole time we'd been using a simple message queue backed by a database but it was starting to creak a little bit under the amount of load that we're putting under it. Also one of the DBAs on the project had discovered that we were using his precious relational database as a backing for a message queue and he started making this face at us every time we saw him, which viewed as probably a bad thing. And so then we switched to a real message queue in the Ruby world called Starlink. But what happened there? This goes back to what I was saying before, that you don't know what you don't know in projects. And a lot of times there's this kind of knee-jerk reaction to say, oh, we need a synchronicity, but we don't know all that we're gonna need a synchronicity for. So let's buy the fanciest message queue we can and then we'll grow into it. But that's deadly. Because if you don't grow into every single one of those features, all you're doing is purchasing technical debt. There's gonna be a lot of pieces of this thing that you don't need that are in your way, there's a bunch of configuration there, it instantiates a bunch of stuff that you're not using. It's not as efficient as it would be if it were single purpose. So what happens in a lot of cases is you end up paying money for technical debt that you never justify. When you over-buy tools early on, you're buying technical debt that a lot of times you'll never actually get payment back from. You're actually much better off trying to go as simple as you can and actually learn the nature of the problem. By the two year mark, we knew pretty much exactly what we needed a synchronicity for in this application and we picked the simplest thing that did the work we needed to get done. We didn't have to get a really fancy bells and whistles kind of message queue because we understood the problem enough to say this is sufficient for what we need to get done. But notice what else we did there. Messaging infrastructure is traditionally architectural, hard to change later. But by isolating that behind a component, behind an interface, when it came time to change it out, it took one pair about a week to swap out background RB with Starling and didn't affect anybody else in the project. What we managed to do was convert it from an architectural element into a design element, something that's easy to change later versus something hard to change later. This is the benefit of doing well-factor component-based systems is that if you can get it right, you actually convert architectural elements into design elements, making them much easier to change later without having to do major infrastructural factoring exercises. So to summarize, evolutionary architecture and emergent design require really good engineering practices. You really need TDD, you need the ability to do refactoring, you really need things like continuous delivery in a big way. And in fact, right after lunch, Rebecca Parsons, who's the CTO of ThoughtWorks is doing a fantastic talk on evolutionary architecture and continuous delivery together that is exactly in this space. Very often, trying to predict the future leads to over-engineering. Over-engineering is a bad thing. And so this may sound like a subtle distinction, but this is what I prefer now is to prefer pro and reactive to predictive. What I mean by that is try to be proactive for the pieces of your application. Look at things like asynchronous in. Okay, let's put that behind a component. That's being proactive about it. Being reactive is that every time it takes on new responsibilities, reevaluate it. Is it recent, last responsible moment? Is it now time to replace this with something more robust or more specific? Being pro and reactive works much better than trying to be predictive. Because being proactive and then reactive means that you're paying attention to the real things on the surface versus just trying to guess what's gonna happen in the future. So I'm very close, I'm essentially out of time and lunch is being served, so I don't wanna keep you here over lunch, but I will be happy to answer some questions, if you'd like. Yes? So the question is, do processing XML and you prefer to do it in Ruby, you said? Ruby, okay. So you can do it either way. I mean, six one-half does the other as long as you're not doing crazy things about calling out, shelling out to it or something like that. It's gonna start new processing, things like that. You could go six one-half does the other in that case, I don't see hard. Yes, I just barely hear you, can you speak up? Yes, that is correct. There are two ways to do this. When you start a new project, doing testing and things like that, with existing projects by using metrics to harvest design elements out of, okay. Yeah, but you see, before Java appeared, everybody was enthusiastic and the programming languages were proliferating and nobody was concerned with software architecture. Out of the blue, they realized without software architecture, you couldn't move on and then you moved to software architectures, okay. I'm not throwing out software architecture. We still have architecture here. We have evolutionary architecture. No, no, I know, but in your presentation, you skipped the level. Rebecca is going to talk later. I'm only talking about design here. No, no, I know, but the way you are talking, it's like we talk. We need to talk about anything, okay. And then you have a language and you talk about everything. The idea would be that programming would be like this. You have a domain, you need to implement, you have a code and things emerge, like when you talk, right. Yes. Okay, great. Okay, yes. As a part of this refactoring with the technical depth, so instead of say five classes, I refactored to say 25 classes. Each class has a single responsibility. So one of the concerns which I hear from fellow developers is it's very difficult to read that code compared to if I say have five classes. I have to jump multiple hopes to read and understand that code. So like how do we address that concern? Well, you got IDEs now that make the code browser in small talk look anemic. So it's very easy when you have factored code like that. You know, in any IDE, you can hold on control and click on a method name and it'll take you to that method. You can literally browser code. In fact, I think that having big giant long methods is much harder to read and understand than have well factored methods that have good names for them. It's much easier to understand a code base that has very small methods with explicit names than these methods that are 20 or 30 or 200 lines long. Nobody understands what a 200 line method does. Not even the person who wrote it, not the person who has to deal with it, because there's just too much there. So I think you're much better off having, I actually find it much easier to navigate code bases that have well factored classes with lots of very small methods in them because it's easy to trace down functionality. Yes. If you're doing test-driven development, then you write the test before you write the code. That's test-driven development. So in that case, I mean, is it that we're facing the issues that what will be the implementation of the test cases? Because I mean, in my actual code, if I take help of some other classes method, I might need to mock that in my test unit test. But until and unless I implement that my actual code, business code, I don't know what are the methods I will be using or how many conditions. Which you use mocking for? Mocking or stubbing. But the things that are not implemented yet, but you know you need, you use mocking or stubbing. Okay, so if we do this in parallel, that's, I mean, I write some test cases in that and then I implement the actual code and then again in the write the test cases, is it against that EDD principle? Well, if you write code before you write the test, you are by definition not doing test-driven development. You're doing test-after development. Which has value, but it has no design value like test-driven development does. Test-driven development forces you to think about every little method when you implement it so that you have fewer bugs at that level than you do in test-after code. Testing is valuable whether you do it before or after. It's a much more valuable design aid if you do test-driven development because you're being forced to verify your assumptions about every method before it goes into the larger methods that are calling it. Yes. Well, let's say our project is very time-bound and so during that time, how would we prefer to be reactive to predictive? I'm sorry, can you speak up? It's very hard. Like we have a time-bound project, right? So how do we prefer to be reactive to predictive? We don't have time to choose and to be reactive to some tool like your case study. You earlier used some other messaging queue system and then you found Starling, right? So that time you didn't have time to look around that this is a right queue to use. So how to find that at the right time? Well, so finding the last responsible moment is really tricky. It takes a lot of experience. It takes understanding your domain and all the architecture. So that's why agile architects must be very hands-on and see and touch code all the time because they have to have a really good feel for what parts of the code base are perfectly good, what parts are maybe reaching toward their last responsible moment. But it's always worth time to do some research on something and not just blindly pick something because you're probably buying a lot of technical debt and it's gonna be hard to get rid of that later, particularly if it's some sort of architectural element because now it's gonna be a huge refactoring effort to change that later versus changing it. We're doing some research up front. So Oman, I very do have had both as a tech manager as a developer, right? So I see last responsible moment and I get it as a developer, makes sense as when you want to actually go for the decision. But to plan things, to budget things as what tools I need for the team, right? I know, I say it's tricky, right? Is there a way around, like you said, you need to be really expert in as always to be hands on developer to make a decision at the last moment, right? But I also want to know as a manager to know what are the costs of the tool which my team is trying to bring in to do the project. So you're talking about the cost of the tools to do this or the cost of time to do it? The cost of the tools to start with and obviously on the time, right? So virtually, so what space are you in? Dot net, Java, Ruby, PHP, Python? It's Python. Python? I don't know. I'm not in the Python space much. I know there's a lot of open source stuff in Python. I know all the metrics tools I talk about exist in Python, but I'm not in that space, so I can't make comments about tooling. In the Java world, the Java, Ruby, Clojure, Scala, Groovy, Dot net world, there are open source tools to do all this stuff. So you don't need to pay a penny for it. But Python I would guess is the same, but I'm not in that space, I don't know. I don't think. Independ, the letter endepend, it's actually a commercial tool, but it does effort and effort coupling at the class and package level. It has some really nice visualizations for things. There's another tool that works for dot net that's free called source monitor that gives you some nice visualizations. It gives you psychometric complexity and some other stuff. It's freeware, only runs on Windows, but it's a freeware tool. Is that microphone on? Now there's another one, there's a better one. Yeah, there you go. But I have an opinion which I want to validate with you. Is that, whether it is design element or just ideas that we are actually thinking ahead and then only coding, right? That's what, that's a design part, it's fine. You actually, when you start talking to the microphone, you got quieter, so speak up, I can't hear you. Yeah, I think originally it was better. So is it okay now? All right. So what I was saying is that when we go with a test driven development approach, the idea is first we capture, whether in the form of design or test, as you are capturing the requirements to which I need the functionality to be written about. But the opinion that I have and what I've practically seen is that mostly developers end up being writing the happy path really right, but they are not having the district of mindset to write design elements or the test cases in such a way that it will destroy the system as well. So what I would say there is that you're actually conflating two kinds of testing that we think about, unit testing and functional testing. So for us unit testing is very much purely a developer tool and it's all about design and structure of code. Functional testing is about what the code does. And so we actually, even if we use the exact same tools to write them like JUnit, we separate those as separate things because functional tests are gonna be much more coarse-grained, they're gonna touch a lot of different pieces, multiple classes, talk to databases and stuff like that. So that's how we differentiate those is unit tests. So what we typically use as a tool like Cucumber would be a good example of a BDD tool. Business analysts, developers sit down together and write the specifications in Cucumber. That then becomes the target for the developer. And so they take the failing Cucumber test and now they use unit tests. They implement little pieces of that. So they may get 14 unit tests to pass to get one green over in Cucumber. And so that becomes the business specification they're working against, whereas unit test is just about structure and design. Okay, okay, that's fine. Just one last question is that from the term that emergent design that we had, right? A lot of references we have been speaking so far is like, 2006, 2005 and all of those years. Almost five, eight years back. So what is latest? Is there anything which is happening in the latest trend, say about two years back or something towards the emergent design? Latest trends around that? That's what I'm trying to determine right now. In fact, what I'm doing right now, one of the things I'm doing at ThoughtWorks is actually going and visiting ThoughtWorks projects all over the world and see how they've applied evolutionary architecture and emergent design. I'm gonna try to consolidate all that stuff and that's gonna be my next book probably next year. All right, then we'll meet you next year for that. I have a doubt, like you're talking about, when you're talking about the unit test, we have the 100% unit coverage. And when it comes to integration, when it comes to a kind of a functionally end-to-end test, like the coverage tends to be low because the cost of maintaining all those things. What would be the ideal combination? Because I often enter into a situation, I used to follow TDD, but when it comes to integration stuff, there's some things because when I'm doing the unit test, I'm doing- I don't even look at code coverage passing the test because it's irrelevant. There's no way you can get 100% test coverage on a functional test without doing a ridiculous amount of not very high value effort. So I care about code coverage on a unit test. I don't even look at it higher than that functional or integration test because that's more driven by business functionality than making sure you're going through down every path. Like, actually, what is the combination? Like, see, when we are doing the unit test, we have something like 100% unit coverage or something like that. But when it comes to integration and when it comes to BDD, it's kind of a vague, like how much coverage that I need to offer. Zero, I don't care. It's irrelevant. It doesn't tell me something valuable. I mean, it probably would if I took the effort to actually make it work, but having tried this years and years ago, it is nowhere near the effort to get 100% coverage on functional tests because it's just too hard at that high of course-grade level to get down every path. You end up doing crazy things. So don't even worry about it past unit test. The only reason you care about code coverage on unit test is to make sure that every path has been taken in your unit code. If you've got that covered, then the functional stuff should matter a lot. And another thing is like, people are like, when I was doing the mocking and all that, people used to be more trust on integration test rather than on unit test. Like, when we are saying mock, okay, why don't you integrate with the database and write the integration test? Like, what is the thought process on that? Like, the integration test is something which is very critical? Depends on what you're writing. So typically we like the Mycome testing pyramid where most of your tests are unit tests and the next most are functional tests and the top one are user acceptance kind of tests. Integration test for us would probably be in the flavor of a functional test because we want to make sure all the pieces fit together. But it really depends on what you're doing. And so if you have a code base that has a lot of very, very complex business rules, you're going to have a big, fat set of functional tests to make sure all those work. Whereas if you've got a CRUD application, you'll have virtually no functional tests because it's CRUD. It's very easy to verify that things got stuck in the database, so we're all done. So I think it's more domain-driven than anything else. Yes? Neil, you talked about pushing things from the architecture layer to the design layer as far as possible. You can convert things which traditionally are thought of as architecture into something which can be swapped. Yep, if you can. Do you have any guidelines for that or any books or any references that you can refer to? It's brand new. I mean, this really ties, and Rebecca's going to talk on this too, the real kind of inspiration for this really comes from the continuous delivery space, which is thinking about component-based architectures. And there seems to be a big trend happening in the world right now that everybody's going to microservices architectures because it's just easier. It's just easier to deal with. It's easier to integrate. Everybody's going to microservice rest-based architectures where you build lots of very small rest-based things that each own their own data. You integrate between those things with other small rest-based services. And what you're doing is creating those as components. And if you have nice semantic interfaces for those components, it's relatively easy to swap the implementation out for that and put another one in place. So you basically turn that into a design element by converting it into a component. So this is literally, I think, on the cutting edge of our thinking about software in the agile space right now in architecture. And continuous deliveries kind of started that. And I'm kind of trying to continue the R&D effort. So like I said at the beginning of the talk, I think this is the most fascinating thing that's happening in the engineering space in agile right now. And I'm really keen to dig more into it to see what else is there. So this question is, will some elements never move to the design space? Absolutely. In fact, in my rampant emergence talk, I basically said, if you've done a really good job of creating a component-based system and the product guy wants to pivot some behavior to something else, if you're lucky, you have an architectural element that you can just replace and give you that new functionality and everything's good. But you're never going to be that lucky. There's all your product person, your business analyst is always going to find something that slices through your architectural layers in awful ways. But there are ways to do that too. You can do feature toggles, you can do branch by abstraction. So there are ways that don't affect the architecture that you can still support those kind of behaviors. My question is related to non-functional requirements. You talked a lot about structural aspects. And you talked about functionality, how to realize functionality, and make a decision when a point is in the last moment possible on designations. So as you know, lots of projects fail because they fail to meet non-functional requirements like scalability, performance, et cetera. So many of the architecture decisions are taken because of NFR requirements. So when we don't, I mean, my question is, how do we handle NFRs in the context of evolutionary architecture and emergent design? So it's a very good question. His question is how do we take care of things like scalability and non-functional requirements if you're doing evolutionary architecture? Notice I'm not saying no architecture. So very much upfront, we will find out from the client or the users or whatever, what is the expected scale? What is the expected response time here? And we will actually add fitness functions into the code base pretty much before we've written any code whatsoever, just to do some very coarse grain. So if we have a hard requirement that every page must load in less than a second, we'll put a test in day one that says for every page that loads, make sure it loads in less than a second and fail a test if it doesn't. So we can find out right away, not six months down the road that something's happened. And what we tend to do in that case is we put non, I hate the term non-functional requirements because if they're non-functional, why are they there? I like quality of service better because that's really what it is. So what we do is we put quality of service stuff kind of stubs in early on and then as we get closer to release, we add more and more resources to that to make sure that we've got good coverage and we understand the architectural elements. And so we pay attention to that all along the way. We don't try to wait until the very end and then try to retrofit it because you can't. It's architectural. Things like scalability and stuff like that are architectural. You want to put a small piece in now and let it evolve over time. That's the ideal. I'm asking for a specific example. Like, we know the requirement. I mean, for example, one second per page load. Then we could be able to, you know, make a destination or architectural addition. But often in most of the cases, particularly when the requirements evolve, we do not know the requirements. You can't, as a business person, you can't ask me to predict the future or to do magic things. You got to tell me up front because it's engineering. I can't make magic things happen after the fact unless you want to give me time. So there are certainly cases where you start something that's very small and it reaches a point where it can't scale anymore. Then that's a rewrite. But that's not the fault of evolutionary architecture. You need to know up front some of these things. I mean, this is the scientific method. We need to be able to experiment and get results back. And if a business person comes to me and says I need an application but I don't know what kind of scale it's gonna need but I know it's gonna be really important, I'm gonna call bullshit on that. I'm gonna say I'm not gonna write any software for you until you can give me some concrete numbers that I can actually test against. I want the ability to have objective measures for things. So that's what I would do there. I would push back on the, well, let's just build something simple and then we'll make it scale later. That's not a good solution. Unless you wanna say, okay, well, let's reserve a bunch of time to rewrite it once it hits a certain point and we'll retrofit it with something else. Other questions? Yes. You explained idiomatic patterns which we generally apply on projects which may not applicable to other projects. So you gave example of annotations. So annotations is a one type of idiomatic pattern. The validation, yeah? Sorry? What was the question about validations? Annotation-based examples, I mean, you gave annotation-based examples. Annotation is a one type of idiomatic pattern. I mean, there is other ways to, you know, pattern is generally a problem and their solution. Yep. So the only two that I specifically call out here are APIs and annotations. There are probably some others, certainly depending on language, but those two are pretty language agnostic so that's why I stuck with those two. Okay. Yes. And I didn't touch at all on idiomatic patterns at the domain level, but domain-specific languages are fantastic for that because that allows you to capture important things about the business in a language very close to the business and capture those in DSLs and that becomes a really good way to harvest business level idiomatic patterns. But I don't have time to talk about that. Yep. Again, going back to handling complexity. You were talking about, you know, writing functionality and handling errors, for example, exceptions and so on. You always end up seeing in code a lot of exception handling than building functionality. And of course, you can abstract a bit and so on. But have you seen any adaptation of formal methods where you have these preconditions, post-conditions and invariants, which is that? Sorry, have I seen formal methods for? For handling complexity, like error conditions, even before you enter a method. So for example, you- Preconditions or something like that? Preconditions, post-conditions. Some languages support that natively. Scala, for example, has a really nice required precondition kind of library. There's some code in the Ruby world too called Handshake that lets you set Eiffel style, pre and post-conditions for methods. So that's certainly a perfectly legitimate thing. The thing you have to watch out for is make sure that those assumptions are always going to be true. Because you might find yourself putting too much rigor in there than having to go back and take it out later. There's nothing wrong with that. What is the latest thinking at least in Java with respect to checked and unchecked exceptions? Oh, forget checked exceptions. What a horrible, horrible, horrible idea. If you look at every single language that's coming after Java, they all swallow checked exceptions and turn them into runtime exceptions. So almost universally, not universally, but almost universally checked exceptions were a terrible idea. So just convert them all to runtime exceptions and forget about that stupid signature because it doesn't help anything. It's one of those things that sounded like a great idea and then in practice it becomes this really horrible cumbersome thing. And you can look at every language that's come since on the JVM, none of them have checked exceptions. In fact, mark my words, no language ever in the future will have checked exceptions. It was such a terrible idea and nobody's gonna do it again, I think. Now, Mike Nigard will argue with me about that. He wrote Release It and he's big in the DevOps space and so he has cases where a checked exception actually solved them a problem, but I say that if you have checked exceptions and it causes 1,000 problems, solves one, then I'd rather get rid of 1,000 problems and deal with the other one problem in a different way. Your suggestions in order to drive the quantification or the significance of TDD to the management or the team, can you suggest as to how we can drive the advantages or the significance we get to the senior management and to the developers? So here's the ironic thing. So a lot of times people ask me, okay, I understand that testing is a good thing. How much extra time do you budget on your project to do all this unit testing? Is it 10% is it 15%? Turns out it's 0% because it actually allows you to go faster. People who believe that testing slow you down believe that there's a correlation between typing speed and developer productivity and there's a really loose correlation if at all. It really has nothing to do with the number of lines of code you type. So a great example of this is a study done by a doctor in the US, Dr. Lori Williams from University of North Carolina, or University of Carolina, where she did an experiment. She took a group of inexperienced developers, college students and a group of experienced developers and had them do a two month project. And the only difference was she got the inexperienced developers to do TDD and experienced developers just wrote code. So not surprising the inexperienced developers got out to an early D but by the end of two months both teams had implemented essentially the same amount of functionality and the less experienced TDD team had 50% fewer bugs than the experienced developers. It does not slow you down, it speeds you up because you know what you don't spend time doing in a TDD code base? Single stepping through the debugger. You almost never debug stuff in TDD because you've resolved all those things with tests before you have to debug it. There was a company called Six Sense, who wrote this tool that's kind of slightly evil that actually monitored what users did in their IDE all day. And it was fascinating because they found that teams that did TDD spent an order of magnitude less time in the debugger than teams that didn't do TDD. So that's the trade off, spending a lot less time debugging and I'm actually writing code that we can now regression test because debugging isn't regressionable but unit tests are. So it definitely makes sense to use TDD for new code that you're writing, right? But we have a lot of legacy code in our system. Code was written like 10 years ago does not very well written like very testable. So does it make sense to go back and write unit tests for those things? Here's what I would do. A lot of effort. No, it's not worth the effort. Here's what I would do. The goal of unit tests in a Brownfield project is to, so what I would do is start a new policy saying starting next Monday, our code coverage will always get higher. So at zero right now, I'll write a unit test on Monday and that'll be 0.0000001%, which is higher than zero, so that's good. And what you do is every time you add new behavior, you write a test and every time you fix a bug, you write a test. Because what you want to do is put tests around the most fragile parts of that existing code base, which is where there are bugs and where you're adding stuff. So you'll never get to 100%, but it'll always get higher from this point on and that shows you're doing the best thing you can do against that aging code base to get confidence in the things that are broken and the things you're adding. I think that's the best use of that. Even to add unit tests to existing code, would it make sense to just rewrite some parts of it to make it more testable? Certainly if so, the probably the big exception that rule is you encounter a 200 line method and you need to add some functionality to it. Nobody wants to deal with that. So typically what I would do in that case is write a very core screen functional test that just looks at the overall state transition for that giant method. Now I have a safety net. Now I'll rip that thing apart and put it into smaller methods. I can still make sure it does what it's supposed to do. Now once I get to smaller methods, now I can add new functionality and write some unit tests around those. So I would opportunistically break down big ugly things as you need to start working on them. But the big ugly things that are there, they're not broken, just leave them there until you have to touch them and we have to touch them then start doing some reactive refactor. Yes. The task. How can you address that issue? There are some tools out there that work very well to do that but that's really just gotta be more an intelligent developer going through. And if you've got really good unit tests, it's easy to tell but it's hard if you don't. I mean the best thing to do is really optimize your build and go in and remove methods and try to build and see what happens. That's about the best you can do without specific tooling around a particular language or IDE. Hi, Neil. First of all, thanks for the fantastic session. It was really informative. Can you speak up? Yeah, first of all, thanks for the fantastic session. It was really informative. One of the practical problems that we are facing currently is like, we are like five to six months into the development. So it's not generally available outside in the market but we are still in the development phase. So we are five to six months into the development. So now we are planning to adopt TDD. So basically the question is like, so we have code base where we have followed test after methodology. We have unit tests. We have some amount of coverage, 50, 60% but it's not done in a TDD way. That's okay. So as you said, our cyclomatic complexity is all well beyond 10. So that's the stage where we are in. Now the practical problem that we are facing is now if we want to adopt TDD, so how to do it for the already existing methods? Well, don't worry about the existing unit tests. So when you do TDD, when you're done, the cool side effects, you have a bunch of tests. Well, you've already got those. So you've already kind of done the design for those. So I wouldn't worry about those. You start doing TDD for the new stuff. You may find design elements you want to change later but I wouldn't go back and revisit all those. I just leave those unit tests in place and go from here forward and do TDD and then retrofit things that you find that may be accidentally broken. I don't think there's huge value, especially if you've already got tests, then in fact there's no value in going back and TDDing it just for the sake of TDDing it. The design flaws you'll find are probably so minor it would be nowhere near the effort required to go do that. So basically even in the next two, three months, even if say, say, suppose there are 100 methods now, so at least 50 methods if we are going to change. So I would pick a date. Starting next Tuesday, everything is going to be TDD. Done. So that's what the major thing is, like for these 50 methods even if you want to go ahead with the TDD way or not, or the new methods that we write, it's better to approach, you know. I'm a big fan of TDD because it gives you two benefits. It gives you a questionable test and it tells you something about the code you're writing right now. So that's why I love TDD as a design factor. Thank you. Okay, I want to take one more question because we're now officially half an hour after the session. I appreciate you guys hanging around. I need, my question is, where do I draw a line saying what is architecture and what can you go into, what's in design? So the question is, are you hard to change later? Yeah. Than your architecture. Okay. So what are you curious about? Your rules engine, that's architecture. It's hard to change later. Your web frameworks, architecture. Your database, architecture, language, architecture. How you use those things is usually designed. Okay, you're saying that much harder to change is probably going to architecture. Much harder to change architecture because it's the things that are harder to change later. They're the bedrock things that you've made assumptions about because architecture gives you two things. It gives you scaffolding so that you don't have to write a bunch of stuff but it also gives you constraints. And so it's important to understand both sides of that that you're getting constraints and benefits and trade those things off. Thank you. Okay, well I want you guys to get some lunch and there's only half an hour left in lunch so thanks very much for coming. I hope you enjoyed it. Thanks.