 Thank you everyone for coming. This session is This Code Stinks. We have all seen this code before. My name is Larry Garfield. You may know me as Krell, online in most venues. Among other things, I'm a member of the Drupal Association Board of Directors. For Drupal 8, I'm working on the web services and context core initiative, so we're rewriting about half of Drupal, so small project. I was the co-author for Drupal 7 module development along with five other people, so I only did a little bit there. And around the office at Palantir, I am well known as a Nerf gunslinger, so don't get cross, don't get cross with me, I'll shoot you. Code smells. That's what we're talking about here. What is a code smell? A code smell is not a bug. A code smell is an indication of something in code that might be a bug. There are ways to help you find if you're doing something wrong in your code before you actually run into a problem. Determining what is and is not a code smell is often a subjective judgment and will often vary by language, developer, and development methodology. That is, you know, guideline for, you know, there might be a bug at some point. These are not rules, they're more like guidelines. So what we're going to do here is go through a couple of common problems, a couple of common warning signs that you can look for both in your own code and in code that you're reviewing from someone else. Who here has written code before at some point in some language? That's what I thought. Who's reviewed a patch on Drupal.org? Almost as many people. Who's going to have reviewed a patch on Drupal.org by the end of the week? There should not be anyone with their hands not up at the moment. So guidelines to look for both when writing code and when reviewing someone else's code. Whether it's a patch or, you know, do I want to use this module? Do I trust this developer? We're going to look at seven smells in particular. These are not necessarily, you know, there's not an exhaustive list of kind of red flags. Nor is it even the normal academic treatments. This is more of a Drupal centric approach to the question of bad code. So our first code smell, first sign that you're doing something wrong is the word and. By and I mean code that does multiple things. This is out of Drupal 7. The Drupal form submit function retrieves, populates, and processes a form. Okay. Populating and processing are actually separate functions. Retrieving is not. So you can't populate a form without retrieving it. You can't populate a form on its own, but you can't retrieve it without populating it and processing it. What if you just want the form? Then you actually can't. Because this function does too much. If a function does something and something, that's a problem. Because you don't have it down to one atomic action that you can mix and match. That's kind of obvious. The documentation will tell you that there's a bug. But this is the solar PHP client. This is the PHP library that pretty much everyone uses to talk to solar server. Both the Apache solar module and the search API module in Drupal 7 use this client. And there's a class in there that is the connection object to solar. And it has a method on it called add documents into which you pass an array of document objects. And this says, okay, add these to the index. Yeah, add these to solar. Great. Seems like a single action. Except in order to do so, you first have to render those objects, those solar objects into XML. Okay, no problem. We've got this handy utility method that does that. What's the problem here? Well, there's two problems. One, document to XML fragment is protected. So if you want to just render a document to XML, you can't. If you want to just render, you know, the entire payload of the add header and all the other stuff that goes in there that I left out to make it fit on the slide, for debugging purposes, you can't. It is physically impossible with this code to render a series of documents to XML without also sending them to solar. And of course, if there's a problem with them, solar will give you back a completely useless error message because it works that way. And this makes debugging impossible. I don't want to think about how many hours on this problem, not that long ago. And it's not always obvious. The correct solution here is to break this up. Document to XML fragment should be public. That's one atomic action. We should have a render documents method. And then all of this function should do is call render documents, get back that string, and then pass that on. So we can break that process up into pieces, each of which does only one thing. In the OO world, the fancy name for this is God objects, which are an object that does more than one thing, that is responsible for connecting to a database and formatting queries, that is responsible for connecting to solar and rendering solar objects, that is responsible for saving nodes and being a node, the actual node itself. That's bad because you want to be able to separate these actions and use them independently from each other. In particular, does anyone, everyone's familiar with the term composition? I hope. So that's, instead of having one huge object or one huge function, you break up the task into a series of smaller objects or functions, and then have a wrapper that leverages them. And to the outside world, it looks like a single piece, but it's actually not. That gives you a lot more flexibility by avoiding that word and. Number two, the word or, notice a pattern emerging? If you have a function that will sometimes do one thing or sometimes do another, and that may vary depending on what's passed into it, may vary depending on time of day, no. There are way worse examples than that. In fact, all I'm going to show here is the doc block for registry check code. This is the core of the registry in Drupal 7 that tracks where classes live on the file system so we can lazy load them, which is a very good feature. It's an absolutely critical feature to have. We're going to do more of that in Drupal 8, but this is how it works right now. So that first parameter is type, which is documented very nicely to say this is the type of resource we're looking for, which is either class or interface, or a constant that messes around with the cache. Wait, what? And the second parameter is the name of the resource we're looking up, like the name of the class, but sometimes null because it gets ignored if the first one is not a string. And we return, well, sometimes a constant and sometimes null and sometimes false, and yuck, I don't know. There's like three different operations shoved together into this one function. And so you can't actually predict what's coming out of it unless you know the implementation internals. And that's disgusting. That's a sign that you're definitely not factoring your code out properly. If you find yourself in that situation where a function does A or B, split them up into function A and function B, it's not that hard. But what if you actually need to share data between these two functions? This happens a lot in Drupal. Like, you know, we have this cache of classes we've used on a given page. You want to store so we can autoload them later. The answer there? OO, objects, shared data between different operations that are bundled together. Very simple. Here's what the registry should look like. Just very simple straight port. Make it an object. Give it a couple of separate methods. Each method is exactly one thing. Then we just set that up. Here's our autoloader. It's for the shutdown function. PHP supports this. It's very simple. And this way you know exactly what each of those methods is going to do. Each of those methods becomes simpler because all it's doing is taking a simple to find input and getting us and returning a simple to find output. It's much easier to document, easier to debug, easier to understand. And you handle shared data by just having, you know, internal properties to the object. Code smell number three. The word if. This is an actual code from a former boss of mine from a number of years ago. Overly complex code leads to overly complex bugs. Now where have we seen that before? So who's here has worked on comment module in core? I see exactly one hand go up and your brave, brave soul. So comment node view. I went through and took out everything in this function except the conditionals. This is the structure of this function. This is just Drupal seven core. What does this function actually do? It's one, two, three, four, five, six, seven, eight, nine, 10, 11, 12, 13. Oh, now we're, I know we're in trouble. We've got 13 if statements here. And some of those have elses on them. I can't even follow the logic in this code, really. The fancy academic term for this is cyclomatic complexity, which is a measure of the number of independent paths to the source code. Fancy way of saying how much branching happens within this code. How many different ways can we get through here to reach a conclusion? Well, it's not 13 actually because you could have return if on one of them and something else another and some of these nested inside others. So it's actually probably somewhere around 30 possible code paths. I didn't actually count to some fancy algorithm for actually determining it. And God help you if you have a for each loop in here someplace with an if statements inside that. And of course, all of this depends on view mode, which is a string and node, which is an unstructured object with God knows what properties on it. So good luck. Correct solution here. Break this thing up. This is way way too complicated for any sane human being. You'll sometimes hear guidelines, especially in academia of your function should be small enough to fit on one screen. That's actually a stupid guideline. But what it's getting at is long and complicated routines should be broken up into smaller simpler routines. And that helps eliminate to these conditionals. This is actually an excerpt from the Linux kernel documentation for the coding standards there. The Linux kernel uses eight spaces or eight tab stops for its indentation. Drupal uses to the kernel uses eight. Now this causes a problem if your screen is not wide. And therefore, you know, when you get down to the code you're actually looking at, you've got about four characters on the edge of the screen to actually work with before you start scrolling. And Linus Torvald's answer to this in true Linus Torvald's fashion is, dude, what the hell are you doing with more than three levels of indentation in the first place? Okay, his style may need work, but he's absolutely right. If your code is that complicated and has that many layers of conditionals in it, it's too complicated for a mere mortal to understand. And sorry, you're all mere mortals. So am I. Closely related to this is runtime type identification. Fancy name for, you know, essentially hard coding your options for something. Let's say we want to, you know, consistently get the label of an entity, which on nodes is stored in the title property in comments and the subject property and so forth. This is fortunately not what Drupal Core does. But this, this would be a naive way of doing it. Because we have now hard coded our the existence of node user and comments into this function. And we cannot support more entities without hacking core. And we all know what happens when you do that. What core actually does is somewhat better. If somewhat uglier, this is the actual entity label function in Drupal seven core. Instead, we say, okay, grab the entity information, which is a from hook entity info, it's a big array, and grab the label callback function out of it and call that function. All right, we get the kind of flexibility that way, even get a fallback if one isn't specified. But it's kind of ugly and hard to read and actually rather slow because of all these, this array handling. Better approach, make it a method. Now, when you want to get the label for an entity, it's an object, it has a method, you call it. There is no conditional involved. That if statement, or that switch statement from back here, has been consumed into the language structure itself. It's now a fundamental part of the PHP language, polymorphism, which is really just a fancy way of saying, I objects can change what they are and you don't care. So at that point, we don't even need a utility function, we just call a method over done. Simpler, faster, less likely to have bugs. Number four, and I'm actually moving through this very fast, Drupal web test case. We here has written a unit test for Drupal. I'll bet you money you haven't. Unit testing, very good thing. Absolutely critical for good software. Unit testing is a method by which individual units of source code are tested. And a unit is the smallest testable part of an application. That is, you take a piece of code, break it down to the smallest possible self-contained piece, and make sure that piece works in a vacuum. You eliminate the variables around it, eliminate proper scientific method, eliminate variables and controls just as one little piece and make sure it works. Then when you put that piece together with other pieces, it's more likely to still work. This is an outline of Drupal web test case. Before every test it runs, we generate a complete database instance, we generate a complete language environment, we mess around with the global shutdown functions that registered with PHP, we create a files directory, we change some PHP any settings, we delete a whole bunch of global objects, set a bunch of other global objects, install Drupal, which easily takes a minute or two on its own, populate the registry table, install some additional modules that weren't part of the installer, I'm not entirely sure why. Rebuild all data structures after enabling the module. That comment is actually right out of the function itself, and I'm not entirely sure why we're doing that, to be honest. But then we go and run cron, then we simulate a login, which means we're actually doing a post back to ourselves within the web server, so we're making a web request back to ourselves and spinning off separate processes, and then we muck around with variable set, which is global. There's a reason why unit tests take so long to run, because we're treating Drupal itself as our unit. This is not a unit test. No, this is not a unit test by any stretch of the imagination. This is called system testing, which is conducted on the complete system. This is very good. This is important. System level testing is critical to make sure that your system as a whole works properly. But it tells you absolutely jack about whether or not individual pieces of code are doing what they're supposed to. What it's telling you is that Drupal does what it's supposed to, not that this function or the subsystem or this object are doing anything right at all. Just that's taken together, the bugs cancel each other out. That's just no good. Instead, little known class Drupal unit test case. This creates no fresh Drupal install, just sets up an empty database connection, and gives it off to you. And it's therefore about 1,000 times faster, I conservatively estimate. Actually, I'm making that number up, to be honest. It takes longer to post back to Drupal the web request to start the test than it does with tests that actually run. It's really that much faster because it's doing so much less, because it's not setting up the complete Drupal environment. Code you're testing with Drupal unit test case, you're testing just your code. This is good. This is how you know that you're doing something right, not that Drupal is doing something right. And in Drupal 7, the Drupal 7 branch is running. We have 10 times as many web test cases as we do unit test cases. We don't have unit tests in Drupal. We have system tests. We do not have unit tests. And that's the problem. Because if you cannot unit test it, your code is wrong. I don't know how, but I can't demonstrate that it's right, and neither can you. And if you can't show it's right, then it's wrong until proven otherwise. Judge mental? Yes. But that's how you get good code. If you see code that cannot be tested with Drupal unit test case, with a pure unit test, that's a smell. That indicates that this code has bits and pieces and dependencies dangling all over the place that you can't keep track of and could change based on God knows what without you knowing about it. How do you make a test code that is more unit testable? Step one, avoid globals. Really? Is anyone here still using globals in their own code? That's willing to admit it? One person. Okay, two people. Yeah. Please stop. If you're using a global, your code is not unit testable. Period. If you're using statics, your code is probably not unit testable. It might be. It depends on the static. But there's a good chance that it's not. By statics, I mean static caching variables or the Drupal static function. Drupal static is the bastard love child of globals and statics and has roughly that level of compatibility with unit testing. Dependency injection, which for those who have not heard me rant about it before, is a given piece of code should not call out to the system to get information needs. It should be given the information it needs. In the case of an object, that means you pass in other objects to it and then it can manipulate that data. In the case of a function, it means you pass parameters into the function. Very good example there. Who remembers the menu system in Drupal 5? Didn't it suck? You had to call arg1 all the time to check and see which node you were dealing with, which meant that you were hard coded to that path. Drupal 6 that shifted, so now you pass in a node object. That's dependency injection. It's very simple. And part and parcel of that is avoiding singletons. Singletons are a fancy global. They're something you cannot modify. They're a hard dependency. Every function call is a hard dependency because you have hard code in the name of a function into your code. Which also means deep function stacks. So if you have function A that calls function B, that calls function C, that calls function D, that calls function E, that calls function F, your function A depends on every single one of those and cannot be tested without them. This is a problem. Instead, do what we did before. Break it up into a series of separate functions and have a wrapper that does nothing but call each of them separately and wire them together. Or make it a OO and you can do essentially the same thing. It's just a more flexible way of doing it. Number five. Documentation. Why is documentation a code smell? Because you can't teach what you don't actually know. And if you can't teach it, you don't actually know it. And if you can't document it, you don't actually know what you're doing. And if you don't know what you're doing, I certainly don't know what you're doing. Drupal is actually really good in this regard. Drupal core is really good. I had to work hard to find a good example here. But this is the file transfer FTP class in Drupal 7. Tell me. What does this jail string do? I don't know. And settings is an array. I know that. But array of what? I haven't a clue. I have no idea how to use this function or how to use this method. And so I presumed that the person who wrote it has no idea either. If you can't document it, you don't actually know what you're doing. The return statement's great. Tell me exactly what I'm getting back and the conditions I'm going to get it back under. But these parameters? No. This is a bad code smell because it means you either don't know what you're doing or can't figure out what you're doing. And either way, it's too complicated to document, which means it's too complicated period. Too complicated to not have bugs. Date module. It's the only part of contrib I'm going to pick on today. What is object? Options are what? Name of what? In context. I don't even know what that is. The word context is at least four meanings in Drupal. And I'm sure this is a different one than all of them. There is no useful documentation here. Therefore, I have no clue what's going on. Therefore, the maintainer of this code has no idea what's going on. Because two weeks later, that maintainer is the same as me. They haven't read it in two weeks. God only knows what's going on here. Lack of documentation is a sign of a probable bug. Because if it wasn't a bug, it'd be simple enough for you to document. Lack of proper comments is a sign of laziness. Or that you don't understand what's going on. Or you're indifferent to it. Or that the code is so embarrassingly buggy, you're afraid to document it. Pro tip. If you're embarrassed about the code, that's a reason to document it to help deflect responsibility. Comment down here is he usually just documents his embarrassment. Yeah. Some of the most useful comments I have ever read or written are, here's what we're about to do. It's stupid. Here's why it's stupid and why we have to do it anyway. If you find yourself in that situation where you're doing something that doesn't make sense to you, that is an excellent place to document it. To show A, why it's being done so someone else doesn't waste hours and hours on it. And B, to show that other person reading that code that it's not your fault. What should you document? Every single function. Even if it's intended to be a private function, every single function should have a complete doc block. Every single method, even if it's private, has a complete doc block. Every class. Every property of an object should have a documentation block on it. Every single constant. Every parameter to every function and every method. I don't have globals listed on here. Does that mean we shouldn't be documenting our globals? No, it means we should be eliminating them entirely. We already went over that one. So yeah. Document, document, document. This is a code smell. This is a sign that your code is bad if you can't document it. And there'll be questions at the end I think. Number six, inappropriate intimacy. This is a code smell where one piece of code knows too much about another piece of code. Nobody knows too much. There's too much access to implementation details. This is a form of tight coupling. There's lots of different ways to code be coupled. So again, fancy academic term that refers to how tightly bound two pieces of code are. How interdependent they are. How likely one is to break another. And the highest form of coupling is when one piece of code depends on implementation details of another piece. Ideally, they're separated so much that there's just this intermediary that's handling traffic back and forth between them. That's not always possible. Ideally, you want to deal mostly in data coupling where you're passing parameters in so you need to know the API. That's it. Or control coupling. That's where one object is just telling another one what to do all the time. That's okay. But dealing with implementation details or God forbid shared globals means that you can't change one piece of code without changing another. It means when you ask what is the knock on effect of changing this implementation detail, the answer is I have no idea it's probably a 400 line 400k patch. I hate writing 400k patches. I hate reviewing 400k patches. But that's what you have to do when you have code that is too tightly coupled. If you want to optimize your code somehow to make it faster, to make it simpler, how much are you going to break the API? The problem is if you expose implementation details, those details become part of the API. And then you cannot change them without breaking your API every single version. Do we know any projects that have that problem? Are there any examples in Drupal? This form API, which is, you know, a complete, exposed data structure from the get go. So you can't change any of the properties even in the slightest without breaking every form. There's it's step child render API. Hook page altar is based entirely on having access to internal implementation details and modifying those directly. Which means you can't even change your configuration without breaking your API. Anyone tried to work with a field API? The way language is stored on a field? Yeah, if we try to clean it up for Drupal 8 to make it simpler, we're going to break that API too. I apologize in advance. One of our older ones. Yeah, hook node load. Nodes are just a bare object, which means if you interest in your own custom code, add a property to a node object and hook node load. And then later on try to change it because it's, you know, faster or easier to load or whatever. You've just broken your API. Probably the biggest code smell we have in Drupal is systems are just too interdependent because we expose implementation details. We don't actually have true APIs between different parts of the system. This is an example, just more concrete example. This is what the select builder part of the database layer in Drupal 7 would look like with exposed APIs or the exposed data structures. Let's see. First, define your field lists, then your table array. And wait, is it interjoined or is it joined? Wait, did that change? And when we list stuff in the where clause, we're using the alias of a table or we're using the table name. I'm not actually sure. And yeah, actually this interjoin, if we wanted to change it from just, from interjoin to just inner and work, move the word join just as part of the implementation, you can't do that without breaking the API on every single query. Fortunately, this is not the API in Drupal 7. This is, not only is it easier to read because it's better abstracted, but the implementation details are hidden. So we did in fact change whether it was inner join or inner that was stored in that data structure at some point, post code freeze in Drupal 7. I don't actually remember which direction we changed it nor does it matter because that implementation detail is hidden. Modules do not have an intimate knowledge of this query object, nor should they and therefore we're able to optimize the code without breaking stuff. Make sure you have provided the same kind of flexibility to people using your code. Best way to do that, interfaces, interfaces, interfaces. That doesn't have to be object oriented. If you're doing stuff with object orientation, build an interface and design to that on both sides. Even if you're doing procedural code though, you have a function, it takes parameters, internally it's a black box, it returns a variable. Period. And what happens inside that function is irrelevant to the caller. If it is relevant to the caller, then the caller is now intimately involved with your code and you cannot separate the two. You cannot unit test the two separately, you cannot change the API or the details of one of them without changing the API. Another easy trap to fall into, especially in Drupal. Number seven, impurity. Fancy name. A pure function, this is again I'm getting to a little bit of academia here, is a function or routine where if you give it a given piece of input, you will always get the same output and there will be no side effects. Side effects include I.O. That's writing to disk, printing to the screen and so forth. The great thing about pure functions is that they are very well behaved. You can unit test them with absolute guarantee that they're going to work next time. They work in the unit test, they'll work in the real world. You know if you call them multiple times, nothing's going to break. You may not get what you want, but they're not going to break. You're not going to break anything else in the system either. Side effects usually mean changing of state somewhere else. You can generally spot a function that is not pure by side effects. There's an object that gets passed into it to change. Is there a global that gets changed? Have we written something to disk? Have we printed something to the screen? Have we set a variable elsewhere out in another utility function someplace? Can we call this function 15 times without causing weird side behavior? If you have any of those, then this function is impure. Now not all functions have to be pure. This is why I don't write in a purely functional language because those are really hard to do state in. And let's face it, we're web developers, everything we do is about taking, is about IO, sucking data out of a database, writing it to a database. That's all IO. In that case, the goal is the side effect. The goal is that, you know, we've written something to disk. The goal is we've set some certain property on an object. We've altered a form structure or whatever. That's okay. Separate that functionality. Separate the side effect-driven functionality from your pure functionality. Leverage one from another. That's fine. But you want to be able to reuse that functionality in places you didn't expect, whether it's a function, an object, whatever. You want that flexibility. And if you don't have it, that's a sign that you probably should get it. Good example here. Drupal theme initialize in core. Let's see, we've got three globals, so we're already losing. One of them is the global user object, which is one of the dumbest ideas we have ever come up with in Drupal. And I don't know what the menu system has to do with it. But we're saying, okay, if we've already been called before this return theme, this does not actually mean there's no side effects necessarily. Because we can't guarantee outside of this function if it's been called. And then we are also setting, you're playing around with some statics. We already talked about how those were a problem. And then we're queuing stuff up into JavaScript. This function is extremely in-cure. Is that okay? I'm going to venture a guess and say no, because there's no actual IO happening here that we care about. What we're doing is initializing a system. Why is that system held in a global? Better approach. Okay, make it an object. You have a theme object that you statically cache in the theme function. It itself does not set anything globally. The theme class here, you know, the constructor contains most of that self-initialization code. The theme function, the theme method then, doesn't have to play around with the globals. It's tracking the theme key internally. So it's encapsulated. All the code is right there in one place. And that object itself does nothing to play around with global JavaScript. We're separating that part out into a separate call. So that theme object is now pure. The theme function, the getJS, the theme method, the getJS method, those are all now pure routines. And we can call them 100,000 times without breaking something else somewhere else in our system. And we know that because we know we don't have access to do so. Okay, the theme function is still calling Drupal edgeJS. And we still have to figure out what to do with this global user. But one patch at a time, please. So this way we've separated out our pure pieces and our impure pieces. And we can now unit test that theme class on its own. All of these tie together. All of these signs of codes that sucks. So let me actually pause for a moment here. Who here recognizes their own code in one of these slides? Yeah, I'm not surprised. All of us ran into this problem at some point. But there are, you know, conversely there are ways to recognize you're doing something right. And I've heard these good smells. Good smells are kind of the opposite of these bad smells. If you look at a piece of code, is it single purpose? Does it do exactly one thing and do it well? No side effects? No weird, you know, non repeatable behavior. Does it do one thing and do it well? Is all the logic for it contained in one place? Or is it scattered around 15 functions and three globals? If your code, if everything for this functionality is in one place physically and logically, you can fit it in your head at one time. That's a good sign. If you can just wrap your head around this one piece in isolation, that's a good sign. If you're just looking at the code, you can tell what it's going to do because it's not depending on random other pieces of information elsewhere around the system. It's not depending on, you know, what's the state of this global, which might have been altered by this altar hook over here and all that kind of stuff. If you can just look at it and say, aha, I get this passed in, I will get this out. That's a good sign. If you can do it 17 times with the rest of the system changing around you and still get the same result, that's a good sign. If it is well documented so that you can point out where that's true and more importantly where it's not. And if you have lots of if statements and you have to explain why, if you're doing one of these things that normally look wrong but it's actually the right thing in your situation, document that fact. That's a good smell. When I'm reading code, if I see, you know, a block of code that looks horribly ugly, but there's a comment block on it that says, you know, here's what we're doing and here's why, even if it's not blaming someone else, that's a good sign because it means whoever wrote that code thought it through. And they're also telling me what is going on and where those other side effects are going to be. That's a good sign. For some more, shameless self-promotion back in triple con Chicago, I gave a presentation on the converse of this session, which is good API design. You can check that out. Video is online. There's a couple of other good articles I recommend reading on code smells. These slides will be online, so you don't need to write it down. The first two here are talking about other code smells in a more academic fashion a bit, kind of their academic names. The third one here from Joel on software, it's an interesting proposal to use coding standards to indicate code smells. The actual example he uses is, you know, you always notice sanitize your variables from a user, right? He actually proposes using variable naming conventions as a way to create a code smell for unsafe data. I don't know if I agree with him, but it's an interesting thought. Is there somewhere in Drupal that we can make it easier to recognize when we're doing something stupid? Maybe. It's worth asking. And I encourage everyone to subscribe to the daily WTF. It's a great research project. Some people think it's a comedy site. It's a great research project of all the things not to do by example. So far, I've never seen any of my code end up on here. I've never seen any Drupal code end up on here. I hope never to do so. And I encourage you to make sure we never do so. But it's a great way to see things that you shouldn't do. And end of the main session. We do have a fair bit of time for questions. So if you have questions, just flag this guy down. I think we're done right over here first. A few days ago, you tweeted about another code smell about strong arm, using strong arm with the module to code smell. Can you speak about it? I think there's a bit of an echo on that microphone. A couple of days ago, I actually tweeted about the strong arm module to the Drupal module that lets you tie variables from variable into the features module. And I tweeted that it's a great way for finding code smells. Not that there's anything wrong with strong arm. I actually have not looked at this code in detail, so I'm not going to say anything about it one way or another. But it lets you look at what modules are doing with the variable table. And say, oh, look, this module is writing in five variables for every node type in the system. This module is writing in a variable for every node type. Why? If you see a module that is abusing the variable system, that's kind of a Drupal specific code smell. If it's using the variable system for it really should be a dedicated table, that's a code smell. And strong arm, because it gives you a GUI where you're poking through all of the variables that exist in the system so that you can flag them to get really should be somewhere else. So that's more what I was going for there. It wasn't a slide against strong arm. It's a slide against modules that are abusing the variable system. In your list of things that should be documented, I didn't see the file all by itself listed. And some files do declare some documentation about the entire file being a kidding of the file. Is it something that you have in your list? Again, there's a bit of echo. I'm not sure if the recording is going to catch the microphone properly. Should we also document files? Drupal's documentation would say yes. Personally, I sometimes don't document, put a doc block on a file if the file itself only contains one thing. So if a file just contains one class, then the class documentation should be sufficient. If it's a file that contains like here's all of the utility functions related to this subject area, then probably yes. You do want a doc block for the file to indicate why a function is in this file and not another file because it's part of this category. In Drupal, using the group tag to indicate this function is part of this category. But that's somewhat project specific. But in general, most of these recommendations are not Drupal specific as far as documentation. File, to some extent it depends which developer you ask. I will say this. You will not get yelled at for having a file doc block. You might get yelled at for not having one. I've never seen anyone yell that for actually having one. Other questions? I'm still going to echo from the microphone. Shall I speak more slowly? What approach would you take instead of having a global user variable? What would I do instead of a global user variable? If you'll forgive me a moment to make a small sales pitch, for the web services and context core initiative, one of the things we want to do is introduce a universal context system into Drupal, which in some sense is a global, but it's a global with very specific controlled properties and it's through that that you access information like currently active user. And the advantage there is because your only access to it is through that controlled environment, that interface, you can't modify a global object for something else. You can't change the active use of code. You can, you know, change it for your scope and children of your scope. But once your scope ends, any changes you've made kind of go away. So that's what I'm trying to replace the global user object with. For more information on that comes my core conversation session tomorrow. I forget the exact time, but all of the Drupal initiative owners are doing a status update. I'm giving a code sprint on Friday. So if you want to know more about how I plan to kill global user, come to those. I think we've got a couple over here. At any plans for removing the query builder and making people use stored procedures? Are there any plans in removing the query builder and using stored procedures instead? I'm sorry, the microphone is giving a lot of echo. At any plans in using stored procedures? Are there any plans for using stored procedures instead of the query builder? I don't have any. In large part because stored procedures in databases are extremely not portable, so we have to write those separately for every database we support. However, there was a core conversation this morning by Demi Antonio, who's one of the database co-maintainers, arguing that we should move to an automated storage approach with automated integrated materialized views for query and that might use stored procedures internally. That's still in very early discussion. I don't know what if anything's going to come of that. But I'd say from a code-smell perspective that's not really a smell one way or another. Stored procedures are a database-level tool. They've got their good uses and have any plans on implementing some code and standards so that we can remove smells from code? Are there any plans on implementing code and standards to remove smells from codes? Coding standards to remove smells from core. I have not proposed any in particular. Actually, I considered the use of a static methods on classes to be a code and not everyone agrees with me on that one, but I do encourage people to deal with objects not with static objects or static classes. I believe the documentation standards for Drupal do encourage that. The documentation coding standards are basically exactly what I just said. Document everything in detail to the best level you can. I'm certainly open to having coding standards that deal with some of these smells. Again, they're not necessarily hard rules. There are cases where you do need 17 if statements in something. I'm definitely open to documenting good guidelines for good code. If you're interested in working on that, let me know. We can try and get that into the documentation. Before, there was a talk, a core conversation talk about Selenium testing. As you said, it's system testing from what I understood. What do we have or do we plan something to unit test our JavaScript because I know there is like some JavaScript framework like Jasmine or something like this. Unit testing JavaScript. I believe jQuery has something that it uses but I've never worked with it. It's an area I don't have much experience with, so I can't really say. On the PHP side, almost the entire PHP world has moved over to using PHP unit as their testing framework, except us, of course. Which focuses on unit testing and code coverage tools. On the JavaScript side, I just don't know what the tools are. One of the bad code smells in Drupal, I think, is that we don't define which functions are public and which functions are private. I sometimes see whenever a web check is reviewing a patch for stable 777 she'll be freaking out because function definitions are changing and we have no system of which functions we can use as API functions we can rely upon and which functions might change in the future. I think we need to have a much stronger definition of which functions are public. I would agree and actually a fair part of that I think is moving over to OO and having real protected methods in part because as long as you have these kind of pseudo private functions just a normal function you're still going to treat that as an API even if you shouldn't and that goes to the developer too they're just, you know, writing along a normal function and that becomes an excuse to not write a better API and think through what do I really need to use here and thinking through that interface so I agree interface-driven development does not imply object-oriented developments, it just helps. If remarked functions as public or private in the doxygen then we could actually have a task that would point out are we not allowed to call this function don't do it so the task would fail. Having a unit test that tests that you don't call a certain function you cannot actually do in PHP unless you're using Runkit Runkit for those not familiar with it is a PHP plugin that lets you change the nature of the language out from under you. It's a very, very dangerous program, a module I don't use it myself but that's the only way to do something like you know make sure this function is not called and not just if that's something we can unit test for but that would be something that we could document as a convention of if the dock block for this function has at private on it or something like that then it's allowed to change between dupal versions and if you try to call it directly it's your own fault. But at the same time we have to also culturally say if there's something that I cannot do without calling that private function then that is a critical bug in the API and should be fixed even if that means adding functions within a stable line like in 7879 that would have to come as a match set I think. I think we've got time for like two or three more questions. So do you think that we can use the Coder module at the moment to detect code smell and what can we do to improve the Coder module to do it better? Coder module is essentially a giant pile of regexes so if we can write a regex that can detect some of these smells then probably we can use that as a smell test no pun intended. Probably it would be something like you know check for you know check dock blocks if a dock block doesn't have a description under a parameter flag that if a dock block doesn't have its parameter list doesn't match up to the function that if you probably could write one that checks a function or a method and says if there are more than five if statements flag that that may or may not be useful most of these require humans to actually analyze if it's real but certainly you could have a check for bad signs tool there are actually such tools in other languages I believe there's one for Java someone said one over here what's that per critic pearl critic apparently pearl has one called critic I don't know of any offhand for PHP but I have no problem with us writing one what's that PHP coach niffer you're right there is one let's use it other questions hang on he's coming with a microphone so we can record it sorry Larry I notice on your website that you specify that you're a Zen certified engineer I wonder if that would be a recommendation you'd make to anyone to help with our continued professional development make our code better and that sort of thing I heard the first part I'm a Zen certified engineer your certified Zen engineer I wonder if that would be in any way useful for our continued professional development and improvement for Drupal code generally Zen certification I don't think they're sponsoring this conference so I'm okay with saying this it tests your knowledge of the breadth of PHP not your knowledge as an actual good developer it tests how well you can memorize the PHP manual I'm not sure I'd recommend that anyone get Zen certified I would recommend reading through the Zen certifications study guide because that is a good introduction to parts of the PHP language that you probably have not seen before but that Drupal has started to use more of but as someone who is Zen certified I don't place a great deal of value in it on a statement there I think that's about it thank you all for coming