 I spend my days reticulating splines at automatic and this really cool website, nomadbased.io, kind of suggests that I'm some kind of digital nomad roaming the world and working. I want to motivate my talk today with this kind of thought provoking question. How buggy would your software be if your compiler insert your favorite word there? Could pinpoint every single problem with your code, not just typing errors, not just syntax errors, not just missing semicolons, but every single problem now, no matter how mundane or obscure. And we're actually going to answer that question. It's a secret. It's not a not a real hidden secret, but it's a secret to writing highly-performance software, to writing exceptional code quality, and to writing maintainable software. What is the secret? Well, it's us. It's the community. It's other people. It's the premise that there's always someone who's smarter than we are who can come into our code and fix something that we have trouble doing. It's the fact that, despite the fact that we're always learning and always growing, we can't learn fast enough and grow quickly enough and dedicate enough time to catch up with the project that we're working on, like WordPress. It grows too fast. We can't do it alone. So our biggest asset is others. And so this talk is called Playing Well with Others. Although maybe a more fitting title, and I'm going to read it in case you can't see it, is how to develop non-trivial computer programs, which potentially will be programmed by several people, which must last for a substantial period of time and be maintained and revised by others and which must be continually explainable at varying levels of detail to a variety of people with a variety of backgrounds by yet other people. And that's thanks to a man named Gary Knott. That's really the challenge we have. And the secret to doing this is writing code with a mindset of working with other people. It's not all that different than the way that we learn growing up how to interact with others. Act consistently. Don't mess with other people's stuff. Treat each other with respect. Stop trying to prove how smart you are and communicate clearly. These all have metaphors to programming. I want to run through some kind of basic practical ideas at how to do this, but I want to start with a misnomer, the edge case, because it seems so central to a lot of these issues. Maybe we've often heard, like, this is ready. I'm 90% of the way there. The only thing that remains are all these edge cases. We'll take care of them. But especially with large projects like WordPress, when your project runs 25% of the web, what does edge case even mean? This interesting thing happens, that they don't exist because your little edge case is impacting 10,000 people every day. And again, I want to motivate this with a picture. Maybe someone can take a guess at what this is. Maybe you've seen it before. It's the output from a finite element analysis on the stresses imposed on a flat table with a ball sitting on it. And the stress inside of the ball is not calculated, which is why it's solid. And there's something really profound here. Everything that we see on the bottom of this picture, every single ripple, every single curl, every single gradient of variation is caused by one small point, one particular point. I mean, this point isn't even a millimeter wide. It's smaller than that. Every single thing that's interesting about this picture comes from the one edge case. And this is how it is with our software most of the time. We have edge cases. We have boundary conditions. We have a little bug. And the impact of those ripple throughout the software in interesting and unexpected ways. And a lot of times, everything that's actually interesting about the software that we write is due to these edge cases. And playing well with others means thinking about these things. As a precondition to always be thinking about edge cases, boundary conditions and how they impact the software as a whole and how they impact other people's code. On the screen is just one list of things that I'm always trying to think about as I write code. What happens if I get an array when I expect a scalar value? What happens if the server is down? Am I just going to start grinding away, trying to re-hit that server and accidentally deny, do a denial of source attack on my own server? What happens if the code I write depends on a function of PHP that was introduced in a version that 20% of the web doesn't run? Weird things. Weird things that seem to be okay, but whose impacts and whose bugs ripple kind of nefariously and nebulously throughout the interwebs. But moving on, I want to start by talking about things that we bring in, things that we spit out and what goes on between. Playing well with others means understanding that input is aggressive. When we're writing code, the input that we received is armed and dangerous. If any of you know little Bobby Tables, it's something that I originally had in my presentation that I took out, but SQL injection should be something that we all are familiar with because it's very devastating when it happens. We need to treat things that other people send us with a certain suspicion and to make sure that we don't let the spewing output from someone else mess up everything else throughout the remainder of the program. We need to force people into the right behavior, which sounds kind of rude. It sounds kind of not very playful. But there's a long-term effect that the more that we guide people into a proper functioning way of writing code and interacting with us, the more positive that relationship will be and the fewer problems we'll have down the line. And of course we always want to be programming defensively because we don't want to accidentally introduce something because somebody else is doing something wrong. A lot of times we hear, be very liberal with what you accept and conservative with what you produce in code. But I think this leads to lots of problems, especially when we're writing code that other people use. Long term, we ought to be extremely conservative and unforgiving with what we accept. And if you want a not directly programming example of where this can impact us, think about the command line utilities that we use every day and think of how confusing it can sometimes be when there's 40 different ways of doing it. And so you're looking online, trying to figure out how to run these commands. And there's no clear way. And so we'll look at that in just a moment. But by forcing or guiding each other into how we intend to do things, we can actually sustain the community. We can encourage each other to learn and to work well together. Just want to make a very quick note while we're talking about input and errors and whatnot. I don't want to go into detail, but never underestimate the power of lists when programming. Lists can be used to encapsulate your data and remove an entire class of bugs the way garbage collection removes an entire class of memory issues. Go find a good functional programming talk on YouTube if you want to learn more. What I really want to do is I want to give an example of being conservative with our input. Oh man, it doesn't matter that this text is small. What's evident from this function is that there's a lot going on. This is the getPost function in WordPress. And my goal here isn't to be controversial to put anyone down. But everything that you see here is written to try and figure out what the heck was sent to this function. You can send a WP post object. You can send a number with a post ID. You can send a string with a number in the string. With PHP, you could actually send a string with a number and then some other random string of characters after the number and it would work. And you can send an all. And so it's nice. It's flexible. We can throw anything at this and expect some valuable output. But what happens when this goes wrong? When something happens and there's an error, this becomes extremely difficult to debug. Why is there an error? Did I send the wrong input? Did I send the input in the wrong way? How do I trace back the path of data? It's very complicated to debug. And beyond that, when we're looking at guides on the internet tutorials and how tos, how am I supposed to call this? How is the best way to get a post out of the database? Being conservative with our input looks like this function here. We have one specific type of input that we allow and we're very clear about what happens. And if we don't get exactly that expected input, we're just going to bail. We check the quality and the quantity of the data. In other words, we check the type that we're receiving and we check the range. This function should never, ever, ever, ever, ever get a negative post ID because that makes no sense. If you're writing code that other people are going to use and you have certain parameters that should never happen, the values should never exist, then blow up. Because the faster that we abort or the faster that we blow up on a developer who uses this in a suspicious way, the faster they're going to be redirected to your code and they're going to see what should be expected there and then they're going to learn how to use it and then they're going to be able to trust your code because they have that guarantee now that they know how to use it that every time they call it, it's going to work. And if there's a problem, they know exactly where to go to. They're going to check the input, which gives them like a two second check to see if they're sending it the right thing. And then if they are, then they know that there must be an issue with your code and they can contact you about that. When we're spitting things out, we don't want to confuse people. Just in the way that we talk about receiving spurious input, we want to be careful that we aren't making the problem worse by sending out spurious output. A lot of these things really lend into ideas that static analysis packages can run over our code, meaning there's automation tools that we can use to discover bugs, even complicated bugs, memory leak bugs, things that aren't that obvious, but these kinds of things depend on certain assumptions that they can make about our code. And the more we break these assumptions, the less power we have to do high level analysis and the less power we have to build trust in our code. When we're writing functions, our functions should always return a legitimate value. If we're writing code that someone else is going to be consuming, we should never give them the fear or the uncertainty or the doubt that they can count on us. And again, I don't want to pick on anything here, but these are good examples. This is WP insert post. What does it spit out? Anybody know? Post ID I heard? Well, what happens if there was a failure? It might be zero. But it might not be zero, because if you passed it with true as the second parameter, it might be a WP error. If you pass it with a second parameter as true, this could be a legitimate number or it might be a WPR. But it could conceivably be zero in this case, which makes me wonder what the heck is going on. It's confusing because the data isn't always legitimate. I would normally want to set post ID equals to this function, thinking that it's going to work, but if it doesn't work, that post ID doesn't at all represent post ID, but I need that variable. So when writing stuff that produces output that other people are going to use, we should try as hard as we can to always return a single type, a single structure of data. Who here in this room is familiar with the list operator, list function in PHP? Well, if you're not familiar with it, you should check it out. It's been here since PHP 4, and it lets us return multiple outputs from a function and have those defined. I'm not sure how clear it is in the display, but I'm using a hypothetical safer WP insert post here, and that returns a list of a post ID and a Boolean value for has error. Now, every time that I call this function, I can be guaranteed that post ID is going to be an integer and that has errors going to be a Boolean, and every time I call it has errors, they're going to be true or false. And so I can check if it's an error by checking an error variable, and then I can always count on post ID. This is kind of a bigger discussion around being explicit and being clear about our failure modes, because software fails. And we want to make sure that we don't get people to count on our code and then spit something out unexpected when it fails. This function isn't altogether wrong. It returns null when it goes bad, although you wouldn't know that unless you realize that returning without a value in PHP returns null. In JavaScript, it doesn't return null, it turns undefined. And when it succeeds, it returns this WP query. And I don't really have a solution in this particular case, just pointing out the fact that this could return a valid query or it could return null, and we have a tendency, we like to do things like this. We like to chain functions. And if, if I'm calling one code as the parameter, one function as the parameter to another function to chain these things together, then I ask, what happens when the inner function fails? How do we handle the failure mode here? It's not obvious. It's not all obvious. We're going to have to split it out. We're going to have to use some kind of conditional to make sure we don't pass the input on. And so we can't chain it. There are ways around this. There are ways to make these work. The most important thing I'm trying to communicate is that we ought to be thinking about how other people are going to be impacted by failures in our code. And then in between the input and the output, we got to fill in the gaps. This is the point where I talk about comments. I love comments. But actually, a lot of comments in your code can kind of ironically be an indicator of bad code. Just the same way that a paucity of comments in your code can indicate bad code. It's not that hard. Great comments are there for what's not there. And to put it in perspective, here's just a great quote from the guy I mentioned earlier. I'm going to read this out loud again. We want to explain at strategic, algorithmic, and conceptual levels. The main point with comments is that unlike the programmer at the time of coding, who by the way is consumed and saturated in this problem in the context of what's going on in the variables and the functions, the reader at a later date can't grasp, can't remember, can't recall the functional meaning of the variables being manipulated without help. And I'd like to remind us all that we are that different reader six months from now. We're a different person altogether, just as all of us in our seats here today are different people than we were six months ago. We're constantly learning. We're constantly engaging ourselves in new projects. And so the thing that seems like we'll never forget it today is going to disappear two weeks from now when we start on a new piece of code or on a new project and we saturate ourselves with that and we find ourselves saying about some new thing. I'm never going to forget it. So the comment is specifically there to draw back in that context so that we can get right on track again. So here's going to be a quick list of some very practical comments or types of comments that I like to use that help other people when they're reading your code or help yourself when you're reading it in the future. A comment should describe how something could have been but was specifically chosen not to be. In this example I'm calling pregmatch which I'm using to determine if a string is a valid regular expression. Now I could simply use the return value here. It'll return false if it's not accurate. But if that happens it's going to spit out an error and I don't want the error output. And so someone could come around and you know like fix the code by saying this is unnecessary let me take out all this junk and just return and they put an ampersand in front of it to mute the error. But actually if you do that it could throw off some unit tests. So because it's it's a particular I could have done it this way guys but if you come into this code and you think this looks silly be warned it's here for a reason. I like to think when I program of this little sticky note right on my monitor right in the middle the people that I work with are not idiots. A lot of times maybe some of you in this room have experienced where somebody comes in to your code and fixes it. They see something that's obviously wrong and they rearrange it and they fix it and now everything's broken. We've got to trust each other. We work with incredible people and usually if something seems odd to us there's a particular reason for being odd. We should we should write those out. In this case I'm commenting something that seems unintuitive. I'm grabbing more data from the database than I need to. And so someone might come in and say well we could improve the performance here by grabbing less data from the database. But there's a note that says oh hey by the way it's there's a process here where we might reject some of these database rows and by grabbing it all it's faster to process it in PHP and throughout the ones we don't need than to go back and make a second third fourth or even fifth call to the database. So yeah there might be a small performance disadvantage here but the comment says overall this is saving us from some major performance hits. We should also point out risks. A lot of the times the code we write is valid and good and great for 80% of the runtime but there are certain conditions under which this code should be untrusted or this code should be run with a different operating mode. In this case I have a hypothetical function that works well other than it has an order of end of the 42nd if that means anything to you when a certain condition and I say you know if you're going to be doing this use the fuzzy approximation instead of the exact one. Risky code is sometimes necessary but it should always be pointed out. It's kind of like sometimes you know you have the road on the mountain that has the sharp turn there's no way to avoid it you need to alert people so that they know to slow down. Something I like to tell people or think about when we when we have these discussions back and forth about commenting the obvious is that if you have to pause to think about it it's not obvious. Sometimes there's flame wars about you shouldn't comment that it's obvious or you should comment that it's not obvious and so go to the author of the code and ask them what the snippet does just give them that snippet don't give them everything else and if they have to say oh yeah it does that that's not obvious so comments should trivialize these trivia things that are mostly obvious but not for example a lot of times we do something where in this case we're chopping up a string it's a very obvious operation what's going on if you read the code but only up into the point that you know that it's chopping up a string it really begs the question what does a string look like so if you're operating on some data in a way that leaves that question throw in a comment with an example of what that string would look like in this case I've done it above you know instantly when you read this code what's happening similarly if you work on objects or arrays that come into your function when you're programming it it might not seem like a big deal you know what you need and you know what's there but actually knowing the structure that object is really important for someone who's going to come in and read your read your code they need to know what the bigger picture is and they don't know all the details that you do besides it gives them a heads up when they hit the top of your function to know these are what what we're going to be dealing with otherwise they might get 75 percent of the way through your code and then all the sudden you're using this member of an object they didn't know about it just helps a lot it it is so useful when reviewing code when reading code when modifying code to know from the get go what kind of input input we expect and sometimes it doesn't have to be that complicated just an example works wonders this little gem from the send of python many of you are probably familiar with explicit is better than implicit sometimes there's just a real value in verbosity anybody recognize this anybody say what it is back in the back what is this yeah it's that SSL thing what you don't see in this slide is the fact that there's like a mountain of conditionals above and below this statement and this was a nasty bug that apple had a couple years back that validated invalid SSL certificates and without getting to into motivations or what could have been or what why something was it is definitely not obvious when you're scanning through this list of nested conditionals that this is wrong because the second go to fail line is nested just like the first and there's a bunch of other nested conditionals with with indents looks normal but because there are no brackets no curlies it doesn't jump out had there been curlies it would have been uglier verifiably uglier but also the indentation on the second line would have jumped out like a sore thumb there's there's a value in verbosity when we're when we're communicating we don't always want to choose what we consider the prettiest route because if it doesn't help somebody else to understand what's going on it's going to hurt our project Douglas crockford crockford who wrote JavaScript the good parts has a pretty strong stance on some things like this he says that we should avoid features in a language that are often useful but occasionally hazardous this really touches on the automatic semicolon insertion debate if there's something about the language that can cause disaster that can cause hazardous things that's toxic that can produce bugs that don't look like bugs that appear in strange ways that are hard to find then this feature is a bug and and we will benefit our project will benefit our peers if we just simply choose not to do that out of out of respect if we say I know that I could do this I know that I could prove that I know how to do this if I did do it but I'm not going to do it because it's going to help someone else understand what's going on Bjarn Struestrup the author of C plus plus has a thing to say about nested structures as well he says we should view them with great suspicion deeply nested conditionals deeply nested loops when we write code that starts branching and we write code that gets very complicated and the control flows can't easily be thought through it misleads us it gets complicated and it gets hard to reason about we should view those with great suspicion and try and write code that has a simpler flow and probably the best summary of this is a quote from a gentleman named Bartos Mielevsky we need the structure not because they're better to look at not because they run better on the computer the computer can run spaghetti code just fine we need this we need to program with structure and elegance because our finite human brains aren't good at understanding lots of complexity real quickly I want to I want to hit on like when you just gotta sometimes you just gotta right you have a very performance critical point in your application you have a very security critical point and you just have to get something that's ugly and a mess and convoluted and complicated but we don't have to throw out our ability to work with others we just have to hold closer to those principles communicate what we're doing test what we're doing and explain it this is one of my favorite php functions it's symbolic of all those things that we can't do it's symbolic of that function that somebody wrote who left the company ten years ago and nobody's been able to touch the code because they're afraid it's going to break everything even though no one understands it this prints out hello world if you have to do it if you just got to do it then there's almost always an easier way to do it that doesn't meet your performance goal that doesn't meet your security goal there's almost always a simpler way to write the code so write it in a comment and if you have a development environment run both of them this is called dual algorithms from an idea from one of my programming heroes Stephen Guire run it both ways and compare the output and if the outputs different and you fail you see these these secret or mysterious or complicated algorithms are risky because people don't have the same kind of review review it's not obvious what's going on we can afford to to check this against non-performing algorithms or simpler algorithms to make sure that that risk doesn't topple over to summarize everything about playing well with others in writing code don't be cute this is not about proving who you are the best programmers are proving their intelligence by writing extremely simple code that everybody will look at and say oh that's obvious I could do that maybe they can't but it looks like they can and that's that's the key not to be not to be overly complicated but to be consistent to handle input in the same way every time so that people can rely on what they're sending to you and to spit out the same kind of thing every time so people can trust on what you're producing and and ultimately the biggest thing here is to communicate to communicate communicate communicate what we're doing is clear as a manner as we can I would like to open up to questions I think we only have one microphone in this room so get your running shoes on curious to get your thoughts on using exceptions for flow of control as opposed to like the typical word press return whatever you feel like returning from function in a project like WordPress we have varying levels of reach to the end user I don't know if I would ever argue to use an exception for control flow versus using an exception to to blow up but there's a certain level of risk assessment we should perform when choosing how quickly to blow up and I'm a huge advocate of blowing up early I think the earlier we can make the whole system crash the more quickly other developers will notice that something's up and fix it before it becomes a mysterious problem later on you can't really do that when your code is hitting the end user but there are a lot of places that we can and so we can actually have varying levels of exception varying levels of error and the repercussions of those errors so we just have to be subjective we just have to kind of evaluate the costs of the risk what would happen if it went wrong and use that to guide us hi on returning a single type of value in WordPress we deal with a lot of basically a legacy code base that has a lot of things that do not comply with that what do you suggest is the better way to fix that without breaking back backward compatibility we can always move forward in my theoretical safer WP insert I was able to do that without breaking anything because everything that relies on WP insert post can continue to use input post but we can actually write new functions that are nothing more than a wrapper around existing functions which hide that complexity which perform the complicated things there's a lot of there's a lot of contention about using Boolean input parameters and we can hide a lot of that behind new functions so that we can be explicit from the function name what we expect results to come out of it's a very complicated question in a big project because we don't want to multiply and the function name name space but it's definitely doable especially with some of the more heavily used and more important functions all right I think that we're at time and state of the word is coming up upstairs in the auditorium next to where lunch is at five thank you very much