 Thank you everyone for coming. This session is aphorisms of API design. If that's not what you're looking for, you're in the wrong room. My name is Larry Garfield. You may know me as Krell online. If you want to make fun of me on Twitter during the session, that's where you do so. I'm a senior architect at Palantir.net, where a web consulting firm in Chicago is focusing on Drupal. For Drupal 8, I'm the web services initiative lead. I'm the Drupal representative to the Fwingwork Interoperability Group. I'm an advisor to the Drupal Association. And general purpose busy body who flies around and tells people what's wrong with their code. That's kind of what I do. Let's start off by defining our terms a bit. An API is an application programming interface. That is, it is code intended for other code to use. Or people who write code. But this is not something end user see. We're talking about the interface from code to code from coder to coder. And an aphorism is a concise statement containing a subjective truth or observation cleverly and pithily written. Cleverly and pithily written, that sounds like me. So what we're talking about here are not hard blind rules. They're more like guidelines. So what we're going to be doing is covering some clever and pithy sayings about good code. Everyone's down for that? Sounded like a good plan? All right. The first one we need to know. Learn the rules like a pro so you can break them like an artist. This is not specific to programming, but it applies here. For everything I'm about to say, I guarantee you you will find an exception to it. I can find an exception to it. But these are still things you want to understand so that when you don't do it, you have a good reason. When you don't follow these guidelines, you understand why you would want to skip over them. First off, everyone know Isaac Asimov? Most people, I hope. One of his books, The Gods Themselves, which I do recommend, he posits that if there are multiple universes, there are infinite universes. Basic idea of the book, humans are contacted by aliens from a parallel dimension and plot happens and one of the characters observes, you know, if there's two universes, why would there only be two? There's probably a whole lot more parallel universes. Because really, two is the least likely number in the universe. Why would there only ever be two of something? Things could be unique. There could be a whole lot of something. Why would there only be two? And that means that in software, if there's multiple possible implementations of a given problem, if there's multiple possible ways to deal with a certain situation, then you should assume that there are an infinite number of possible implementations, an infinite number of options. So what does that mean? At a very applied level, who's ever written an API in which you control options with constants with numbers? A couple of people? Yeah, don't do that. So let's say we've got this code here, load articles by status, whatever. And so we pass in one of these status constants, publish a draft, and then, you know, look those up from some mechanism, some views or database query or whatever, and then load them. Great, and that's fine, that's wonderful, and it's right until you try to extend it or let someone else extend it and you go boom. Because you didn't say anything about three, but guaranteed someone's going to want another status. So how do you fix that? It makes that as a string. There's a lot more strings and less likely to bump into each other. This is exactly what Drupal does when it's doing it right, and in some places there are contrib modules that have done it the wrong way and have regretted it. I think voting API is a good example. There are early versions of voting API used constants for different modes and a whole bunch of voting API extending modules came along and tried to add their own constants and things exploded. But wait, Larry, what about booleans? Aren't booleans a two-state? Isn't that a case of something of which there's only ever two? No, because booleans are not a two-state value. A boolean does not mean it's either or, it does not mean A or B. A boolean is true or false. A boolean is true or the absence of true. A boolean is a single value. It's a value that is or is not. It's a very subtle distinction, but it's something to keep in mind. If you know that something could possibly be true or not, that's a boolean. If you could be using option A or option B and A and B are the only ones you can imagine ever existing, you're wrong. There's going to be a C and someone's going to come up with it and you're going to be stuck if you try using a boolean for that. Don't use booleans for modes. Good example here. Access is a boolean. Does user five have access to edit node 12? Yes, no. It is the presence of access or the absence of access that is not option A or option B. Access control, which system decides the user has that access, is n possibilities. In Drupal 7, you only had one access control callback, but you could exchange it. In Drupal 8, you could have many access control callbacks. And you have to account for potentially an infinite number of them. Corollary to this, so I like to call Garfield's law because I'm not egotistical at all. One is a special case of many. Some of you may have heard me say this before. What does that mean? Let's take a look at node load multiple. You've all used node load multiple at some point. So in Drupal 6, this is how node load worked, vastly simplified. You pass in a node ID. We look up the base record out of the node table, and then we do extra stuff with it, call hooks and so forth to make extra database queries, and return that object. And that's great and that's wonderful. Right up until you actually want to do the same thing to seven different nodes at once, and then, oh, good. The only way to do it would be to load one table, load all the nodes base records, and then call all of those hooks on each node individually. And this is what you did in Drupal 6 with your own custom code. You looped over some node IDs and did node load on each one of them, and ended up generating about 500 database queries. This is known as the select n plus one problem. It is also known as your database administrator hate you. Please don't make your database administrator hate you. This was fixed in Drupal 7. And in Drupal 8, any entity operation is innately a multiple operation. Node load is simply a simple wrapper around loading multiple nodes. Hook node load always assumes you're dealing with multiple nodes, because if you have an array of nodes, the iteration you do is exactly the same if you have one 50 or 100. It's the exact same code, which means you can treat one as a special case of many. That greatly simplifies your code. If you have to deal with, you know, I may have one of these, or I may have to do a multi-operation, that one case is always a subset of the multiple operation. Merge them. Which brings us to our first guideline. N is the only number. Don't assume you know how many of something is going to be. There will be many of them. Many is a non-defined number. Number two. Quote from Rosmas Leador, founder of PHP. Fail fast, fail cheap, be lazy. Everyone here is lazy, right? If you're not, then you're not doing a good job as a programmer. Good programmers make the code debug for them, because really, why are you going to waste time debugging when you can make the machine do it for you? The whole point of having a computer is to make the computer do work for you so you don't have to. Don't plan for every eventuality. Don't plan for everything that could possibly happen. Figure out how it's going to break in a way that's going to help you. Good example of that. Drupal 6. This is an excerpt from the theme function. And, you know, go through the system and check, all right, we're calling this theme key. Does the theme key exist? If not, all right, just return null. Except that null in PHP gets cast to an empty string when you try to use it as a string. If you have a typo in your theme function somewhere in one of the four places you need to specify it, you don't know. In fact, the system will not only not tell you, it will actively try to keep you from finding out that you have a typo. This is not good. Who has ever run into the problem where, you know, you're trying to use a formatter in Drupal 6? And first, who remembers developing in Drupal 6? Okay. All had the experience of developing a formatter and having that crazy long name for the theme function you need for it. And you're trying to write it and it just doesn't show up and you have no idea why. This is why I have lost tens of hours of my life to this problem that I will never get back. Whoever decided this was a good idea, I blame you. We fixed this in Drupal 7. Instead, you know, some of the logic changed, but now, if there's no hook found, we log it. Which means if you are getting a weird problem where you're trying to theme something and it doesn't show up, you can check the log and it'll tell you, oh, by the way, this theme hook's not found. And you can look at it and go, oh, duh, I have a typo. And you can fix this in about 5 to 10 seconds instead of 5 to 10 hours. Yes, I've been there. Who's old enough to remember DOS? Uh-huh. This was a pretty useless error message, wasn't it? Who's run into the Drupal equivalent? Does anyone know what this even means? A couple people? What this means is somewhere in a form array or a render array, you have a string where you're supposed to have an array. Where? I have no idea. You have no idea. And because that system is recursive, try debugging to find it. You're still not going to find it. You're pretty much left with guess and check. This is a very, very bad thing. This is why big anonymous arrays are not a good data structure because they are impossible to debug. Because all this error message tells you is, by the way, somewhere there's a bug in an array. It's Drupal. There's lots of arrays. This doesn't help me. This might be an American joke. Code failures like voting in Chicago. Do people actually get that joke here? Failorily fail often. Some people get it now? Yeah. Vote early, vote often. It's the tradition in Chicago. The way to do that is you constrain your inputs and fail usefully. What do we mean by that? Constrain inputs, good APIs are picky. This is an excerpt from the database system in Drupal 7 and Drupal 8. If you call this fields method and you pass in two arrays, you're fine. If you pass in something that is not an array, this code does not know how to handle that. So we specify it in the method call. If you pass something that is not an array to this method, PHP will fatal on you and tell you exactly where you called this method from with something that's not an array, go fix it you moron. PHP is slightly more polite than that, but that's the idea. But you can know exactly, oh, I have a bug right here that I can go fix. Instead of 15 function calls later, something not working because you passed something is not an array when it expects an array. Be picky, that will help you. Another example, pre-execute. In this case, we can't detect it just through the language, but SQL insert queries let you specify this field should have this value. It also lets you specify this field should have whatever the default value is defined in the database. Completely legit things to do. If you specify the same field in both of those lists, your query will break because SQL has no idea what to do. If you just let this all the way through, you will get a completely useless error message from MySQL giving you a parse error and MySQL error messages are not the most useful. Instead, we check for that and throw an exception. And that exception tells you exactly what you did wrong. You specify this field in both places. Don't do that. And you now know exactly what it is you did wrong and can go fix it. Save yourself time by telling yourself what you did wrong. A good programmer is someone who always looks both ways before crossing a one-way street because some idiot is going to drive the wrong way down that street. That idiot is usually user data. Also means if you are not developing under eStrict and eAll, who knows what these mean, eStrict and eAll? Okay, you're all using this, right? You will be by the end of the day. If you're not doing this, you are doing it wrong because you are giving up the ability for PHP to find bugs for you. Make PHP find bugs for you so you don't have to. That's why the capability is there. Make PHP run in picky mode. You want it that way. Another corollary here. Who's heard the phrase, we don't babysit broken code? We don't use it as much anymore in Drupal for good reason. What we don't babysit broken code is code for we don't care about developers. Do not take that approach. Always make sure your code fails in a useful fashion as soon as possible. Fail fast, fail cheap, fail usefully. Number three. You can't teach what you don't know. And if you can't teach it, you don't know it well enough. Okay, it makes sense. It also means you don't understand what you can't document it. If you can't explain what's going on, you don't actually know what's going on. And neither do I. If you don't document something, I have no idea what you're doing. This is somewhere that Drupal actually does really well. When putting together this presentation originally, it took me a while to find some examples where Drupal didn't do this right. But I found some. This is the file transfer FTP class in Drupal 7. And it starts off pretty good. So we've got a proper dock block on this method. It's named well, so we know it's a factory method of some kind. And we know it takes two parameters. One is a string, one is an array. It returns file transfer FTP. So it's the name of a class. And we know if it's going to get back that class or a subclass of it based on the available options, cool. So I know what I'm going to get back now. And what I need to pass in is a jail string. What does jail mean? I have no idea what jail means. Is that a path for a CH route? Is it just an arbitrary string? I have no idea. To this day, I still don't actually know. And settings. Okay. Well, we know it's an array. I'm going to assume it's an associative array with configuration keys of some kind. What are the possible options? Anyone actually know? I don't. This tells me nothing. Don't be this. Do better than this. Contrib is much more of a mixed bag. Here's my favorite example. Date module. This is a bit old, so they might have fixed this by now. Date entity metadata field getter. All right. It's a getter callback to return date values as date stamps. Okay, then. And its parameters are objects, options, name, object type, and context. Oh, my God. Well, object. So does that mean standard class object? Does that mean an entity? Does that mean... I mean, there's lots of kinds of objects. I don't even know what this is. Could it that be an array that we use as an object? Because Drupal's done that a lot. Object type. Wait, why do we care about object type? Wouldn't that be intrinsic in the object? That's what a class is? Oh, wait. Maybe it's not a classed object, but I don't know. Array of options. Well, we're right back to settings again. No idea what that is. Name. What's the name of the object? I have no idea. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. Context. So I'm not going to trust your code. It could mean you don't actually know what's going on. You don't know what you're actually writing. And so I don't want to use your code. Or it could mean you know what's going on and you're just embarrassed about it. But I don't know that. I'm going to assume one of the others. If you're embarrassed about it, document the fact that you know it's terrible so that someone else doesn't blame you for it being terrible. In the Linux kernel, there is a block of code somewhere which has a doc block on it, or a comment above it that says, I don't know why this works, but it does. This is an incredibly useful comment because I know here be dragons. I know not to touch this unless I have cleared a week to figure out what's going on. I don't even remember which block of code it is, but it's some weird probably bitwise logic or something. What should you document in your code? Every single function, every single method of every single class, every object property in your entire class, every constant defined in the class or globally, every parameter to every function or every method, every return value from every function and every method, no exceptions. Document those two actually. If you find a contrib module that doesn't do this filabug market major, and yes, Jen Hogden, the Drupal documentation lead, has told me it's OK to say that. Even better, usage documentation. You're all familiar with the phrase, a picture is worth 1,000 words. A code sample is worth 1,000 comments. How do I use a piece of code? How am I expected to use something? This is the documentation page for Gearman. Gearman is a queuing server that has very good PHP support. On their front page for the project, they have a code sample of exactly how you should use it. They have English text describing what's going on and what you're supposed to do. They have a diagram showing how it all fits together. Oh my god, this is amazing. I have never used Gearman in a project. I want to because their documentation is just that good. Can you say that about your code? If not, you've got work to do. Documentation is part of your API. Do not forget that part. If you'll excuse me a moment for a side rant on documentation. You may have heard this line before. Clearly written code with well-named methods is self-documenting. Great, you don't actually need to write comments. Doc box fine, but you don't need to do comments in line. No. It's a wonderful thing to shoot for, but no. Here's an example. This is actual code I wrote for a non-Drupal project a few years ago. It's called normalized character set. OK. So I assume that's what it's going to do. And it's going to pass in a value, which I'm assuming the string we're going to normalize. And I switch based on the encoding. And in case of ASCII, I do nothing. In case of UTF-8, I do what in the hell am I doing? Converting to HTML and back. And false. Why does false mean windows? Anyone know what's going on here? Anyone know why I'm doing any of this nonsense? I mean, huh? This does not make any sense to me at all, and I wrote it. Here's the code with the actual comments that are in it. MBDetectiveCoding in most situations only supports two character sets. And so we're going to guess that if we don't, if it's not a found character set, then it's probably Windows 1252, which is the garbage, 1251, which is the garbage character set used by Microsoft Word and nothing else in the universe and therefore is still polluting computer systems the world over. This is why Microsoft Word is evil. This code is here to compensate for Microsoft Word being evil. Now I know that because it's been commented. I'll put the rest of this. I have absolutely no idea why UTF-8 strings need to be converted from UTF-8 to UTF-8. But if this code is removed, everything breaks. To this day, I do not know why I had to put this code in there, but I did. And I left a comment so that someone coming by to maintain this project after me a year later doesn't waste their time trying to optimize away this line of code and then wonder why it breaks. And so why are we doing it this way? Well, HTML entities, in my experience, is the most forgiving tool PHP has for guessing at a character set of a string. PHP has like four different mechanisms to do that. And this is the one that, in my experience, is most reliable. Why? I have absolutely no idea. But it works so I document that fact and move on. And then Windows, all right. A false return means that it couldn't figure out so we're gonna guess it's Windows. And if it's correct, it'll work. If it's not correct, it's gonna break. But now we know where to look if something breaks. It's probably a character set other than these we're dealing with. Document your code, comment your code. Don't explain what, explain why. These comments are explaining why we're doing this ridiculously silly thing. So that you don't think I'm a fool for doing this ridiculously silly thing. That's good comments. End rant. So aphorism number three. Doxer didn't happen. This is enforced in the core issue queue. You should enforce it in your issue queues too. Number four, a UI is not an API. Seems obvious, right? A user interface is not an application programming interface. These are different things. For the very simple reason that a user is not a program. Most of the time. Unless you're Kevin Flynn. A UI is a client of your API. It is not itself the API, it is just a client of it. It is a user of your API. Why would we have a website without a UI? What are you talking about, Larry? Who said anything about a website? I didn't, did you? I didn't hear you say it. No, we're just talking about APIs. And really, websites are just a small fraction of the software out there. Maybe you need to do a command line tool. You want something that'll work in Drush. Or some other command line tool like Clex. Or you want to test it. And you want to test it correctly using PHP unit instead of simple test. Tester didn't happen. Or maybe you're doing a REST API. Who is just here for the REST session in the last session here? Your code needs to work through the REST API. Your code needs to be able to work without there being a form involved anywhere. Just REST calls coming back and forth. Because someone's going to try and do that with your code. And maybe, maybe you'll have some forms in a web page. This is an edge case. Web forms are an edge case as far as APIs are concerned. If you treat this as your only output, you don't have an API. You have a pile of code. A pile of code does not qualify as an API. A website is not an API. It uses an API. It is just one use case of an API. The API can exist on its own without a website, without HTTP at all in most cases. And if it can't, there's a problem. There's a common saying in software development. You're not done until you have three implementations. If you're designing an API, you need at least three different systems that are able to use it differently to prove that you got it right. This is what we did for Drupal 7 and for Drupal 8, the database layer. Why do we have three databases in core? To make sure we can support multiple databases correctly. Before we had all three in there, it didn't entirely work. We had to refactor it for each one of those three. Once we had that, people started adding support for Oracle or Microsoft SQL in contribe. I didn't even know about it. And that's good. That means we did it right. Your API, you want three implementations. What three implementations do you want? Well, your first one's a unit test. You always want that implementation. You want a web services call. You want a command line interface for it and a website. Pick three. If you're designing an API, pick three of these and make sure it works. Otherwise, you don't actually know you did it right. And this also means your API can work without any one of these in particular. It means you're decoupled from your website and your command line tool. That means you did the right thing. A UI is not an API. Don't pretend it is. Which, incidentally, means if you're doing CRUD operations through hook form alter, stop. It's going to break. I love this one. You know that saying about standing on shoulders of giants? Drupal is standing in a huge pile of midgets. Drupal 7 had nearly 1,000 contributors. Drupal 8 is around 1,600 in climbing. There are lots of people who have done stuff before you. Don't reinvent the wheel. There's plenty of them out already. Don't add to API bloat. There's too much code in the world as is. Who's a fan of XKCD? You've seen this one before, haven't you? Don't do this. Don't create standard number 15. When possible, leverage existing patterns. Memory is the highest form of flattery. It means you're complimenting someone to say, your work is good enough that I'm just going to use it. It's easier to remember because you already know this other thing, and you can do it the same way. It means less work for you. Your brain is not that big. You can hold more things that way. And it takes less work. I would rather reuse some other design, some other code, and go home at 5 o'clock than pull it all night or four nights a week. Does anyone disagree? You'd rather pull it all night or four nights a week? I didn't think so. Follow existing patterns. Learn design patterns, especially for Drupal 8 with all of this new OO code. We've got look into existing design pattern resources. There's a link to some good sources for that. Follow your platform's patterns. If you're using Drupal, entities, hooks, plugins, and so forth, use these tools that are already there. It's easier for you to develop. It's easier for people using your API to develop. You want to make something swappable in Drupal 8? Either make it a service or use plugins. Both of these, everyone who knows Drupal 8 is going to be familiar with. You follow those patterns, your API becomes really, really easy to learn. Do not make your API hard to learn. If you're doing something with Symphony, it uses YAML configuration files. It uses events. It has its bundles mechanism for extensions. Whatever system you're using, if you're using Zend, Cake, whatever, it's going to have these patterns already. Follow them, go with the flow. It'll make your life easier and everyone else's life easier. And documentation. Best API is the one you didn't have to write. Number six, it's a very good book. 97 Things Every Software Architecture Know. I highly recommend it. One of the lines in there, use uncertainty as a driver. What does that mean? It means don't make decisions if you don't have to. Put off making decisions if you can. You want to make changing your mind cheap. You don't know enough at the beginning of a project to really decide what you want to do. And certainly, your client doesn't. Your client is going to change his mind, probably right before launch, multiple times. Who's actually had that experience? This is how you make your life easy when that happens. You can only change things that have been encapsulated, that have been carved off into their own swappable piece. For example, logging. The system wants to have logging, fine, whatever. And we're going to log to the database because, well, it's there. And that's great and wonderful until your client comes along and says, no, no. Our SysAdmins want everything logged to SysLog instead so they can have all of their logs in the whole system together. That happens to me two months ago. No, actually, sometimes you want to display them on screen during development, too. So they're in my face. And if they're really important, we need to send someone a pager in the middle of the night to wake them up. And if you're Jeff Eaton, you write a module to send watch log messages to Twitter. Yes, he actually did that. I don't know why you'd want to, but you can. Don't decide in advance where you're going to log because, guaranteed, you will have to change that later. Instead, simply hide it behind an interface. It's really simple to do this one, guys. Just throw stuff into an interface. You have an object behind it. You're done. You want to log somewhere else? You swap out the object. Problem solved. Or better yet, don't design your own and just use the existing interface that's already standard in the PHP community. This is a recent product of the PHP Framework Interoperability Group, PSR3. It is a standard logging interface. We are trying to get this into Drupal 8. Basic idea. Logging is a commodity. Why do we have everyone developing their own logging interfaces, their own logging mechanisms? That's silly. Here's the logging interface. Use it. You're fine. Main one, log. Give you a level, a message, and context information. Context here. Same thing as placeholders in Watchdog. In fact, based off of them. And all these others, these just utility methods for the different log levels. Cool. Use that. You're done. Use this for logging. And it doesn't matter what's on the other end. Your client changes their mind five minutes before launch. You can swap out the object and you're done. How about caching? Well, we could cache to the database. But then APC comes along and you want to cache some stuff there. But then memcache comes along and you've got multiple servers. So you can't use APC. You have to put stuff in memcache instead. Oh, great. Now I need to rewrite my caching layer. And oh, wait. Now Redis is the cool thing instead of memcache. And the system wants to use what's cool. And Redis is cool. So we have to use that now because the system wants to do so. And oh, wait. Now they want to use RIAC. I don't even know what that is, but it's supposed to be good for caching. I don't care. I shouldn't have to think about this at my application level. Don't decide in advance. You should not have to decide. Carve that out. In Drupal, we have this interface already. This is from Drupal 7. Drupal cache interface. Get multiple. Fine. What's on the other side of this? I don't care if it's memcache, or Redis, or RIAC, or whatever, or something that doesn't even exist yet. I can punt that question to later. Avoid decision making punt questions to later. Or even better, don't write your own interface for that. There's another PSR standard in the works for caching. I'm actually the editor for it. Stay tuned. Hopefully that'll be out in the not too distant future. And then everyone can cache using the same common API. Incapsulation avoids decision making. You want to avoid decision making until you can't get away from it. Put it off as long as possible. And in order to do that, you need loose coupling. You need your code to be loosely coupled, not relying on implementation details. You want explicit interfaces to hide behind. Hide implementation details behind interfaces so that you can change what's on the other side of that interface without breaking all the things. You want that. How do we get that? Dependency injection. You've heard me say it before. I'm going to keep on saying it. This is one of the great things in Drupal 8 is we have most of the system factored out this way so that you can swap pieces out. It's not just for testing. Dependency injection is great for testability. It also is great for changes that happen right before launch because you can just swap out one class for another class and be done with it. And the rest of the system doesn't break. Another way to get this, make sure you separate your logic from your data. Oh, come on. Only a few joe laughs at that one. Separation of concerns. You want to have interface-driven development. This is, again, a big change in Drupal 8, and it's a good change. Because we have node objects that are classed and clearly defined with an interface you can rely on, which are separate from the logic to email someone about a node or whatever. These are separate objects, separate problems. Your logic, your services should be stateless and context-free. What does that mean? It means if I send an email through using the email service, next time I try to send an email, that first call doesn't affect it. No state is maintained between those. Why is that good? It means I can change the order in which I do things and not break things. I don't want my code to break depending on the order in which I do things. Wherever possible, code that has actual business logic in it should be stateless. You should be able to call it 50 times and get the exact same result without anything changing out from under you. When things change out from under you is when things break. State is the enemy of stable code. Follow the single responsibility principle. One object does one thing. That thing could be being a node. It could be loading a node. It could be emailing a node. It could be exposing a node on a REST URL. Whatever it is, one object does one thing. And dependency injection. That makes it easy to swap things out. If you don't dependency inject, it means you have hard-coded classes. And when you need to change something at the last second, you can't avoid making decisions. Don't make decisions if you can get away with not doing so, because that lets you make the decision later when you have more information. Great, all this delegation adds indirection. And indirection requires abstraction. OK, this makes sense. And abstraction doesn't solve. It just hides complexity. Abstraction does not change how complex something is. It simply means you don't have to deal with it directly. It is still there. Abstraction is not free. We're talking about all this great abstraction stuff, but there is a cost to it you have to weigh against. On the performance front, at the language level, for example, call user funk array costs as much as three PHP function calls. And using the magic underscore underscore call method costs as much as three method calls. Actually, benchmark this, it's about that expensive. By the way, guess what pretty much the entirety of Drupal is based on? Now, that's OK. Function calls are not that expensive these days. Modern PHP is pretty good about that. But that cost is still there. You have to be aware of it. Or further up, we've all used the query builders in Drupal, DB Select, right? DB Select is 30% slower than just using DB Query, plus more memory overhead. And that's not even a complicated query system. As query builders go, ours is pretty simplistic. Don't use DB Select if you can get away with using DB Query and save yourself some time. And also, save yourself complexity. Save yourself confusion, because it is very easy to set up a system where you can't actually figure out what's going on. There's just too many layers of abstraction to dig through. Getting that balance right, where you have just enough abstraction to keep the problems away from you, but not so much that you can't figure out what's going on when you need to, is hard. Balancing that is hard. And you're probably going to get it wrong on your first try. But it's still a skill you need to learn. There are two ways of constructing software. One is to make it so simple that there's no obvious deficiencies. The other is to make it so complicated there's no obvious deficiencies. Great, this doesn't help me much. Guess which one is harder? The unavoidable price of reliability is simplicity. Now, this doesn't necessarily mean simplicity at large, but within a certain area, keep something as simple as possible. Build a series of simple solutions that you collect together into a larger problem. That helps you break up the problem space better. That helps you keep the simplicity where you need it and still be able to solve the complex problems. But remember, there is no problem in the world that cannot be solved by adding another layer of abstraction, except abstraction. And finally, Drupal has idiosyncrasies, but really, there's nothing special about it. Doesn't matter what version we're talking about, it's just software. And no one understands what's going on in Drupal, except for Drupal developers. And not even all of them do, really. Maybe you're working with another system like Symphony. No one understands Symphony's quirks except for Symphony developers, and a lot of them don't get it either. No one understands PHP's weird quirks except PHP developers, and there's an awful lot of them that don't get them either. And believe me, there are plenty of quirks in PHP. Whatever system it is you're building, no one actually understands it except you. And probably, if it's been more than a week since you worked on it, neither do you. We all know the guy here on the left here, right? That's Caroline Nageshi, checks. One of Drupal's top developers, one of the lead people behind Drupal's five, six, and seven. Crazy smart guy, knows more about PHP than I've forgotten. He's a great guy, but doesn't matter how smart he is, John Resig can code rings around him when it comes to JavaScript. No matter how smart you are, and in whatever field you're working, there's someone who knows more. No matter what it is you're doing, odds are someone's done it before you and done it better because we've been around a while. Think about it, third gen languages, that means something higher than assembler, something more complex than assembler, have been around for 56 years. Who in this room has been around for 56 years? Uh-huh. No, nobody. PHP has been around for 18 years. I don't know about in the Czech Republic, in the US, it's now old enough to vote. I don't know, is it old enough to drink or vote in the, here? Okay, maybe. Drupal is 12 years old this year. Your site is six months old. I mean, really. What are the odds you're doing something that in 56 years no one has managed to run into? Pretty darn small. Find existing tools that you can use instead. If you're looking for PHP code, use Composer, find code on packages that you don't have to develop, you can just use because it's already there and you don't have to do anything special with it. Look at the symphony two components. We did, it made Drupal 8 a lot better. Look at the Zen framework too. They've got good components there too. Maybe symphony doesn't do what you need, Zen does. Great, use that. If you're working, if you need a larger system, don't write your own platform. Build an application on symphony two. Build it on Zen framework. Build it on cake PHP. Build it on, you know, Laravel, on whatever system you want to work with. Find some base platform to work with. If you're doing a CMS type site, build it on Drupal absolutely. Don't write your own CMS. Good God, don't write your own CMS anymore. No one in the right mind writes their own CMS anymore. Do not write your own system from scratch when so many people have written so much good code before you got there that you can just use. That's the wonders of open source. Please take advantage of it. If you have to write something yourself, leverage existing standards. Look at the stuff that the FIG is putting out regarding interfaces, regarding coding conventions, regarding all these other things. Don't re-architect when you don't have to. Learn HTTP. Who here has ever read at least a slight piece of the HTTP spec? Who has read more than half of it? More hands should be up here. Everything we do on the web is based on HTTP. Learn how it works. It does a lot more than you think. It does a lot more than you think. And you can leverage that if you know it's there. If you need to do a web API of some kind, you wanna do it using REST. Why? Why would we bother using REST semantics and RESTful APIs in the first place? Why would we follow this existing prescribed pattern? What does REST actually stand for? REST stands for I can go home and take a nap because I can rest and not have to design the whole thing from scratch. You want caching for your API? HTTP has it built in. Use it. It's already been figured out. You need to do validation. It's already there. You need to handle authentication. There's multiple mechanisms built in. Don't reinvent the wheel. Just follow these patterns. You don't have to waste your time trying to redesign these things. Learn from existing tools. If you're actually doing to write new code, look at PECL, the PHP extension, something library, C plugins for the PHP language itself. Look at them and see what they're doing. Can you just use one of them? Maybe, if not, do a PHP user space version of it. If you're doing anything with JavaScript, look at jQuery. You can learn stuff about PHP from jQuery too. Learn from these other systems. Look at WordPress. It's fun to make fun of WordPress, but they have nine times the market share of Drupal. They must be doing something right. Look at their code and learn from it. Even if it's learn what not to do, you can still learn from it. Look at Symphony 2. Look at Xen. Look at these other popular PHP frameworks. Whether you use them directly or not, you can learn from them and that makes your code better. That means you don't have to invent as many things yourself. Look at Java. A lot of the documentation on Java, object-oriented techniques applies to PHP because PHP's object model was derived from Java's. It's evolved separately since then. Fine, cool, whatever. But there's still a lot of similarities. A lot of the documentation you find in the Java world applies just as well to PHP. Not all of it, but a lot of it. Look at these resources. Learn from them before you start down a road yourself. Look at design patterns. A lot of them talk about Java, but they apply generically. Look at these tools before you start designing so that you know what to do and what not to do. You know what paths are already trodden. There's probably a reason why the path less traveled is less traveled. Don't always follow that path if you don't have to. This is exactly what we did in Drupal 8. This is why Drupal 8 is such a radical departure. We are pulling in code from Symphony. We're pulling in code from Symphony CMF, from the Xen framework, from Doctrine. Guzzle from EZRDF, from Aesthetic. We're using Twig so that we don't have to reinvent these wheels because Symphony has a perfectly good routing system and a perfectly good core pipeline that is aware of HTTP, so we used it because Xen had a perfectly good RSS parser that is way better than the three different ones we had in Drupal and so we just used that one. Because let's face it, Twig beats the pants off a PHP template and is way better than us trying to come up with ourselves. Let's just use it. And by the way, all of these things have existing documentation we don't have to write that saves everybody time. Because really, there is a feature that absolutely no software project needs. Your code should never have this feature in it. Ego, it has no place in software development. Which leads us to Dyrton's Law. I don't care what you're doing. I don't care how cool it is. You are not a special and unique snowflake. Which leads us to our grand unified theory of API design. Let's bring everything together. Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. Well, guess what? You always know where you live, don't you? Six months from now, some poor sucker is gonna have to do something with the code you're writing that you never thought of that they will not be able to modify your code for but they're gonna have to build on top of it and get something done that never in a million years would have occurred to you. And they're gonna be stuck with that and stuck dealing with whatever it is you gave them. That poor bastard's gonna be you. Plan accordingly. Some additional resources. I mentioned the book 97 Things Every Software Architecture. No, highly recommend reading that. It's a very easy read. People heard of the Daily WTF. It's a great research project. People think it's a humor site. It's actually a research project in software anti-patterns and process anti-patterns. Read this site, don't do what they do. It's a great way to learn what not to do. A lot of these quotes are also got from softwarequotes.com. Another good resource, I highly recommend it. Thank you very much. My name's Larry Garfield. Please review the session online. Also, two other announcements. One, this room, next session, read after the coffee break. There's a session by Anthony Ferrara talking on many of the same subjects. Stay in this room for the next session. And I expect you all at the sprint on Friday to help work on Drupal 8. Thank you. I think we've got about five, 10 minutes for questions. So this very nice gentleman down here with a microphone. If you have questions, please raise your hand. He'll come running. Yeah. Do you hear me? I was thought that you're the right person to ask them my question. I get an idea. Maybe I get support or denial. I was thinking if there will be a possibility of quotes or module sustainability like between the versions. Like say, you might hear that. Not, I spoke with other quarters which don't like the idea. And I think that it's not a good idea to lose that amount of good code, good modules during the version extensions. Come to my core conversation tomorrow, third session. I think it's next floor up. I can never remember which floor is which here. But yes, I'm talking about exactly that question. So I'll talk to you about it then. Okay, thank you. Other questions, somebody down here? Uh-oh. So we had those nice principles or aphorism or whatever. And the truth is that in reality you code something and then you realize it's crap and you do it again or do something else. Or you let someone else code it who is maybe just a trainee or someone and then you won't be happy with it, but it works and then you prepare for that at some day you have to do it in a different way. And shouldn't this be also part of the big theory that things evolve and you learn and not everything can be, also that I want to write a method. I think that could be a good idea to write this method. And I start writing and before I add the commands and everything I say, oh no, that was a bad idea. I remove it again. And so there's a question of the timing and when do you add your commands and how do you, do you want to start with something really ugly and then evolve from that? Or do you want to start with something over engineered and then it come back and it bounces to over engineered and to oversimplified and to something else? So there's a kind of process and you kind of just do it all at once and have all these principles filled in in the perfect way. Sure. So it is a balancing act. Another good aphorism plan to throw one away, whatever it is you write first is probably wrong. Do that in small chunks. Your code will be better if you can write it once, realize it's wrong, throw it away and redo it in bits and pieces rather than the entire system at once. That's another place where this divide and conquer makes sense and if you can say, if you have an interface on something and on that side of the interface, I know I've got crap. The code is a complete disaster but I know it and I know that it's hidden behind that interface. As long as that interface is decently designed, you can refactor that crap later. You can rewrite what your intern wrote, you can have your intern rewrite the crap that you wrote, whatever, as long as that interface is right, the rest of the code is fine. Keep getting that interface right. That's where the rule of three comes in. So write something, try it in a couple of situations, realize it's wrong, fix it, go through that process to let it stabilize. As for documentation, yes, just as you're writing, you probably don't wanna document every method the instant you write it. I would say before you commit, make sure you've documented. Just before you commit, yeah, you're gonna be throwing stuff away, writing, rewriting, whatever, don't bother. Document when you commit. Or at the very least, document before you push your branch. I would also say never leave something uncompunted at the end of the day. Tomorrow morning, you are not going to remember what you were thinking. Last thing you do on the day, go through your code, make sure you comment it, make sure you update any previous comments and doc blocks to make sure they're right, because that way, tomorrow you is going to thank you. Do not make tomorrow you hate your guts. Does that make sense as a guideline? All right, over here. Hi, so you mentioned that we need to, we don't reinvent the wheel, like we have a lot of things out there and we need to use it, but if you look at Drupal and you listed like eight, 10 components, so as individual, yes, they are good. As a product, or maybe if you look at other way, we are taking few, you know, like a well-known stuff or working stuff and putting us and creating a new product like Drupal, and we are telling that it's gonna be fine, but how we validate that if we put these things together, they know each other and they work well. That's gonna vary widely with the system. In the case of, for example, symphony, what we wanted to make major changes to Drupal 8 and we knew we were gonna make major changes to Drupal 8 and realized, okay, the changes we wanna make, symphony is already done, so we can just skip over three years' worth of development and build stuff on top of that, that is probably gonna be similar to where we wanted to end up anyway, but we managed to skip three years' worth of development. In the case of Twig, we actually looked at it and said, all right, is this going to be compatible with our system? And it turns out a lot of stuff Twig wants you to do in order to set it up, Drupal was already doing, so it was a good fit. We probably could also look at other tools, like we looked at, Imagine, I think it's called, is an image manipulation library that wasn't following autoload standards, wasn't following modern conventions. We're like, okay, trying to use this library would be more trouble than it's worth. And so we passed on it and just kept doing our own thing there. So a lot of it does come down to case by case decisions. There will be cases where the existing code is not going to work, or where the existing code is far more than you need. That's okay, still do the research and look at it and say, okay, could I use this? Would this be better than writing it myself? Can I save myself time with this? The answer may be no, that's fine, but you have to ask the question first, honestly, and say, you know, is it worth my time to write my own thing, or is just using this library going to be better in the long run? And go with whatever works in that case. So there's a lot of case by case decision there. So there's some people in the Drupal community who might say, with documentation, I've written you this module, it's free, you know, I've done all the work on it, why should I have to sit there and document it? I've done enough already. How would you address those people? I didn't force you to release the module in the first place. You release the module because you want to help other people, you want to give back, you want to share. I mean, if you're releasing a module, it's a good thing if lots of people can use it. If you didn't consider it a good thing for lots of people to be able to use your module, why did you release it in the first place? So you already have the intent of other people benefiting from this module, from this code you're releasing. If that's your goal, documentation improves your ability to achieve that goal. Uncommented code that you just released out into the wild is useless, people can't use it. If that's the case, don't bother releasing it. This doesn't mean you have an obligation to maintain it for free for all time or anything like that, no. But if you are going to try and do the open source thing and release code to help other people, do it right and document it so other people know how to be helped by it. And if people know how to be helped by it, that means they also know how to contribute back more easily. And I would rather contribute to a module or a project that is very well documented because I know what's going on and I know how to extend it, I know how to improve it without breaking things, without breaking assumptions. And so you will get more and better contributors if that's what your goal is, if you have good documentation so people know what they're supposed to do with your code in the first place. Code on its own is not enough if that's your goal. If your goal is just I'm gonna write something and get it done and then go home and I don't actually care, don't bother releasing it. It's not worth the effort to generalize to release it. And that's okay too, but just be aware of which is which and when you're doing one or the other. Do you have any questions over on this side of the room? All right, thank you all for coming. Enjoy the rest of the conference. Please rate the session and I'll see you all tomorrow on Friday at the Code Sprints.