 Mae'r gweld. Mwynhau! Fy fyddo i'w Indiana Jones ar y Cymru Chordaeth. Cymru'n cael ei ddau'r ffordd? Mae'n fyddechrau'r cythoes. Mae'n gweithio'n gwybod. Felly, mae'r cythoes. Mae'r llunion antrejau mae'n ffordd o ddweud y 1900 yma ar y gweithio y Llyfrgell Gwyrgyllid. Yn olygu dweud o'r teulu mae'r cythoes mae'n gwneud ddylai'r cyfan. Mae'n gwybod, gyda'r gyffredig a we think that they might have been used to calculate the position of ships of the water. All of this technically makes it the first computer that ever existed, just like the Antichythor whey mechanism. Most applications that we work on today were developed years ago created by developers who left no documentation and we just can't reach them anymore. So, we hope that the antiquary could help shed some light, help us gain some insight into some legacy code bases. So I was on my way to pick it up to share it with you all. But it's been stolen. If anyone sees this man, let me know. Top 10 most wanted list. In this talk we're going to go on an adventure. We're going to infiltrate the Confuciano and we're going to recover the mechanism gyda'r hyrddoedd rwyfodol? Mae ymlaen gyntaf wedyn rydyn ni'n foton, rydyn ni'n gwneud ond yn ysgol yw'u gwahag fydda'r hiel Ysgolol iawn. Ymlawr amser, mae yna yn fenny i'w cysylltu, mae i yn oed i gael yma'r newid. Mae'n hyn yw eu bod yn ddeud. Roedd y Ff Las Deaf, Roedd Robert Glass dais wedi ddwy awdd ll 다시ul ym 14.80% o'r ffrogex hefyd mae'n gyfrifith yw'r mod. Mae hybcym o'r ffrogex The only successful projects get maintained. If it's not successful, no it's going to put any effort into it. It turns out that most of the time, once we identify an issue, the fix is fairly simple. Finding that issue can take days, weeks. Understanding a product is dominant maintenance activity. This means that our primary task as developers isn't to rate code, but to understand it. Mae'r cweld yw'r hyn o'r sgwrs maes yn ymlaen nhw'n cael mynd i gael i'r ysgolion yma, mae'r cystafell a'r gwrs yn ymddiolol. Mae yw'r cystafell. Rydyn, rydyn i'n rhaid i'r Michael, a i ni wedi'u bod yn cymryd. Mae'r cystafell a'r cystafell a'r cystafell wedi'u gilydd i'r cweld yma. Fy fydd wedi'u gyrraedd ei bod yn ei wneud, Ond byddwn i'r cychwyn i ddwy'r cymryd, yn i'w ei ddweud. Ond rwy'n gŵr ymwneud i ddweud i ddweud i'r gweithio'r gweithio, ac rwy'n gweithio i ddweud i ddweud eu gwaith i ddweud. Rwy'n gweithio'n gweithio'r gweithio newydd. Rwy'n gweithio'r cydweithio? Felly, rwy'n gweithio'r gweithio ymlaen gyda'r gwasanaeth ymddechrau'r cyfrifol o'r blaenau cyrbion sy'na byddwn o'r cyfnod. cwys answers, swap na H3r I1g, they said what..... What Vision means? If we are going in and that pH we can't go in with a sledgehammer. Some parts of the application got very valuableute and very risky to change which parts are there? Which bits are dangerous which bits will break everything if I touch them without really understanding what is going on? What kind of team created this was it a young team was it an experienced team were they contractors can help us gain a bigger picture as to how the application was written. This helps us a lot later on. So normally, I get a new project, and I start with the read me. If you're really lucky, this will exist. In the read me, it normally explains, or if you're lucky, it normally explains why the application exists. With programmers, we're very good at working out technical details but understanding the motivations for an application. That's something that you can't get just by looking at the code. Do you know someone that goes to the casino that Michael's had in quite a lot? Talk to them. Find out what pain they went through. Perhaps the roulette table is rigged or perhaps they're familiar with the code base and they know that any database access is in that file called util.php. Utilise co-workers. Utilise people that have seen the code before. Use their experience to help you. Inventory the code. Make sure that you have everything that you need. There's nothing worse than getting halfway through a project and then realising that you're missing a key point of it. It'd be like Andy going into the casino without his whip. There's no easy way to do this. There's no general checklist that works for every project. But as you work for a company all the time, you start to notice patterns. They have the same database requirements. They have the same third-pointed requirements. Spot these patterns, make a list, and it will help you in the future. You may have all of the code, but what are the dependencies either? Shared libraries, third-pointed services, databases. Which of these are optional? Does it need specific versions of a dependency? Try and build the project. If you can't build it now before you make changes, you definitely won't be able to build it afterwards. Finally, accept that you're still missing something. You don't know what you don't know. Just try and minimise that. Try and get as much information as you can before you start. So, it's time to go in. I've done my homework. We've registered to navigate the casino. But we get to the door and the security door. He told me that I could only enter the casino if I had the password. So, I thought, I know the password. So, I told him, this talk is great. And he said, no, wrong. That is not the password. It seems like a bug to me. But I had the code base, so I started having a look. In a new code base, I was your best friend. Try and trigger an error condition. An error condition that the application handles. When you trigger an error, the application throws an exception. And once you have that error message, search for it because error messages tend to be fixed. If you're working with real data, it might come from a database, it might come from a third party service. But errors tend to be in the code base itself. And as soon as you find that error message, you know where it was triggered. You can work backwards from there. So, I searched for incorrect passwords. That's what Michael said to me. And it brought me to this piece of code. And he's right, the password isn't, this talk is great. It's shibbleleet. So, I went back to the casino. I said, sorry, my mistake. Shibbleleet and he tipped his hat. And in I went. And that's the best introduction to a code base ever. There was a defined problem. And we could focus on the small amount of code very fast. Sadly, that was the easy part. Now, we've got to find what Michael is hiding. There's a lot of people in the front playing on the slots, but we're looking for the real tables. A back room somewhere. That's probably where the antikythera is hidden. It's time to take a look around, find an entry point. This could be called index.php, bootstrap.php, web.php, something like that. And in here, we're looking for configuration loaders, routing information, that kind of thing. What makes this place tick? We need to know this inside out, as we're likely to be seeing a lot of this place as we explore. If you're new to an application, you're new to the casino, it's like a maze. Don't bother with fancy tools or anything, just pen, paper, and just sketch. How do you get from place to place? To get through the security section, I need to get through the front, through authentication, and make the correct request to security. This is just a sketch, something I was working on, microservices architecture, and any time I needed to authenticate, it went through this process. Once I sketched it out, I didn't have to look at the code again, because I knew which services it was talking to, which endpoints it was hitting. 60 seconds of a job that saved me hours over the course of the project. Once we have a decent idea of how the data flows through the system, we can start looking at the details, looking at the code, and working at what it does. Whatever you do, don't trust the comments. They're designed to lie to you 90% of the time. Look at this example. Based on what you can see, you think that it returns a random number for the roulette table. But we can be sure, no matter what the documentation says, the source code is the ultimate truth, the best and most definitive and up-to-date documentation you're likely to find. But saying that, don't trust the code. You can always work out what code it's doing, but it may not be doing what it says it's doing. For example, method names. They can lie to you. Just because a method is named ReadConfig, there's no guarantee that it's not writing a megabyte of data to disk under the hood. Going back to our roll of random number, just because a routine is named roll of random, doesn't mean that it's not hard-coded to return the number 4. Check the code. People don't do this on purpose. Things just kind of happen over time. But this can be useful information, too. How did the code get this way? What other changes were made around the same time? Were the development team under a lot of pressure at that point in time? Should we be extra careful with other code that changed in the same time period? Before we go and put everything on number 4 on roulette, we need to make sure that this code is actually used. It could have been built for a brighter future that it never actually got to. Static analysis tools, things like scrutiniser, they can help. But there's always my personal favourite. Stick a dice down at the end there. If it doesn't trigger, the code isn't running. Nice and easy. But be aware that this isn't perfect. Some codes only run in certain environments. You might not trigger this in development. You might trigger it in production. You need to make sure that the application isn't doing different things depending on where it's running. Look for things that don't look quite right. That file that's called configurationloader.php actually has 2,500 lines in it. Probably doing more than loading configuration. I'd say to search through the casino code base for something like this. And while searching, I found something useful. When the fire alarm trips all of the door locks are released this could be my opportunity to get into the back rooms, to get into the secure areas of the casino, find Michael and the antikythera mechanism. Let's just do one more pass of the code to get ourselves ready. By now, we're starting to understand the terminology used by the original developers. Make note of this for yourself and for others. Different points of the code might use the same term in different ways. Which version of that word does this particular part of the code mean? Look for things with specific names. At the casino, driving licence validator is good. Unjoined in, which is a speaker review site. Talk comment of Rover is a good name. State manager is a bad name. What state are we managing? In fact, anything with manager, util or helper is the suffix. Probably isn't that useful in terms of getting to know what it actually does. Look for how things are intended to be used. If you find a super complicated class with a dozen construct parameters perhaps you're not supposed to instantiate it yourself. Look for a factory that will create the object for you. Lots of the parameters may be defaulted. Often, you can spot patterns. Compliment that information with what you learn as you make changes to the code. Perhaps most of the database access is located in our in absolutely names utility module. Maybe each screen in the application is backed by its own class. You just uncovered your first principles. They might not be ideal but at least they exist. Then look for changes that break those principles. Consistently bad is better than inconsistently good. And finally, prove you understand it. Don't just dive in and start adding a deleting code. Buy something that uses that code. Maybe that's a new endpoint or preferably a test of some kind. Prove that you know how you got to that code. What it does. How it interacts with everything else in the system. And then start making your changes. At this point, I'm not ashamed to admit. I don't understand it. I'm anxious to get the mechanism back but I'm not quite comfortable making the changes that I need yet. I need to go back and take a better look at the casino plans and plan my escape route for when I do eventually find the anti-thera mechanism. And we've been through the core base. We've had a look for the patterns. We've had a look for anything that doesn't look quite right. But there must be more that it can tell us. Everything up until this point has been fairly process focused. Follow step A, then B, then C. We need to get our hands dirty. We need to work out how things actually work. I have my own copy of the casino. I do whatever I want. I can change whatever I need. So let's get started. Use everything. I'm an everything debugger. I use echo and I when convenient. Xdebug when working with complex data. Or when I need to hook into something and change values interactively. When debugging, everything goes. You want to use a global variable? Fine. You want to redefine a method? Yep. Adding conditionals, manipulating the log path. If you want to edit the dependencies in your composer vendor folder, do it. This is your checkout. No one else is going to see this. Is to help you understand how the application works. I do anything I need to make sure I understand what's going on. I'm a big DDD developer. That's die driven debugging. This is your debugging sledge hammer. You go in, you knock down walls, you work out what's going on. I want to see how many people are currently in the casino. Echo, void up and die. Make it easy to see your debugging information by adding fences around it. Use conditionals too. Only show what you need to see. Remember, when debugging, everything goes. Jenny Wong, she keynoted here last year. If she's not in the casino, show me who else is. I'm looking for allies. If you need to know how to get to a certain point in the program's execution, use debug backtrace. You can throw an exception, but something could catch that. If you use debug backtrace and die, you know that you're going to get the information that you need. Nothing's going to catch that. Don't be afraid to extend and change behaviour. I once had an object that was being mutated, and I couldn't work out whether it was being changed. So I extended it. I used a new class, and I overwrote the setter to throw an exception. I used this successfully to find out what time the security team changes shift to the casino, so I know when to strike. If you feel like you should do more of a proper job, you can use XDbug. This is more of a hammer and chisel than a sledgehammer. The first thing I always do in a new application is enable XDbugScreen. This disables the at operator for error suppression. If you see some behaviour that doesn't feel quite right, but you can't work out what's going on, this can show you all of the warnings that are happening when you're stressed. It's usually a reasonably good indicator that you should look at that area of the cards a little bit more closely. But most people will use XDbug as an interactive debugger. That's its primary use case. There's loads of other talks on that, so I won't go into too much detail. But you can use XDbug Break to drop into an interactive shell at any point. But XDbug isn't a standalone tool. Use your standard debugging techniques with it too. Conditionally break out into the interactive debugger. You can even change values to make it look like Jenny is here with us in the building, even though she isn't. Let's start thinking about other tools that are available. Things like Ex-HPROF. This is a primarily profiling tool. Why would this help us debug? But it can show things like function execution counts, memory usage, runtime, all kinds of information. If a single non-built-in method accounts for 80% of the calls in the script's execution, that's probably where you want to start looking. The same holds you through if a single function uses 90% of the memory. Investigate where all of your resources are being used. That's probably an important part. By combining our tools, there's so much more that we can do. Don't choose one tool and use it exclusively. Use everything that's available to us. However, we have to be careful when we're doing this. The observer effect is when things change behaviour, when they're being looked at. Using a debugger can trigger special cases in your code. Imagine trying to find a race condition only to find that your debugger makes certain functions synchronous rather than asynchronous. In this case, or you can do these print values, log things, look at it by hand, and anything can cause this. I once heard about some code that only worked when trace level logging was enabled. When it was compiled out at build time, the code had several race conditions. It turns out that the logging code slowed it down just enough to mask those issues. And that code was shipped to production with trace logging enabled. It works. But they tried to attach a debugger, they tried to dive in, and everything just worked in development. At this point, I'm feeling better about the application, but I'm still not 100%. I don't want to miss anything that might help me. I only get one shot at recovering the Antikythera. But the code isn't the only source of information. Your version control system is awesome too. We'll never be able to understand a complex system by looking at a single snapshot of it, a single point in time. When we limit ourselves to what's visible in the code, we miss a lot of valuable information. Instead, we need to understand both how the system came to be and how the people working on it interacted with each other. My one stop shot for all of this is git pickaxe. Specify a string and it will search everything for it. Diff's, commit messages, everything. There's an equivalent of the source version control systems if you're using McEw real SVN. I use git, so I use it to search the Xenos development history for Antikythera, and it brings up this commit. Michael's added a new storage facility for the Antikythera, but it isn't in there yet. It's great to know, as it'll be tough to get it out once it's been locked away. I'll save that for later. I'll keep digging into the storage and control history. Think about what's changed recently. Code that changes is likely to change again. Code changes for a reason. Perhaps it's under active development right now. Perhaps it's a module that has too many responsibilities, or the feature area is poorly understood. This definition has to keep changing as a result. Those are the areas that we should pay special interest in. Special attention to. Look at the git history. When file A changes, does file B change? Does it happen every time? This is temporal coupling. There are two kinds. Explicit is okay. When it's an explicit coupling, we kind of expect that. Here, we see that when Michael changed the roulette table, he had the double Z loads of the game. And he updated the test at the same time. This makes a lot of sense. You update a class, you update its test. This is explicit temporal coupling. But there's a lot of implicit coupling in a lot of applications. And this is quite tough to spot. In this case, we can see that Michael removed the table and it's updated the fire escape plan at the same time. It makes sense. The floor plan has to be accurate. But they're implicitly linked not explicitly. We wouldn't know that you have to update the fire plan. If the floor plan changes, this is the kind of information that gives you the additional context around how things are linked that you can't get just by looking at the code. There's quite a few common causes of temporal coupling. Copy and paste is a prime candidate for extraction. Perhaps there's inadequate encapsulation. If concepts aren't encapsulated quite right, you need to edit multiple files to change any behaviour. Perhaps you have a producer and the consumer. Actually, that one's kind of expected. You change the producer, you change the consumers. But they might not be explicitly linked. That could be an implicit link. But that's just something you have to know. This is down to your expertise to make an informed decision based on what you know about the application. Make sure that you don't just look at a single commit. Look at all commits within a day to find temporal coupling. Things might not change in the same commit, but they will generally always change within 24 hours if they are coupled. We can use this data to feed into another tool called Codemat. Codemat can generate all kinds of statistics from project history. So I ran it on the casino and it generated this summary. There have been 2,200 commits from 106 different people. This tells us that either there's a big team working there or there's a lot of turnover. Both interesting things to know. We can find organisational metrics. The VIP lounge is the one that's changed by the most people with 26 unique authors and 181 unique change sets. To me, this says if I want to change something, I should probably start around there. We can find edge metrics. Only three files have changed in the last three months in the casino. This tells me that the code is generally quite stable. It tells me that a lot of the classes in there are single use. They have one job. They do it well. It also shows me that the radio channels haven't been changed in seven months. Maybe I could listen in. Maybe that's my way in. Finally, my favourite, coupling metrics. This shows us that the floor manager and table factory change 100% of the time together. Anytime we add a new table, the floor manager is updated. It also shows us that either the drinks vendor and the snacks vendor have changed ten times and 60% of the time they both changed together. Information like this really helps you understand how something runs. You know if you want to go into the casino or you're pitching for the drinks contract you'll probably want to go for the snacks one too. Let's give you a better example of how this can be useful. Let's take a real world example. This is from the old version I've joined in. 90% of the time when config.php changes database.php changes too. That's quite high coupling without an explicit link. You can probably imagine the scenario where it does make sense but it's still implicit and we wouldn't have known about it as wonderful as it is. History analysis doesn't always work. It's not a silver bullet. Individual commit styles might bias the data. Some developers like to commit small isolated changes while others prefer to commit one huge commit every three days. It's important to limit the time period that you're analysing to a period shorter than the project's history. If you include too much historic data you might secure the results and obscure important recent trends. You also risk flagging hotspots that no longer exist. History can be wrecked by automated tools. For example, PSI2. You run PHP ECS fixer. All of your history is gone. You can't compare things before and after that was run. If you're in that situation just limit the time period that you're searching after that automated fix was run. The project's history can be amazingly useful. It can also be misleading. Use your judgement to decide which bits of information you use to inform your decisions. Given that today's Polish code will inevitably become the subject of someone else's future archaeological dig, what can we do to help them? How can we help them comprehend what we were thinking and wish you had when you started this project? Secure the site. Every file related to the project should be in version control. They should be able to run Gitflom and have everything they need. Leave Rosetta Stone. As you work through the code base as you learn the technical jargon of the demand that you were working in, you made notes. It helped. Imagine having that from day one. When someone says through boy, they actually mean customer invoice. Write this down. It will help people that inherit your project in the future. It will be twice as useful for them as it is for you. Building instrumentation, tracing and visualisation hooks will be replicable. This could be as simple as tracing got here, got here, got here. Or it could be as intricate as an embedded HTTP server that displays the application's current state. That's actually really useful. We have those HTTP pages in a lot of our core pipeline stuff and being able to see per second how it's behaving, what it's processing, what it's not. It's awesome. So, so useful for both development and production debugging. Use consistent naming conventions to facilitate automatic static code analysis tools and search tools. And it helps humans. If there's a pattern to how you name things. If they learn one area of the code, they automatically know all the rest. Build a knowledge map. Work out who's responsible for each area of the code. Who's responsible for the code that sends the emails. What about the code that approves all of the incoming orders? Is it just one person? That's an issue. What happens if they leave? Who can pick up that code? And finally, explain why you did things, not just what you did. We're developers. We're very good at reading code and working out what it does. But as I mentioned at the beginning, codes can't show you the intent. It can only show you what it actually does. Given the business requirements, we can write new code to fulfil them, but we can really work out those requirements from the code. I'd like to leave you with one of my favourite comments on the code base. It's insightful, and it meant that anyone coming to that file knew exactly which parts they could and which parts they shouldn't to try and change. I'm Michael, I'm MGPON Twitter, and you've all been awesome. Any questions? Any questions from the floor, guys? No? Thank you very much. Thank you, Michael. That was a great talk.