 Hi, there. Thank you for joining. So my name is Alessandro Michi. I work for Biobahn, a software company in Rome. We use mostly Python. And since I do my hobby, most of the time, involved programming and coding, I often happen to do coding competitions. It's a very fun sport. I'm just an amateur. I'm not very high in the rankings. I do it as a weekend hobby. But I think I learned quite a lot about how the general theory of coding competitions. And I found it was fun. But it also helped me being a better programmer overall, even if it's not what you would expect. It's not that you know so many new tricks that you use. It's really a more general attitude toward programming. And this is what I want to talk in this talk. The first point is, why would you do coding competitions with Python? Usually, you would say that if you want to compete with others, you need to have a language that is as fast as possible, which is true. And in fact, most of the competitors use one of those three languages, which is simple. For typical coding competitions, this is the Google Code Jam. It's the largest single coding competition that is online. And it's the main topic. It's the example that we use throughout the talk, because it's the cleaner and easier to track. The three most used programming languages are C++, Python, and Java, especially in the initial rounds, verification rounds, it's open to everybody. Then you need to qualify to start the rounds. And then the audience becomes smaller and smaller until in the final round, only a handful of people, actually 25 or 26 of the very best competitors in the app. And you see, compared to the C++ crowd, the Python crowd, it's more or less half of the very beginning. It becomes one-tenth at the end. But still, there are people who do use Python in the finals. And the reason is that there have been cases where the only solution that could be submitted to one of the problems was a Python. In 2011, round three, it's one of the final rounds, only the top 25 competitors qualify. And user Linguo, who is a proficient Python programmer, he codes in nine or 10 languages in the competitions for fun, but then when things get tough, he uses Python. And he was the only one. He was ranked first, because he solved the large input data sets of problem D with Python. No one else managed to do it, not even the very best C++ programmers. In 2013, round two, something, if you want, even more interesting happened, because Bee Married's one of the professional, one of the guy who can win the competitions. And he is a C++ guy. He uses C++ almost for everything. But the last problem of the round two, he managed to solve. He was the only one who solved it. And he solved it in Python. So he switched in just half the time, or even less than half the time, that the other C++ competitor had. He managed to implement the Python version and to get it. Now notice that he finished at 2 minutes, 29, and 54 seconds. It's two hours and a half. It's the full competition. And he squeezed the last winning submission just six seconds before the end. This is also one of the things that you'll never give up. Finally, in 2017, this happened last month, user Kevin Sogo, who is one of the very, very younger and very best in the scene. He is a Python guy. He uses mostly Python except C++ when he thinks that he really needs the speed boost. And he managed to finish. It's one of the only four people who submitted a correct solution for the final problem. And three of them will use a Python, including Kevin Sogo. So the point is, it might be, in certain cases, a good idea to use Python. Let's try to understand why. Also, during the time, more and more people are using Python, both in the initial rounds where, I mean, there are a lot of people who are not as good as in the final rounds, but also in the final rounds. It's actually, in the past two years, Python actually overcame Java as a language in the very final rounds. So the plan for the talk is to give an intro to what is exactly to compete. For example, for the Google Code Jam, then we will try to solve the problem live. They say, never do live coding. Let's see. And then we try to understand, from the theoretical point of view, what is that really matters in a coding competition? Then once we have this, we can try to do a program language comparison to understand what are the advantages and the weaknesses of every language. And then we see what we can do, what we have learned, that is, what if I want to do this competition for fun, and what I can I actually get something for my day-to-day coding, or even for life? So Google Code Jam, that's the main entry point. It's a competition. It's run once per year. It has several online grants and just one on-site round on a Google location. It has quite sizable price, a total of $20K. And then usually you need to solve, depending on the round, from three to five problems. There are problems with the non-correct solutions. So there are other kinds of competitions where the complete solution or complete optimal solution is not known, but you can judge a solution if it's better than another. Here it's on-off, either it's correct or it's incorrect. You can program in any programming languages because you do it on your computer. You download sample inputs. You run code on your computer and you upload the output. If the output is correct, an online judge will decide if the output is correct. The competition is time boxed in a few hours. Two and a half hours is the typical length, but some are longer, some rounds are longer, some rounds are shorter. The runtime of your code, it's a few minutes. Four minutes for the small input, for a small data set, and eight minutes for the long data set. So every problem usually has a simple, a more simple data set, and then a more complex, more broad, large data set. And you get different, you get independent scores if you get the first or the first and the second. Finally, you get ranking based on the problem score, that is how many problems you solved and how difficult they were. You have given the score in advance and your submission time. So you need to be fast in submitting the solution. So you get points if you are correct and you get an advantage if you do it fast. So usually a problem is done like this. You have a problem statement that describes the universe factors, rules, parametric rules usually, and a question. What is your task? Then you have limits. This is very important. The parameter and the initial condition of the universe are described as we will not give you more than thought actors or so on, 10,000 or 10 or something. And there are different limits, usually for small and large inputs. This is the main difference. So typically the problem is simpler. To solve, you need less insight to solve the small input and more steps, more insight to solve the large input. Then you have sample test cases with the solution. So you can check if the solution you've developed is actually working on a sample data set. And then you have the real test cases, the one that you download, and if you solve them, you get points. So typically the problem solving is you have two things to keep in mind. First, you have constraints. You have your CPU RAM storage. This is what you usually think as important thing in solving a problem with a computer. And then you have the code running time. Your code cannot run more than four minutes or eight minutes depending on the input. But then you have wall clock time. That is really the big part, because you need to solve as many problems as possible in the two and a half hours that they gave you. And if you solve it faster, then you are higher in the ranking. Coding phases, we will talk quite a bit about those. Usually the steps of this, you read the problem, you try to understand how to map the universe in some algorithm that solved the question that you are given. Then you need to implement the algorithm, actually typing it in. And then depending on the problem, you may need performance tuning or debugging. Debugging is anything that you do after you are pretty sure you code it well and it doesn't work. And performance tuning instead is when you realize that the solution, the model that you have, it's too slow. And then you have run time. Run time, it's how much time you need to download the stuff, run, and upload it again. Now, to give you an idea of how this looks like, we will try to do a live training session. That is, we read the problem and we try to code it, download it, run, upload, and see if it works. Hopefully, we will do it in time. Now, the reason that the problem that I'm proposing, I actually did live in one of the this year rounds. And I made it in seven minutes and 11 seconds, first time that I saw it. Now I saw it already, so hopefully it will not be long. And still was not enough to qualify. So the first thousand competitors qualified and I was 1,146, so tough luck. So now, the hard part that is switching this and going here. OK. So can you read the statement? So usually, the problem statements are the universe is describing some kind of funny or interesting way, so you don't lose the fun of it. So it's about Annie is a bus driver with high stress jobs. She tried to do something and in the end, he wants to do some horseback riding for fun. So today, Annie, this is, now it's describing our universe, we need to capture how to make this into a code. Annie is riding her horse to the east along a narrow one, one-way road and runs west to east. She's currently a kilometer zero of the road and the destination is a kilometer D, kilometers along the road. There are a number from west to east. There are other horses traveling east to same road. All of them go traveling forever and all of them are currently between Annie's horse and her destination and the I horse as an initial kilometer K-I and traveling at the maximum speed S-I. Then he says that horses cannot overcome each other and the question is, you can go at any, it doesn't have any maximum speed, you can go at whatever speed you want, but as long as it doesn't pass any other horse. So the problem is that we have horses running on a segment of a line and we cannot cross, we cannot pass them and we want to choose to ensue smooth riding for her and the horse. Annie wants to choose a single constant cruise counter speed of her horse for the entire trip from the current position to the destination so that her horse will not pass any other horses. What is the maximum such speed that he can choose? This is very typical. This is very simple universe. Then you have the description of the input. This is typical input. You have number of test cases, then a line with D and N, how many other horses, how long you need to go, and then N lines with where every horse started and what is his speed. Output needs to be the string where we say what is the cruise speed velocity. Limits. For the small data set, the interesting part is that we have only one or two other horses in the case of the large data set, we have up to 1,000 horses. Now, you have an input and for this input, you know what is the output. Typically, some of those, the problems are quite standard in a way that they give you the, this is my solution, my scaffold solution, I read the number of test cases, then I range on test cases, I build some kind of result and I printed the result. If I do run this right now, I just get zero because I don't do any computing. Now, this is quite simple, or at least quite simple for me. I know that since horses cannot overcome, what I, every horse has precise, has a maximum time when he arrived, he reaches the final decision and somehow I want to look just for the slowest horse. That is the horse that will reach the destination in the maximum time and then I want to go there at the same time as him. That's easy. I mean, even if you are not practicing, maybe if you do not practice coding competition, you can get it done in slightly longer than seven minutes but it's very fast to do it once you have the hang on it. So basically what I'm saying is that first I need to read all my D and N. So I get for every test case, I get an input, I split it in D and N, which is the order in which the first line is given, and then I have a number of horses. So I do it for N. I need to read N more lines, so I go input. Again, these are two numbers, so I split it and I map it with int. And then I need to do a list here because map returns a generator. Here I don't need it, but here I do. Let's see if this works. Usually you do something very rough. I want just to see, okay, I can read it more or less. So now I have the result. Now for every horse, I know, I can't tell at what time the horse will be at D, at the end of my destination. So I have K and S in I. And my result is the maximum result itself, D minus K. So this is the distance that is remaining for the horse divided its speed. Now, Python treats very nice, so this already floats and everything. Let's see if this works. And this says that this is the time it takes since I really don't want to have the time, but I need to give the speed. I think I need to divide this by to do the divided rest. Let's see if it works. And yes, so this was fast. Now, how do I know if this is correct? I need to download a practice solution. I already am in the correct thing, so I replace it. Now, this is the... I'm really testing if my... It's not on the sample test case, but on the real stuff. During the completion, this is how I get the score. I need to get the small input. Yes, very fast, so I save it to the small out. And now I can submit my solution, which is small out and see case one, case zero. OK, this is a very typical mistake. OK, now this is debugging. Sometimes you have a problem and you need to understand what is the problem and solve it. Let's see if this was the real problem. Submit. Correct, great. Now, in this case, I already know that I can handle 1,000 horses as well as few, so I just go and download the large case. The large problem, I save it to large in, replace. Then I do the same with large in, and I save the output in a large out. And I need to select it. Submit. Correct, great. So this is how typically things work for a very simple problem. What is interesting is that first, there was no need to... Yes, thank you. First, I didn't hit any of the constraints. The problem was so simple that really CPU, RAM, storage, running time was not a problem. In the large, for the large datasets, it was a couple of seconds. So the only thing that really counts in this case was wall clock time, because you get... It's not a score, it's not part of the score, but it is something that makes you higher in the ranking. So if you look at more or less the various coding phases that went, modeling was just as lower now because I need to read the statement aloud, but usually it's fast, and then coding it's usually fast. If you don't have any hit, performance tuning is not needed, in this case was not needed, and debugging was just tried, I mean, just test the thing. Worker, you have a small problem, but you fix it immediately. And then runtime was zero as well. So the seven minutes, you can submit a problem in five minutes, this is what the really good people do usually for at least one or two of the problems. And everybody has a threshold where the problem is easy enough that this is your target time if you want to compete. Then there are generalized and solvable problems. There is something that you are able to solve. It's not easy, straightforward, this one, but it's something that you're able to solve. How do you recognize these kind of problems? Usually most of the time, especially when you begin, there's still no real need to keep the constraint because usually the main problem is to do the modeling. You can take from two minutes to 20 minutes to do the modeling that is going from the text to an actual algorithm to data structure, et cetera. And then coding is usually, again, not the real problem. It's usually performance tuning you don't need. It's very, very rare, only if you do some mistake at the beginning. And this is the real place where PyPy beats Cpy don't because you need even less performance tuning. The problem is debugging. The real key and the real reason Python has an advantage is that if something goes wrong, if you did mistake in implementation or even a mistake in the modeling, if things don't work, you need to understand where, how, and when. And this is the debugging part is where Python has a huge advantage. Oh, first. The language that you know most, the most, is the one that has the advantage. But Python, if you know more than one language very well, Python has the advantage to stack traces, very powerful print statement. So debugging is where Python shines, really. And it's the largest and more risky of the coding phases because maybe you don't need any debugging at all, even in C++. But if you do, it might grow. It's easy to get one hour stuck on a problem that you know you can solve. But just you made silly mistakes somewhere. So how do you recognize a simple unsolvable problem? This depends on you, really, because usually if you keep an eye on a scoreboard, you see that the pros can submit correct solution to a problem very fast. Or I mean, it doesn't take long. So you know that this is a simple problem in the general way, but it's not solvable for you if you don't know how to do the modeling. This is where there is really the big stuff, the big difference between experienced programmer or experienced competitor and new people. The other part is debugging. Because if you're a compiler, you might have a model that you, at the end, you might, I mean, you find what is the algorithm, data structure that is described. But if the things are at the limit of your capacity, most of the time will be spent in debugging again. So these are the real two key assets. So the key asset is your time. And the way you spend your time is mostly modeling and debugging. The rest is usually a fraction or, in the typical case, is a fraction. Then you have generalized unsolvable problems. Which is something that is completely out of scope for you and you can easily recognize because the pros are having hard time to solve it. So this doesn't care. You may not care. Language comparison. So why people still use C++? But a lot of people are using Python. C Python, Python, there is a slight difference. But since we are interested in Python, I have them separated. Now the point is, in the modeling part, both Python and C Python is the best because you have access to a huge number of libraries. Sometimes the solution is just to know you already know a library and you just need to code how to describe the problem to a library and the library solves it. With PyPy, you have slightly less an advantage because some of the libraries are not working. And with C++ and Java, usually it's harder to install stuff so unless you have a perfect setup, you may not or also interfacing with libraries is harder. So if there are many more competitors, just go with a straight type solution, not going for a library. Coding. The coding time in Python is excellent because it's very terse, you can be very clean and the same, it's C++, Java, it's a little bit longer, but coding it's not really that important. I mean, unless there are thousands of ifs, because you have a very strange problem, the coding it's not what makes a difference. Performance tuning is that can, in some cases, be important, and this is the reason why the pros use C++. Performance tuning in C++, usually you don't need to do because the C++ is fast and it's the compiler that does most of the optimizations. Java and PyPy have a JIT, just in time compiler, so it's fine in there to use them. Cpy don't sometimes are very stupid optimizations that you need to track and if you don't, it can be very slow, usually they are very ingrained in our brains, so it's very easy to do performance optimization, like not repeating a LAN or a function called inside a loop, but still you need to keep them in mind, so it's not so good. The point is that with debugging, Cpy don't and PyDon't is where they have the big advantage. And as we saw, that is the big, big risk. So now we move into what we learn about in calling competition. The first point is that strategic thinking. The wall clock time is the paramount asset. This is something that was not clear to me at the beginning, and then you need to optimize that, manage your time, and be sure to know in which phase you are, and if you are doing the right thing at the right phase. At a higher level of strategy, you have problem selection. Usually you should go for the simple problem first. This means that you go and track and put aside the points that you have, that you can, the lower hanging fruits. And then also among the simple problem, usually you should consider the simple solution to the simple problem. Brute force works in a lot of cases in the first rounds of the code gem. And once you did the brute force solution, you have a working solution. Then probably to do the large, that would be not enough, but at least you can test if your fancier solution gives the same results as the brute force. Then you always need to keep focused and keep an eye on the scoreboard. That is, first you need to check what other competitors are doing, especially the pros, so that you know if something is actually solvable or not. So you don't lose time in something that is out of too much out of scope for you. And then you really need to try to work on problems that can make a difference for you. In one instance, I knew I had a problem doing a simpler problem, a simpler problem. So I set it aside and tried to do a more complex problem. When I solved the small for the complex or for the harder problem, I saw that I didn't have a chance, if I go back to the, if I went back to the simple problem, solve it, that would not have been enough to qualify. So I tried to see, I gave myself 20 minutes to look at, to think of the, how to solve the large, of the hard problem that no one was doing. So it was really an hazard, but there was no way to win otherwise. And after a while, I noticed that the problem was really mostly a trick, that the same solution of the small problem would have worked for the large, for the large, even if it didn't look like. So at the very last, I just tried. I was not losing anything because I would have not qualified anyway, and it worked. So just trying, I have an insight, but it was really, this is the only thing that I can do right now to have a chance at winning. So strategy, it's a big player in this, in competitions. The other things that you learn is to focus and stress management. Every time you need to have both the round strategy and the problem strategy, that is you are doing the single problem, but you need to qualify for the whole round, and this is something you need to move, go up and down in the strategy level to be sure that you are employing your time as wisely as possible. Then something that is very important is keeping focus in a way so that you don't make mistakes while coding, and every mistake may cost you debugging time, that is the worst that you can do is to do a mistake both at modeling and at coding. Usually coding mistakes are easier to track. If you do a mistake at modeling, then it's much harder to go back. But one of the points is everything in modeling your problem and coding should be kept as simple as possible. This is quite strange, you think that you may do fancy stuff, but no, most of the time doing the most plain solution is more easily debuggable, and so it's the one that wins you, that qualifies you. Then one thing that, this is a mistake that I do very often, you need to be ready to question all your assumptions. Sometimes you decide that the solution to a certain problem has a greedy solution, so you go with that and you never question that assumption, which was just an assumption, and that is something that you need to learn somehow again and again. Always go back and be sure that what you are doing is the, you didn't leave anything back. Then there is stress management. This part, somehow, I didn't think was so important. Instead, the fact that, I mean, you need to keep calm in the face of a ticking wall clock. And this is not easy. I mean, competing since five years and still when the competition starts, I can feel my heart rate going up and feeling sweaty. And so, the keeping this in control, keeping control over this, it's really one of the things that you learn if you want during this hobby, and it's something that's very useful in all kinds of situations. Again, there is the fact that you also need to keep calm in the face of the risk of public shaming or, I mean, public humiliation because what you submit, then it will be made public. The solutions in the Google code gem are public. So sometimes, you know you're doing something really horrible, it should be optimized, but time is what counts. And in the end, this is what counts for you. And if then other people try to learn and unload your code and see your code and say that, oh, God, this is pretty horrible, it's fine. That still you need to keep calm in the face of this. Then there is something quite amazing that is never give up. This happens with all the sports. But with coding competitions, the fact that this stuff gets very much closer to the way we program and sometimes if you, like in the case that I have, if you have just one way to get out of a situation, you may try it, may not work, but you know that you did the best. And if it works, you actually get out of, I mean, in my case, I qualified and you can get out of your problem. Also, you can imagine B Mary coding like crazy. I mean, the final code that he submitted six seconds before the end was something like three pages full of if and everything. And you need to say, okay, I go on doing this, even if I have very, very tight in time. I'm almost, certainly I will not have time to run it and submit it. But if you do it six seconds before the gong, it means that you really believe that. So how do I start or how do I get better at the competition? You get this advice online. The most important thing is practice. The second most important thing is practice again. And so is the third. There's nothing like that, it is more important practice. But now, for real, so something useful. First practice, you need to practice real world competition from start to end with a timer. The wall clock, the fact that you feel pressure and keep the stress and focus needs to be real, even in competition. Then you need to go for volume first. Do a lot of simple problem. If and if they are obviously simple, you need to get your muscles, coding muscles going. Then you go to problems that are more to your limits. You need to understand and try to figure out what are the problems that are your limit and try to go for that. Then learn from code of other competitors and then learn libraries. There are a lot of libraries. These are the ones that I find most amazing in coding competitions. But a lot of libraries just make a lot of things much simpler. Then you need to learn data structure and algorithms. This is the course that I followed, which is amazing. And I invite everybody to use it. So what do I learn from everyday programming? Basically, the point is modeling, coding, performance tuning and debugging is the same. And the point is that performance tuning and debugging have a much lower time scale. So this is where you need to optimize your wall clock management time. Also, strategic thinking. Always try to get to the low hanging fruits before doing the big architecture. And always keep an eye on the scoreboard. That is what is exactly that is useful now on what is the most useful thing that I can do now. Finally, focus and stress management is the thing that I really like the most. That is if you have a bug in production or you're doing a hackathon, you have time boxed. You want to solve something as soon as possible. And all the keep calm in the face of the clock and keep coming through to people looking at you, like in client-facing problem-solving situations. This is something that once you know, once you do coding competitions, looks much simpler to do. Job interviews as well. I mean, you have someone looking and watching every K-stroke. Or if you're trapped in a fire, you need to say, okay, keep calm, I need to solve the problem, not to panic. Then again, strategic thinking. It works exactly the same also in your life and keep an eye on the scoreboard. It's as well one of the most important thing. So I thank you for your attention and I'm ready to get questions if you want. Thank you. Thank you for the talk. I have a very quick question, it could be a bit specific. I'm, can you hear me good? I hear very badly. Can you speak? Yeah, can you? Close up to the microphone. Is it better? Yeah, better. So this could be a bit specific. I am like the Python UV. I mostly use like the CC++. I was curious why Python is better for debugging. Like what are the advantages over the... Okay. The first and most important point is stack traces. This is where Python shines. Stack traces are usually very, very useful and you get them for free immediately. And the other point is that the print statement, which is really the only debugging tool that you use in coding competition, it's much, much powerful. So you get a very nice representation of a lot of the data types, especially if you use standard data types, which is what you do most of the time. And in case you are forced to go into the debugger, which you really want to avoid, you just do import IPDB, IPDB set trace, and the introspection of Python helps you really a lot. I did not match the CC++ in my life, so I'm probably not an expert in debugging, but when I needed to port my Python libraries to CC++ for performance reasons, even if I already knew what I wanted to do, it was a pain that every small details needed much more time to be debugged. Okay, thank you for your talk. And I wanted to ask because you said that we have the time limit for execution also. It was like four minutes for the small data set. And how is this measured? Because you are downloading the input and uploading the output. So how to do it? You have a timer. When you download the data set, you have a timer and you can only upload the result within four minutes. If you go out after, you're not able to upload the data, the solution. Okay, so basically if you have more power of your computer, then you have kind of a lot of advantage. Yes, and quite surprisingly, this doesn't count most of the time. In my whole career, I mean, I did a lot of practice and a few code gems, five probably. And I think only once, I really had something like a solution that ran in five minutes instead of four. And it was something like a trivial mistake that I made. So yes, it may happen that a faster computer or a faster language is different, but it's quite rare. All right, I think we've probably got time for one more question, if anybody. All right, okay, please join me in giving Alexander a hand. Thank you very much. And please remember you can...