 Welcome to this presentation with simple advice on doing a bioinformatics PhD. If you're a new PhD student or new PhD supervisor, it gives you simple advice on how to avoid disaster. I'll go over making a course plan, a project plan, making sure the data foundation is in place, doing the actual project and finally, publishing. Starting with the course plan. Systems vary greatly in terms of how many courses you're required to do during your PhD, but in any case, earlier is better. The sooner you do the courses, the more you'll get out of whatever you learn on them. And you should focus on taking courses that are either directly relevant to the project you're doing or focus on transferrable skills such as scientific writing, presentation technique and project management. You should not do more than required. There's plenty of work to do during a PhD and too little time, so doing more courses than required by your university, I would advise against it. Next up, the project plan. The biggest single advantage of doing a PhD in bioinformatics is that you have time for doing multiple projects, which you often will not have if you're doing wet lab work. This allows you to improve efficiency by doing multiple projects in parallel and reduce the risk of failure because simply doing more projects, it's less likely that all of them will fail. That is, of course, unless you make a bad project plan, one that is designed like a house of cards where you're stacking the risks on top of each other so that if the first project fails, all projects fail. To avoid that, you need to have a contingency plan in case projects depend on each other. Also, make sure to have a safe first project. It will help to reduce stress as well. Minimize dependencies between projects so that if one fails, it doesn't affect the others, and so that it allows you to work in parallel, so when you're stuck on one project, you can move forward on another. Let's talk a bit about the data foundation. The biggest disadvantage of doing a PhD in bioinformatics is that you rely on data, and this is often data either made by collaborators or data that require getting permission to work with them like medical data. This means it's data that is not in your hands and it's not in your hands to control when they arrive. Data will always be delayed, often by more than a year, and that means that by the time they arrive, it will be too late for you. You'll be running out of time on your PhD and you won't have time to actually work with the data that you are supposed to work with during your project. Future data is not data. If you remember one thing from this presentation, remember this phrase. The way to deal with this is to have a contingency plan. You need to have a plan that is based on data that actually exists and is in your hands. Future data should be seen as a bonus. If you get it, be happy. But don't bet your PhD on it. So how do you go about actually doing the projects? The first thing that I cannot stress enough is to look at the data. Don't be the arrogant person who knows this is a matrix. I know how matrices work. I can analyze that without looking at the data. When you look at the data, you will find issues and it will reduce the debugging you have to do later when your scripts fail because of features in the data that you hadn't thought about. Also, automate your analysis. You may think that something is a one-off analysis, but you will need to rerun it. And even if you don't, it will improve the reproducibility of your research, which is a good thing. Use version control. You will almost certainly mess up big time at some point and want to roll back your code to an earlier version. And even if you don't, it will improve the findability of your code when everything is under version control in a repository on, for example, GitHub. Create example data. Having some small data set where you know what the result should be is key to doing debugging. And it's also very useful later when you want to do live demonstrations of your software and work with some small data set where the analysis doesn't take forever. And finally, focus on the science. You're doing a PhD. You are not a software developer. This leads me to the topic of publishing. To get a PhD, you need to publish. And one of the things that often comes up is, should you write a review in the beginning of your thesis? And I understand why it comes up, but I have mixed feelings about it. You should definitely do a literature survey. You need to know what has been done before you started. And you need to write up what you find it will be part of the introduction of your thesis. However, there are often too many reviews around. So if there's already an up-to-date review covering the topic that you did your literature search on, don't write another unnecessary review. The reason why people recommend it is properly to get a safe first paper. And I can definitely understand that, especially if you're doing wet lab biology. But when you're doing bioinformatics, there are so many other better ways of getting that first paper. You'll probably need to put together some data set. Why not compile a database and publish that? You may need to create a tool to carry out some tasks that you need in your project. Or if there are tools around, you probably need to benchmark existing tools to figure out which is the best one. All of these are viable safe first publications. On top of that, you will almost certainly get some collaborative papers. As a bioinformatician, you tend to be working together with experimentalists or other bioinformaticians on a large number of projects. That leaves us with having a main paper at the end that combines everything and that project will typically be higher risk, higher reward. And that's perfectly okay as long as you don't need that paper to graduate because you already have plenty of other papers, including first author papers. That's all I have to say about doing a PhD in bioinformatics. If you want to learn more about how to actually get your work published as papers, take a look at this presentation next. Thanks for your attention.