 Perfect. So you're going to talk to us today about the transition from Python 2 to Python 3, right? Yes. Okay, perfect. And are you ready to start then? I'm ready. Wonderful. So Michael Howitz, the floor is yours. Hello, everyone. Welcome to my presentation. We have nearly one million lines of Python 2 code in production. And now in summer 2016, we started to think about this question. I'm going to present how we solve this problem and what you can learn for your own big and small Python 2 migration projects. Let me start by introducing myself. Oh, I see this is an a bit older photo. Well, since 2003, I am a Python software developer. I was the head of development in the Python migration project. My presentation is based on my employer is a small software and consulting company named Gosebt. It is located in Halle, Germany. We do not have our own products we sell, but we develop and maintain projects for our customers. Currently, we work on some migration projects, migrating them to Python 3, or we even have finished the migration. I'd like to say thank you to Gosebt for the time to prepare and to give this talk. Now you know a little bit about me. Let me ask a single question to you. I took this question from the Python 2 end of life survey done by active state in the last quarter of 2019. So the question might sound a bit off now, but nevertheless, the question is how prepared do you feel for Python 2 end of life? There should be a button to answer this survey for you in the Zoom. So let's try it. I can answer the question myself. That's nice. We have 48% of the of the attendees voted so far. 50%. Come on, Bebs. It's 28 of you that haven't voted yet. 27. Okay, so I'll try to present the results of the survey and then we can compare with the results of the poll. The results of the survey the results of the survey were that about half of the possessive pens felt highly prepared, but about a third did not feel prepared at all. So I do not see the results of the poll. No, but so far we have 27% not prepared, 45% somewhat prepared and 27% highly prepared. Okay, that's even less than the survey results, at least the ones who are highly prepared. So the end of life of Python 2 is already history. If you do not feel well enough prepared to handle that event, let's see what this presentation can give you. I scheduled the following four items for you. Discussion of approaches, introduction into a union CMS as I use it as an example project for the presentation doing the migration and tips and tricks. Let's dive into a discussion about possible approaches. What to do with an application running on Python 2 discussion here means that I am the only one who discusses. Sorry, I'm going to present the following five approaches. Sunsets application replaced by something off the shelf. Don't migrate, start over or fade out and move to Python 3. Let's start with the first approach. Python 2 is after its end of life, maybe your presentation too. Ask yourself how important is the business case that application supports? How long will I or my company earn money using this business case? Can a migration be a success even financially? This approach might be the cheapest and the most safe one if it is possible at all. But you should at least think about it. Don't burn money by touching an application which is no nut or no longer worth investing into it. Is your application so special that it cannot be replaced? Maybe you can even replace it by something off the shelf. It is probably cheaper to buy a service from a company than doing everything yourself. Yes, I know there was a time when it was totally cool to write your own tools. Nowadays, there are lots of specialized service providers who can do a better job. If you want to keep your tool, it's your baby and you have to take care of it. On the other hand, you mostly get only an 80% solution. Maybe you can work around the missing 20%. One of our customers, for instance, is replacing big parts of their homegrown European system by products backed by companies, the bug tracker, the internal CMS, the calendar, and so on. You still have to migrate the data. I know this can cause big trouble, but you only have to do it once. After the migration of the data, the problem of the service provider, the task of the service provider to keep the service running, to fulfill feature requests, they have to take the maintenance burden. Your users might need to learn a new tool. They might require some customizations, but being forced to tell the user the new application cannot do that. Let's find a way around it. It might cause less stress than having to ask, when do you need this new feature? Even my company, the company I work for, has a software shop, has a slogan which roughly translates into English, better buy it instead of develop it on your own. The next approach might be dangerous, don't migrate. My boss won't like that I mention this approach here because it could ruin our business or save it if you are forced to change your mind in two years. If the use case of your application will go away soon, neither maintenance nor development of new features is required, and if there are no bugs, to fix this approach might be a fit for you. But be careful, don't use this approach on critical infrastructure, which ran for years and nobody wants to take care of. What will happen if the server running the application dies? Okay, it won't be a problem, it runs in an individual machine or in a docker container, but does it really? Will you be able to find help if you have problems with the code within the next two years? So be aware it will become harder to keep an old application running and to find someone who wants to invest time into it, even if you have the money. If it is all ruins, maybe start over. They are directly in Python 3 or even in another language, try to keep at least the high level tests so you do not have everything from scratch. There are many people who say don't do this, and I am one of them. It might be interesting to start on new ground. In the beginning it looks like a great opportunity to do everything correct this time, like you read it in the book. Often this will become a great nightmare, at least for big enough projects. You can do this for a microservice, but not for a monolith application. There might be exceptions to this rule of thumb, but why do you think you are the exception? Rewrite projects are the ones which rarely succeed. Only rewriting is never enough. The new application should also be a lot more flexible and have a ton of new features and you still have to migrate the data. Or do you dare to use the same old database? During the rewrite the old code still has to be maintained. Change requests now have to be implemented in both variants, the old one and the new one. You have to learn the use cases which led to the former application again, or do you have up-to-date requirements documentation. Users will not be happy with the new application. It's always so kind, because it behaves differently. The old one they used to complain about will over time become the secret weapon to get things done. The minimal viable product for the new application will be quite big, especially as there is an old working one the users know and trust. There is a sub-approach to start over named fade out. Replace the part of the old monolith one by one. This might work, but wouldn't be it easier if the monolith already runs on a recent Python version. So you do not have to sync in different Python versions if you have to work both on the monolith and on its successor. If nothing helps you could even try to move your code and your tests to Python 3. For me this is the most interesting approach, because I love literacy code. I admire the people who created it, and if I can make it even better it would be a great opportunity. Of all the approaches I showed you, this one I think is the most future proof. It uses what is already running in production and only transforms it. It can be challenging, but until now it was possible for each project to be touched as a company. Let me introduce Union CMS. Union CMS is a content management system once written for Verdi, a German trade union. Its first version went into production as early as 2003. It is now also used by DGB, also known as Deutsche Gewerkschaftspund. Union CMS is a multi-site, multi-user content management system. It has thousands of content editors and it serves its content to some million visitors. Sorry, the source code is not publicly available. When we first touched the source code in 2009, it was based on Zope 2 and Zope object database, ZODB. It was probably running Python 2.6 as Python 2.7 was not yet finally released in those days. Over the years we developed many new features for the CMS, so finally we ended up having nearly one million lines of Python code. Together in the core of the CMS and in the projects using the CMS, which are built on top of the core. The customers are interested to keep the CMS running and maintainable for the next coming years, so there was no way to sunset the application. Do not migrate was also a no-go because of security considerations. Replacing it or starting over was no opportunity. The customers invested much time and effort in the previous years to train the editors and to create or migrate the content from an older Union CMS version. We were happy to migrate the core and the projects built upon Union CMS to Python 3. Now let's take a look at the migration steps which worked well for Union CMS. I believe they can be used in Python 3 migration projects of any size. I'm going to present four, five main steps. Step one of the migration is general preparation. This means have someone who knows Python 2 and Python 3 and the differences in between. There are enough tutorials on the web, so I'm not going to cover this here. It deeply helps to have a decent test coverage. Yes, this means you have to part the tests too, but the test coverage gives you confidence that your code still runs. In Union CMS, we currently have about 88% test coverage of the Python code. When we started, Union CMS had less test coverage. It increased during the migration project. Sorry, I do not have the actual numbers from where we started. Currently, the 88% are quite good enough. Annotating each function with the expected types for the input and the output aka type annotations might help. Checkers like MyPy are able to find problems by statistically analyzing your code, but in Python 2, type annotations have to be commons instead of being part of the language. So in Union CMS, we decided against using type annotations for the migration. It seemed too much hassle for an unknown gain. Step two of the migration. Clean up code and test. This means all tests should be running and passing. Broken or disabled tests are generally no good idea. You should try to fix them or delete them if nothing helps. Some parts of the code are probably unused. You should not port them. The ones used will be difficult enough to port, so there's no need to work on that code. In Union CMS, it required a lot of detective work to find code which is no longer needed. There were Python modules which were not even imported, so they had no Python files. There were classes and functions which were not used anywhere. We had the advantage to know the code base relatively well to find that code or at least hopefully most of it. There was a presentation on this on this conference about deleting the joy of deleting code. I think it was called so there. In this talk, you can find ideas how to find and delete this code. Removing that code should be done like surgeon, symbol by symbol, function by function, class by class. Remove everything the code needs, the import, the global variables, the registration, etc. Each deleted line does not need to be ported and supported later on. Step three of the migration is bring dependencies to Python 3. That means you need a list of other direct and transitive Python package dependencies of your application. Transitive dependencies are the dependencies of the dependencies of the dependencies of the dependencies of the dependencies, you know it. How to get this list depends on the tools you use. Only two examples if you're using pip to install your dependencies call pip freeze to get this list. Union CMS uses zc build out. It lists all the needed dependencies into generate run scripts. So we got them from there. Each dependency has to be checked for Python 3 compatibility individually. This means take a look at the Python package index aka pipi for the package and see if there is a version which declares Python 3 support. I know this step could be automated. But looking at the package gives you a feeling for the package. When was the latest release, does the package still seem to be actively developed and maintained? Which version is the last one which supports both Python 2 and Python 3? On the next slide, I'm going to suggest to port to Python 3 by keeping Python 2 compatibility at least for a while. So you need to know this version. The step could also be used to introspect the stack you're building upon. Possibly you will find packages which should be replaced or have a maintained successor package. At the end of the step, you have a snapshot of the versions you're using. The package versions needed for Python 3 compatibility and a maximum package version, maximum package version numbers to keep for Python 2 compatibility. When you know the needed versions, you can use them. Make sure you run on the newest versions which are Python 2 compatible. This might require some changes in your code. So it works with these newer versions. This, of course, also means run on the latest version of Python 2.7. I think it's Python 2.7.18, which was released this year. There might be dependencies which are not yet ported to Python 3. You could try to replace them with other packages. According to the already cited active state survey, finding replacement packages, more than half of the participants of the survey expected this to be a challenge in the migration. Sometimes you are the chosen one to port a package to Python 3. You could see that as an exercise for the migration of your application. Maybe it's even enough to port the part of the package you actually use. To get a clear plan where to start porting your application, you might need a dependency tree of the packages belonging to the application. It should at least exist in your head. The packages without the dependencies, the packages with circular dependencies, which need to be ported together, at least the interdependent parts, and the packages which need the whole core to be ported before. In Unison, as we knew the dependencies well enough, so there was no need for an explicit dependency graph. Unfortunately, so I cannot suggest a good tool for creating that dependency tree. There might be one which I used more than once. It's called TL-AC-DEPS, but it might not fit your use case. Let's move to step four, migrate the code. There are several tools to help you. Modernize is a package created by Armin Rohnacher, the developer of Flask and many other Python packages. It is used to convert Python 2 code so it can run both on Python 2 and on Python 3. To achieve this goal, it adds a dependency to a library named six as two times three is six. Depending on this six library makes it pretty easy to drop Python 2 support later on as the code needed for compatibility is marked by using this library. I only have to grab for six. Although this step automatically changes the code, we had very little problems with the changes it created. We were happy about this first step running automatically as it was clear the next ones would require manual work. There is an alternative to modernize called futureize. It adds a dependency to a library named future. I personally do not like this library. It seems not as lightweight as six, which provides only a single module. Future has many modules and when looking at the imports you are not always sure are you importing from standard library or are you importing from the future library. It feels like future is not made as a temporary dependency which will be removed after the migration with ease. There are some problems which cannot be automatically fixed but they can be detected. For instance, in Python 3 a class providing the method DunderEQ for equality comparison has all the implement DunderHash. Pylint is a tool to find such issues. It has a mode where you can detect them. You need to install Pylint older than version 2 on Python 2 because newer versions are Python 3 only and no longer contain the needed checkers. It would tell you about some problems that even your tests will not find but they can cause trouble when running the application. Now the heavy lifting follows. Run the tests on Python 3 and fix them until they pass. There might be even import errors which already disturbs the test collection. Many of them will be due to relative imports which are no longer support on Python 3. Running the tests will show up all the problems modernize and Pylint could not find. For instance string.io versus bytes.io and if you did import unicode literals and need both in both Python versions native strings you are forced to remove this feature future switch. Ask a web search engine of your choice for specific problems to encounter. There are already solutions for most of them. I added two example pages in the links section on the last page of the presentation. If you are using PyTest to run your tests the plugin PyTest current 10 can come handy. It allows to curate a list of failing tests. This way you always know which tests you still have to fix. And you see if you broke tests by a changers which were successful before. You are even able to commit the current chain list to a repository so multiple developers can work on fixing the tests. Make sure the tests still pass on Python 2 so you are always able to deploy to production. In Union CMS it was really helpful to add comments to the code branches which are only needed by a single Python version which was specific for a Python version by Sharp Py2 respectively Sharp Py3. This made it easier later on to remove Python 2 compatibility code which had nothing to do with the 6th library. Okay, step five. Switch to Python 3. This means run your application instance against an at least fairly empty database. This way you make sure the instance starts at least. Sometimes we already had it in our company the tests run fine but the instance didn't start. You might find some unique code with spites issues while clicking through the application. Maybe you should write some additional tests for the problems you find when the application runs. It might be time to migrate the data. Union CMS uses the ODB. This database stores Python pickle objects. Python pickle objects behave differently on Python 2 and Python 3. If you store an object with an instance of str. On Python 2 it can have an arbitrary encoding as it is actually bytes data. But if you read this pickle on Python 3, it expects the data to be UTF-8 encoded as it expects str to contain text data. So you have to convert these pickles. If you are using a relational database you are probably lucky and able to skip the step. Now you can deploy the application to a staging environment to test it in depth from and to end. Even with 100% test coverage, thumb issues might slip through. Union CMS was tested by the editors who use it in their day drop. They even wrote a test manual for an external tester to test the CMS on the different installations because it was too tedious for them to do this. Be aware that the testers will find bugs the automatic tests did not find. Additionally, they will find bugs which already existed before the Python 3 migration was started. I have to think how to deal with them. After a successful test on staging it's time to roll out to production. Maybe you can do a canary rollout. This means let only a small fraction of the users hit the server running on Python 3, fix the occurring errors and then increase the portion of users which see the Python 3 version over time up to 100%. In Union CMS this approach was not possible. As I said we had to convert the database to be usable on Python 3 so it was no longer usable on Python 2. So we stopped editing in the CMS for a weekend and converted a copy of the database and went back online after the conversion. The rollout went smooth thanks to the many testers in the staging environment. There is one final step, drop the Python 2 support. An application can only run on a single Python version. After the successful development on Python 3 the Python 2 compatibility code is no longer needed. The crowd branches for checking for the Python versions might cost at least some performance. Most of them will be done on import time but some of them have to be done on runtime and comments in the code about specific Python versions might distract the developers. It seems to be the best way to drop Python support in a coordinated way instead of when editing the code the next time it is touched. There is a tool called PyUpgrade which can be used to remove most parts where the library 6 is used so this can be done automatically. Additionally it converts it is able to convert code to be modern Python 3 including the usage of f-strings. After running this tool you can remove the imports of 6 which it does not do automatically at least not the version we used and you can clean up the places you might have marked with Py2 or Python 3 as I suggested. Do not underestimate the step. Even reading through the diff when removing Python 2 compatibility is a lengthy task. Don't forget to update the dependencies to the newest Python 3 only versions. Some words about the time schedule. We use the following metrics to compute the effort for porting union CMS to Python 3. We at first ran Python modernize and afterwards PyLint was checking in the changes. For each warning PyLint showed we calculated 30 minutes to fix it because these are the hard things. In most cases we needed less than the calculated 30 minutes. For each doc test we had we calculated 20 minutes because some tests had to be rewritten as unit tests to support both Python versions and to understand them and we knew this from previous projects. The tests these tests which have to be had to be converted to unit tests took more than 20 minutes. We calculated five minutes for each Python module including test modules and two minutes for each Python module, Python modernize had touched to maybe fix the master tool created. Plus 135 minutes as a base effort per installable Python package. This means a package with a setup.py which can be installed via pip. These numbers added up to a budget which was more than enough to migrate the code. We created a plan for the whole migration. We planned 19 working phases. One iteration each month with two weeks of implementation and testing. So there was still time for bug fixes and new features outside the migration project because we the customer and we did not want to stop everything only for this migration. Be aware this 19 working phases were not only the migration from Python 2 to Python 3 but we also migrated from ZOB 213 to ZOB 4 as the underlying main dependency as an application server. Additionally we fixed some technical depth as we are working on the code and actually we deployed a version running on Python 3 after the 15th iteration so we were a bit faster than expected. We originally planned to roll out after each iteration so after each month but the customer was not able to test the whole theme that often because they did not trust our automatic tests so much. So we had roll out every three months. Each roll out was shipping the current status of the Python 3 migration as it was still Python 2 compatible. Even the big first roll out on Python 3 could have been switched back to Python 2 if it would have failed utterly on Python 3 so the bug fixes would have been deployed but on Python 2. Now I'm going to give you some tips and lessons learned. The migration project might take a while so aim to use a recent Python 3 version so you do not have to start the next small migration project the day after the roll out. Python 3.7 seems to be a good starting point. Newer Python versions might make it a bit harder to still support Python 2. There are discussions to keep it easier to run on newer Python 3 versions but it might be a problem. Merge often to the main branch so do not use long-running branches. I suggested that your code will run on Python 2 and Python 3 so there's no need to keep it separate from the main branch as it produces enough new problems. Go the route of the small steps. Your project might seem small enough to directly migrate to Python 3 without the intermediate steps of supporting both versions as I suggested but we did this on a project on a seemingly small project and we got burnt. The changes of updating the dependencies to Python 3 together with the update to Python 3 of the code were too much to get it running so we had to step back, port to the new versions and then analyze the problems in Python 2 as we knew this was a running version which was running before and then we could port again to Python 3 so it cost at least time and money. The Python 3 migration of UnionCMS took about one and a half years. It could have been done in a much shorter time but we wanted to be sure that it was always in the deployable state and features and bugs fixes were always possible during the whole migration project without breaking what we already achieved in the migration process. If you are still running Python 2.7 or even older in production it should not matter. If you keep doing so for some extra month to gain the extra certainty the migration will not break everything into pieces. That's all. Thank you for listening. I hope you enjoyed the presentation and have some nice takeaways for your own projects. Well, thank you, Michael. Thank you very much. First of all, I'd like to share the result of the polls so I just finished now and we have 38% on somewhat prepared, 38% on highly prepared and 24% on not prepared. So I think our audience is a little bit different from the one we showed on the beginning. There we go. You can have a look at the results now. Okay, thank you. No problem at all. And then we also have a question from Ivan. So Ivan is asking how would you balance the involvement of QA people, QA engineers, testers, test automation engineers during the process of rewriting the service application? How would you find the right balance between enough test validation without slowing the overall progress but avoiding the typical scenario of leaving all the testing to the end? Okay, that's an interesting question. We were a team of about five people so we did not have such many job titles and the idea was to test each step automatically and by the customers. So there was no QA team. So we are not such a big company so I do not have any experience of how this would work in a bigger scenario. Perfect. Very good. And then we had Enzi asking, is there any tools for adding tests for otherwise untested microservices? A tool which automatically writes tests. This would be nice to have such a tool. There are ideas. I think there's even a talk on this conference about how tests write nearly write themselves. I didn't attend it. So the way you can do it, you can use the code and write tests for the expected values and for the code branches, especially for the happy pass and then maybe add some for error handling. Perfect. Thank you. Then David is asking, did you run any automatic tools on the complete code base and then hand out the parts to fix as work items or did you run it file by file as work items? We ran it on each package. Union CMS has about 30 PIP installable packages and we ran the automatic tools on each package. There are some which are quite big so it took relatively long time to run them, read through the diff and commit it but it worked relatively well. Perfect. Cool. And then last question for now, I think, yes, is from Keith. So how did you prioritize developer time between new feature development or bug fixes as opposed to the effort on the migration? Was the pressure for management to pause or delay your work on the migration for business reasons? We prioritized by blocking weeks in the calendar for doing the migration. So the idea was to have the migration at the beginning of the month so that the other weeks are free for bug fixing and development and we did not feel too much pressure to delay the migration because it was important for the customer to get this, to get sure they are running on a secure environment which is updated to the current Python versions which they wanted for security considerations. Perfect. And that is the end then of our Q&A and also the end of our talk. Michael, thank you so much. Thank you for coming and sharing us some thoughts on testing and migration from Python 2 to Python 3. Thank you. It was a pleasure.