 All right, folks. We are at the last keynote of EuroPython 2021. And it's with great pleasure and an honor to introduce Joana. Joana is a very well-known member of our community. She was originally born in Uganda, Africa. And she now, I believe, lives in Canada. She is a Python core developer. She is a published author and a director of the Python Software Foundation. She is going to tell us about Python, the bad parts. So I would really like to welcome Joana here. Thank you so much, Joana, for taking the time to prepare your talk and to be here with us. And I leave you the stage. I'll add your slides. This is a 45-minute talk. And we are all very excited about it. And wish you best of luck in delivering it. Take it away, Joana. All right. Thank you. Thank you, everyone. As they've already told you, my name is Joana. And today I'm going to give a talk about Python, titled Python, the bad parts. I'll start with a bit of stuff that has already been said because I've already planned for it. So I'm originally from Africa in a country called Uganda and in a city called Kampala. So I had to put Africa because most people think Africa is a country. Africa is not a country. Uganda is a country. And I left Uganda two years ago to come to New Brunswick in Canada, where I'm currently staying and do my PhD in collaboration with IBM and the University of New Brunswick. In terms of our things, I'm a Python core developer. I've been for around 1.9 years, 1.9, one more month until I become two years as the C Python core developer, director of the PSF. And I will be talking a little bit about my work and my plans for the Python community. But I would love to first take the time to thank everyone that entrusted me with their vote to be a director of the PSF. I do not take it lightly, so I really appreciate the trust you put in me to be a director for this year. And I'm excited for all the many things that we are going to do during my term as a PSF director. In my other life, I hinted that I'm doing a PhD. So my research is on garbage collection in Python. And I look at both garbage collection in general for the reference implementation that is C Python, but also alternative implementations like PIPI. I am funded by IBM under the Center for Advanced Studies at the University of New Brunswick. I authored Python 2 and 3 compatibility some years ago probably when the subject was very hard. I don't know if the subject is still hard right now, but you can check out the book if you want. I'd love to first start by saying I'm really honored to be here. And I would love to thank the organizers of EuroPython. This is my first time to attend a EuroBest Python conference or even speak at one for that matter. I've spoken at a Ruby conference that's called EuroCo like so many years ago, but I'd never spoken at any conference in Europe and Python, per se. So I'm really honored to be a speaker or on the speaker lineup for this year. And I don't take it for granted. So let's start with why are we actually here? Why do we attend conferences specifically? Why do we come for Python? I started to attend Python in 2016. And my first Python was in South Africa. It's called Python. Yes, Python South Africa. And it was held in Cape Town. That was my first encounter at a Python. And I didn't know why I was going for a Python anyway. So anyway, I reached there. And I talked to so many people. And I asked them, why do you come for Python? Why do you come for Python? Why do you even attend conferences in the first place? And a group of people that were there, of course, they were Python users. It was a whole Python community. They're not just community. They're not united by the fact that they use Python. They're also friends in general. And so my conclusion after my very first Python in Cape Town in South Africa was that I think we all come to PyCons, or we attend PyCons to celebrate the old semblance of Python. Python has a language in different ways. Sometimes we come as a trip. It could be a vacation. But if we end up in a Python convention, in a Python hall, we are usually there to celebrate how awesome Python is. And the great things Python as a language has, or as a community, or as an ecosystem, has accorded us the different things we are able to do with the language. So we celebrate Python's old semblance first as a language. I started to use Python in 2013. And actually 2010, 2010, when I was an undergrad student. And then I had different reasons for using Python. But the first reason was initially I was just forced to use the language. It was taught in class. So I was given assignments in Python. It was not fun then, because the whole class failed. The professor taught it very hard. So the whole class failed the assignments. However, after failing two assignments in the class towards our final exam, I read the whole Python book from cover to cover. And then I was able to pass the final exam. That's a story of another day. But I just fell in love with the syntax, the simplicity of the language. And I do not share this belief or notion alone. So many people have talked about how Python is awesome as a language, how it's used everywhere, how it's big enough friendly. I think I initially identified with the second reason. The first reason as I became an expert, yeah, I started to appreciate why people say that. But also the community and then Python as a language has helped us personally. I can speak for myself. But also many people in the audience, it has helped you find a job just because you know a skill and you're able to pay your bills or in your family because you know Python or because you're using Python as a tool for your career or for your employment. Python as a community is also awesome as many people have said. And most popularly, Brett Cannon said, he came to learn Python in the language. Or he was interested in Python in the language. But he ended up staying because of the community. And so many of us can attest to that personally. I also started getting very, I started as a user, I wasn't very interested. And I started getting so much into the language because I had some sort of research. However, I stayed because people in the community were interested in my contribution to the community but also the language as a whole. Personally, I was mentored by Victor Stiner to be a Python co-developer. And he helped, his mentorship did not only help me to be an expert as a Python co-developer but it also helped me to be in my professional life in general. And they have made so many friends not just on a technical level but in general just genuine friends in the community. So the Python community, aside from the language or the technicalities of the things we talk about, the Python community is awesome as well. The Python as an ecosystem is also awesome and I and probably you can attest to it because we are able to use Python in the language mostly because of its ecosystem. Its ecosystem in terms of the different libraries, Python is a very popular language right now with so many libraries literally simplifying our work in industry or academia or anything we're doing. A vast majority of Python libraries for in scientific programming, vast majority of things like alternate implementations like PyPy, PyStorm and so many other things, libraries for the most famous or Danish or cutting-edge technologies like machine learning and deep learning. So the Python ecosystem has empowered us, its users with great tools that are simplifying our lives in our work and on our projects and in our businesses. And so again, at PyCon and in our daily lives as Python users, we also celebrate ecosystem because of how awesome it is and how rich it is and how big it is and how useful, for example, it is. However, today I would love, I want us to take a step back or I want us to talk about a different view of Python. I've talked about its awesomeness and how it's been a great language and we cannot say there is no argument about that. Like Python is already good, even as it is now, it's a good language, but I want to talk about some of the things we could improve as most especially as a language, but also as a community in general. So my original title was Python the Bad Parts and then I thought about it. Well, I had already submitted the title, so I wouldn't change it, but then I was making my talk. I said maybe this title is actually going to trigger some people, mostly maybe Python called developers or because my talk is mostly about the language itself. So I said, okay, I'm going to probably change it a little bit. So instead of us, instead of you or us looking at my talk or whatever I'm going to talk about and really criticizing Python or airing Python 30 laundry, maybe we could change this title and start to look at it like instead a reflection on Python's potential. Like Python is already doing interesting stuff, but today I want you to join me in us reflecting on what Python can be in this time and age. Python is about 30 years old and I'm sure probably Guido, Guido Valrosa probably started working on Python before I was born. So again, I may not be the right person to be criticizing it and I'm sure Guido Valrosa had the best ideas for the language because as time went on we were using it because he made very many good decisions then. However, 30 years, 30 years I've gone since then and right now I think it's also a good time to start talking about what Python can be and what new things, what new things or what path Python could be taking right now. It's not necessarily a criticism. I think it's good love because I'm an avid Python user and I believe all of you are. So the intention of this talk is not to talk down on the people that put a lot of effort in the language. It's to look, to have just a quick reflection of what could it be? What more could Python in the language be? I'll start by a brief discussion on what successful languages are. Best of a paper that was written a couple of years ago by these two authors, Leo and Ariel, they did an empirical analysis of programming languages and they gave us some insight and they said successful programming languages are not successful because of new features or the complex features that are always released every day. Surprise, it's instead there like three key ideas of all things that make a language successful. One is accessible libraries and Python is a successful language, it's a popular language because one is, it also has accessible libraries. PyPI, the Python index where all our, the Python cheese shop where all our libraries are, it's a very rich set and collection of Python libraries that the developers of Python have always have, have used and continue using. Languages are successful because there is proof of usage according to the authors of this paper. There is no denying that Python is successful. Again, it meets this goal because there is Python code running everywhere. There is Python code now running on Mars. NASA is using Python code. There's Python code in machine learning. Python code is being used in all works of life for web and scientific programming. Things of all kinds. So there is proof of usage. So again, by this definition, Python is already a very successful language. It also, there is experience, sorry, sorry for that. So successful languages are also successful because of the experience they give to their programmers. Again, Python has been hailed for its simplicity in terms of syntax because it's allowed beginners to easily experiment but also rapidly prototype ideas and bring up solutions in a very short amount of time. So by this definition, Python is popular and very successful, so we cannot argue with that. However, success, the success we've attained or the success that Python has can only be sustained by relevance. So like I said, the problems we're solving 30 years ago are probably not the same problems we are solving now. So as a language and basically mainly, maybe as a core team of the Python language or as it uses us because Python is open source. So everyone has a responsibility to probably make it better. We need to be looking at how relevant the language remains as time goes by and as things change as different hardware changes, as technology changes, as our needs evolve. The success of Python is only going to be sustained by how it grows to be relevant with the times. 40 years from now, we should not probably be having the same version of Python. And I have hope that we are always evolving. But today I want to talk about a few challenges that Python is facing and I want to talk about all of them because if I wrote out a thread on Twitter and asked people, why do you hate Python? There will probably be a million reasons why people hate Python. Also today I would just talk about five things that I find that where Python could be improving and informed by my own understanding and some discussions that have been on different mailing lists. I don't promise to give an idea of every problem and I don't think these are the only priority problems right now. Also I have to preface this that it's in my view. I love to start with performance. Now, Python is not C, so we are not expecting C level performance in Python. However, being a dynamic language and an interpreter it has also its limitation. But I believe there is still room to improve the performance of Python. We, in Python is in a paradigm that it shares a paradigm with other languages and I believe we can borrow from a lot of research that has been written specific to dynamic languages and find a way of improving the performance, the performance, the performance that Python is able to give its users right now. There is work in the community and I would like to shout out to the Microsoft, the co-developers at Microsoft. They've been spearheading a lot of efforts in ensuring that we have a faster Python interpreter, especially the C Python, the reference implementation. And because of some of the challenges we also saw some alternate implementations have been motivated by, I mean, they're not so good performance of Python to come up with things like jetting, for example, with other concepts like jetting and better probably garbage collection. They've been motivated by the same aspect of performance. So if we are to be relevant and as a language and spearhead or be moved with the times, we have to be, I think right now is a very important time to be looking at where the Python interpreter could be improved. Again, I know there is of course a lot of work happening on even on the reference implementation and alternate implementations. I think this is priority. I think it should be some, there should be some sort of priority given to performance right now. I mean, we could be better, I'm not saying we can be like C, we could be better, so that we do not lose some of our users, obviously. Again, Python was not built to be a language for everything, but we can do something about it. The other challenge that has been talked about has been the standard library. So my first language summit was in 2019 when I came to Python University and one of the heated discussions we were having was the standard library. And so very many talks and discussions happened around the standard library on very many mailing lists and usually most people are talking about, okay, let's, we have to trim the standard library and make sure we just remain with what we need. But I think, yes, I think that's valued. I mean, there is the aspect that we still having very old things in the standard library and things that are not useful right now. However, most of the problem in the standard library is because we are not, what should I say, improving or maintaining the batteries we have in there. I mean, we can trim it as much as we want, but I think we need to, even whatever we remain with in the standard library, we need to find a way of making sure we improve the modules that we still keep there because we are not solving any problems. So what happens again, 10 years from now, will we still just keep trimming off things from the standard library all the time? I think we should just have a pathway towards instead improving some of those, the libraries we feel are important. I'm not against trimming it further, but I think we should be also make sure we handle the underlying problem which is there are maintained in some sort of way. The other aspect is garbage collection. Now, garbage collection as a topic or as a paradigm has lasted for 60 years now. And it has edged as it's as old as six years right now, 60 years. Python is about 30 years. However, and without downgrading anybody and I know most people know about the challenge, I think we should be looking at moving probably to tracing garbage collection and generational garbage collection because the industry has moved, a lot of research has shown. So Python is based on reference counting. We could call it a hybrid because reference counting does not collect cycle. So there is a sort of a tracing like GEC that handles cycles. So GEC research has showed us that tracing garbage collection does well in terms of performance and we can improve it. We can further even optimize it with parallelism and so many other things. So if you had to be relevant and go with that time, I think we should be looking at improving some garbage collection for CPython, the reference implementation at some point. And then this is something that has been discussed for long. It's posed so many, there are so many questions to answer in terms of supporting this but I think this is a good time to start thinking about garbage collection. Because again, if we are to be relevant, I mean, maybe we should be thinking about moving to tracing garbage collection, maybe generational because as it stands, it looks like we are almost 20 years behind behind the landscape or behind in a version or behind research. And they accept the challenges we are still facing. I think, but I still think we can do more in this area. The other interesting aspects that I also touch on in my research but also in general has been the CPI. So we've hailed alternative implementations that have come up. First of all, they've been motivated by performance of the reference implementation. However, they've all been blocked by the aspect that they cannot efficiently support the Python extension modules. So the CPI was built to be very simple. And if we look at the implementation of the CPI and you compare it to the design at that time, the Python was built, they made some good decisions at that time, but now I don't think some of the design decisions in the CPI, as we've seen, it's evolved to be unmaintainable but alternative implementations cannot efficiently support it. I think it's only PyPy that has tried to maybe successfully support the CPI, but as if you, but the way they support it has also come with the performance, some performance degradation. And if you go to the PyPy documentation, you can read about all the challenges they are facing by virtue of how the CPI was designed and could took the things coming to mind. It's just that it exposes too many things, but it's also tied so many things like VM implementation details, like garbage collection is exposed in it. So it's been a block at so many alternative implementations to successfully support it. I've been reading a couple of papers and from the time I think Blue Hours was implemented, they've been using the Python CPI as an example of how not to implement a CPI. So many people have learned from us in terms of how not to implement a CPI according to those papers, which is unfortunate, and I'm not criticizing anybody. But I think this is something we can also, I think right now start looking at it and try to see if we can find a solution. So those are my key things about the language in general, but I would also love to talk about maybe one thing about the Python ecosystem that some people have probably been grumbling about and it's core development in general. And when I talk about core development, I'm talking about how the language is managed or maintained in general in terms of receiving bugs, how bugs are triaged, how development is managed, the response to pull requests and generally the response to contributions, first of all by the core team, but also from contributors in general. So when you look at core mentorship, recently, so many core developers have tried to, because the core team, it's not as big, but also the active core developers are not as many. So it's too much work for very few people. And I know that there is a lot of effort by core developers like Victor Stina, even Goodman Rossum and other people that have been mentoring very many new members to join the core team. And, but there is still friction around, right now we're having like 1,400 pull requests that are still open on the CPython project and most of them are actually never reviewed. And this is an open problem. It still remains to be open and I don't know how we will effectively solve it. However, right now, I think it's very, in my view, from my two years of joining core development, I'm seeing like so many people are grumbling about the whole how core development is managed, the idea that people are willing to submit peers and people are not there to review them. And I'm not even blaming the core team because I mean, no one is paid besides, people have families and stuff and they do other things after work. And besides the core team is very small. So I'm not even blaming anybody. And then there are other factors, even if you had time as a core developer, you cannot review every pull request because we also, every core developer, we have varying levels of expertise. So it's unrealistic to expect one person to be reviewing everything. I think my point here is that it's a very difficult topic but we need to be looking at it because so many people are grumbling and talking about it. So I think it's high time we started to talk about it because it will affect how people look at the language because if they think they're submitting bugs and no one even triages it or even looks at it or responds to it after 10 years, then it starts to be a problem. Probably people will look for other better places to be which is not a problem to any specific individual but I guess it's something we need to talk about. So what is the way forward? I don't think I have any, I don't think I have solutions to all these problems. And I've just talked about just a few of them. And I don't promise that any, I don't even know if anybody has a single solution for all those challenges but I think the solution or the way forward is in some way among us. In some way the solution to all the problems the Python as a language or ecosystem is facing in some way the solutions live in its community. We just need to uncover them in some sort of way. I don't know, EuroPython I think is having about 1,000 attendees. Python US usually has so many thousands of attendees as well. And I think as a collective unit we have a solution. The solutions are somewhere in us. We just need to find a way of tapping into them organizing ourselves towards having some sort of solutions. And if we think more about some of the problems we're having we could find a way for them. I just have a few tips that I'm going to talk about today. The challenges Python is facing as a language are not unique. Some new languages like I told you, especially Lua because I've read so many papers that have been written especially by the Lua Coroutine. When they started to build Lua they first of all looked at Python and tried to avoid its problems. But also we can learn something from them especially in how they evolved and changed specific aspects of especially Lua. And I'm trying to say Python's problems are not very unique. For example, I've been following the story about Lua's API and they've made drastic changes even changes that we fear we are fearing to make. Some of them that we're fearing to make right now as Python. But we could learn something from them especially on how they evolved the API. They made so many radical decisions in how Lua's API changed. Maybe we could learn from them. And again, take my advice with not so much thought because again we need to look at other things like there's some things we cannot totally compare but I'm saying we can learn from some of these new languages languages like V8 and Rust. The way C8, the V8 garbage collection especially works for both the normal JavaScript but also see extensions for garbage collection. They use handles, I think something that HPI does. Again, there are things we could borrow from some of these new languages. They've written a couple of papers where they're actually using Python as an example. They're using Python's problems that they don't design languages that look like that have the same challenges like Python. So the first place we can look at these languages and we see what we can do borrowing from some of the ideas. Now, I talked about the C API, how it's problematic and problems like garbage collection and maybe other performance related problems. It's not going to be easy to change the C API or to change to Tracing garbage collection and those ideas are very radical. But what we can do is if we can have researchers like me and other people if you're probably doing your PhD or considering to do your PhD and you could start, you could decide to work on some of these problems because I think what we need is people simulating some of these radical ideas that are probably the core team is still afraid to try. And then from the insight that these researchers are simulating we can find a way of to see if some of these ideas will actually flow well with the language. That's one thing we could try because for example, for compatibility reasons there is a big fear of totally changing the C API. Even if we have the solutions but if we have people simulating and creating insight then probably we'll have some confidence in the core team trying to gradually borrow some of these ideas. And some that can be merged into the project as they are. I mean, it wouldn't be bad if they don't pose a serious compatibility or serious problems. I think the idea is that we need to create insight if your research are probably should look at simulating some of these radical ideas. And I have already talked about my whole issue about the standard library is that just streaming stuff is not going to improve unless we have a pathway to saying that even the staff will keep in the standard library if we commit that we shall improve. We have a commitment to improving it otherwise we are not solving the problem. Will we delete or trim the standard library until we only have if else for only? I mean, it won't make sense at some point. I think as we trim, let's find a pathway towards it. And I think the question is if you're trimming stuff and telling people to go to the, to go to PIPI let's have a clear path of where these deleted modules are going. I think that's going to put some confidence in people otherwise if we just delete stuff and we don't have a good way of where to place those things that I was going to leave for users. That's my thing. Again, there's so many efforts around, so when we talk about these problems very many people are engineering solutions. However, I think this is a time to take caution because over engineering is not, over engineering does not always come with performance also the benefits. Let's prioritize guided engineering. And so I think I've read stories elsewhere in the language I want to mention right now. So they over engineered something and removed it later. So over engineering is not helping. I mean, I think the engineering should be guided by the problems we are currently facing at least priority wise. So if it's something is a performance related fix, let's, I mean, let's wait and see if it's really giving us the benefits we're claiming it has. And if it's not, however, even if it's running on top of catching edge technology I think we shouldn't be merging. So let's have a balance between not engineering for the sake of engineering. I think that applies to the core team. And Python has done a lot of work in, right now we have a developer in residence that just works on C Python. So that has been a good step but I also encourage so many other companies to probably think about sponsoring and financing developer in residence roles so that we have more people. This will improve the language because if you're running on Python and we are not investing a lot in it then we are probably building on to crumbling infrastructure per se. So like I said, I would love to, before I end the talk, I would have to say that the solution to all our problems or the bad parts of Python, the solutions leave among us. And if we cooperate, it does not care. It doesn't matter because Python is open source and anyone of us, even if just 2000 of us wanted to get a solution we are enough to improve or make Python better. I think we just need to cooperate and be willing to share all ideas and if we have time, put in the time and for companies, if you have money if you can't put in time then you can put in money to support all the programs that the PSF has in ensuring that Python, the language is relevant in terms of all the activities that surround the language and its ecosystem. And if you want to talk about some of the things you can cooperate on as part of my plan for being a PSF director is I want to highlight some of the research we have in our community. And so if you do research in Python I would love to talk to you. I have a couple of plans I have for us and if you do Python in education you can talk to me after or during the conference and if you're interested in the ambassador program you can also talk to me and we can discuss more about some of the plans I have as an individual and that we can work together with the PSF to achieve the languages in the community. I would love to end by saying the solution is within us in some sort of way we just need to find a way of voicing our solutions in a more practical way per se. Thank you. These are my emails if you want to talk about anything. Yeah, thanks very much. I would love to hand over to the organizers for any questions I may have. All right, thank you so much for the great talk. We just loved it and I'm seeing lots of questions in the question and answer chat so I will try to write down the questions so that they appear on the screen and it will be easier for you also to read them out and to answer them. So we got one question, very interesting question. It's about the standard library. You mentioned that the standard library has been sometimes seen as a problem but what is actually the problem with it? Is it because it's too large or what else? Okay, so the different discussions that have been had around the standard library has been that first of all that the problem is they are old modules but most importantly, most of the modules live in the standard library but they're actually not maintained for some reason. For different reasons that I can't pinpoint even myself but it could be because of resources. Some modules up to now don't even have an expert like nobody knows, understands them at all. So it's too big but also too big for people to maintain. At some point it's grown bigger than us but also some things are obsolete, they are old. Again, we are not the only people, Python is not the only language that has the same problem. I think even Ruby, its standard library has grown to some point to be that it has old stuff but also they found that having things in the language itself, it does not have maintainers and the learning curve to contribute to the core language itself is harder compared to if the library, for example, was on PIPI. So I think it makes more sense if we had a more streamlined standard library and we have a path towards having more modules on PIPI that could probably, maybe we will have that could have better maintenance there. So again, it's too big having obsolete stuff but essentially it also poses a maintenance burden. So PIPI is better in that case, yeah. All right, thank you, great answer. Great question and great answer. We have another very interesting question in the chat. What is the reason for switching from a ref counting to a Tracy Garbage collector? If it were performance, what would be the difference that you would expect from such a switch? So best of research. And again, I do research in garbage collection and if you have, if you, I'm not baddening anybody to read or to have an insight into existing research. Most people, we move away from reference counting for performance reasons because first issues like maintaining reference counts is so can be, can pose an overhead. And the fact that we need to have a hybrid sort of approach to solve all our problems regarding garbage collection, I think it's not very ideal because we can need some sort of tracing garbage collector to manage our cycles per se. I have no prediction of how much difference we are going to have in Python that you can never make. I think predicting performance in such an, I said you can never make right now. But what I think is that best of research, there are things tracing garbage collection could give us sort of parallelism that something that we can't do with reference counting. But we are confident that from insight we've had in other languages and existing research, we are sure tracing garbage collection can give us better benefits than reference counting. Okay, Joanna, fantastic. Thank you so much for your answer, very good question and fantastic answers just like before. And we are running a little bit out of time and so I'm afraid we have to stop here. But I would like to thank you for taking the time to prepare this talk and delivering it. We really enjoyed it. Thank you so much, Joanna.